Tag Archives: classifying topos

Where does the generic triangle live?

I recently posted the following question on MathOverflow.  It is a revision of an earlier question.

Motivation 1

There is such a thing as a generic group. In category theory this is done by constructing “theory” of the group, which is a category in a certain doctrine. Functors (in that doctrine) to Set, or more generally to any topos, are groups. The barest such theory (as usually seen) is the Lawverean algebraic theory of groups. This theory is a category containing an object and operations making it a group object in that category, and the theory is the smallest such category that contains all finite limits. There are fancier ones; the fanciest is the classifying topos for groups, which is in some sense the initial topos-with-group object. Since in a topos, you have full-scale first order intuitionistic logic, the classifying topos for groups allows you to reason about the generic group inside the classifying topos and the theorems you prove will be true for all groups. (This is only an approximation of the actual situation.) In particular you can’t prove it is abelian and you can’t prove it isn’t; the logic clearly does not have excluded middle.

Motivation 2

You can prove that a triangle that has two angles that are equal must be isosceles (has two sides that are equal). You can do this with Pappus’ proof: Look at the triangle, flip it over the perpendicular from the odd angle to the other side, look at it again, and the side-angle-side theorem shows you that the “two” triangles are congruent, so two sides much be equal. This appears to me to be true without requiring the parallel postulate. So the theorem and the proof must be true not only in Euclidean 2-space but in any surface of constant curvature. (Here I am getting into territory I know very little about, so this particular motivation may be totally misguided.)

The Question

So what I want is a classifying space of some sort that contains the generic triangle in such a way that maps of the correct sort to any surface of constant curvature are triangles, and so that Pappus’ proof can be carried out in the classifying space. The space doesn’t have to be a topos or a category at all. I have no clue as to what sort of structure it would be.

Note 1: Even the Lawvere theory of groups has its own internal logic — in this case equational logic. You certainly cannot prove the generic group is or is not abelian with equational logic.

Note 2: It does not seem reasonable to me that Pappus’ proof would work in a surface with variable curvature. But maybe there is some trick to define “angle mod curvature” that would make it true.

Send to Kindle

Just-in-time foundations

Introduction

In MathOverflow, statements similar to the following two occurred in comments:

  1. Sets and functions do not form a category
  2. Categories and functors do not form a category.

I cannot find either one of them now, but I want to talk about them anyway.

If you look at the definition of categories in various works (for example references [1] through [3] below) you find that the objects and arrows of a category must each form a “collection” or “class” together with certain operations.   The authors all describe the connection with Grothendieck’s concept of “universe” and define “large categories” and “small categories” in the usual way.  So Statement 1 above is simply wrong.

Statement 2 is more problematic.  The trouble is that if the word “categories” includes large categories then the objects do not form a set even in the second universe.  You have to go to the third universe.

Now there is a way to define categories where this issue does not come up.  It allows us to think about categories without having a particular system such as ZF and universes in mind.

A syntactic definition of category

A category consists of objects and arrows, together with four methods of construction M1 – M4 satisfying laws L1 -L7.  I treat “object” and “arrow” as predicates:  object[f] means f is an object and arrow[a] means a is an arrow.  “=” means equals in the mathematical sense.

M1 Source If arrow[f], object[f.source].
M2 Target If arrow [f], object[f.target].
M3 Identity If object[a],  arrow[a.identity].
M4 Comp If arrow[g] and arrow[f] and  f.target = g.source, then arrow[(g,f).comp].
L1. If object[a],  a.identity.source = a.
L2. If object[a], a.identity.target = a.
L3. If arrow[g] and arrow[f] and  f.target = g.source, then (g,f).comp.source = f.source.
L4. If arrow[g] and arrow[f] and  f.target = g.source, then (g,f).comp.target = g.target.
L5. If object[a] and arrow[f] and f.source = a, then (f, a.identity) = f.
L6.  If object[a] and arrow[g] and g.target = a, then (a.identity, g) = g.
L7.  If arrow[h] and arrow[g] and arrow[f] and h.source= g.target and g.source = f.target, then (h,(g,f).comp = ((h,g).comp, f.comp).
Remarks on this definition
1. I have deliberately made this definition look like a specification in an object oriented program (see [6]), although the syntax is not the same as any particular oo language.  It is as rigorous a mathematical definition as you could want, and it could presumably be compiled in some oo language, except that I don’t know if oo languages allow the conditional definition of a method as given in M4.
2.  I could have given the definition in mathematical English, for example “If f is an arrow then the source of f is an object”.  My point in providing the impenetrable definition above is to make a connection (admittedly incompletely established) with a part of math (the theory of oo languages) that is definitely rigorous but is not logic.  An informal definition in math English of course could also be transformed rigorously into first order logic.
3.  This definition is exactly equivalent to the FL sketch for categories given in my post [5].  That sketch has models in many categories, not just Set, as well as its generic model living in the corresponding FL-cattheory (or in the classifying topos it generates).
4.  Saunders Mac Lane defined metacategory in precisely this way in [1].  That was of course before anyone every heard of oo languages.  I think he should have made that the definition of category.

Just-in-time foundations

Mathematicians work inside the categories Set (sets and functions) and Cat (categories and functors) all the time, including functors to or from Cat or Set. When they consider a category, the use theorems that follow from the definition above.  They do not have to have foundations in mind.

Once in awhile, they are frustrated because they cannot talk about the set of objects of some category.  For example, Freyd’s solution set condition is required to prove the existence of a left adjoint because of that problem.  The ss condition is a work-around for a familiar obstruction to an easy way to prove something.  I can imagine coming up with such a work-around without ever giving a passing thought to foundations, in particular without thinking of universes.

When you work with a mathematical object, the syntax of the definitions and theorems give you all you need to justify the claim that something is a theorem.  You absolutely need models of the theory to think up and understand proofs, but the models could be sets or classes with structure, or functors (as in sketch theory), or you may work with generic models which may require you to use intuitionistic reasoning.  You don’t have to have any particular kind of model in mind when you work in Set or Cat.

When you do run into something like the impossibility of forming the set of objects of some category (which happens in any model theory environment that uses classical rather than intuitionistic reasonins) then you may want to consider an approach through some theory of foundations.  That is what most mathematicians do: they use just-in-time foundations. For example, in a particular application you may be happy to work in a topos with a set-of-all-objects, particularly if you are a certain type of computer scientists who lives in Pittsburgh.  You may be happy to explicitly consider universes, although I am not aware of any category-theoretical results that do explicitly mention universes.

But my point is that most mathematicians think about foundations only when they need to, and most mathematicians never need to think about foundations in their work. Moral: Don’t think in terms of foundations unless you have to.

This point of view is related to the recent discussions of pragmatic foundations [7] [8].

Side remark

The situation that you can’t always construct a set of somethings is analogous to the problem that you have in working with real numbers:  You can’t name most real numbers. This may get in the way of some analyst wanting to do something, I don’t know.  But in any branch of math, there are obstructions to things you want to do that really do get in your way.  For example, in beginning linear algebra, it may have occurred to you, to your annoyance, that if you have the basis of a subspace you can extend it to the basis for the whole space, but if you have a basis of the whole space, and a subspace, the basis may not contain a basis of the subspace.

References and links

  1. Saunders Mac Lane, Categories for the working mathematician. Springer-Verlag, 1971.
  2. Wikipedia article on category theory
  3. Michael Barr and Charles Wells, Category Theory for Computing Science, Third Edition (1999). Les Publications CRM, Montreal (publication PM023).
  4. Discussion of functions in abstractmath.org.
  5. Definitions into Mathematical Objects 7.
  6. Object oriented programming in Wikipedia.
  7. M. Gelfand, We Do Not Choose Mathematics as Our Profession, It Chooses Us: Interview with Yuri Manin.
  8. Discussion in n-category cafe.
Send to Kindle

Naive proofs

The monk problem

A monk starts at dawn at the bottom of a mountain and goes up a path to the top, arriving there at dusk. The next morning at dawn he begins to go down the path, arriving at dusk at the place he started from on the previous day. Prove that there is a time of day at which he is at the same place on the path on both days.

Proof: Envision both events occurring on the same day, with a monk starting at the top and another starting at the bottom at the same time and doing the same thing the monk did on different days. They are on the same path, so they must meet each other. The time at which they meet is the time required.

The pons asinorum

Theorem: If a triangle has two equal angles, then it has two equal sides.

Proof: In the figure below, assume angle ABC = angle ACB. Then triangle ABC is congruent to triangle ACB since the sides BC and CB are equal and the adjoining angles are equal.

PATriangle

I considered the monk problem at length in my post Proofs Without Dry Bones.  Proofs like the one given of the pons asinorum, particularly its involvement with labeling, recently came up on the mathedu mailing list.  See also my question on Math Overflow.

Naive proofs

These proofs share a characteristic property; I propose to say they are naive, in the sense Halmos used it in his title Naive Set Theory.

The monk problem proof is naive.

For the monk problem, you can give a model of a known mathematical type (for example model the paths as  smoothly parametrized curves on a surface) and use known theorems (for example the intermediate value theorem) and facts (for example that clock time is cyclical and invariant under the appropriate mapping) to prove it.  But the proof says nothing about that.

You could imagine inventing an original set of axioms for the monk problem, giving axioms for a structure that are satisfied by the monk’s journeys and their timing and that imply the result.  In principle, these could be very different from multivariable calculus ideas and still serve the purpose. (But I have not tried to come up with such a thing.)

But the proof as given simply uses directly  known facts about clock time and traveling on paths.  These are known to most people.  I have claimed in several places that this proof is still a mathematical proof.

Every proof is incomplete in the sense that they provide a mathematical model and analyze it using facts the reader is presumed to know.  Proofs never go all the way to foundations.  A naive proof simply depends more than usual on the reader’s knowledge: the percentage of explication is lower.  Perhaps “naive” should also include the connotation that the requisite knowledge is “common knowledge”.

The pons asinorum proof is naive.

This involves some subtle issues.  When I first wrote about this proof in the Handbook I envisioned the triangle as existing independently of any embedding in the plane, as if in the Platonic world of ideals.  I applied some labels and a relabeling and used a known theorem of Euclid’s geometry.  You certainly don’t have to know where the triangle is in order to understand the proof.

That’s a clue.  The triangle in the problem does not need to be planar. It is true for triangles in the sphere or on a saddle surface, because the proof does not involve the parallel axiom. But the connection with the absence of the parallel axiom is illusory.  When you imagine the triangle in your head the proof works directly for a triangle in any suitable geometry, by imagining the triangle as existing in and of itself, and not embedded in anything.

Questions

  1. How do you give a mathematical definition of a triangle so that it is independent of embedding?  This was the origin of my question on Math Overflow, although I muddled the issue by mentioning specific ways of doing it.
  2. (This is a variant of question 1.)  Is there anything like a classifying topos or space for a generic triangle?  In other words, a category or space or something that is just big enough to include the generic triangle and from which mappings to suitable spaces or categories produce what we usually mean by triangles.
  3. Some of the people on mathedu thought a triangle obviously had to have labels and others thought it obviously didn’t.  Specifically, is triangle ABC “the same” as triangle ACB?  Of course they are congruent.  Are they the sameThis is an evil question. The proof works on the generic isosceles triangle.  That’s enough.  Isn’t it?  All three corners of the generic isosceles triangle are different points.  Aren’t they?  (I have had second, third and nth thoughts about this point.)
  4. You can define a triangle as a list of lengths of edges and connectivity data.  But the generic triangle’s sides ought to be (images of) line segments, not abstract data.  I don’t really understand how to formulate this correctly.

Note

1.  I could avoid discussion of irrelevant side issues in the monk problem by referring to specific times of day for starting and stopping, instead of dawn and dusk.  But they really are irrelevant.

Send to Kindle