Tag Archives: isomorphism

Sets

I have been working my way through abstractmath.org, revising the articles and turning them into pure HTML so they will be easier to update. In some cases I am making substantial revisions. In particular, many of the articles need a more modern point of view.

 

The math community’s understanding of sets and structures has changed because of category theory and will change
because of homotopy type theory.

 

This post considers some issues and possibilities concerning the chapter on sets.

The references listed at the end of the article include several about homotopy type theory. They provide different viewpoints and require different levels of sophistication.

A specification of the concept of set

The abmath article Specification of sets specifies what a set is in this way:

A set is a single math object distinct from but completely determined by what its elements are.

I have used this specification for sets since the eighties, first in my Discrete Math lecture notes and then in abstractmath.org. It has proved useful because it is quite simple and the statement implies lots of immediate consequences. Each of the first four consequences in this list below exposes a confusion that some students have.

Consequences of the specification

  1. A set is a math object. It has the same status as the number “$143$” and the sine function and the real line: they are all objects of math. A set is not merely a typographically convenient way to define a certain collection of things.
  2. A set is a single object. Many beginners seem to have in their head that the set $\{3,4\}$ is two things.
  3. A set is distinct from its elements. The set $\{3,4\}$ is not $3$, it is not $4$, it is not a number at all.
  4. The spec implies that $\{3,4\}$ is the same set as $\{4,3\}$. Some students think they understand this but some of their mistakes show that they don’t really understand it.
  5. On the other hand, $\{3,5\}$ is a different set from $\{3,4\}$. I haven’t noticed this bothering students but it bothers me. See the discussion on ursets below.

Those consequences make the spec a useful teaching tool. But if a beginning abstract math student gets very far in their studies, some complications come up.

Defining “set”

In the late nineteenth century, math people started formally defining particular math structures such as groups and various
kinds of spaces. This was normally done by starting with a set and adding structure.

You may think that “starting with a set and adding structure” brushes a lot of complications under the rug. Well, don’t look under the rug, at least not right now.

The way they thought about sets was a informal version of what is now called naive set theory. In particular, they freely defined particular sets using what is essentially setbuilder notation, producing sets in a way which (I claim) satisfies my specification.

Bertrand Russell wakes everyone up

Then along came Russell’s paradox. In the context of this discussion, the paradox implied that the spec for sets is not a definition.The spec provides a set of necessary conditions for being a set. But it is not sufficient. You can say “Let $S$ be the set of all sets that…[satisfy some condition]” until you are blue in the face, but there are conditions (including the empty condition) that don’t define a set.

The Zermelo-Fraenkel axioms

The Zermelo-Fraenkel axioms were designed to provide a definition that didn’t create contradictions. The axioms accomplish this by creating a sort of hierarchy that requires that each set must be defined in terms of sets defined previously. They provide a good way (but not the only one) of providing a way of legitimizing our use of sets in math.

Observe that the “set of all sets” is certainly not “defined” in terms of previously defined sets!

Sets as a foundation

During those days there was a movement to provide a solid foundation for mathematics. After Zermelo-Fraenkel came along, the progress of thinking seemed to be:

  1. Sets are in trouble.
  2. Zermelo-Fraenkel solves our set difficulties.
  3. So let’s require that every math object be a set.

That list is oversimplified. In particular, the development of predicate logic was essential to this approach, but I can’t write about everything at once.

This leads to monsters such as the notorious definition of ordered pair:

The ordered pair $(a,b)$ is the set $\{a,\{b\}\}$.

This leads to the ludicrous statement that $a$ is an element of $(a,b)$ but that $b$ is not.

By saying every math object may be modeled as a set with structure, ZF set theory becomes a model of all of math. This approach gives a useful proof that all of math is as consistent as ZF set theory is.

But many mathematicians jumped to the conclusion that every math object must be a set with structure. This approach does not match the way mathematicians think about math objects. In particular, it makes computerized proof assistance hard to use because you have to translate your thinking into sets and first order logic.

Sets by category theory

“A mathematical object is determined by the role it plays in a category.” — A. Grothendieck

In category theory, you define math structures in terms of how they relate to other math structures. This shifts the emphasis from

What is it?

to

What are its properties?

For example, an ordered pair is a mathematical object $p$ determined by these properties:

  • It determines mathematical objects $p_1$ and $p_2$.
  • $p$ is completely determined by what $p_1$ is and what $p_2$ is.
  • If $p$ and $q$ are ordered pairs and $p_1=q_1$ and $p_2=q_2$ then $p=q$.

Categorical definition of set

“Categorical” here means “as understood in category theory”. It unfortunately has a very different meaning in model theory (set of axioms with only one model up to isomorphism) and in general usage, as in “My answer is categorically NO” said by someone who is red in the face. The word “categorial” has an entirely different meaning in linguistics. *Sigh*.

William Lawvere has produced an axiomatization of the category of sets.
The most accessible introduction to it that I know of is the article Rethinking set theory, by Tom Leinster. This axiomatization defines sets by their relationship with each other and other math objects in much the same way as the categorical definition of (for example) groups gives a definition of groups that works in any category.

“Set” means two different things

The word set as used informally has two different meanings.

  • According to my specification of sets, $\{3,4\}$ is a set and so is $\{3,5\}$.
  • $\{3,4\}$ and $\{3,5\}$ are not the same set because they don’t have the same elements.
  • But in the category of sets, any two $2$-element sets are isomorphic. (So are any two seven element sets.)
  • From a categorical point of view, two isomorphic objects in a category can be be thought of as the same object, with a caveat that you have better make it clear which isomorphism you are thinking of.

One of the great improvements in mathematics that homotopy type theory supplies is a systematic way of keeping track of the isomorphisms, the isomorphisms between the isomorphisms, and so on ad infinitum (literally). But note: I am just beginning to understand htt, so regard this remark as something to be suspicious of.

  • But $\{3,4\}$ and $\{3,5\}$ may not be thought of as the same object according to the spec I gave, because they don’t have the same elements.
  • This means that the traditional idea of set is not the same as the strict categorical idea of set.

I suggest that we keep the word “set” for the traditional concept and call the strict categorical concept an urset.

A traditional set is a structure on an urset

The traditional set $\{3,5\}$ consists of the unique two-element urset coindexed on the integers.

A (ur)set $S$ coindexed by a math structure $A$ is a monic map from $S$ to the underlying set of $A$. In this example, the map has codomain the integers and takes one element of the two-element urset to $3$ and the other to $5$.

Note added 2014-10-05 in response to Toby Bartels’ comment: I am inclined to use the names “abstract set” for “urset” and “concrete set” for coindexed sets when I revise the articles on sets. But most of the time we can get away with just “set”.

There is clearly no isomorphism of coindexed sets from $\{3,4\}$ to $\{3,5\}$, so those two traditional sets are not equal in the category of coindexed sets.

I made up the phrase “coindexed set” to use in this sense, since it is a kind of opposite of indexed set. If terminology for this already exists, lemme know. Linguists will tell you they use the word “coindexed” in a different sense.

Elements

The concept of “element” in categorical thinking is very different from the traditional idea, where an element of a set can be any mathematical object. In categorical thinking, an element of an object $A$ of a category $\mathbf{C}$ is an arrow $1\to A$ where $1$ is the terminal object. Thus $4$ as an integer is the arrow $1\to \mathbb{Z}$ whose unique value is the number $4$.

An object is an element of only one set

In the usage of category theory, the arrow $1\to\mathbb{R}$ whose value is the real number $4$ is a different math object from the arrow $1\to\mathbb{Z}$ whose value is the integer $4$.

A category theorist will probably agree that we can identify the integer $4$ with the real number $4$ via the well known canonical embedding of the ring of integers into the field of real numbers. But in categorical thinking you have to keep all such embeddings in mind; you don’t say the integer $4$ is the same thing as the real number $4$. (Most computer languages keep them distinct, too.)

This difference is actually not hard to get used to and is in fact an improvement over traditional set theory. When you do category theory you use lots of commutative diagrams. The embeddings show up as monic arrows and are essential in keeping the different objects ($\mathbb{Z}$ and $\mathbb{R}$ in the example) separate.

The paper Relating first-order set theory and elementary toposes, by Awodey, Butz, Simpson and Streicher, introduces a concept of “structural system of inclusions” that appears to me to restore the idea of object being an element of more than one set for many purposes.

Homotopy type theory allows an object to have only one type, with much the same effect as in the categorical approach.

Variable elements

The arrow $1\to \mathbb{Z}$ that picks out the integer $4$ is a constant function. It is useful to think of any arrow $A\to B$ of any category as a variable element (or generalized element) of the object $B$. For example, the function $f:\mathbb{R}\to \mathbb{R}$ defined by $f(x)=x^2$ allows you to
think of $x^2$ as a variable number with real parameter. This is another way of thinking about the “$y$” in the equation $y=x^2$, which is commonly called a dependent variable.

One way to think about $y$ is that some statements about it are true, some are false, and many statements are neither true nor false.

  • $y\geq 0$ is true.
  • $y\lt0$ is false.
  • $y\leq1$ is neither true nor false.

This way of thinking about variable objects clears up a lot of confusion about variables and deserves to be more widely used in teaching.

The book Category theory for computing science provides some examples of the use of variable elements as a way of thinking about categorical ideas.

References

Creative Commons License< ![endif]>

This work is licensed under a Creative
Commons Attribution-ShareAlike 2.5
License
.

Send to Kindle

A proof by diagram chasing



In Rigorous proofs, I went through the details of a medium-easy epsilon-delta proof in great detail as a way of showing what is hidden by the wording of the proof. In this post, I will do the same for an easy diagram-chasing proof in category theory. This theorem is stated and proved in Category Theory for Computing Science, page 365, but the proof I give here maximizes the diagram-chasing as a way of illustrating the points I want to make.

Theorem (J. Lambek) Let $F$ be a functor from a category to itself and let $\alpha:Fa\to a$ be an algebra for $F$ which is initial. Then $\alpha$ is an isomorphism.

Proof

  1. $F\alpha:FFa\to Fa$ is also an $F$-algebra.
  2. Initiality means that there is a unique algebra morphism $\eta:a\to Fa$ from $\alpha:Fa\to a$ to $F\alpha:FFa\to Fa$ for which this diagram commutes:



  3. To that diagram we can adjoin another (obviously) commutative square:



  4. Then the outside rectangle in the diagram above also commutes.
  5. This means that $\alpha\circ\eta:a\to a$ is an $F$-algebra morphism from $\alpha:Fa\to a$ to itself.
  6. Another such $F$-algebra morphism is $\text{id}_{A}$.
  7. Initiality of $\alpha$ means that the diagram below commutes:



  8. Because the upper bow and the left square both commute we are justified in inserting a diagonal arrow as below.



  9. Now we can read off the diagram that $F\alpha\circ F(\eta)=\text{id}_{Fa}$ and $\eta\circ\alpha=\text{id}_a$. By definition, then, $\eta$ is a two-sided inverse to $\alpha$, so $\alpha$ is an isomorphism.

Analysis of the proof

This is an analysis of the proof showing what is not mentioned in the proof, similar to the analysis in Rigorous proofs.

  • An $F$-algebra is any arrow of the form $\alpha:Fa\to a$. This definition directly verifies statement (1). You do need to know the definition of “functor” and that the notation $Fa$ means $F(a)$ and $FFa$ means $F(F(a))$.
  • When I am chasing diagrams, I visualize the commutativity of the diagram in (2) by thinking of the red path and the blue path as having the same composites in this graph:





    In other words, $F\alpha\circ F\eta=\eta\circ\alpha$. Notice that the diagram carries all the domain and codomain information for the arrows, whereas the formula “$F\alpha\circ F\eta=\eta\circ\alpha$” requires you to hold the domains and codomains in your head.

  • (Definition of morphism of $F$-algebra) The reader needs to know that a morphism of $F$ algebras is any arrow $\delta:c\to d$ for which




    commutes.
  • (Definition of initial $F$-algebra) $\alpha$ is an initial $F$-algebra means that for any algebra $\beta:Fb\to b$, there is a unique arrow $\delta$ for which the diagram above commutes.
  • (2) is justified by the last two definitions.
  • Pulling a “rabbit out of a hat” in a proof means introducing something that is obviously correct with no motivation, and then checking that it results in a proof. Step (9) in the proof given in Rigorous proofs has an example of adding zero cleverly. It is completely OK to pull a rabbit out of a hat in a proof, as long as the result is correct, but it makes students furious.
  • In statement (3) of the proof we are considering here, the rabbit is the trivially commutative diagram that is adjoined on the right of the diagram from (2).
  • Statement (4) uses a fact known to all diagram chasers: Two joined commutative squares make the outside rectangle commute. You can visualize this by seeing that the three red paths shown below all have the same composite. When I am chasing a complicated diagram I trace the various paths with my finger, or in my head.



    You could also show it by pointing out that $\alpha\circ F\alpha\circ F\eta=\alpha\circ\eta\circ\alpha$, but to check that I think most of us would go back and look at the diagram in (3) to see why it is true. Why not work directly with the diagram?

  • The definition of initiality requires that there be only one $F$-algebra morphism from $\alpha:Fa\to a$ to itself. This means that the upper and lower bows in (7) commute.
  • The diagonal identity arrow in (8) is justified by the fact that the upper bow is exactly the same diagram as the upper triangular diagram in (8). It follows that the upper triangle in (8) commutes. I visualize this as moving the bow down and to the left with the upper left node $Fa$ as a hinge, so that the two triangles coincide. (It needs to be flipped, too.) I should make an interactive diagram that shows this.
  • The lower triangle in (8) also commutes because the square in (2) is given to be commutative.
  • (Definition of isomorphism in a category) An arrow $f:a\to b$ in a category is an isomorphism if there is an arrow $g:b\to a$ for which these diagrams commute:


    xx


    This justifies statement (9).

Remark: I have been profligate in using as many diagrams as I want because this can be seen on a screen instead of on paper. That and the fact that much more data about domains and codomains are visible because I am using diagrams instead of equations involving composition means that the proof requires the readers to carry much less invisible data in their heads.

Send to Kindle