Tag Archives: object

A new kind of introduction to category theory

About this article

  • This post is an alpha version of the first part of the intended article.
  • People who are beginners in learning abstract math concepts have many misunderstandings about the definitions and early theorems of category theory.
  • This article introduces a few basic concepts of category theory. It goes into detail in Purple Prose about the misunderstandings that can arise with each of the concepts. The article is not at all a complete introduction to categories.
  • My blog post Introducing abstract topics describes some of the strategies needed in teaching a new abstract math concept.
  • This article also introduces a few examples of categories that are primarily chosen to cause the reader to come up against some of those misunderstandings. The first example is completely abstract.
  • Math students usually see categories after considerable exposure to abstract math, but students in computing science and other fields may see it without having much background in abstraction. I hope teachers in such courses will include explanations of the sort of misunderstandings mentioned in this article.
  • Like all posts in Gyre&Gimble and all posts in abstractmath.org, this article is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. If you are teaching a class involving category theory, feel free to hand it out, and to modify it (in which case you should include a link to this post).
  • You could also use the article as a source of remarks you make in the class about the topics.

About categories

To be written.

Definition of category

A category is a type of Mathematical structure consisting of two types of data, whose relationships are entirely determined by some axioms. After the definition is complete, I will introduce several categories with a detailed discussion of each one, explaining how they fit the definition of category.

Axiom 1: Data

  1. A category consists of two types of data: objects and arrows.
  2. No object can be an arrow and no arrow can be an object.

Notes for Axiom 1

  • An object of a category can be any kind of mathematical object. It does not have to be a set and it does not have to have elements.
  • Arrows of a category are also called morphisms. You may be familiar with “homomorphisms”, “homeomorphisms” or “isomorphisms”, all of which are functions. This does not mean that a “morphism” in an arbitrary category is a function.

Axiom 2: Domain and codomain

  1. Each arrow has a domain and a codomain, each of which is an object of the category.
  2. The domain and the codomain of an arrow may or may not be the same object.
  3. Each arrow has only one domain and only one codomain.

Notes for Axiom 2

  • If $f$ is an arrow with domain $A$ and codomain $B$, that fact is typically shown either by the notation “$f:A\to B$” or by a diagram like this:
  • The notation “$f:A\to B$” is like that used for functions. This notation may be used in any category, but it does not imply that $f$ is a function or that $A$ and $B$ have elements.
  • For such an arrow, the notation “$\text{dom}(f)$” refers to $A$ and “$\text{cod}(f)$” refers to $B$.
  • For a given category $\mathsf{C}$, the collection of all the arrows with domain $A$ and codomain $B$ may be denoted by
    • “$\text{Hom}(A,B)$” or
    • “$\text{Hom}_\mathsf{C}(A,B)$” or
    • “$\mathsf{C}(A,B)$”.
  • Some newer books and articles in category theory use the name source for domain and target for codomain. This usage has the advantage that a newcomer to category theory will be less likely to think of an arrow as a function.

Axiom 3: Composition

  1. If $f$ and $g$ are arrows in a category for which $\text{cod}(f)=\text{dom}(g)$, as in this diagram:

    then there is a unique arrow with domain $A$ and codomain $C$ called the composite of $f$ and $g$.

Notes for Axiom 3

    diagra

  • An important metaphor for composition is: Every path of length 2 has exactly one composite.
  • The unique arrow required by Axiom 3 may be denoted by “$g\circ f$” or “$gf$”. “$g\circ f$” is more explicit, but “$gf$” is much more commonly used by category theorists.
  • Many constructions in categories may be shown by diagrams, like the one used just above.
  • The diagram

    is said to commute if $h=g\circ f$. The idea is that going along $f$ and then $g$ is the same as going along $h$.

  • It is customary in some texts in category theory to indicate that a diagram commutes by putting a gyre in the middle:
  • The concept of category is an abstraction of the idea of function, and the composition of arrows is an abstraction of the composition of functions. It uses the same notation, “$g\circ f$”. If $f$ and $g$ are set functions, then for an element $x$ in the domain of $f$, \[(g\circ f)(x)=g(f(x))\]
  • But in arbitrary category, it may make no sense to evaluate an arrow $f$ at some element $x$; indeed, the domain of $f$ may not have elements at all, and then the statement “$(g\circ f)(x)=g(f(x))$” is meaningless.

Axiom 4: Identity arrows

  1. For each object $A$ of a category, there is a unique arrow denoted by $\textsf{id}_A$.
  2. $\textsf{dom}(\textsf{id}_A)=A$ and $\textsf{cod}(\textsf{id}_A)=A$.
  3. For any object $B$ and any arrow $f:B\to A$, the diagram

    commutes.

  4. For any object $C$ and any arrow $g:A\to C$, the diagram

    commutes.

Notes for Axiom 4

  • The fact stated in Axiom 4(b) could be shown diagrammatically either as

    or as

  • Facts (c) and (d) can be written in algebraic notation: For any arrow $f$ going to $A$,\[\textsf{id}_A\circ f=f\]and for any arrow $g$ coming from $A$,\[g\circ \textsf{id}_A=g\]

Axiom 5: Associativity

  1. If $f$, $g$ and $h$ are arrows in a category for which $\text{cod}(f)=\text{dom}(g)$ and $\text{cod}(g)=\text{dom}(h)$, as in this diagram:

    then there is a unique arrow $k$ with domain $A$ and codomain $C$ called the composite of $f$, $g$ and $h$.

  2. In the diagram below, the two triangles containing $k$ must both commute.

Notes for Axiom 5

  • Axiom 5b requires that \[h\circ(g\circ f)=(h\circ g)\circ f\](which both equal $k$), which is the usual formula for associativity.
  • Note that the top two triangles commute by Axiom 3.
  • The associativity axiom means that we can get rid of parentheses and write \[k=h\circ f\circ g\]just as we do for addition and multiplication of numbers.
  • In my opinion the notation using categorical diagrams communicates information much more clearly than algebraic notation does. In particular, you don’t have to remember the domains and codomains of the functions — they appear in the picture. I admit that diagrams take up much more space, but now that we read math stuff on a computer screen instead of on paper, space is free.

Examples of categories

For the first three examples, I will give a detailed explanation about how they fit the definition of category.

Example 1: MyFin

This first example is a small, finite category which I have named $\mathsf{MyFin}$ (my very own finite category). It is not at all an important category, but it has advantages as a first example.

  • It’s small enough that you can see all the objects and arrows on the screen at once, but big enough not to be trivial.
  • The objects and arrows have no properties other than being objects and arrows. (The other examples involve familiar math objects.)
  • So in order to check that $\mathsf{MyFin}$ really obeys the axioms for a category, you can use only the skeletal information given here. As a result, you must really understand the axioms!

A correct proof will be based on axioms and theorems. The proof can be suggested by your intuitions, but intuitions are not enough. When working with $\mathsf{MyFin}$ you won’t have any intuitions!

A diagram for $\mathsf{MyFin}$

This diagram gives a partial description of $\mathsf{MyFin}$.

Now let’s see how to make the diagram above into a category.

Axiom 1

  • The objects of $\mathsf{MyFin}$ are $A$, $B$, $C$ and $D$.
  • The arrows are $f$, $g$, $h$, $j$, $k$, $r$, $s$, $u$, $v$, $w$ and $x$.
  • You can regard the letters just listed as names of the objects and arrows. The point is that at this stage all you know about the objects and arrows are their names.
  • If you prefer, you can think of the arrows as the actual arrows shown in the $\mathsf{MyFin}$ diagram.
  • Our definition of $\mathsf{MyFin}$ is an abstract definition. You may have seen multiplication tables of groups given in terms of undefined letters. (If you haven’t, don’t worry.) Those are also abstract definitions.
  • Most of our other definitions of categories involve math objects you actually know something about. They are like the definition of division, for example, where the math objects are integers.

Axiom 2

  • The domains and codomains of the arrows are shown by the diagram above.
  • For example, $\text{dom}(r)=A$ and $\text{cod}(r)=C$, and $\text{dom}(v)=\text{cod}(v)=B$.

Axiom 3

Showing the $\mathsf{MyFin}$ diagram does not completely define $\mathsf{MyFin}$. We must say what the composites of all the paths of length 2 are.

  • In fact, most of them are forced, but two of them are not.
  • We must have $g\circ f=r$ because $r$ is the only arrow possible for the composite, and Axiom 3 requires that every path of length 2 must have a composite.
  • For the same reason, $h\circ g=s$.
  • All the paths involving $u$, $v$, $w$ and $x$ are forced:

  • (p1) $u\circ u=u$, $v\circ v=v$, $w\circ w=w$ and $x\circ x=x$.
  • (p2) $f\circ u=f$, $r\circ u=r$, $j\circ u=j$ and $k\circ u=k$. You can see that, for example, $f\circ u=f$ by opening up the loop on $f$ like this:

    There is only one arrow going from $A$ to $B$, namely$f$, so $f$ has to be the composite $f\circ u$.

  • (p3) $v\circ f=f$, $g\circ v=g$ and $s\circ v=s$.
  • (p4) $w\circ g=g$, $w\circ r=r$ and $h\circ w=h$.
  • (p5) $x\circ h=h$, $x\circ s=s$, $x\circ j=j$ and $x\circ k=k$.

  • For $s\circ f$ and $h\circ r$, we have to choose between $j$ and $k$ as composites. Since $s\circ f=(h\circ g)\circ f$ and $h\circ r=h\circ (g\circ f)$, Axiom 3 requires that we must chose one of $j$ and $k$ to be both composites.

    Definition: $s\circ f=h\circ r=j$.

    If we had defined $s\circ f=h\circ r=k$ we would have a different category, although one that is “isomorphic” to $\mathsf{MyFin}$ (you have to define “isomorphic” or look it up.)

  • Axiom 4

    • It is clear from the $\mathsf{MyFin}$ diagram that for each object there is just one arrow that has that object both as domain and as codomain, as required by Axiom 4a.
    • The requirements in Axiom 4b and 4c are satisfied by statements (p1) through (p5).

    Axiom 5

    • Since we have already required both $(h\circ g)\circ f$ and $h\circ(g\circ f)$ to be $k$, composition is associative.

    Example 2: Set

    To be written.

    This will be a very different example, because it involves known mathematical objects — sets and functions. But there are still issues, for example the fact that the inclusion of $\{1,2\}$ into $\{1,2,3\}$ and the identity map on $\{1,2\}$ are two different arows in the category of sets.

    Example 3: IntegerDiv

    To be written.

    The objects are all the positive integers and there is an arrow from $m$ to $n$ if and only if $m$ divides $n$. So this example involves familiar objects and predicates, but the arrows are nevertheless not functions that take elements to elements. Integers don’t have elements. I would expect to show how the GCD of two integers is a limit.

    References

      Creative Commons License        

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle

    Very early difficulties II

    Very early difficulties II

    This is the second part of a series of posts about certain difficulties math students have in the very early stages of studying abstract math. The first post, Very early difficulties in studying abstract math, gives some background to the subject and discusses one particular difficulty: Some students do not know that it is worthwhile to try starting a proof by rewriting what is to be proved using the definitions of the terms involved.

    Math StackExchange

    The website Math StackExchange is open to any questions about math, even very easy ones. It is in contrast with Math OverFlow, which is aimed at professional mathematicians asking questions in their own field.

    Math SE contains many examples of the early difficulties discussed in this series of posts, and I recommend to math ed people (not just RUME people, since some abstract math occurs in advanced high school courses) that they might consider reading through questions on Math SE for examples of misunderstanding students have.

    There are two caveats:

    • Most questions on Math SE are at a high enough level that they don’t really concern these early difficulties.
    • Many of the questions are so confused that it is hard to pinpoint what is causing the difficulty that the questioner has.

    Connotations of English words

    The terms(s) defined in a definition are often given ordinary English words as names, and the beginner automatically associates the connotations of the meaning of the English word with the objects defined in the definition.

    Infinite cardinals

    If $A$ if a finite set, the cardinality of $A$ is simply a natural number (including $0$). If $A$ is a proper subset of another set $B$, then the cardinality of $A$ is strictly less than the cardinality of $B$.

    In the nineteenth century, mathematicians extended the definition of cardinality for infinite sets, and for the most part cardinality has the same behavior as for finite sets. For example, the cardinal numbers are well-ordered. However, for infinite sets it is possible for a set and a proper subset of the set to have the same cardinality. For example, the cardinality of the set of natural numbers is the same as the cardinality of the set of rational numbers. This phenomenon causes major cognitive dissonance.

    Question 1331680 on Math Stack Exchange shows an example of this confusion. I have also discussed the problem with cardinality in the abstractmath.org section Cardinality.

    Morphism in category theory

    The concept of category is defined by saying there is a bunch of objects called objects (sorry bout that) and a bunch of objects called morphisms, subject to certain axioms. One requirement is that there are functions from morphisms to objects choosing a “domain” and a “codomain” of each morphism. This is spelled out in Category Theory in Wikibooks, and in any other book on category theory.

    The concepts of morphism, domain and codomain in a category are therefore defined by abstract definitions, which means that any property of morphisms and their domains and codomains that is true in every category must follow from the axioms. However, the word “morphism” and the talk about domains and codomains naturally suggests to many students that a morphism must be a function, so they immediately and incorrectly expect to evaluate it at an element of its domain, or to treat it as a function in other ways.

    Example

    If $\mathcal{C}$ is a category, its opposite category $\mathcal{C}^{op}$ is defined this way:

    • The objects of $\mathcal{C}^{op}$ are the objects of $\mathcal{C}$.
    • A morphism $f:X\to Y$ of $\mathcal{C}^{op}$ is a morphism from $Y$ to $X$ of $\mathcal{C}$ (swap the domain and codomain).

    In Question 980933 on Math SE, the questioner is saying (among other things) that in $\text{Set}^{op}$, this would imply that there has to be a morphism from a nonempty set to the empty set. This of course is true, but the questioner is worried that you can’t have a function from a nonempty set to the empty set. That is also true, but what it implies is that in $\text{Set}^{op}$, the morphism from $\{1,2,3\}$ to the empty set is not a function from $\{1,2,3\}$ to the empty set. The morphism exists, but it is not a function. This does not any any sense make the definition of $\text{Set}^{op}$ incorrect.

    Student confusion like this tends to make the teacher want to have a one foot by six foot billboard in his classroom saying

    A MORPHISM DOESN’T HAVE TO BE A FUNCTION!

    However, even that statement causes confusion. The questioner who asked Question 1594658 essentially responded to the statement in purple prose above by assuming a morphism that is “not a function” must have two distinct values at some input!

    That questioner is still allowing the connotations of the word “morphism” to lead them to assume something that the definition of category does not give: that the morphism can evaluate elements of the domain to give elements of the codomain.

    So we need a more elaborate poster in the classroom:

    The definition of “category” makes no requirement
    that an object has elements
    or that morphisms evaluate elements.

    As was remarked long long ago, category theory is pointless.

    English words implementing logic

    There are lots of questions about logic that show that students really do not think that the definition of some particular logical construction can possibly be correct. That is why in the abstractmath.org chapter on definitions I inserted this purple prose:

    A definition is a totalitarian dictator.

    It is often the case that you can explain why the definition is worded the way it is, and of course when you can you should. But it is also true that the student has to grovel and obey the definition no matter how weird they think it is.

    Formula and term

    In logic you learn that a formula is a statement with variables in it, for example “$\exists x((x+5)^3\gt2)$”. The expression “$(x+5)^3$” is not a formula because it is not a statement; it is a “term”. But in English, $H_2O$ is a formula, the formula for water. As a result, some students have a remarkably difficult time understanding the difference between “term” and “formula”. I think that is because those students don’t really believe that the definition must be taken seriously.

    Exclusive or

    Question 804250 in MathSE says:

    “Consider $P$ and $Q$. Let $P+Q$ denote exclusive or. Then if $P$ and $Q$ are both true or are both false then $P+Q$ is false. If one of them is true and one of them is false then $P+Q$ is true. By exclusive or I mean $P$ or $Q$ but not both. I have been trying to figure out why the truth table is the way it is. For example if $P$ is true and $Q$ is true then no matter what would it be true?”

    I believe that the questioner is really confused by the plus sign: $P+Q$ ought to be true if $P$ and $Q$ are both true because that’s what the plus sign ought to mean.

    Yes, I know this is about a symbol instead of an English word, but I think the difficulty has the same dynamics as the English-word examples I have given.

    If I have understood this difficulty correctly, it is similar to the students who want to know why $1$ is not a prime number. In that case, there is a good explanation.

    Only if

    The phrase “only if” simply does not mean the same thing in math as it does in English. In Question 17562 in MathSE, a reader asks the question, why does “$P$ only if $Q$” mean the same as “if $P$ then $Q$” instead of “if $Q$ then $P$”?

    Many answerers wasted a lot of time trying to convince us that “$P$ only if $Q$” mean the same as “if $P$ then $Q$” in ordinary English, when in fact it does not. That’s because in English, clauses involving “if” usually connote causation, which does not happen in math English.

    Consider these two pairs of examples.

    1. “I take my umbrella only if it is raining.”
    2. “If I take my umbrella, then it is raining.”
    3. “I flip that switch only if a light comes on.”
    4. “If I flip that switch, a light comes on.”

    The average non-mathematical English speaker will easily believe that (1) and (4) are true, but will balk and (2) and (3). To me, (3) means that the light coming on makes me flip the switch. (2) is more problematical, but it does (to me) have a feeling of causation going the wrong way. It is this difference that causes students to balk at the equivalence in math of “$P$ only if $Q$” and “If $P$, then $Q$”. In math, there is no such thing as causation, and the truth tables for implication force us to live with the fact that these two sentences mean the same thing.

    Henning Makholm’ answer to Question 17562 begins this way: “I don’t think there’s really anything to understand here. One simply has to learn as a fact that in mathematics jargon the words ‘only if’ invariably encode that particular meaning. It is not really forced by the everyday meanings of ‘only’ and’ if’ in isolation; it’s just how it is.” That is the best way to answer the question. (Other answerers besides Makholm said something similar.)

    I have also discussed this difficulty (and other difficulties with logic) in the abmath section on “only if“.

    References

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle