Tag Archives: monoid

A slow introduction to category theory

Category theory turns math inside-out. Definitions depend on nothing inside, but on everything outside. — John Cook

About this post

This is a draft of the first part of an article on category theory that will be posted on abstractmath.org. It replaces an earlier version that was posted in June, 2016.

During the last year or so, I have been monitoring the category theory questions on Math Stack Exchange. Some of the queries are clearly from people who do not have enough of a mathematical background to understand basic abstract reasoning, for example the importance of definitions and the difficulties described in the abmath artice on Dysfunctional attitudes and behaviors. Category theory has become important in several fields outside mathematics, for example computer science and database theory.

This article is intended to get people started in category theory by giving a very detailed definition of “category” and some examples described in detail with an emphasis on how the example fits the definition of category. That’s all the present version does, but I intend to add some examples of constructions and properties such as the dual category, product, and other concepts that some of the inquirers on Math Stack Exchange had great difficulty with.

There is no way in which this article is a proper introduction to category theory. It is intended only to give beginners some help over the initial steps of understanding the subject, particularly the aspects of understanding that cause many hopeful math majors to fall off the Abstraction Cliff.

About categories

To be written.

Definition of category

A category is a type of Mathematical structure consisting of two types of data, whose relationships are entirely determined by some axioms. After the definition is complete, I introduce several example categories with a detailed discussion of each one, explaining how they fit the definition of category.

Axiom 1: Data

A category consists of two types of data: objects and arrows.

Notes for Axiom 1

  • You will see in the section on Examples of categories that every definition of a category $\mathsf{C}$ starts by specifying what the objects of $\mathsf{C}$ are and what the arrows of $\mathsf{C}$ are. That is what Axiom I requires.
  • An object of a category can be any kind of mathematical object. It does not have to be a set and it does not have to have elements.
  • Arrows of a category are also called morphisms. You may be familiar with “homomorphisms”, “homeomorphisms” or “isomorphisms”, all of which are functions. This does not mean that a “morphism” in an arbitrary category is a function.

Axiom 2: Domain and codomain

Each arrow of a category has a domain and a codomain, each of which is an object of the category.

Notes for Axiom 2

  • The domain and the codomain of an arrow may or may not be the same object.
  • Each arrow has only one domain and only one codomain.
  • If $f$ is an arrow with domain $A$ and codomain $B$, that fact is typically shown either by the notation “$f:A\to B$” or by a diagram like this:
  • Warning: The notation “$f:A\to B$” is like that used for functions. This notation may be used in any category, but it does not imply that $f$ is a function or that $A$ and $B$ have elements.
  • For an arrow $f:A\to B$, the notation “$\text{dom}(f)$” refers to $A$ and “$\text{cod}(f)$” refers to $B$.
  • For a given category $\mathsf{C}$, the collection of all the arrows with domain $A$ and codomain $B$ may be denoted by
    • “$\text{Hom}(A,B)$” or
    • “$\text{Hom}_\mathsf{C}(A,B)$” or
    • “$\mathsf{C}(A,B)$”.

  • Some newer books and articles in category theory use the name source for domain and target for codomain. This usage has the advantage that a newcomer to category theory will be less likely to think of an arrow as a function.

Axiom 3: Composition

If $f$ and $g$ are arrows in a category for which $\text{cod}(f)=\text{dom}(g)$, as in this diagram:

then there is a unique arrow with domain $A$ and codomain $C$ called the composite of $f$ and $g$.

Notes for Axiom 3

  • The unique arrow required by Axiom 3 may be denoted by “$g\circ f$” or “$gf$”. “$g\circ f$” is more explicit, but “$gf$” is much more commonly used by category theorists.
  • Many constructions in categories may be shown by diagrams, like the one used just above.
  • The diagram

    is said to commute if $h=g\circ f$.

  • It is useful to think of $f$ followed by $g$ as a path in the diagram. Then a metaphor for composition is: Every path of length 2 has exactly one composite.
  • It is customary in some texts in category theory to indicate that a diagram commutes by putting a gyre in the middle:
  • Note that the composite of the path that I described as “$f$ followed by $g$” is written as “$g\circ f$” or “$gf$”, which seems backward. Nevertheless, the most common notation in category theory for the composite of “$f$ followed by $g$” is $gf$. Some authors in computer science write “$f;g$” for “$gf$” to get around this problem.
  • The concept of category is an abstraction of the idea of function, and the composition of arrows is an abstraction of the composition of functions. It uses the same notation, “$g\circ f$”. If $f$ and $g$ are set functions, then for an element $x$ in the domain of $f$, \[(g\circ f)(x)=g(f(x))\]
  • But in arbitrary category, it may make no sense to evaluate an arrow $f$ at some element $x$; indeed, the domain of $f$ may not have elements at all, and then the statement “$(g\circ f)(x)=g(f(x))$” is meaningless.

Axiom 4: Identity arrows

Note: WordpPress does not recognize the html command

    . Axiom 1 should be 4a, Axiom 2 4b Axiom 3 4c and Axiom 4 4d.

  1. For each object $A$ of a category, there is an arrow denoted by $\mathsf{id}_A$.
  2. $\textsf{dom}(\textsf{id}_A)=A$ and $\textsf{cod}(\textsf{id}_A)=A$.
  3. For any object $B$ and any arrow $f:B\to A$, the diagram


  4. For any object $C$ and any arrow $g:A\to C$, the diagram


Notes for Axiom 4

  • The fact stated in Axiom 4(b) could be shown diagrammatically either as

    or as

  • Facts (c) and (d) can be written in algebraic notation: For any arrow $f$ going to $A$,\[\textsf{id}_A\circ f=f\]and for any arrow $g$ coming from $A$,\[g\circ \textsf{id}_A=g\]
  • There may be many arrows with domain and codomain both equal to $A$ (for example in the category $\mathsf{Set}$), but only one of them is $\textsf{id}_A$. It can be proved that $\textsf{id}_A$ is the unique arrow satisfying both (c) and (d) of the axiom.

Axiom 5: Associativity

  1. If $f$, $g$ and $h$ are arrows in a category for which $\text{cod}(f)=\text{dom}(g)$ and $\text{cod}(g)=\text{dom}(h)$, as in this diagram:

    then there is a unique arrow $k$ with domain $A$ and codomain $D$ called the composite of $f$, $g$ and $h$.

  2. In the diagram below, the two triangles containing $k$ must both commute.

Notes for Axiom 5

  • Axiom 5b requires that \[h\circ(g\circ f)=(h\circ g)\circ f\](which both equal $k$), which is the usual algebraic notation for associativity.
  • Note that the top two triangles commute by Axiom 3.
  • The associativity axiom means that we can get rid of parentheses and write \[k=h\circ f\circ g\]just as we do for addition and multiplication of numbers.
  • In my opinion the notation using categorical diagrams communicates information much more clearly than algebraic notation does. In particular, you don’t have to remember the domains and codomains of the functions — they appear in the picture. I admit that diagrams take up much more space, but now that we read math stuff on a computer screen instead of on paper, space is free.

Examples of categories

For these examples, I give a detailed explanation about how they fit the definition of category.

Example 1: MyFin

This first example is a small, finite category which I have named $\mathsf{MyFin}$ (“my finite category”). It is not at all an important category, but it has advantages as a first example.

  • It’s small enough that you can see all the objects and arrows on the screen at once, but big enough not to be trivial.
  • The objects and arrows have no properties other than being objects and arrows. (Some of the other examples involve familiar math objects.)
  • So in order to check that $\mathsf{MyFin}$ really obeys the axioms for a category, you can use only the skeletal information given here. As a result, you must really understand the axioms!

A correct proof will be based on axioms and theorems.
The proof can be suggested by your intuitions,
but intuitions are not enough.
When working with $\mathsf{MyFin}$ you won’t have any intuitions!

A diagram for $\mathsf{MyFin}$

This diagram gives a partial description of $\mathsf{MyFin}$.

Now let’s see how to make the diagram above into a category.

Axiom 1: Data

  • The objects of $\mathsf{MyFin}$ are $A$, $B$, $C$ and $D$.
  • The arrows are $f$, $g$, $h$, $j$, $k$, $r$, $s$, $u$, $v$, $w$ and $x$.
  • You can regard the letters just listed as names of the objects and arrows. The point is that at this stage all you know about the objects and arrows are their names.
  • If you prefer, you can think of the arrows as the actual arrows shown in the $\mathsf{MyFin}$ diagram.
  • Our definition of $\mathsf{MyFin}$ is an abstract definition. You may have seen multiplication tables of groups given in terms of undefined letters. (If you haven’t, don’t worry.) Those are also abstract definitions.
  • Our other definitions of categories involve math objects you actually know something about.

Axiom 2: Domain and Codomain

  • The domains and codomains of the arrows are shown by the diagram above.
  • For example, $\text{dom}(r)=A$ and $\text{cod}(r)=C$, and $\text{dom}(v)=\text{cod}(v)=B$.

Axiom 3: Composition

Showing the $\mathsf{MyFin}$ diagram does not completely define $\mathsf{MyFin}$. We must say what the composites of all the paths of length 2 are.

  • In fact, most of them are forced, but two of them are not.
  • We must have $g\circ f=r$ because $r$ is the only arrow possible for the composite, and Axiom 3 requires that every path of length 2 must have a composite.
  • For the same reason, $h\circ g=s$.
  • All the paths involving $u$, $v$, $w$ and $x$ are forced:

  • (p1) $u\circ u=u$, $v\circ v=v$, $w\circ w=w$ and $x\circ x=x$.
  • (p2) $f\circ u=f$, $r\circ u=r$, $j\circ u=j$ and $k\circ u=k$. You can see that, for example, $f\circ u=f$ by opening up the loop on $f$ like this:

    There is only one arrow going from $A$ to $B$, namely$f$, so $f$ has to be the composite $f\circ u$.

  • (p3) $v\circ f=f$, $g\circ v=g$ and $s\circ v=s$.
  • (p4) $w\circ g=g$, $w\circ r=r$ and $h\circ w=h$.
  • (p5) $x\circ h=h$, $x\circ s=s$, $x\circ j=j$ and $x\circ k=k$.
  • For $s\circ f$ and $h\circ r$, we have to choose between $j$ and $k$ as composites. Since $s\circ f=(h\circ g)\circ f$ and $h\circ r=h\circ (g\circ f)$, Axiom 3 requires that we must chose one of $j$ and $k$ to be both composites.

    Definition: $s\circ f=h\circ r=j$.

    If we had defined $s\circ f=h\circ r=k$ we would have a different category, although one that is “isomorphic” to $\mathsf{MyFin}$ (you have to define “isomorphic” or look it up.)

Axiom 4: Identity arrows

Axiom 5: Associativity

  • Since we have already required both $(h\circ g)\circ f$ and $h\circ(g\circ f)$ to be $k$, composition is associative.

Example 2: IntegerDiv

  • This example uses familiar mathematical objects — positive integers.
  • The arrows are not functions that can be applied to elements, since integers do not have elements.

Axiom 1: Data

  • The objects of IntegerDiv are all the positive integers.
  • Suppose $m$ and $n$ are positive integers:
  • If $m$ divides $n$, there is exactly one arrow from $m$ to $n$. I will call this arrow $\textsf{mdn}$. (This is my notation. There is no standard notation for this category.) For example there is one arrow from $2$ to $6$, denoted by $\textsf{2d6}$.
  • If $m$ does not divide $n$, there is no arrow from $m$ to $n$.

Axiom 2: Domain and codomain

The arrow denoted by $\textsf{mdn}$ has domain $m$ and codomain $n$.




which may also be shown as

Axiom 3: Composition

The composite of

must be $\textsf{rdt}$, since that is the only arrow with domain $r$ and codomain $t$.

This fact can also be written this way: \[\mathsf{sdt}\circ\textsf{rds}=\textsf{rdt}\]

Axiom 4: Identity arrows

The composites


must commute since the arrows shown are the only possible arrows with the domains and codomains shown. In other words, $\textsf{id}_\textsf{r}=\textsf{rdr}$ and $\textsf{id}_\textsf{s}=\textsf{sds}$.

Axiom 5: Associativity

In the diagram below,

there is only one arrow from one integer to another, so $\textsf{k}$ must be both \[\textsf{tdu}\circ(\textsf{sdt}\circ\textsf{rds})\] and \[(\textsf{tdu}\circ\textsf{sdt})\circ\textsf{rds}\] as required.

Example 3: The category of Sets

In this section, I define the category $\mathsf{Set}$ (that is standard terminology in category theory.) This example will be very different from $\mathsf{MyFin}$, because it involves known mathematical objects — sets and functions.

Axiom 1: Data

  • Every set is an object of $\mathsf{Set}$ and nothing else is.
  • Every function between sets is an arrow of $\mathsf{Set}$ and nothing else is an arrow of $\mathsf{Set}$.

Axiom 2: Domain and codomain

For a given function $f$, $\text{dom}(f)$ is the domain of the function $f$ in the usual sense, and $\text{cod}(f)$ is the codomain of $f$ in the usual sense. (See Functions: specification and definition for more about domain and codomain.)


  • Let $f:\mathbb{R}\to\mathbb{R}$ be the function defined by $f(x):=x^2$. Then the arrow $f$ in $\mathsf{Set}$ satisfies $\text{dom}(f)= \mathbb{R}$ and also $\text{cod}(f)=\mathbb{R}$.
  • Let $j:\{1,2,3\}\to\{1,2,3,4\}$ be defined by $j(1):=1$, $j(2):=4$ and $j(3):=3$. Then $\text{dom}(j)=\{1,2,3\}$ and $\text{cod}(j)=\{1,2,3,4\}$.

Axiom 3: Composition

The composite of $f:A\to B$ and $g:B\to C$ is the function $g\circ f:A\to C$ defined by \[\text{(DC)}\,\,\,\,\,\,\,\,\,\,(g\circ f)(a):=g(f(a))\]


Many other categories have a similar definition of composition, including categories whose objects are math structures with underlying sets and whose arrows are structure-preserving functions between the underlying sets. But be warned: There are many useful categories whose arrows do not evaluate at an element of an object because the objects don’t have elements. In that case, (DC) is meaningless. This is true of $\mathsf{MyFin}$ and $\mathsf{IntegerDiv}$.

Axiom 4: Identity arrows

For a set $A$, the identity arrow $\textsf{id}_A:A\to A$ is, as you might expect, the identity function defined by $\textsf{id}_A(a)=a$ for every $a\in A$. We must prove that these diagrams commute:

The calculations below show that they commute. They use the definition of composite given by (DC).

  • For any $b\in B$, \[(\textsf{id}_A\circ f)(b)=\textsf{id}_A(f(b))=f(b)\]
  • For any $a\in A$, \[(g\circ \textsf{id}_A)(a)=g(\textsf{id}_A(a))=g(a)\]

Note: In $\mathsf{Set}$, there are generally many arrows from a particular set $S$ to itself (for example there are $4$ from $\{1,2\}$ to itself), but only one is the identity arrow.

Axiom 5: Associativity

Composition of arrows in $\mathsf{Set}$ is associative because function composition is associative. Suppose we have functions as in this diagram:

We must show that the two triangles containing $k$ in this diagram commute:

In algebraic notation, this requires showing that for every element $a\in A$,\[(h\circ(g\circ f))(a))=((h\circ g)\circ f)(a)\]

The calculation below does that. It makes repeated use of Definition (DC) of composition. For any $a\in A$,\[\begin{equation}
\big(h\circ (g\circ f)\big)(a)
& = h\big((g\circ f)(a)\big) \\
& = h\big(g(f(a))\big) \\
& = (h\circ g)(f(a)) \\
& = \big((h\circ g)\circ f\big)(a)

Example 4: The category of Monoids

  • This definition makes repeated use of the fact that a homomorphism of monoids is a set function. Specifically, if $(S,\Delta)$ and $(T,\nabla)$ are monoids with identities $e_S$ and $e_{T}$, a homomorphism $h:S\to T$ must be a set function that satisfies the following two axioms: \[\text{(ME)}\,\,\,\,\,\,\,\,h(e_S)=e_T\] and for all elements $s, s’$ of $S$, \[\text{(MM)}\,\,\,\,\,\,\,\,h(s\Delta s’)=h(s)\nabla h(s’)\]
  • The category of monoids may be denoted by $\mathsf{Mon}$.

Axiom 1: Data

  • Every monoid is an object of the category of monoids, and nothing else is.
  • If $f$ is a homomorphism of monoids, then $f$ is an arrow of the category of monoids, and nothing else is.

Axiom 2: Domain and codomain

If $(S,\Delta)$ and $(T,\nabla)$ are monoids and $f:(S,\Delta)\to(T,\nabla)$ is a homomorphism of monoids, then the domain of $f$ is $(S,\Delta)$ and the codomain of $f$ is $(T,\nabla)$.


  • Since $f$ takes elements of the set $S$ to elements of the set $T$, it is also an arrow in the category $\mathsf{Set}$. In general, very few functions from $S$ to $T$ will be monoid homomorphisms from $(S,\Delta)$ to $(T,\nabla)$.
  • Many authors do not distinguish between $f$ regarded as an arrow of $\mathsf{Mon}$ and $f$ regarded as an arrow of $\mathsf{Set}$. Others may write $U(f)$ for the arrow in $\mathsf{Set}$. “$U$” stands for “underlying functor“, also called “forgetful functor”.

Axiom 3: Composition

The composite of

is the composite $g\circ f$ as set functions:

It is necessary to check that $g\circ f$ is a monoid homomorphism. The following calculation shows that it preserves the monoid operation; it makes repeated use of equations (DC) and (MM).

The calculation: For elements $r$ and $r’$ of $R$,\[\begin{align*}
(g\circ f)(r\,{\scriptstyle \square}\, r’)
&=g\left(f(r\, {\scriptstyle \square}\, r’)\right)\,\,\,\,\,\text{(DC)}\\ &=g\left(f(r) {\scriptstyle\, \Delta}\, f(r’)\right)\,\,\,\,\,\text{(MM)}\\
&=g(f(r)){\scriptstyle \,\nabla}\, g(f(r’))\,\,\,\,\text{(MM)}\\
&=(g\circ f)(r){\scriptstyle \,\nabla}\,(g\circ f)(r’)\,\,\,\,\,\text{(DC)}

The fact that $g\circ f$ preserves the identity of the monoid is shown in the next section.

Axiom 4: Identity arrows

For a monoid $(S,\Delta)$, the identity function $\text{id}_S:S\to S$ preserves the monoid operation $\Delta$, because $\text{id}_S(s\Delta s’)=s\Delta s’$ by definition of the identity function, and that is $\text{id}_S(s)\Delta \text{id}_S(s’)$ for the same reason.

The required diagrams below must commute because the set functions commute and, by Axiom 3, the set composition of a monoid homomorphism is a monoid homomorphism.

We also need to show that $g\circ f$ as in

preserves identities. This calculation proves that it does; it uses (DC) and (ME)

(g\circ f)(\text{id}_R)

Axiom 5: Associativity

The diagram

in the category $\mathsf{Set}$ commutes, so the diagram

must also commute.


All these references are available on line.

  Creative Commons License        

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

Naming mathematical objects

Commonword names confuse

Many technical words and phrases in math are ordinary English words ("commonwords") that are assigned a different and precisely defined mathematical meaning.  

  • Group  This sounds to the "layman" as if it ought to mean the same things as "set".  You get no clue from the name that it involves a binary operation with certain properties.  
  • Formula  In some texts on logic, a formula is a precisely defined expression that becomes a true-or-false sentence (in the semantics) when all its variables are instantiated.  So $(\forall x)(x>0)$ is a formula.  The word "formula" in ordinary English makes you think of things like "$\textrm{H}_2\textrm{O}$", which has no semantics that makes it true or false — it is a symbolic expression for a name.
  • Simple group This has a technical meaning: a group with no nontrivial normal subgroup.  The Monster Group is "simple".  Yes, the technical meaning is motivated by the usual concept of "simple", but to say the Monster Group is simple causes cognitive dissonance.

Beginning students come with the (generally subconscious) expectation that they will pick up clues about the meanings of words from connotations they are already familiar with, plus things the teacher says using those words.  They think in terms of refining an understanding they already have.  This is more or less what happens in most non-math classes.  They need to be taught what definition means to a mathematician.

Names that don't confuse but may intimidate

Other technical names in math don't cause the problems that commonwords cause.

Named after somebody The phrase "Hausdorff space" leads a math student to understand that it has a technical meaning.  They may not even know it is named after a person, but it screams "geek word" and "you don't know what it means".  That is a signal that you can find out what it means.  You don't assume you know its meaning. 

New made-up words  Words such as "affine", "gerbe"  and "logarithm" are made up of words from other languages and don't have an ordinary English meaning.  Acronyms such as "QED", "RSA" and "FOIL" don't occur often.  I don't know of any math objects other than "RSA algorithm" that have an acronymic name.  (No doubt I will think of one the minute I click the Publish button.)  Whole-cloth words such as "googol" are also rare.  All these sorts of words would be good to name new things since they do not fool the readers into thinking they know what the words mean.

Both types of words avoid fooling the student into thinking they know what the words mean, but some students are intimidated by the use of words they haven't seen before.  They seem to come to class ready to be snowed.  A minority of my students over my 35 years of teaching were like that, but that attitude was a real problem for them.


You can write for several different audiences.

Math fans (non-mathematicians who are interested in math and read books about it occasionally) In my posts Explaining higher math to beginners and in Renaming technical conceptsI wrote about several books aimed at explaining some fairly deep math to interested people who are not mathematicians.  They renamed some things. For example, Mark Ronan in Symmetry and the Monster used the phrase "atom" for "simple group" presumably to get around the cognitive dissonance.  There are other examples in my posts.  

Math newbies  (math majors and other students who want to understand some aspect of mathematics).  These are the people abstractmath.org is aimed at. For such an audience you generally don't want to rename mathematical objects. In fact, you need to give them a glossary to explain the words and phrases used by people in the subject area.   

Postsecondary math students These people, especially the math majors, have many tasks:

  • Gain an intuitive understanding of the subject matter.
  • Understand in practice the logical role of definitions.
  • Learn how to come up with proofs.
  • Understand the ins and outs of mathematical English, particularly the presence of ordinary English words with technical definitions.
  • Understand and master the appropriate parts of the symbolic language of math — not just what the symbols mean but how to tell a statement from a symbolic name.

It is appropriate for books for math fans and math newbies to try to give an understanding of concepts without necessary proving theorems.  That is the aim of much of my work, which has more an emphasis on newbies than on fans. But math majors need as well the traditional emphasis on theorem and proof and clear correct explanations.

Lately, books such as Visual Group Theory have addressed beginning math majors, trying for much more effective ways to help the students develop good intuition, as well as getting into proofs and rigor. Visual Group Theory uses standard terminology.  You can contrast it with Symmetry and the Monster and The Mystery of the Prime Numbers (read the excellent reviews on Amazon) which are clearly aimed at math fans and use nonstandard terminology.  

Terminology for algebraic structures

I have been thinking about the section of Abstracting Algebra on binary operations.  Notice this terminology:


The "standard names" are those in Wikipedia.  They give little clue to the meaning, but at least most of them, except "magma" and "group", sound technical, cluing the reader in to the fact that they'd better learn the definition.

I came up with the names in the right column in an attempt to make some sense out of them.  The design is somewhat like the names of some chemical compounds.  This would be appropriate for a text aimed at math fans, but for them you probably wouldn't want to get into such an exhaustive list.

I wrote various pieces meant to be part of Abstracting Algebra using the terminology on the right, but thought better of it. I realized that I have been vacillating between thinking of AbAl as for math fans and thinking of it as for newbies. I guess I am plunking for newbies.

I will call groups groups, but for the other structures I will use the phrases in the middle column.  Since the book is for newbies I will include a table like the one above.  I also expect to use tree notation as I did in Visual Algebra II, and other graphical devices and interactive diagrams.


In the sixties magmas were called groupoids or monoids, both of which now mean something else.  I was really irritated when the word "magma" started showing up all over Wikipedia. It was the name given by Bourbaki, but it is a bad name because it means something else that is irrelevant.  A magma is just any binary operation. Why not just call it that?  

Well, I will tell you why, based on my experience in Ancient Times (the sixties and seventies) in math. (I started as an assistant professor at Western Reserve University in 1965). In those days people made a distinction between a binary operation and a "set with a binary operation on it".  Nowadays, the concept of function carries with it an implied domain and codomain.  So a binary operation is a function $m:S\times S\to S$.  Thinking of a binary operation this way was just beginning to appear in the common mathematical culture in the late 60's, and at least one person remarked to me: "I really like this new idea of thinking of 'plus' and 'times' as functions."  I was startled and thought (but did not say), "Well of course it is a function".  But then, in the late sixties I was being indoctrinated/perverted into category theory by the likes of John Isbell and Peter Hilton, both of whom were briefly at Case Western Reserve University.  (Also Paul Dedecker, who gave me a glimpse of Grothendieck's ideas).

Now, the idea that a binary operation is a function comes with the fact that it has a domain and a codomain, and specifically that the domain is the Cartesian square of the codomain.  People who didn't think that a binary operation was a function had to introduce the idea of the universe (universal algebraists) or the underlying set (category theorists): you had to specify it separately and introduce terminology such as $(S,\times)$ to denote the structure.   Wikipedia still does it mostly this way, and I am not about to start a revolution to get it to change its ways.


In the olden days, people thought of groups in this way:

  • A group is a set $G$ with a binary operation denoted by juxtaposition that is closed on $G$, meaning that if $a$ and $b$ are any elements of $G$, then $ab$ is in $G$.
  • The operation is associative, meaning that if $a,\ b,\ c\in G$, then $(ab)c=a(bc)$.
  • The operation has a unity element, meaning an element $e$ for which for any element $a\in G$, $ae=ea=a$.
  • For each element $a\in G$, there is an element $b$ for which $ab=ba=e$.

This is a better way to describe a group:

  • A group consist of a nullary operation e, a unary operation inv,  and a binary operation denoted by juxtaposition, all with the same codomain $G$. (A nullary operation is a map from a singleton set to a set and a unary operation is a map from a set to itself.)
  • The value of e is denoted by $e$ and the value of inv$(a)$ is denoted by $a^{-1}$.
  • These operations are subject to the following equations, true for all $a,\ b,\ c\in G$:


    • $ae=ea=a$.
    • $aa^{-1}=a^{-1}a=e$.
    • $(ab)c=a(bc)$.

This definition makes it clear that a group is a structure consisting of a set and three operations whose axioms are all equations.  It was formulated by people in universal algebra but you still see the older form in texts.

The old form is not wrong, it is merely inelegant.  With the old form, you have to prove the unity and inverses are unique before you can introduce notation, and more important, by making it clear that groups satisfy equational logic you get a lot of theorems for free: you construct products on the cartesian power of the underlying set, quotients by congruence relations, and other things. (Of course, in AbAl those theorem will be stated later than when groups are defined because the book is for newbies and you want lots of examples before theorems.)


  1. Three kinds of mathematical thinkers (G&G post)
  2. Technical meanings clash with everyday meanings (G&G post)
  3. Commonword names for technical concepts (G&G post)
  4. Renaming technical concepts (G&G post)
  5. Explaining higher math to beginners (G&G post)
  6. Visual Algebra II (G&G post)
  7. Monads for high school II: Lists (G&G post)
  8. The mystery of the prime numbers: a review (G&G post)
  9. Hersh, R. (1997a), "Math lingo vs. plain English: Double entendre". American Mathematical Monthly, volume 104, pages 48–51.
  10. Names (in abmath)
  11. Cognitive dissonance (in abmath)
Send to Kindle