Tag Archives: group

Introducing abstract topics

I have been busy for the past several years revising abstractmath.org (abmath). Now I believe, perhaps foolishly, that most of the articles in abmath have reached beta, so now it is time for something new.

For some time I have been considering writing introductions to topics in abstract math, some typically studied by undergraduates and some taken by scientists and engineers. The topics I have in mind to do first include group theory and category theory.

The point of these introductions is to get the student started at the very beginning of the topic, when some students give up in total confusion. They meet and fall off of what I have called the abstraction cliff, which is discussed here and also in my blog posts Very early difficulties and Very early difficulties II.

I may have stolen the phrase “abstraction cliff” from someone else.

Group theory

Group theory sets several traps for beginning students.

Multiplication table

  • A student may balk when a small finite group is defined using a set of letters in a multiplication table.
    “But you didn’t say what the letters are or what the multiplication is?”
  • Such a definition is an abstract definition, in contrast to the definition of “prime”, for example, which is stated in terms of already known entities, namely the integers.
  • The multiplication table of a group tells you exactly what the binary operation is and any set with an operation that makes such a table correct is an example of the group being defined.
  • A student who has no understanding of abstraction is going to be totally lost in this situation. It is quite possible that the professor has never even mentioned the concept of abstract definition. The professor is probably like most successful mathematicians: when they were students, they understood abstraction without having to have it explained, and possibly without even noticing they did so.


  • Cosets are a real killer. Some students at this stage are nowhere near thinking of a set as an object or a thing. The concept of applying a binary operation on a pair of sets (or any other mathematical objects with internal structure) is completely foreign to them. Did anyone ever talk to them about mathematical objects?
  • The consequence of this early difficulty is that such a student will find it hard to understand what a quotient group is, and that is one of the major concepts you get early in a group theory course.
  • The conceptual problems with multiplication of cosets is similar to those with pointwise addition of functions. Given two functions $f,g:\mathbb{R}\to\mathbb{R}$, you define $f+g$ to be the function \[(f+g)(x):=f(x)+g(x)\] Along with pointwise multiplication, this makes the space of functions $\mathbb{R}\to\mathbb{R}$ a ring with nice properties.
  • But you have to understand that each element of the ring is a function thought of as a single math object. The values of the function are properties of the function, but they are not elements of the ring. (You can include the real numbers in the ring as constant functions, but don’t confuse me with facts.)
  • Similarly the elements of the quotient group are math objects called cosets. They are not elements of the original group. (To add to the confusion, they are also blocks of a congruence.)

Isomorphic groups

  • Many books, and many professors (including me) regard two isomorphic groups as the same. I remember getting anguished questions: “But the elements of $\mathbb{Z}_2$ are equivalence classes and the elements of the group of permutations of $\{1,2\}$ are functions.”
  • I admit that regarding two isomorphic groups as the same needs to be treated carefully when, unlike $\mathbb{Z}_2$, the group has a nontrivial automorphism group. ($\mathbb{Z}_3$ is “the same as itself” in two different ways.) But you don’t have to bring that up the first time you attack that subject, any more than you have to bring up the fact that the category of sets does not have a set of objects on the first day you define categories.

Category theory

Category theory causes similar troubles. Beginning college math majors don’t usually meet it early. But category theory has begun to be used in other fields, so plenty of computer science students, people dealing with databases, and so on are suddenly trying to understand categories and failing to do so at the very start.

The G&G post A new kind of introduction to category theory constitutes an alpha draft of the first part of an article introducing category theory following the ideas of this post.

Objects and arrows are abstract

  • Every once in a while someone asks a question on Math StackExchange that shows they have no idea that an object of a category need not have elements and that morphisms need not be functions that take elements to elements.
  • One questioner understood that the claim that a morphism need not be a function meant that it might be a multivalued function.


  • That misunderstanding comes up with duality. The definition of dual category requires turning the arrows around. Even if the original morphism takes elements to elements, the opposite morphism does not have to take elements to elements. In the case of the category of sets, an arrow in $\text{Set}^{op}$ cannot take elements to elements — for example, the opposite of the function $\emptyset\to\{1,2\}$.
  • The fact that there is a concrete category equivalent to $\text{Set}^{op}$ is a red herring. It involves different sets: the function corresponding to the function just mentioned goes from a four-element set to a singleton. But in the category $\text{Set}^{op}$ as defined it is simply an arrow, not a function.

Not understanding how to use definitions

  • Some of the questioners on Math Stack Exchange ask how to prove a statement that is quite simple to prove directly from the definitions of the terms involved, but what they ask and what they are obviously trying to do is to gain an intuition in order to understand why the statement is true. This is backward — the first thing you should do is use the definition (at least in the first few days of a math class — after that you have to use theorems as well!
  • I have discussed this in the blog post Insights into mathematical definitions (which gives references to other longer discussions by math ed people). See also the abmath section Rewrite according to the definitions.

How an introduction to a math topic needs to be written

The following list shows some of the tactics I am thinking of using in the math topic introductions. It is quite likely that I will conclude that some tactics won’t work, and I am sure that tactics I haven’t mentioned here will be used.

  • The introductions should not go very far into the subject. Instead, they should bring an exhaustive and explicit discussion of how to get into the very earliest part of the topic, perhaps the definition, some examples, and a few simple theorems. I doubt that a group theory student who hasn’t mastered abstraction and what proofs are about will ever be ready to learn the Sylow theorems.
  • You can’t do examples and definitions simultaneously, but you can come close by going through an example step by step, checking each part of the definition.
  • There is a real split between students who want the definitions first
    (most of whom don’t have the abstraction problems I am trying to overcome)
    and those who really really think they need examples first (the majority)
    because they don’t understand abstraction.

  • When you introduce an axiom, give an example of how you would prove that some binary operation satisfies the axiom. For example, if the axiom is that every element of a group must have an inverse, right then and there prove that addition on the integers satisfies the axiom and disprove that multiplication on integers satisies it.
  • When the definition uses some undefined math objects, point out immediately with examples that you can’t have any intuition about them except what the axioms give you. (In contrast to definition of division of integers, where you and the student already have intuitions about the objects.)
  • Make explicit the possible problems with abstractmath.org and Gyre&Gimble) will indeed find it difficult to become mathematical researchers — but not impossible!
  • But that is not the point. All college math professors will get people who will go into theoretical computing science, and therefore need to understand category theory, or into particle physics, and need to understand groups, and so on.
  • By being clear at the earliest stages of how mathematicians actually do math, they will produce more people in other fields who actually have some grasp of what is going on with the topics they have studied in math classes, and hopefully will be willing to go back and learn some more math if some type of math rears its head in the theories of their field.
  • Besides, why do you want to alienate huge numbers of people from math, as our way of teaching in the past has done?
  • “Our” means grammar school teachers, high school teachers and college professors.


Thanks to Kevin Clift for corrections.

  Creative Commons License        

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

A very early satori that occurs with beginning abstract math students

In the previous post Pattern recognition and me, I wrote about how much I enjoyed sudden flashes of understanding that were caused by my recognizing a pattern (or learning about a pattern). I have had several such, shall we say, Thrills in learning about math and doing research in math. This post is about a very early thrill I had when I first started studying abstract algebra. As is my wont, I will make various pronouncements about what these mean for teaching and understanding math.


Early in any undergraduate course involving group theory, you learn about cosets.

Basic facts about cosets

  1. Every subgroup of a group generates a set of left cosets and a set of right cosets.
  2. If $H$ is a subgroup of $G$ and $a$ and $b$ are elements of $G$, then $a$ and $b$ are in the same left coset of $H$ if and only if $a^{-1}b\in H$. They are in the same right coset of $H$ if and only if $ab^{-1}\in H$.
  3. Alternative definition: $a$ and $b$ are in the same left coset of $H$ if $a=bh$ for some $h\in H$ and are in the same right coset of $H$ if $a=hb$ for some $h\in H$
  4. One of the (left or right) cosets of $H$ is $H$ itself.
  5. The relations
    $a\underset{L}\sim b$ if and only if $a^{-1}b\in H$


    $a\underset{R}\sim b$ if and only if $ab^{-1}\in H$

    are equivalence relations.

  6. It follows from (5) that each of the set of left cosets of $H$ and the set of right cosets of $H$ is a partition of $G$.
  7. By definition, $H$ is a normal subgroup of $G$ if the two sets of cosets coincide.
  8. The index of a subgroup in a group is the cardinal number of (left or right) cosets the subgroup has.

Elementary proofs in group theory

In the course, you will be asked to prove some of the interrelationships between (2) through (5) using just the definitions of group and subgroup. The teacher assigns these exercises to train the students in the elementary algebra of elements of groups.


  1. If $a=bh$ for some $h\in H$, then $b=ah’$ for some $h’\in H$. Proof: If $a=bh$, then $ah^{-1}=(bh)h^{-1}=b(hh^{-1})=b$.
  2. If $a^{-1}b\in H$, then $b=ah$ for some $h\in H$. Proof: $b=a(a^{-1}b)$.
  3. The relation “$\underset{L}\sim$” is transitive. Proof: Let $a^{-1}b\in H$ and $b^{-1}c\in H$. Then $a^{-1}c=a^{-1}bb^{-1}c$ is the product of two elements of $H$ and so is in $H$.
Miscellaneous remarks about the examples
  • Which exercises are used depends on what is taken as definition of coset.
  • In proving Exercise 2 at the board, the instructor might write “Proof: $b=a(a^{-1}b)$” on the board and the point to the expression “$a^{-1}b$” and say, “$a^{-1}b$ is in $H$!”
  • I wrote “$a^{-1}c=a^{-1}bb^{-1}c$” in Exercise 3. That will result in some brave student asking, “How on earth did you think of inserting $bb^{-1}$ like that?” The only reasonable answer is: “This is a trick that often helps in dealing with group elements, so keep it in mind.” See Rabbits.
  • That expression “$a^{-1}c=a^{-1}bb^{-1}c$” doesn’t explicitly mention that it uses associativity. That, too, might cause pointing at the board.
  • Pointing at the board is one thing you can do in a video presentation that you can’t do in a text. But in watching a video, it is harder to flip back to look at something done earlier. Flipping is easier to do if the video is short.
  • The first sentence of the proof of Exercise 3 is, “Let $a^{-1}b\in H$ and $b^{-1}c\in H$.” This uses rewrite according to the definition. One hopes that beginning group theory students already know about rewrite according to the definition. But my experience is that there will be some who don’t automatically do it.
  • in beginning abstract math courses, very few teachers
    tell students about rewrite according to the definition. Why not?

  • An excellent exercise for the students that would require more than short algebraic calculations would be:
    • Discuss which of the two definitions of left coset embedded in (2), (3), (5) and (6) is preferable.
    • Show in detail how it is equivalent to the other definition.

A theorem

In the undergraduate course, you will almost certainly be asked to prove this theorem:

A subgroup $H$ of index $2$ of a group $G$ is normal in $G$.

Proving the theorem

In trying to prove this, a student may fiddle around with the definition of left and right coset for awhile using elementary manipulations of group elements as illustrated above. Then a lightbulb appears:

In the 1980’s or earlier a well known computer scientist wrote to me that something I had written gave him a satori. I was flattered, but I had to look up “satori”.

If the subgroup has index $2$ then there are two left cosets and two right cosets. One of the left cosets and one of the right cosets must be $H$ itself. In that case the left coset must be the complement of $H$ and so must the right coset. So those two cosets must be the same set! So the $H$ is normal in $G$.

This is one of the earlier cases of sudden pattern recognition that occurs among students of abstract math. Its main attraction for me is that suddenly after a bunch of algebraic calculations (enough to determine that the cosets form a partition) you get the fact that the left cosets are the same as the right cosets by a purely conceptual observation with no computation at all.

This proof raises a question:

Why isn’t this point immediately obvious to students?

I have to admit that it was not immediately obvious to me. However, before I thought about it much someone told me how to do it. So I was denied the Thrill of figuring this out myself. Nevertheless I thought the solution was, shall we say, cute, and so had a little thrill.

A story about how the light bulb appears

In doing exercises like those above, the student has become accustomed to using algebraic manipulation to prove things about groups. They naturally start doing such calculations to prove this theorem. They presevere for awhile…

Scenario I

Some students may be in the habit of abandoning their calculations, getting up to walk around, and trying to find other points of view.

  1. They think: What else do I know besides the definitions of cosets?
  2. Well, the cosets form a partition of the group.
  3. So they draw a picture of two boxes for the left cosets and two boxes for the right cosets, marking one box in each as being the subgroup $H$.
  4. If they have a sufficiently clear picture in their head of how a partition behaves, it dawns on them that the other two boxes have to be the same.
Remarks about Scenario I
  • Not many students at the earliest level of abstract math ever take a break and walk around with the intent of having another approach come to mind. Those who do Will Go Far. Teachers should encourage this practice. I need to push this in abstractmath.org.
  • In good weather, David Hilbert would stand outside at a shelf doing math or writing it up. Every once in awhile he would stop for awhile and work in his garden. The breaks no doubt helped. So did standing up, I bet. (I don’t remember where I read this.)
  • This scenario would take place only if the students have a clear understanding of what a partition is. I suspect that often the first place they see the connection between equivalence relations and partitions is in a hasty introduction at the beginning of a group theory or abstract algebra course, so the understanding has not had long to sink in.

Scenario II

Some students continue to calculate…

  1. They might say, suppose $a$ is not in $H$. Then it is in the other left coset, namely $aH$.
  2. Now suppose $a$ is not in the “other” right coset, the one that is not $H$. But there are only two right cosets, so $a$ must be in $H$.
  3. But that contradicts the first calculation I made, so the only possibility left is that $a$ is in the right coset $Ha$. So $aH\subseteq Ha$.
  4. Aha! But then I can use the same argument the other way around, getting $Ha\subseteq aH$.
  5. So it must be that $aH=Ha$. Aha! …indeed.
Remarks about Scenario 2
  • In step (2), the student is starting a proof by contradiction. Many beginning abstract math students are not savvy enough to do this.
  • Step (4) involves recognizing that an argument has a dual. Abstractmath.org does not mention dual arguments and I can’t remember emphasizing the idea to my classes. Tsk.
  • Scenario 2 involves the student continuing algebraic calculations till the lightbulb strikes. The lightbulb could also occur in other places in the calculation.


Send to Kindle

Dysfunctions in doing math I

I am in the middle of revising the article in abstractmath.org on dysfunctional attitudes and behaviors in doing math. Here are three of the sections I have finished.

Misuse of analogy

When William Rowan Hamilton was trying to understand the new type of number called quaternions (MW, Wik) that he invented, he assumed by analogy that like other numbers, quaternion multiplication was commutative. It was a major revelation to him that they were not commutative.

Analogy may suggest new theorems or ways of doing things. But it is fallible. What happens particularly often in abstract math is applying a rule to a situation where it is not appropriate. This is an easy trap to fall into when the notation in two different cases has the same form; that is an example of formal analogy.

Matrix multiplication

Matrix multiplication is not commutative

If $r$ and $s$ are real numbers then the products $rs$ and $sr$ are always the same number. In other words, multiplication of real numbers is commutative : $rs = sr$ for all real numbers $r$ and $s$.

The product of two matrices $M $and $N$ is written $MN$, just as for numbers. But matrix multiplication is not commutative. For example,
1 & 2 \\
3 & 4\\
3 & 1 \\
3 &2\\
9 & 5\\
21 & 11 \\
3 & 1 \\
3 & 2\\
1 & 2 \\
3 & 4\\
6 & 10\\
91 & 14 \\
Because $rs = sr$ for numbers, the formal similarity of the notation suggests $MN$ = $NM$, which is wrong.

This means you can’t blindly manipulate $MNM$ to become $M^2N$. More generally, a law such as $(MN)^n=M^nN^n$ is not correct when $M$ and $N$ are matrices.

You must understand the meanings
of the symbols you manipulate.

The product of two nonzero matrices can be 0

If the product of two numbers is 0, then one or both of the numbers is zero. But that is not true for matrix multiplication:
-2 & 2 \\
-1 & 1\\
1 & 1 \\
1 &1\\
0 &0\\
0 & 0 \\

Canceling sine

  • Beginning calculus students have already learned algebra.
  • They have learned that an expression such as $xy$ means $x$ times $y$.
  • They have learned to cancel like terms in a quotient, so that for example \[\frac{3x}{3y}=\frac{x}{y}\]
  • They have learned to write the value of a function $f$ at the input $x$ by $f(x)$.
  • They have seen people write $\sin x$ instead of $\sin(x)$ but have never really thought about it.
  • So they write \[\frac{\sin x}{\sin y}=\frac{x}{y}\]

This happens fairly often in freshman calculus classes. But you wouldn’t do that, would you?

Boundary values of definitions

Definitions are usually inclusive

Definitions of math concepts usually include the special cases they generalize.


  • A square is a special case of rectangle. As far as I know texts that define “rectangle” include squares in the definition. Thus a square is a rectangle.
  • A straight line is a curve.
  • A group is a semigroup.
  • An integer is a real number. (But not always in computing languages — see here.)

But not always

  • The axioms of a field include a bunch of axioms that a one-element set satisfies, plus a special axiom that does nothing but exclude the one-element set. So a field has to have at least two elements, and that fact does not follow from the other axioms.
  • Boolean algebras are usually defined that way, too, but not always. MathWorld gives several definitions of Boolean algebra that disagree on this point.

When boundary values are not special cases

Definitions may or may not include other types of boundary values.


  • If $S$ is a set, it is a subset of itself. The empty set is also a subset of $S$.
  • Similarly the divisors of $6$ are $-6$, $-3$, $-2$, $-1$, $1$, $2$, $3$ and $6$, not just $2$ and $3$ and not just $1$, $2$, $3$ and $6$ (there are two different boundaries here).

But …

  • The positive real numbers include everything bigger than $0$, but not $0$. ( Note).


A definition that includes such special cases may be called inclusive; otherwise it is exclusive. People new to abstract math very commonly use words defined inclusively as if their definition was exclusive.

  • They say things such as “That’s not a rectangle, it is a square!” and “Is that a group or a semigroup?”
  • They object if you say “Consider the complex number $\pi $.”

This appears to be natural linguistic behavior. Even so, math is picky-picky: a square is a rectangle, a group is a semigroup and $\pi$ is a complex number (of course, it is also a real number).


  • You attend a math lecture and the speaker starts talking about things you never heard of.
  • Your fellow students babble at you about manifolds and tensors and you thought they were car parts and lamps.
  • You suspect your professor is deliberately talking over your head to put you down.
  • You suspect your friends are trying to make you believe they are much smarter than you are.
  • You suspect your friends are smarter than you are.

There are two possibilities:

  • They are not trying to intimidate you (most common).
  • They are deliberately setting out to intimidate you with their arcane knowledge so you will know what a worm you are. (There are people like that.)

Another possibility, which can overlap with the two above, is:

  • You expect to be intimidated. You may be what might be called a co-intimidator, Similar to the way someone who is codependent wants some other person to be dependent on them. (This is not like the “co” in category theory: “product” and “coproduct” have a symmetric relationship with each other, but the co-intimidator relation is asymmetric.)

There are many ways to get around being intimidated.

  • Ask “What the heck is a manifold?”
  • (In a lecture where it might be imprudent or impractical to ask) Write down what they say, then later ask a friend or look it up.
  • Most teachers like to be asked to explain something. Yes, I know some professors repeatedly put down people. Change sections! If you can’t, live with it! Not knowing something says nothing bad about you.

And remember:

If you don’t know something
probably many other students don’t know it either.

Send to Kindle

Naming mathematical objects

Commonword names confuse

Many technical words and phrases in math are ordinary English words ("commonwords") that are assigned a different and precisely defined mathematical meaning.  

  • Group  This sounds to the "layman" as if it ought to mean the same things as "set".  You get no clue from the name that it involves a binary operation with certain properties.  
  • Formula  In some texts on logic, a formula is a precisely defined expression that becomes a true-or-false sentence (in the semantics) when all its variables are instantiated.  So $(\forall x)(x>0)$ is a formula.  The word "formula" in ordinary English makes you think of things like "$\textrm{H}_2\textrm{O}$", which has no semantics that makes it true or false — it is a symbolic expression for a name.
  • Simple group This has a technical meaning: a group with no nontrivial normal subgroup.  The Monster Group is "simple".  Yes, the technical meaning is motivated by the usual concept of "simple", but to say the Monster Group is simple causes cognitive dissonance.

Beginning students come with the (generally subconscious) expectation that they will pick up clues about the meanings of words from connotations they are already familiar with, plus things the teacher says using those words.  They think in terms of refining an understanding they already have.  This is more or less what happens in most non-math classes.  They need to be taught what definition means to a mathematician.

Names that don't confuse but may intimidate

Other technical names in math don't cause the problems that commonwords cause.

Named after somebody The phrase "Hausdorff space" leads a math student to understand that it has a technical meaning.  They may not even know it is named after a person, but it screams "geek word" and "you don't know what it means".  That is a signal that you can find out what it means.  You don't assume you know its meaning. 

New made-up words  Words such as "affine", "gerbe"  and "logarithm" are made up of words from other languages and don't have an ordinary English meaning.  Acronyms such as "QED", "RSA" and "FOIL" don't occur often.  I don't know of any math objects other than "RSA algorithm" that have an acronymic name.  (No doubt I will think of one the minute I click the Publish button.)  Whole-cloth words such as "googol" are also rare.  All these sorts of words would be good to name new things since they do not fool the readers into thinking they know what the words mean.

Both types of words avoid fooling the student into thinking they know what the words mean, but some students are intimidated by the use of words they haven't seen before.  They seem to come to class ready to be snowed.  A minority of my students over my 35 years of teaching were like that, but that attitude was a real problem for them.


You can write for several different audiences.

Math fans (non-mathematicians who are interested in math and read books about it occasionally) In my posts Explaining higher math to beginners and in Renaming technical conceptsI wrote about several books aimed at explaining some fairly deep math to interested people who are not mathematicians.  They renamed some things. For example, Mark Ronan in Symmetry and the Monster used the phrase "atom" for "simple group" presumably to get around the cognitive dissonance.  There are other examples in my posts.  

Math newbies  (math majors and other students who want to understand some aspect of mathematics).  These are the people abstractmath.org is aimed at. For such an audience you generally don't want to rename mathematical objects. In fact, you need to give them a glossary to explain the words and phrases used by people in the subject area.   

Postsecondary math students These people, especially the math majors, have many tasks:

  • Gain an intuitive understanding of the subject matter.
  • Understand in practice the logical role of definitions.
  • Learn how to come up with proofs.
  • Understand the ins and outs of mathematical English, particularly the presence of ordinary English words with technical definitions.
  • Understand and master the appropriate parts of the symbolic language of math — not just what the symbols mean but how to tell a statement from a symbolic name.

It is appropriate for books for math fans and math newbies to try to give an understanding of concepts without necessary proving theorems.  That is the aim of much of my work, which has more an emphasis on newbies than on fans. But math majors need as well the traditional emphasis on theorem and proof and clear correct explanations.

Lately, books such as Visual Group Theory have addressed beginning math majors, trying for much more effective ways to help the students develop good intuition, as well as getting into proofs and rigor. Visual Group Theory uses standard terminology.  You can contrast it with Symmetry and the Monster and The Mystery of the Prime Numbers (read the excellent reviews on Amazon) which are clearly aimed at math fans and use nonstandard terminology.  

Terminology for algebraic structures

I have been thinking about the section of Abstracting Algebra on binary operations.  Notice this terminology:


The "standard names" are those in Wikipedia.  They give little clue to the meaning, but at least most of them, except "magma" and "group", sound technical, cluing the reader in to the fact that they'd better learn the definition.

I came up with the names in the right column in an attempt to make some sense out of them.  The design is somewhat like the names of some chemical compounds.  This would be appropriate for a text aimed at math fans, but for them you probably wouldn't want to get into such an exhaustive list.

I wrote various pieces meant to be part of Abstracting Algebra using the terminology on the right, but thought better of it. I realized that I have been vacillating between thinking of AbAl as for math fans and thinking of it as for newbies. I guess I am plunking for newbies.

I will call groups groups, but for the other structures I will use the phrases in the middle column.  Since the book is for newbies I will include a table like the one above.  I also expect to use tree notation as I did in Visual Algebra II, and other graphical devices and interactive diagrams.


In the sixties magmas were called groupoids or monoids, both of which now mean something else.  I was really irritated when the word "magma" started showing up all over Wikipedia. It was the name given by Bourbaki, but it is a bad name because it means something else that is irrelevant.  A magma is just any binary operation. Why not just call it that?  

Well, I will tell you why, based on my experience in Ancient Times (the sixties and seventies) in math. (I started as an assistant professor at Western Reserve University in 1965). In those days people made a distinction between a binary operation and a "set with a binary operation on it".  Nowadays, the concept of function carries with it an implied domain and codomain.  So a binary operation is a function $m:S\times S\to S$.  Thinking of a binary operation this way was just beginning to appear in the common mathematical culture in the late 60's, and at least one person remarked to me: "I really like this new idea of thinking of 'plus' and 'times' as functions."  I was startled and thought (but did not say), "Well of course it is a function".  But then, in the late sixties I was being indoctrinated/perverted into category theory by the likes of John Isbell and Peter Hilton, both of whom were briefly at Case Western Reserve University.  (Also Paul Dedecker, who gave me a glimpse of Grothendieck's ideas).

Now, the idea that a binary operation is a function comes with the fact that it has a domain and a codomain, and specifically that the domain is the Cartesian square of the codomain.  People who didn't think that a binary operation was a function had to introduce the idea of the universe (universal algebraists) or the underlying set (category theorists): you had to specify it separately and introduce terminology such as $(S,\times)$ to denote the structure.   Wikipedia still does it mostly this way, and I am not about to start a revolution to get it to change its ways.


In the olden days, people thought of groups in this way:

  • A group is a set $G$ with a binary operation denoted by juxtaposition that is closed on $G$, meaning that if $a$ and $b$ are any elements of $G$, then $ab$ is in $G$.
  • The operation is associative, meaning that if $a,\ b,\ c\in G$, then $(ab)c=a(bc)$.
  • The operation has a unity element, meaning an element $e$ for which for any element $a\in G$, $ae=ea=a$.
  • For each element $a\in G$, there is an element $b$ for which $ab=ba=e$.

This is a better way to describe a group:

  • A group consist of a nullary operation e, a unary operation inv,  and a binary operation denoted by juxtaposition, all with the same codomain $G$. (A nullary operation is a map from a singleton set to a set and a unary operation is a map from a set to itself.)
  • The value of e is denoted by $e$ and the value of inv$(a)$ is denoted by $a^{-1}$.
  • These operations are subject to the following equations, true for all $a,\ b,\ c\in G$:


    • $ae=ea=a$.
    • $aa^{-1}=a^{-1}a=e$.
    • $(ab)c=a(bc)$.

This definition makes it clear that a group is a structure consisting of a set and three operations whose axioms are all equations.  It was formulated by people in universal algebra but you still see the older form in texts.

The old form is not wrong, it is merely inelegant.  With the old form, you have to prove the unity and inverses are unique before you can introduce notation, and more important, by making it clear that groups satisfy equational logic you get a lot of theorems for free: you construct products on the cartesian power of the underlying set, quotients by congruence relations, and other things. (Of course, in AbAl those theorem will be stated later than when groups are defined because the book is for newbies and you want lots of examples before theorems.)


  1. Three kinds of mathematical thinkers (G&G post)
  2. Technical meanings clash with everyday meanings (G&G post)
  3. Commonword names for technical concepts (G&G post)
  4. Renaming technical concepts (G&G post)
  5. Explaining higher math to beginners (G&G post)
  6. Visual Algebra II (G&G post)
  7. Monads for high school II: Lists (G&G post)
  8. The mystery of the prime numbers: a review (G&G post)
  9. Hersh, R. (1997a), "Math lingo vs. plain English: Double entendre". American Mathematical Monthly, volume 104, pages 48–51.
  10. Names (in abmath)
  11. Cognitive dissonance (in abmath)
Send to Kindle

Whole numbers

Sue Van Hattum wrote in response to a recent post:

I’d like to know what you think of my ‘abuse of terminology’. I teach at a community college, and I sometimes use incorrect terms (and tell the students I’m doing so), because they feel more aligned with common sense.

To me, and to most students, the phrase “whole numbers” sounds like it refers to anything that doesn’t need fractions to represent it, and should include negative numbers. (It then, of course, would mean the same thing that the word integers does.) So I try to avoid the phrase, mostly. But I sometimes say we’ll use it with the common sense meaning, not the official math meaning.

Her comments brought up a couple of things I want to blather about.

Official meaning

There is no such thing as an "official math meaning".  Mathematical notation has no governing authority and research mathematicians are too ornery to go along with one anyway.  There is a good reason for that attitude:  Mathematical research constantly causes us to rethink the relationship among different mathematical ideas, which can make us want to use names that show our new view of the ideas.  An excellent example of that is the evolution of the concept of "function" over the past 150 years, traced in the Wikipedia article.

What some "authorities" say about "whole number":

  • MathWorld  says that "whole number" is used to mean any of these:  Any positive integer, any nonnegative integer or any integer.
  • Wikipedia also allows all three meanings.
  • Webster's New World dictionary (of which I have been a consultant, but they didn't ask me about whole numbers!) gives "any integer" as a second meaning.
  • American Heritage Dictionary give "any integer" as the only meaning.
  • Someone stole my copy of Merriam Webster.

Common Sense Meaning

Mathematicians think about and talk any particular kind of math object using images and metaphors.  Sometimes (not very often) the name they give to a math object embodies a metaphor.  Examples:

  • A complex number is usually notated using two real parameters, so it looks more complicated than a real number.
  • "Rings" were originally called that because the first examples were integers (mod n) for some positive integer, and you can think of them as going around a clock showing n hours.

Unfortunately, much of the time the name of a kind of object contains a suggestive metaphor that is bad,  meaning that it suggests an erroneous picture or idea of what the object is like.

  • A "group" ought to be a bunch of things.  In other words, the word ought to mean "set".
  • The word "line" suggests that it ought to be a row of points.  That suggests that each point on a line ought to have one next to it.  But that's not true on the "real line"!

Sue's idea that the "common sense" meaning of "whole number" is "integer" refers, I think, to the built-in metaphor of the phrase "whole number" (unbroken number).

I urge math teachers to do these things:

  • Explain to your students that the same math word or phrase can mean different things in different books.
  • Convince your  students to avoid being fooled by the common-sense (metaphorical meaning) of a mathematical phrase.


Send to Kindle

Defining “category”

The concept of category is typically taught later in undergrad math than the concept of group is.  It is supposedly a more advanced concept.  Indeed, the typical examples of categories used in applications are more advanced than some of those in group theory (for example, symmetries of geometric shapes and operations on numbers).

Here are some thoughts on how categories could be taught as early as groups, if not earlier.

Nodes and arrows

Small finite categories can be pictured as a graph using nodes and arrows, together with a specification of the identity arrows and a definition of the composition.  (I am using the word “graph” the way category people use it:  a directed graph with possible multiple edges and loops.)

An example is the category pictured below with three objects and seven arrows. The composition is forced except for $kh$, which I hereby define to be $f$.

This way of picturing a category is  easy to grasp. The composite $kh$ visibly has to be either $f$ or $g$.  There is only one choice for the composite of any other composable pair.  Still, the choice of composite is not deducible directly by looking at the graph.

A first class in category theory using graphs as examples could start with this example, or the example in Note 1 below.  This example is nontrivial (never start any subject with trivial examples!) and easy to grasp, in this case using the extraordinary preprocessing your brain does with the input from your eyes.  The definition of category is complicated enough that you should probably present the graph and then give the definition while pointing to what each clause says about the graph.

Most abstract structures have several different ways of representing them. In contrast, when you discuss categorial concepts the standard object-and-arrow notation is the overwhelming favorite.  It reveals domains and codomains and composable pairs, in fact almost everything except which of several possible arrows the composite actually is.  If for example you try to define category using sets and functions as your running example, the student has to do a lot of on-the-go chunking — thinking of a set as a single object, of a set function (which may involve lots of complicated data) as a single chunk with a domain and a codomain, and so on.  But an example shown as a graph comes already chunked and in a picture that is guaranteed to be the most common kind of display they will see in discussions of categories.

After you do these examples, you can introduce trivial and simple graph examples in which the composition is entirely induced; for example these three:

(In case you are wondering, one of them is the empty category.)  I expect that you should also introduce another graph non-example in which associativity fails.

Multiplication tables

The multiplication table for a group is easy to understand, too, in the sense that it gives you a simple method of calculating the product of any two elements.  But it doesn’t provide a visual way to see the product as a category-as-graph does.  Of course, the graph representation works only for finite categories, just as the multiplication table works only for finite groups.

You can give a multiplication table for a small finite category, too, like the one below for the category above.  (“iA” means the identity arrow on A and composition, as usual in category theory, is right to left.) This is certainly more abstract than the graph picture, but it does hit you in the face with the fact that the multiplication is partial.


1. My suggested example of a category given as a graph shows clearly that you can define two different categorial structures on the graph.  One problem is that the two different structures are isomorphic categories.  In fact, if you engage the students in a discussion about these examples someone may notice that!  So you should probably also use the graph below,where you can define several different category structures that are not all isomorphic. 

2. Multiplication tables and categories-as-graphs-with-composition are extensional presentations.  This means they are presented with all their parts laid out in front of you.  Most groups and categories are given by definitions as accumulations of properties (see concept in the Handbook of Mathematical Discourse).  These definitions tend to make some requirements such as associativity obvious.

Students are sometimes bothered by extensional definitions.  “What are h and k (in the category above)?  What are a, b and c?” (in a group given as a set of letters and a multiplication table).

Send to Kindle

Function as map

This is a first draft of an article to eventually appear in abstractmath.

Images and metaphors

To explain a math concept, you need to explain how mathematicians think about the concept. This is what in abstractmath I call the images and metaphors carried by the concept. Of course you have to give the precise definition of the concept and basic theorems about it. But without the images and metaphors most students, not to mention mathematicians from a different field, will find it hard to prove much more than some immediate consequences of the definition. Nor will they have much sense of the place of the concept in math and applications.

Teachers will often explain the images and metaphors with handwaving and pictures in a fairly vague way. That is good to start with, but it’s important to get more precise about the images and metaphors. That’s because images and metaphors are often not quite a good fit for the concept — they may suggest things that are false and not suggest things that are true. For example, if a set is a container, why isn’t the element-of relation transitive? (A coin in a coinpurse in your pocket is a coin in your pocket.)

“A metaphor is a useful way to think about something, but it is not the same thing as the same thing.” (I think I stole that from the Economist.) Here, I am going to get precise with the notion that a function is a map. I am acting like a mathematician in “getting precise”, but I am getting precise about a metaphor, not about a mathematical object.

A function is a map

A map (ordinary paper map) of Minnesota has the property that each point on the paper represents a point in the state of Minnesota. This map can be represented as a mathematical function from a subset of a 2-sphere to {{\mathbb R}^2}. The function is a mathematical idealization of the relation between the state and the piece of paper, analogous to the mathematical description of the flight of a rocket ship as a function from {{\mathbb R}} to {{\mathbb R}^3}.

The Minnesota map-as-function is probably continuous and differentiable, and as is well known it can be angle preserving or area preserving but not both.

So you can say there is a point on the paper that represents the location of the statue of Paul Bunyan in Bemidji. There is a set of points that represents the part of the Mississippi River that lies in Minnesota. And so on.

A function has an image. If you think about it you will realize that the image is just a certain portion of the piece of paper. Knowing that a particular point on the paper is in the image of the function is not the information contained in what we call “this map of Minnesota”.

This yields what I consider a basic insight about function-as-map:  The map contains the information about the preimage of each point on the paper map. So:

The map in the sense of a “map of Minnesota” is represented by the whole function, not merely by the image.

I think that is the essence of the metaphor that a function is a map. And I don’t think newbies in abstractmath always understand that relationship.

A morphism is a map

The preceding discussion doesn’t really represent how we think of a paper map of Minnesota. We don’t think in terms of points at all. What we see are marks on the map showing where some particular things are. If it is a road map it has marks showing a lot of roads, a lot of towns, and maybe county boundaries. If it is a topographical map it will show level curves showing elevation. So a paper map of a state should be represented by a structure preserving map, a morphism. Road maps preserve some structure, topographical maps preserve other structure.

The things we call “maps” in math are usually morphisms. For example, you could say that every simple closed curve in the plane is an equivalence class of maps from the unit circle to the plane. Here equivalence class meaning forget the parametrization.

The very fact that I have to mention forgetting the parametrization is that the commonest mathematical way to talk about morphisms is as point-to-point maps with certain properties. But we think about a simple closed curve in the plane as just a distorted circle. The point-to-point correspondence doesn’t matter. So this example is really talking about a morphism as a shape-preserving map. Mathematicians introduced points into talking about preserving shapes in the nineteenth century and we are so used to doing that that we think we have to have points for all maps.

Not that points aren’t useful. But I am analyzing the metaphor here, not the technical side of the math.

Groups are functors

People who don’t do category theory think the idea of a mathematical structure as a functor is weird. From the point of view of the preceding discussion, a particular group is a functor from the generic group to some category. (The target category is Set if the group is discrete, Top if it is a topological group, and so on.)

The generic group is a group in a category called its theory or sketch that is just big enough to let it be a group. If the theory is the category with finite products that is just big enough then it is the Lawvere theory of the group. If it is a topos that is just big enough then it is the classifying topos of groups. The theory in this sense is equivalent to some theory in the sense of string-based logic, for example the signature-with-axioms (equational theory) or the first order theory of groups. Johnstone’s Elephant book is the best place to find the translation between these ideas.

A particular group is represented by a finite-limit-preserving functor on the algebraic theory, or by a logical functor on the classifying topos, and so on; constructions which bring with them the right concept of group homomorphisms as well (they will be any natural transformations).

The way we talk about groups mimics the way we talk about maps. We look at the symmetric group on five letters and say its multiplication is noncommutative. “Its multiplication” tells us that when we talk about this group we are talking about the functor, not just the values of the functor on objects. We use the same symbols of juxtaposition for multiplication in any group, “{1}” or “{e}” for the identity, “{a^{-1}}” for the inverse of {a}, and so on. That is because we are really talking about the multiplication, identity and inverse function in the generic group — they really are the same for all groups. That is because a group is not its underlying set, it is a functor. Just like the map of Minnesota “is” the whole function from the state to the paper, not just the image of the function.

Send to Kindle

"Automorphisms of group extensions" augmented

There has recently been an uptick in citations to my paper [1].  Several works over the years ([2], [3], [4]) have given proofs of my theorem that are easier to understand and more informative, so I have posted a package here that contains the original paper, a correction I published later, and the references below.  Malfait’s article in particular embeds my exact sequence into a remarkable cube of exact sequences.

[1] Charles Wells, Automorphisms of group extensions, Trans. Amer. Math. Soc, 155 (1970), 189-194.

[2] Kung Wei Yang, Isomorphisms of group extensions.  Pacific J. Math. Volume 50, Number 1 (1974), 299-304.

[3] D.J.S. Robinson, Applications of cohomology to the theory of groups, Groups – St. Andrews 1981, London Math. Soc. Lecture Notes vol. 71 (1982), pp. 46–80.

[4] Wim Malfait, The (outer) automorphism group of a group extension.   Bull. Belg. Math Soc. 9 (2002), 361-372.

Send to Kindle

Mathematical concepts

This post was triggered by John Armstrong’s comment on my last post.

We need  to distinguish two ideas: representations of a mathematical concept and the total concept.  (I will say more about terminology later.)

Example: We can construct the quotient of the kernel of a group homomorphism by taking its cosets and defining a multiplication on them.  We can construct the image of the homomorphism by take the set of values of the homomorphism and using the multiplication induced by the codomain group.   The quotient group and the image are the same mathematical structure in the sense that anything useful you can say about one is true of the other.   For example, it may be useful to know the cardinality of the quotient (image) but it is not useful to know what its elements are.

But hold on, as the Australians say, if we knew that the codomain was an Abelian group then we would know that the quotient group was abelian because the elements of the image form a subgroup of the codomain. (But the Australians I know wouldn’t say that.)

Now that kind of thinking is based on the idea that the elements of the image are “really” elements of the codomain whereas elements of the quotients are “really” subsets of the domain.  That is outmoded thinking.  The image and the quotient are the same in all important aspects because they are naturally isomorphic.   We should think of the quotient as just as much as subgroup of the codomain as the image is.  John Baez (I think) would say that to ask whether the subgroup embedding is the identity on elements or not is an evil question.

Let’s step back and look at what is going on here.  The definition of the quotient group is a construction using cosets.  The definition of the image is a construction using values of the homomorphism.  Those are two different specific  representations of the same concept.

But what is the concept, as distinct from its representations?  Intuitively, it is

  • All the constructions made possible by the definition of the concept.
  • All the statements that are true about the concept.

(That is not precise.)

The total concept is like the clone plus the equational theory of a specific type of algebra in the sense of universal algebra.  The clone is all the operations you can construct knowing the given signature and equations and the equational theory is the set of all equations that follow from them.  That is one way of describing it.  Another is the monad in Set that gives the type of algebra — the operations are the arrows and the equations are the commutative diagrams.

Note: The preceding description of the monad is not quite right.  Also the whole discussion omits mention of the fact that we are in the world (doctrine) of universal algebra.  In the world of first order logic, for example, we need to refer to the classifying topos of the category of algebras of that type (or to its first order theory).


We need better terminology for all this.  I am not going to propose better terminology, so this is a shaggy dog story.

Math ed people talk about a particular concept image of a concept as well as the total schema of the concept.

In categorical logic, we talk about the sketch or presentation of the concept vs. the theory. The theory is a category (of the kind appropriate to the doctrine) that contains all the possible constructions and commutative diagrams that follow from the presentation.

In this post I have used “total concept” to refer to the schema or theory.  I have referred the particular things as  “representations” (for example construct the image of a homomorphism by cosets or by values of the homomorphism).

“Representation” does not have the same connotations as “presentation”.  Indeed a presentation of a group and a representation of a group are mathematically  two different things.  But I suspect they are two different aspects of the same idea.

All this needs to be untangled.  Maybe we should come up with two completely arbitrary words, like “dostak” and “dosh”.

Send to Kindle