Naming mathematical objects

Commonword names confuse

Many technical words and phrases in math are ordinary English words ("commonwords") that are assigned a different and precisely defined mathematical meaning.  

  • Group  This sounds to the "layman" as if it ought to mean the same things as "set".  You get no clue from the name that it involves a binary operation with certain properties.  
  • Formula  In some texts on logic, a formula is a precisely defined expression that becomes a true-or-false sentence (in the semantics) when all its variables are instantiated.  So $(\forall x)(x>0)$ is a formula.  The word "formula" in ordinary English makes you think of things like "$\textrm{H}_2\textrm{O}$", which has no semantics that makes it true or false — it is a symbolic expression for a name.
  • Simple group This has a technical meaning: a group with no nontrivial normal subgroup.  The Monster Group is "simple".  Yes, the technical meaning is motivated by the usual concept of "simple", but to say the Monster Group is simple causes cognitive dissonance.

Beginning students come with the (generally subconscious) expectation that they will pick up clues about the meanings of words from connotations they are already familiar with, plus things the teacher says using those words.  They think in terms of refining an understanding they already have.  This is more or less what happens in most non-math classes.  They need to be taught what definition means to a mathematician.

Names that don't confuse but may intimidate

Other technical names in math don't cause the problems that commonwords cause.

Named after somebody The phrase "Hausdorff space" leads a math student to understand that it has a technical meaning.  They may not even know it is named after a person, but it screams "geek word" and "you don't know what it means".  That is a signal that you can find out what it means.  You don't assume you know its meaning. 

New made-up words  Words such as "affine", "gerbe"  and "logarithm" are made up of words from other languages and don't have an ordinary English meaning.  Acronyms such as "QED", "RSA" and "FOIL" don't occur often.  I don't know of any math objects other than "RSA algorithm" that have an acronymic name.  (No doubt I will think of one the minute I click the Publish button.)  Whole-cloth words such as "googol" are also rare.  All these sorts of words would be good to name new things since they do not fool the readers into thinking they know what the words mean.

Both types of words avoid fooling the student into thinking they know what the words mean, but some students are intimidated by the use of words they haven't seen before.  They seem to come to class ready to be snowed.  A minority of my students over my 35 years of teaching were like that, but that attitude was a real problem for them.

Audience

You can write for several different audiences.

Math fans (non-mathematicians who are interested in math and read books about it occasionally) In my posts Explaining higher math to beginners and in Renaming technical conceptsI wrote about several books aimed at explaining some fairly deep math to interested people who are not mathematicians.  They renamed some things. For example, Mark Ronan in Symmetry and the Monster used the phrase "atom" for "simple group" presumably to get around the cognitive dissonance.  There are other examples in my posts.  

Math newbies  (math majors and other students who want to understand some aspect of mathematics).  These are the people abstractmath.org is aimed at. For such an audience you generally don't want to rename mathematical objects. In fact, you need to give them a glossary to explain the words and phrases used by people in the subject area.   

Postsecondary math students These people, especially the math majors, have many tasks:

  • Gain an intuitive understanding of the subject matter.
  • Understand in practice the logical role of definitions.
  • Learn how to come up with proofs.
  • Understand the ins and outs of mathematical English, particularly the presence of ordinary English words with technical definitions.
  • Understand and master the appropriate parts of the symbolic language of math — not just what the symbols mean but how to tell a statement from a symbolic name.

It is appropriate for books for math fans and math newbies to try to give an understanding of concepts without necessary proving theorems.  That is the aim of much of my work, which has more an emphasis on newbies than on fans. But math majors need as well the traditional emphasis on theorem and proof and clear correct explanations.

Lately, books such as Visual Group Theory have addressed beginning math majors, trying for much more effective ways to help the students develop good intuition, as well as getting into proofs and rigor. Visual Group Theory uses standard terminology.  You can contrast it with Symmetry and the Monster and The Mystery of the Prime Numbers (read the excellent reviews on Amazon) which are clearly aimed at math fans and use nonstandard terminology.  

Terminology for algebraic structures

I have been thinking about the section of Abstracting Algebra on binary operations.  Notice this terminology:

boptable

The "standard names" are those in Wikipedia.  They give little clue to the meaning, but at least most of them, except "magma" and "group", sound technical, cluing the reader in to the fact that they'd better learn the definition.

I came up with the names in the right column in an attempt to make some sense out of them.  The design is somewhat like the names of some chemical compounds.  This would be appropriate for a text aimed at math fans, but for them you probably wouldn't want to get into such an exhaustive list.

I wrote various pieces meant to be part of Abstracting Algebra using the terminology on the right, but thought better of it. I realized that I have been vacillating between thinking of AbAl as for math fans and thinking of it as for newbies. I guess I am plunking for newbies.

I will call groups groups, but for the other structures I will use the phrases in the middle column.  Since the book is for newbies I will include a table like the one above.  I also expect to use tree notation as I did in Visual Algebra II, and other graphical devices and interactive diagrams.

Magmas

In the sixties magmas were called groupoids or monoids, both of which now mean something else.  I was really irritated when the word "magma" started showing up all over Wikipedia. It was the name given by Bourbaki, but it is a bad name because it means something else that is irrelevant.  A magma is just any binary operation. Why not just call it that?  

Well, I will tell you why, based on my experience in Ancient Times (the sixties and seventies) in math. (I started as an assistant professor at Western Reserve University in 1965). In those days people made a distinction between a binary operation and a "set with a binary operation on it".  Nowadays, the concept of function carries with it an implied domain and codomain.  So a binary operation is a function $m:S\times S\to S$.  Thinking of a binary operation this way was just beginning to appear in the common mathematical culture in the late 60's, and at least one person remarked to me: "I really like this new idea of thinking of 'plus' and 'times' as functions."  I was startled and thought (but did not say), "Well of course it is a function".  But then, in the late sixties I was being indoctrinated/perverted into category theory by the likes of John Isbell and Peter Hilton, both of whom were briefly at Case Western Reserve University.  (Also Paul Dedecker, who gave me a glimpse of Grothendieck's ideas).

Now, the idea that a binary operation is a function comes with the fact that it has a domain and a codomain, and specifically that the domain is the Cartesian square of the codomain.  People who didn't think that a binary operation was a function had to introduce the idea of the universe (universal algebraists) or the underlying set (category theorists): you had to specify it separately and introduce terminology such as $(S,\times)$ to denote the structure.   Wikipedia still does it mostly this way, and I am not about to start a revolution to get it to change its ways.

Groups

In the olden days, people thought of groups in this way:

  • A group is a set $G$ with a binary operation denoted by juxtaposition that is closed on $G$, meaning that if $a$ and $b$ are any elements of $G$, then $ab$ is in $G$.
  • The operation is associative, meaning that if $a,\ b,\ c\in G$, then $(ab)c=a(bc)$.
  • The operation has a unity element, meaning an element $e$ for which for any element $a\in G$, $ae=ea=a$.
  • For each element $a\in G$, there is an element $b$ for which $ab=ba=e$.

This is a better way to describe a group:

  • A group consist of a nullary operation e, a unary operation inv,  and a binary operation denoted by juxtaposition, all with the same codomain $G$. (A nullary operation is a map from a singleton set to a set and a unary operation is a map from a set to itself.)
  • The value of e is denoted by $e$ and the value of inv$(a)$ is denoted by $a^{-1}$.
  • These operations are subject to the following equations, true for all $a,\ b,\ c\in G$:

     

    • $ae=ea=a$.
    • $aa^{-1}=a^{-1}a=e$.
    • $(ab)c=a(bc)$.

This definition makes it clear that a group is a structure consisting of a set and three operations whose axioms are all equations.  It was formulated by people in universal algebra but you still see the older form in texts.

The old form is not wrong, it is merely inelegant.  With the old form, you have to prove the unity and inverses are unique before you can introduce notation, and more important, by making it clear that groups satisfy equational logic you get a lot of theorems for free: you construct products on the cartesian power of the underlying set, quotients by congruence relations, and other things. (Of course, in AbAl those theorem will be stated later than when groups are defined because the book is for newbies and you want lots of examples before theorems.)

References

  1. Three kinds of mathematical thinkers (G&G post)
  2. Technical meanings clash with everyday meanings (G&G post)
  3. Commonword names for technical concepts (G&G post)
  4. Renaming technical concepts (G&G post)
  5. Explaining higher math to beginners (G&G post)
  6. Visual Algebra II (G&G post)
  7. Monads for high school II: Lists (G&G post)
  8. The mystery of the prime numbers: a review (G&G post)
  9. Hersh, R. (1997a), "Math lingo vs. plain English: Double entendre". American Mathematical Monthly, volume 104, pages 48–51.
  10. Names (in abmath)
  11. Cognitive dissonance (in abmath)

Whole numbers

Sue Van Hattum wrote in response to a recent post:

I’d like to know what you think of my ‘abuse of terminology’. I teach at a community college, and I sometimes use incorrect terms (and tell the students I’m doing so), because they feel more aligned with common sense.

To me, and to most students, the phrase “whole numbers” sounds like it refers to anything that doesn’t need fractions to represent it, and should include negative numbers. (It then, of course, would mean the same thing that the word integers does.) So I try to avoid the phrase, mostly. But I sometimes say we’ll use it with the common sense meaning, not the official math meaning.

Her comments brought up a couple of things I want to blather about.

Official meaning

There is no such thing as an "official math meaning".  Mathematical notation has no governing authority and research mathematicians are too ornery to go along with one anyway.  There is a good reason for that attitude:  Mathematical research constantly causes us to rethink the relationship among different mathematical ideas, which can make us want to use names that show our new view of the ideas.  An excellent example of that is the evolution of the concept of "function" over the past 150 years, traced in the Wikipedia article.

What some "authorities" say about "whole number":

  • MathWorld  says that "whole number" is used to mean any of these:  Any positive integer, any nonnegative integer or any integer.
  • Wikipedia also allows all three meanings.
  • Webster's New World dictionary (of which I have been a consultant, but they didn't ask me about whole numbers!) gives "any integer" as a second meaning.
  • American Heritage Dictionary give "any integer" as the only meaning.
  • Someone stole my copy of Merriam Webster.

Common Sense Meaning

Mathematicians think about and talk any particular kind of math object using images and metaphors.  Sometimes (not very often) the name they give to a math object embodies a metaphor.  Examples:

  • A complex number is usually notated using two real parameters, so it looks more complicated than a real number.
  • "Rings" were originally called that because the first examples were integers (mod n) for some positive integer, and you can think of them as going around a clock showing n hours.

Unfortunately, much of the time the name of a kind of object contains a suggestive metaphor that is bad,  meaning that it suggests an erroneous picture or idea of what the object is like.

  • A "group" ought to be a bunch of things.  In other words, the word ought to mean "set".
  • The word "line" suggests that it ought to be a row of points.  That suggests that each point on a line ought to have one next to it.  But that's not true on the "real line"!

Sue's idea that the "common sense" meaning of "whole number" is "integer" refers, I think, to the built-in metaphor of the phrase "whole number" (unbroken number).

I urge math teachers to do these things:

  • Explain to your students that the same math word or phrase can mean different things in different books.
  • Convince your  students to avoid being fooled by the common-sense (metaphorical meaning) of a mathematical phrase.

 

Defining “category”

The concept of category is typically taught later in undergrad math than the concept of group is.  It is supposedly a more advanced concept.  Indeed, the typical examples of categories used in applications are more advanced than some of those in group theory (for example, symmetries of geometric shapes and operations on numbers).

Here are some thoughts on how categories could be taught as early as groups, if not earlier.

Nodes and arrows

Small finite categories can be pictured as a graph using nodes and arrows, together with a specification of the identity arrows and a definition of the composition.  (I am using the word “graph” the way category people use it:  a directed graph with possible multiple edges and loops.)

An example is the category pictured below with three objects and seven arrows. The composition is forced except for $kh$, which I hereby define to be $f$.

This way of picturing a category is  easy to grasp. The composite $kh$ visibly has to be either $f$ or $g$.  There is only one choice for the composite of any other composable pair.  Still, the choice of composite is not deducible directly by looking at the graph.

A first class in category theory using graphs as examples could start with this example, or the example in Note 1 below.  This example is nontrivial (never start any subject with trivial examples!) and easy to grasp, in this case using the extraordinary preprocessing your brain does with the input from your eyes.  The definition of category is complicated enough that you should probably present the graph and then give the definition while pointing to what each clause says about the graph.

Most abstract structures have several different ways of representing them. In contrast, when you discuss categorial concepts the standard object-and-arrow notation is the overwhelming favorite.  It reveals domains and codomains and composable pairs, in fact almost everything except which of several possible arrows the composite actually is.  If for example you try to define category using sets and functions as your running example, the student has to do a lot of on-the-go chunking — thinking of a set as a single object, of a set function (which may involve lots of complicated data) as a single chunk with a domain and a codomain, and so on.  But an example shown as a graph comes already chunked and in a picture that is guaranteed to be the most common kind of display they will see in discussions of categories.

After you do these examples, you can introduce trivial and simple graph examples in which the composition is entirely induced; for example these three:

(In case you are wondering, one of them is the empty category.)  I expect that you should also introduce another graph non-example in which associativity fails.

Multiplication tables

The multiplication table for a group is easy to understand, too, in the sense that it gives you a simple method of calculating the product of any two elements.  But it doesn’t provide a visual way to see the product as a category-as-graph does.  Of course, the graph representation works only for finite categories, just as the multiplication table works only for finite groups.

You can give a multiplication table for a small finite category, too, like the one below for the category above.  (“iA” means the identity arrow on A and composition, as usual in category theory, is right to left.) This is certainly more abstract than the graph picture, but it does hit you in the face with the fact that the multiplication is partial.

Notes

1. My suggested example of a category given as a graph shows clearly that you can define two different categorial structures on the graph.  One problem is that the two different structures are isomorphic categories.  In fact, if you engage the students in a discussion about these examples someone may notice that!  So you should probably also use the graph below,where you can define several different category structures that are not all isomorphic. 

2. Multiplication tables and categories-as-graphs-with-composition are extensional presentations.  This means they are presented with all their parts laid out in front of you.  Most groups and categories are given by definitions as accumulations of properties (see concept in the Handbook of Mathematical Discourse).  These definitions tend to make some requirements such as associativity obvious.

Students are sometimes bothered by extensional definitions.  “What are h and k (in the category above)?  What are a, b and c?” (in a group given as a set of letters and a multiplication table).

Function as map

This is a first draft of an article to eventually appear in abstractmath.

Images and metaphors

To explain a math concept, you need to explain how mathematicians think about the concept. This is what in abstractmath I call the images and metaphors carried by the concept. Of course you have to give the precise definition of the concept and basic theorems about it. But without the images and metaphors most students, not to mention mathematicians from a different field, will find it hard to prove much more than some immediate consequences of the definition. Nor will they have much sense of the place of the concept in math and applications.

Teachers will often explain the images and metaphors with handwaving and pictures in a fairly vague way. That is good to start with, but it’s important to get more precise about the images and metaphors. That’s because images and metaphors are often not quite a good fit for the concept — they may suggest things that are false and not suggest things that are true. For example, if a set is a container, why isn’t the element-of relation transitive? (A coin in a coinpurse in your pocket is a coin in your pocket.)

“A metaphor is a useful way to think about something, but it is not the same thing as the same thing.” (I think I stole that from the Economist.) Here, I am going to get precise with the notion that a function is a map. I am acting like a mathematician in “getting precise”, but I am getting precise about a metaphor, not about a mathematical object.

A function is a map

A map (ordinary paper map) of Minnesota has the property that each point on the paper represents a point in the state of Minnesota. This map can be represented as a mathematical function from a subset of a 2-sphere to $latex {{\mathbb R}^2}&fg=000000$. The function is a mathematical idealization of the relation between the state and the piece of paper, analogous to the mathematical description of the flight of a rocket ship as a function from $latex {{\mathbb R}}&fg=000000$ to $latex {{\mathbb R}^3}&fg=000000$.

The Minnesota map-as-function is probably continuous and differentiable, and as is well known it can be angle preserving or area preserving but not both.

So you can say there is a point on the paper that represents the location of the statue of Paul Bunyan in Bemidji. There is a set of points that represents the part of the Mississippi River that lies in Minnesota. And so on.

A function has an image. If you think about it you will realize that the image is just a certain portion of the piece of paper. Knowing that a particular point on the paper is in the image of the function is not the information contained in what we call “this map of Minnesota”.

This yields what I consider a basic insight about function-as-map:  The map contains the information about the preimage of each point on the paper map. So:

The map in the sense of a “map of Minnesota” is represented by the whole function, not merely by the image.

I think that is the essence of the metaphor that a function is a map. And I don’t think newbies in abstractmath always understand that relationship.

A morphism is a map

The preceding discussion doesn’t really represent how we think of a paper map of Minnesota. We don’t think in terms of points at all. What we see are marks on the map showing where some particular things are. If it is a road map it has marks showing a lot of roads, a lot of towns, and maybe county boundaries. If it is a topographical map it will show level curves showing elevation. So a paper map of a state should be represented by a structure preserving map, a morphism. Road maps preserve some structure, topographical maps preserve other structure.

The things we call “maps” in math are usually morphisms. For example, you could say that every simple closed curve in the plane is an equivalence class of maps from the unit circle to the plane. Here equivalence class meaning forget the parametrization.

The very fact that I have to mention forgetting the parametrization is that the commonest mathematical way to talk about morphisms is as point-to-point maps with certain properties. But we think about a simple closed curve in the plane as just a distorted circle. The point-to-point correspondence doesn’t matter. So this example is really talking about a morphism as a shape-preserving map. Mathematicians introduced points into talking about preserving shapes in the nineteenth century and we are so used to doing that that we think we have to have points for all maps.

Not that points aren’t useful. But I am analyzing the metaphor here, not the technical side of the math.

Groups are functors

People who don’t do category theory think the idea of a mathematical structure as a functor is weird. From the point of view of the preceding discussion, a particular group is a functor from the generic group to some category. (The target category is Set if the group is discrete, Top if it is a topological group, and so on.)

The generic group is a group in a category called its theory or sketch that is just big enough to let it be a group. If the theory is the category with finite products that is just big enough then it is the Lawvere theory of the group. If it is a topos that is just big enough then it is the classifying topos of groups. The theory in this sense is equivalent to some theory in the sense of string-based logic, for example the signature-with-axioms (equational theory) or the first order theory of groups. Johnstone’s Elephant book is the best place to find the translation between these ideas.

A particular group is represented by a finite-limit-preserving functor on the algebraic theory, or by a logical functor on the classifying topos, and so on; constructions which bring with them the right concept of group homomorphisms as well (they will be any natural transformations).

The way we talk about groups mimics the way we talk about maps. We look at the symmetric group on five letters and say its multiplication is noncommutative. “Its multiplication” tells us that when we talk about this group we are talking about the functor, not just the values of the functor on objects. We use the same symbols of juxtaposition for multiplication in any group, “$latex {1}&fg=000000$” or “$latex {e}&fg=000000$” for the identity, “$latex {a^{-1}}&fg=000000$” for the inverse of $latex {a}&fg=000000$, and so on. That is because we are really talking about the multiplication, identity and inverse function in the generic group — they really are the same for all groups. That is because a group is not its underlying set, it is a functor. Just like the map of Minnesota “is” the whole function from the state to the paper, not just the image of the function.

"Automorphisms of group extensions" augmented

There has recently been an uptick in citations to my paper [1].  Several works over the years ([2], [3], [4]) have given proofs of my theorem that are easier to understand and more informative, so I have posted a package here that contains the original paper, a correction I published later, and the references below.  Malfait’s article in particular embeds my exact sequence into a remarkable cube of exact sequences.

[1] Charles Wells, Automorphisms of group extensions, Trans. Amer. Math. Soc, 155 (1970), 189-194.

[2] Kung Wei Yang, Isomorphisms of group extensions.  Pacific J. Math. Volume 50, Number 1 (1974), 299-304.

[3] D.J.S. Robinson, Applications of cohomology to the theory of groups, Groups – St. Andrews 1981, London Math. Soc. Lecture Notes vol. 71 (1982), pp. 46–80.

[4] Wim Malfait, The (outer) automorphism group of a group extension.   Bull. Belg. Math Soc. 9 (2002), 361-372.

Mathematical concepts

This post was triggered by John Armstrong’s comment on my last post.

We need  to distinguish two ideas: representations of a mathematical concept and the total concept.  (I will say more about terminology later.)

Example: We can construct the quotient of the kernel of a group homomorphism by taking its cosets and defining a multiplication on them.  We can construct the image of the homomorphism by take the set of values of the homomorphism and using the multiplication induced by the codomain group.   The quotient group and the image are the same mathematical structure in the sense that anything useful you can say about one is true of the other.   For example, it may be useful to know the cardinality of the quotient (image) but it is not useful to know what its elements are.

But hold on, as the Australians say, if we knew that the codomain was an Abelian group then we would know that the quotient group was abelian because the elements of the image form a subgroup of the codomain. (But the Australians I know wouldn’t say that.)

Now that kind of thinking is based on the idea that the elements of the image are “really” elements of the codomain whereas elements of the quotients are “really” subsets of the domain.  That is outmoded thinking.  The image and the quotient are the same in all important aspects because they are naturally isomorphic.   We should think of the quotient as just as much as subgroup of the codomain as the image is.  John Baez (I think) would say that to ask whether the subgroup embedding is the identity on elements or not is an evil question.

Let’s step back and look at what is going on here.  The definition of the quotient group is a construction using cosets.  The definition of the image is a construction using values of the homomorphism.  Those are two different specific  representations of the same concept.

But what is the concept, as distinct from its representations?  Intuitively, it is

  • All the constructions made possible by the definition of the concept.
  • All the statements that are true about the concept.

(That is not precise.)

The total concept is like the clone plus the equational theory of a specific type of algebra in the sense of universal algebra.  The clone is all the operations you can construct knowing the given signature and equations and the equational theory is the set of all equations that follow from them.  That is one way of describing it.  Another is the monad in Set that gives the type of algebra — the operations are the arrows and the equations are the commutative diagrams.

Note: The preceding description of the monad is not quite right.  Also the whole discussion omits mention of the fact that we are in the world (doctrine) of universal algebra.  In the world of first order logic, for example, we need to refer to the classifying topos of the category of algebras of that type (or to its first order theory).

Terminology

We need better terminology for all this.  I am not going to propose better terminology, so this is a shaggy dog story.

Math ed people talk about a particular concept image of a concept as well as the total schema of the concept.

In categorical logic, we talk about the sketch or presentation of the concept vs. the theory. The theory is a category (of the kind appropriate to the doctrine) that contains all the possible constructions and commutative diagrams that follow from the presentation.

In this post I have used “total concept” to refer to the schema or theory.  I have referred the particular things as  “representations” (for example construct the image of a homomorphism by cosets or by values of the homomorphism).

“Representation” does not have the same connotations as “presentation”.  Indeed a presentation of a group and a representation of a group are mathematically  two different things.  But I suspect they are two different aspects of the same idea.

All this needs to be untangled.  Maybe we should come up with two completely arbitrary words, like “dostak” and “dosh”.