Naming mathematical objects

Commonword names confuse

Many technical words and phrases in math are ordinary English words ("commonwords") that are assigned a different and precisely defined mathematical meaning.  

  • Group  This sounds to the "layman" as if it ought to mean the same things as "set".  You get no clue from the name that it involves a binary operation with certain properties.  
  • Formula  In some texts on logic, a formula is a precisely defined expression that becomes a true-or-false sentence (in the semantics) when all its variables are instantiated.  So $(\forall x)(x>0)$ is a formula.  The word "formula" in ordinary English makes you think of things like "$\textrm{H}_2\textrm{O}$", which has no semantics that makes it true or false — it is a symbolic expression for a name.
  • Simple group This has a technical meaning: a group with no nontrivial normal subgroup.  The Monster Group is "simple".  Yes, the technical meaning is motivated by the usual concept of "simple", but to say the Monster Group is simple causes cognitive dissonance.

Beginning students come with the (generally subconscious) expectation that they will pick up clues about the meanings of words from connotations they are already familiar with, plus things the teacher says using those words.  They think in terms of refining an understanding they already have.  This is more or less what happens in most non-math classes.  They need to be taught what definition means to a mathematician.

Names that don't confuse but may intimidate

Other technical names in math don't cause the problems that commonwords cause.

Named after somebody The phrase "Hausdorff space" leads a math student to understand that it has a technical meaning.  They may not even know it is named after a person, but it screams "geek word" and "you don't know what it means".  That is a signal that you can find out what it means.  You don't assume you know its meaning. 

New made-up words  Words such as "affine", "gerbe"  and "logarithm" are made up of words from other languages and don't have an ordinary English meaning.  Acronyms such as "QED", "RSA" and "FOIL" don't occur often.  I don't know of any math objects other than "RSA algorithm" that have an acronymic name.  (No doubt I will think of one the minute I click the Publish button.)  Whole-cloth words such as "googol" are also rare.  All these sorts of words would be good to name new things since they do not fool the readers into thinking they know what the words mean.

Both types of words avoid fooling the student into thinking they know what the words mean, but some students are intimidated by the use of words they haven't seen before.  They seem to come to class ready to be snowed.  A minority of my students over my 35 years of teaching were like that, but that attitude was a real problem for them.

Audience

You can write for several different audiences.

Math fans (non-mathematicians who are interested in math and read books about it occasionally) In my posts Explaining higher math to beginners and in Renaming technical conceptsI wrote about several books aimed at explaining some fairly deep math to interested people who are not mathematicians.  They renamed some things. For example, Mark Ronan in Symmetry and the Monster used the phrase "atom" for "simple group" presumably to get around the cognitive dissonance.  There are other examples in my posts.  

Math newbies  (math majors and other students who want to understand some aspect of mathematics).  These are the people abstractmath.org is aimed at. For such an audience you generally don't want to rename mathematical objects. In fact, you need to give them a glossary to explain the words and phrases used by people in the subject area.   

Postsecondary math students These people, especially the math majors, have many tasks:

  • Gain an intuitive understanding of the subject matter.
  • Understand in practice the logical role of definitions.
  • Learn how to come up with proofs.
  • Understand the ins and outs of mathematical English, particularly the presence of ordinary English words with technical definitions.
  • Understand and master the appropriate parts of the symbolic language of math — not just what the symbols mean but how to tell a statement from a symbolic name.

It is appropriate for books for math fans and math newbies to try to give an understanding of concepts without necessary proving theorems.  That is the aim of much of my work, which has more an emphasis on newbies than on fans. But math majors need as well the traditional emphasis on theorem and proof and clear correct explanations.

Lately, books such as Visual Group Theory have addressed beginning math majors, trying for much more effective ways to help the students develop good intuition, as well as getting into proofs and rigor. Visual Group Theory uses standard terminology.  You can contrast it with Symmetry and the Monster and The Mystery of the Prime Numbers (read the excellent reviews on Amazon) which are clearly aimed at math fans and use nonstandard terminology.  

Terminology for algebraic structures

I have been thinking about the section of Abstracting Algebra on binary operations.  Notice this terminology:

boptable

The "standard names" are those in Wikipedia.  They give little clue to the meaning, but at least most of them, except "magma" and "group", sound technical, cluing the reader in to the fact that they'd better learn the definition.

I came up with the names in the right column in an attempt to make some sense out of them.  The design is somewhat like the names of some chemical compounds.  This would be appropriate for a text aimed at math fans, but for them you probably wouldn't want to get into such an exhaustive list.

I wrote various pieces meant to be part of Abstracting Algebra using the terminology on the right, but thought better of it. I realized that I have been vacillating between thinking of AbAl as for math fans and thinking of it as for newbies. I guess I am plunking for newbies.

I will call groups groups, but for the other structures I will use the phrases in the middle column.  Since the book is for newbies I will include a table like the one above.  I also expect to use tree notation as I did in Visual Algebra II, and other graphical devices and interactive diagrams.

Magmas

In the sixties magmas were called groupoids or monoids, both of which now mean something else.  I was really irritated when the word "magma" started showing up all over Wikipedia. It was the name given by Bourbaki, but it is a bad name because it means something else that is irrelevant.  A magma is just any binary operation. Why not just call it that?  

Well, I will tell you why, based on my experience in Ancient Times (the sixties and seventies) in math. (I started as an assistant professor at Western Reserve University in 1965). In those days people made a distinction between a binary operation and a "set with a binary operation on it".  Nowadays, the concept of function carries with it an implied domain and codomain.  So a binary operation is a function $m:S\times S\to S$.  Thinking of a binary operation this way was just beginning to appear in the common mathematical culture in the late 60's, and at least one person remarked to me: "I really like this new idea of thinking of 'plus' and 'times' as functions."  I was startled and thought (but did not say), "Well of course it is a function".  But then, in the late sixties I was being indoctrinated/perverted into category theory by the likes of John Isbell and Peter Hilton, both of whom were briefly at Case Western Reserve University.  (Also Paul Dedecker, who gave me a glimpse of Grothendieck's ideas).

Now, the idea that a binary operation is a function comes with the fact that it has a domain and a codomain, and specifically that the domain is the Cartesian square of the codomain.  People who didn't think that a binary operation was a function had to introduce the idea of the universe (universal algebraists) or the underlying set (category theorists): you had to specify it separately and introduce terminology such as $(S,\times)$ to denote the structure.   Wikipedia still does it mostly this way, and I am not about to start a revolution to get it to change its ways.

Groups

In the olden days, people thought of groups in this way:

  • A group is a set $G$ with a binary operation denoted by juxtaposition that is closed on $G$, meaning that if $a$ and $b$ are any elements of $G$, then $ab$ is in $G$.
  • The operation is associative, meaning that if $a,\ b,\ c\in G$, then $(ab)c=a(bc)$.
  • The operation has a unity element, meaning an element $e$ for which for any element $a\in G$, $ae=ea=a$.
  • For each element $a\in G$, there is an element $b$ for which $ab=ba=e$.

This is a better way to describe a group:

  • A group consist of a nullary operation e, a unary operation inv,  and a binary operation denoted by juxtaposition, all with the same codomain $G$. (A nullary operation is a map from a singleton set to a set and a unary operation is a map from a set to itself.)
  • The value of e is denoted by $e$ and the value of inv$(a)$ is denoted by $a^{-1}$.
  • These operations are subject to the following equations, true for all $a,\ b,\ c\in G$:

     

    • $ae=ea=a$.
    • $aa^{-1}=a^{-1}a=e$.
    • $(ab)c=a(bc)$.

This definition makes it clear that a group is a structure consisting of a set and three operations whose axioms are all equations.  It was formulated by people in universal algebra but you still see the older form in texts.

The old form is not wrong, it is merely inelegant.  With the old form, you have to prove the unity and inverses are unique before you can introduce notation, and more important, by making it clear that groups satisfy equational logic you get a lot of theorems for free: you construct products on the cartesian power of the underlying set, quotients by congruence relations, and other things. (Of course, in AbAl those theorem will be stated later than when groups are defined because the book is for newbies and you want lots of examples before theorems.)

References

  1. Three kinds of mathematical thinkers (G&G post)
  2. Technical meanings clash with everyday meanings (G&G post)
  3. Commonword names for technical concepts (G&G post)
  4. Renaming technical concepts (G&G post)
  5. Explaining higher math to beginners (G&G post)
  6. Visual Algebra II (G&G post)
  7. Monads for high school II: Lists (G&G post)
  8. The mystery of the prime numbers: a review (G&G post)
  9. Hersh, R. (1997a), "Math lingo vs. plain English: Double entendre". American Mathematical Monthly, volume 104, pages 48–51.
  10. Names (in abmath)
  11. Cognitive dissonance (in abmath)

Idempotents by sketches and forms

 
This post provides a detailed description of an example of a mathematical structure presented as a sketch and as a form.  It is a supplement to my article An Introduction to forms.  Most of the constructions I mention here are given in more detail in that article.
 
It helps in reading this post to be familiar with the basic ideas of category, including commutative diagram and limit cone, and of the concepts of logical theory and model in logic.
 

Sketches and forms

sketch of a mathematical structure is a collection of objects and arrows that make up a digraph (directed graph), together with some specified cones, cocones and diagrams in the digraph.  A model of the sketch is a digraph morphism from the digraph to some category that takes the cones to limit cones, the cocones to colimit cocones, and the diagrams to commutative diagrams.  A morphism of models of a sketch from one model to another in the same category is a natural transformation.  Sketches can be used to define all kinds of algebraic structures in the sense of universal algebra, and many other types of structures (including many types of categories).  

There are many structures that sketches cannot sketch.  Forms were first defined in [4].  They can define anything a sketch can define and lots of other things.  [5] gives a leisurely description of forms suitable for people who have a little bit of knowledge of categories and [1] gives a more thorough description.  

An idempotent is a very simple kind of algebraic structure.  Here I will describe both a sketch and a form for idempotents. In another post I will do the same for binops (magmas).

Idempotent

An idempotent is a unary operation $u$ for which $u^2=u$.

  • If $u$ is a morphism in a category whose morphisms are set functions, a function $u:S\to S$ is an idempotent if $u(u(x))=u(x)$ for all $x$ in the domain.  
  • Any identity element in any category is an idempotent.
  • A nontrivial example is the function $u(x,y):=(-x,0)$ on the real plane.  

Any idempotent $u$ makes the following diagram commute

and that diagram can be taken as the definition of idempotent in any category.

The diagram is in green.  In this post (and in [5]) diagrams in the category of models of a sketch or a form are shown in green.

A sketch for idempotents

The sketch for idempotents contains a digraph with one object and one arrow from that object to itself (above left) and one diagram (above right).  It has no cones or cocones.  So this is an almost trivial example.  When being expository (well, I can hardly say "when you are exposing") your first example should not be trivial, but it should be easy.  Let's call the sketch $\mathcal{S}$.

  • The diagram looks the same as the green diagram above.  It is in black, because I am showing things in syntax (things in sketches and forms) in black and semantics (things in categories of models) in green.
  • The green diagram is a commutative diagram in some category (unspecified).  
  • The black diagram is a diagram in a digraph. It doesn't make sense to say it is commutative because digraphs don't have composition of arrows.
  • Each sketch has a specific digraph and lists of specific diagrams, cones and cocones.  The left digraph above is not in the list of diagrams of $\mathcal{S}$ (see below).

The definition of sketch says that every diagram in the official list of diagrams of a given sketch must become a commutative diagram in a model.  This use of the word "become" means in this case that a model must be a digraph morphism $M:\mathcal{S}\to\mathcal{C}$ for some category $\mathcal{C}$ for which the diagram below commutes.

This sketch generates a category called the Theory ("Cattheory" in [5]) of the sketch $\mathcal{S}$, denoted by $\text{Th}(\mathcal{S})$.  It is roughly the "smallest" category containing $f$ and $C$ for which the diagrams in $\mathcal{S}$ are commutative.  
 
This theory contains the generic model $G:\mathcal{S}\to \text{Th}(\mathcal{S})$ that takes $f$ and $C$ to themselves.
  • $G$ is "generic" because anything you prove about $G$ is true of every model of $\mathcal{S}$ in any category.
  • In particular, in the category $\text{Th}(\mathcal{S})$, $G(f)\circ G(f)=G(f)$.  
  • $G$ is a universal morphism in the sense of category theory: It lifts any model $M:\mathcal{S}\to\mathcal{C}$ to a unique functor $\bar{M}=M\circ G:\text{Th}(\mathcal{S})\to\mathcal{C}$ which can therefore be regarded as the same model.  See Note [2].
SInce models are functors, morphisms between models are natural transformations.  This gives what you would normally call homomorphisms for models of almost any sketchable structure.  In [2] you can find a sketch for groups, and indeed the natural transformations between models are group homomorphisms.

Sketching categories

You can sketch categories with a sketch CatSk containing diagrams and cones, but no cocones.  This is done in detail in [3]. The resulting theory $\text{Th}(\mathbf{CatSk})$ is required to be the least category-with-finite-limits generated by $\mathcal{S}$ with the diagrams becoming commutative diagrams and the cones becoming limit cones.  This theory is the FL-Theory for categories, which I will call ThCat (suppressing mention of FL).  

Doctrines

In general the theory of a particular kind of structure contains a parameter that denotes its doctrine. The sketch $\mathcal{S}$ for idempotents didn't require cones, but you can construct theories $\text{Th}(\mathcal{S})$, $\text{Th} (\text{FP},\mathcal{S})$ and $\text{Th}(\text{FL},\mathcal{S})$ for idempotents (FP means it is a category with finite products).  

In a strong sense, all these theories have the same models, namely idempotents, but the doctrine of the theory allows you to use more mechanisms for proving properties of idempotents.  (The doctrine for $\text{Th}(\mathcal{S})$ provides for equational proofs for unary operations only, a doctrine which has no common name such as FP or FS.)  The paper [1] is devoted to explicating proof in the context of forms, using graphs and diagrams instead of formulas that are strings of symbols.

Describing composable pairs of arrows

The form for any type of structure is constructed using the FL theory for some type of category, for example category with all limits, cartesian closed category, topos, and so on.  The form for idempotents can be constructed in ThCat (no extra structure needed).  The form for reflexive function spaces (for example) needs the FL theory for cartesian closed categories (see [5]).

Such an FL theory must contain objects $\text{ob}$ and $\text{ar}$ that become the set of objects and the set of arrows of the category that a model produces.  (Since FL theories have models in any category with finite limits, I could have said "object of objects" and "object of arrows".  But in this post I will talk about only models in Set.)

ThCat contains an object  $\text{ar}_2$ that represents composable pairs of arrows.  That requires a cone to define it:

This must become a limit cone in a model.

  • I usually show cones in blue. 
  • $\text{dom}$ and $\text{cod}$ give (in a model) the domain and codomain of an arrow.
  • $\text{lfac}$ gives the left factor and $\text{rfac}$ gives the right factor. It is usually useful to give suggestive names to some of the projections in situations like this, since they will be used elsewhere (where they will be black!).
  • The objects and arrows in the diagram (including $\text{ar}_2$) are already members of the FL theory for categories.
  • This diagram is annotated in green with sample names of objects and arrows that might exist in a model.  Atish and I introduced that annotation system in [1] to help you chase the diagram and think about what it means.

This cone is a graph-based description of the object of composable arrows in a category (as opposed to a linguistic or string-based description).

Describing endomorphisms

Now an idempotent must be an endomorphism, so we provide a cone describing the object of endomorphisms in a category. This cone already exists in the FL theory for categories.

  • $\text{loop}$ is a monomorphism (in fact a regular mono because it is the mono produced by an equalizer) so it is not unreasonable to give the element annotation for $\text{endo}$ and $\text{ar}$ the same name.
  • "$\text{dc}$" takes $f$ to its domain and codomain. 
  • $\text{loop}$ and "$\text{dc}$" were not created when I produced the cone above.  They were already in the FL theory for categories.
 
Since the cone defining $\text{ar}_2$ is a limit cone (in the Theory, not in a model), if you have any other commutative cone (purple) to that cone, a unique arrow (red) $\text{diag}$ automatically is present as shown below:

This particular purple cone is the limit cone defining $\text{endo}$ just defined.  Now $\text{diag}$ is a specific arrow in the FL theory for categories. In a model of the theory (which is a category in Set or in some other category) takes an endomorphism to the corresponding pair of composable arrows.

The object of idempotents

Now using these arrows we can define the object $\text{idm}$ of idempotents using the diagram below. See Note [3].

 

 

 

 

 

Idm is an object in ThCat.  In any category, in other words in any model of ThCat, idm becomes the set of idempotent arrows in that category.

In the terminology of [5], the object idm is the form for idempotents, and the cone it is the limit of is the description of idempotent.  

Now take ThCat and adjoin an arrow $g:1\to\text{idm}$.  You get a new FL category I will call the FL-theory of the form for idempotents.  A model of the theory of the form in Set  is a category with a specified idempotent. A particular example of a model of the form idm in the category of real linear vector spaces is the map $u(x,y):=(-x,0)$ of the (set of points of) the real plane to itself (it is an idempotent endomorphism of $\textbf{R}^2$).  

This example is typical of forms and their models, except in one way:  Idempotents are also sketchable, as I described above.  Many mathematical structures can be perceived as models of forms, but not models of sketches, such as reflexive function spaces as in [5].

Notes

[1] The diagrams shown in this post were drawn in Mathematica.  The code for them is shown in the notebook SketchFormExamples.nb .  I am in the early stages of developing a package for drawing categorical diagrams in Mathematica, so this notebook shows the diagrams defined in very primitive machine-code-like Mathematica.  The package will not rival xypic for TeX any time soon.  I am doing it so I can produce diagrams (including 3D diagrams) you can manipulate.

[2] In practice I would refer to the names of the objects and arrows in the sketch rather than using the M notation:  I might write $f\circ f=f$ instead of $M(f)\circ M(f)=M(f)$ for example.  Of course this confuses syntax with semantics, which sounds like a Grievous Sin, but it is similar to what we do all the time in writing math:  "In a semigroup, $x$ is an idempotent if $xx=x$."  We use same notation for the binary operation for any semigroup and we use $x$ as an arbitrary element of most anything.  Actually, if I write $f\circ f=f$ I can claim I am talking in the generic model, since any statement true in the generic model is true in any model.  So there.

[3] In the Mathematica notebook SketchFormExamples.nb in which I drew these diagrams, this diagram is plotted in Euclidean 3-space and can be viewed from different viewpoints by running your cursor over it.

References

[1] Atish Bagchi and Charles Wells, Graph-Base Logic and Sketches, draft, September 2008, on ArXiv.

[2] Michael Barr and Charles Wells, Category Theory for Computing Science (1999). Les Publications CRM, Montreal (publication PM023).

[3] Michael Barr and Charles Wells, Toposes, Triples and Theories (2005). Reprints in Theory and Applications of Categories 1.

[4] Charles Wells, A generalization of the concept of sketch, Theoretical Computer Science 70, 1990

[5] Charles Wells, An Introduction to forms.

 

 

 

An Introduction to Forms

In 2009, I wrote a sequence of posts on this blog explaining the concept of form that I introduced in [1].  I have now updated and combined them into an article [2].  The posts no longer exist on the blog. The article contains links to other papers on forms.

[1] A generalization of the concept of sketch, Theoretical Computer Science 70, 1990.

[2] An Introduction to forms.

More about defining “category”


In a recent post, I wrote about defining “category” in a way that (I hope) makes it accessible to undergraduate math majors at an early stage.  I have several more things to say about this.

Early intro to categories

The idea is to define a category as a directed graph equipped with an additional structure of composition of paths subject to some axioms.  By giving several small finite examples of categories drawn in that way that gives you an understanding of “category” that has several desirable properties:

  • You get the idea of what a category is in one lecture.
  • With the right choice of examples you get several fine points cleared up:
    • The composition is added structure.
    • A loop doesn’t have to be an identity.
    • Associativity is a genuine requirement –  it is not automatic.
  • You get immediate access to what is by far the most common notation used to work with a category — objects (nodes) and arrows.
  • You don’t have to cope with the difficult chunking required when the first examples given are sets-with-structure and structure-preserving functions.  It’s quite hard to focus on a couple of dots on the paper each representing a group or a topological space and arrows each representing a whole function (not the value of the function!).

Introduce more examples

Then the teacher can go on with the examples that motivated categories in the first place: the big deal categories such as sets, groups and topological spaces.   But they can be introduced using special cases so they don’t require much background.

  • Draw some finite sets and functions between them.  (As an exercise, get the students to find some finite sets and functions that make the picture a category with $f=kh$ as the composite and $f\neq g$.)
  • If the students have had calculus,  introduce them to the category whose objects are real finite nonempty intervals with continuous or differentiable mappings between them.  (Later you can prove that this category is a groupoid!)
  • Find all the groups on a two element set and figure out which maps preserve group multiplication.  (You don’t have to use the word “group” — you can simply show both of them and work out which maps preserve multiplication — and discover isomorphism!.)  This introduces the idea of the arrows being structure-preserving mape. You can get more complicated and use semigroups as well.  If the students know Mathematica you could even do magmas.  Well, maybe not.

All this sounds like a project you could do with high school students.

Large and small

If all this were just a high school (or intro-to-math-for-math-majors) project you wouldn’t have to talk about large vs. small.  However, I have some ideas about approaching this topic.

In the first place, you can define category, or any other mathematical object that might involve a proper class, using the syntactic approach I described in Just-in-time foundations.  You don’t say “A category consists of a set of objects and a set of arrows such that …”.  Instead you say something like “A category $\mathcal{C}$ has objects $A,\,B,\,C\ldots$ such that…”.

This can be understood as meaning “For any $A$, the statement $A$ is an object of  $\mathcal{C}$ is either true or false”, and so on.

This approach is used in the Wikibook on category theory.  (Note: this is a permanent link to the November 28 version of the section defining categories, which is mostly my work.  As always with Wikimedia things it may be entirely different when you read this.)

If I were dictator of the math world (not the same thing as dictator of MathWorld) I would want definitions written in this syntactic style.  The trouble is that mathematicians are now so used to mathematical objects having to be sets-with-structure that wording the definition as I did above may leave them feeling unmoored.  Yet the technique avoids having to mention large vs. small until a problem comes up. (In category theory it sometimes comes up when you want to quantify over all objects.)

The ideas outlined in this subsection could be a project for math majors.  You would have to introduce Russell’s Paradox.  But for an early-on intro to categories you could just use the syntactic wording and avoid large vs. small altogether.

 

http://en.wikibooks.org/w/index.php?title=Category_Theory/Categories&stableid=2221684

Defining “category”

The concept of category is typically taught later in undergrad math than the concept of group is.  It is supposedly a more advanced concept.  Indeed, the typical examples of categories used in applications are more advanced than some of those in group theory (for example, symmetries of geometric shapes and operations on numbers).

Here are some thoughts on how categories could be taught as early as groups, if not earlier.

Nodes and arrows

Small finite categories can be pictured as a graph using nodes and arrows, together with a specification of the identity arrows and a definition of the composition.  (I am using the word “graph” the way category people use it:  a directed graph with possible multiple edges and loops.)

An example is the category pictured below with three objects and seven arrows. The composition is forced except for $kh$, which I hereby define to be $f$.

This way of picturing a category is  easy to grasp. The composite $kh$ visibly has to be either $f$ or $g$.  There is only one choice for the composite of any other composable pair.  Still, the choice of composite is not deducible directly by looking at the graph.

A first class in category theory using graphs as examples could start with this example, or the example in Note 1 below.  This example is nontrivial (never start any subject with trivial examples!) and easy to grasp, in this case using the extraordinary preprocessing your brain does with the input from your eyes.  The definition of category is complicated enough that you should probably present the graph and then give the definition while pointing to what each clause says about the graph.

Most abstract structures have several different ways of representing them. In contrast, when you discuss categorial concepts the standard object-and-arrow notation is the overwhelming favorite.  It reveals domains and codomains and composable pairs, in fact almost everything except which of several possible arrows the composite actually is.  If for example you try to define category using sets and functions as your running example, the student has to do a lot of on-the-go chunking — thinking of a set as a single object, of a set function (which may involve lots of complicated data) as a single chunk with a domain and a codomain, and so on.  But an example shown as a graph comes already chunked and in a picture that is guaranteed to be the most common kind of display they will see in discussions of categories.

After you do these examples, you can introduce trivial and simple graph examples in which the composition is entirely induced; for example these three:

(In case you are wondering, one of them is the empty category.)  I expect that you should also introduce another graph non-example in which associativity fails.

Multiplication tables

The multiplication table for a group is easy to understand, too, in the sense that it gives you a simple method of calculating the product of any two elements.  But it doesn’t provide a visual way to see the product as a category-as-graph does.  Of course, the graph representation works only for finite categories, just as the multiplication table works only for finite groups.

You can give a multiplication table for a small finite category, too, like the one below for the category above.  (“iA” means the identity arrow on A and composition, as usual in category theory, is right to left.) This is certainly more abstract than the graph picture, but it does hit you in the face with the fact that the multiplication is partial.

Notes

1. My suggested example of a category given as a graph shows clearly that you can define two different categorial structures on the graph.  One problem is that the two different structures are isomorphic categories.  In fact, if you engage the students in a discussion about these examples someone may notice that!  So you should probably also use the graph below,where you can define several different category structures that are not all isomorphic. 

2. Multiplication tables and categories-as-graphs-with-composition are extensional presentations.  This means they are presented with all their parts laid out in front of you.  Most groups and categories are given by definitions as accumulations of properties (see concept in the Handbook of Mathematical Discourse).  These definitions tend to make some requirements such as associativity obvious.

Students are sometimes bothered by extensional definitions.  “What are h and k (in the category above)?  What are a, b and c?” (in a group given as a set of letters and a multiplication table).

Showing categorical diagrams in 3D

In Graph-Based Logic and Sketches, Atish Bagchi and I needed to construct a lot of cones based on fairly complicated diagrams. We generally show the base diagram and left the reader to imagine the cone. This post is an experiment in presenting such a diagram in 3D, with its cone and other constructions based on it.

To understand this post, you need a basic understanding of categories, functors and limit cones (see References).

To manipulate the diagram below, you must have Wolfram CDF Player installed on your computer. It is available free from their website.  Note that you can also drag the diagram around in three dimensions to see it from different perspectives.  (Of course it isn’t really in three dimensions.   Your eyes-to-brain module reconstructs the illusion of three dimensions when you twirl the diagram around.)

The notebook and CDF files that generate this display may be downloaded from here:

These files may be used and modified as you wish according to the Creative Commons rule listed under “Permissions” (at the top of the window).

The sketch for categories: composition

A finite-limit sketch (FL sketch) is a category with finite limits given by specifying certain  nodes and arrows, commutative diagrams using these nodes and arrows, and limit cones based on diagrams using the given nodes and arrows.  A model of an FL sketch is a finite-limit-preserving functor from the FL sketch into some category $latex \mathcal{C}$.  Detailed descriptions of FL-sketches are  in References [1], [2] and [3] (below).

Categories themselves may be sketched by FL-sketches. Here I will present the part of the sketch that constructs (in a model) the object of composites of two arrows.  This is the specification for composite:

  1. The composite of two arrows $latex f:A\to B$ and $latex g:B’\to C$ is defined if and only if $latex B=B’$.
  2. The composite is denoted by $latex gf$.
  3. The domain of $latex gf$ is $latex A$ and the codomain is $latex C$.

We start with a diagram in the FL sketch for categories that gives the data corresponding to two arrows that may be composed.  This diagram involves nodes ob and ar, which in a model become the object of objects and the object of arrows of the category object in $latex \mathcal{C}$.  (Suppose $latex \mathcal{C}$ is the category of sets; then the model is simply a small category.  The node ob goes to the set of objects of the small category and ar goes to the set of arrows.)  The arrows labeled dom and cod take (in a model) an arrow to its domain and codomain respectively. Here is the diagram:

Note that this is a diagram, not a directed graph (digraph). (In the paper, Atish and I, like most category theorists, say “graph” instead of “digraph”.) It has an underlying digraph (see Chapter 2 of Graph-Based Logic and Sketches), but the labeling of several different nodes of the underlying digraph by the name of the same node of the sketch is meaningful. 

Here, the key fact is that in the diagram there are two arrows, one labeled dom and the other cod, to the same node labeled ob, and two other arrows to two different nodes labeled ob. 

Now click c1.

This shows a cone over the diagram.  One of the nodes in the sketch must be cp (in other words given beforehand; that is, we are specifying not only that the blue stuff is a limit cone but that the limit is the node cp.)   In a model, this cone must become a limit cone.  It follows from the properties of limits that the elements of cp in the model in Sets are pairs of arrows with the property that one has a codomain that is the same as the domain of the other.  The label “cp” stands for “compatible pairs”.

Now click c2.

The green stuff is a diagram showing two arrows from the node labeled ar to the left and right nodes labeled ob in the original black diagram.  This is not a cone; it is just a diagram.  In a model, any arrow in the vertex must have domain the same as the domain of one of the arrows in the compatible pair, and codomain the same as the codomain of the other arrow of the pair.  Thus in the model, an arrow living in the set labeled with “ar” in green must satisfy requirement 3 in the specification for composition given above.

Note that the requirement that the green diagram be commutative in a model is vacuous, so it doesn’t matter whether we specify it specifically as a diagram in the sketch or not.

Now click c3.

The arrow labeled comp must be specified as an arrow in the sketch.  We want its value to be the composite of an element of cp in a model, in other words a compatible pair of arrows.  At this point that will not necessarily be true.  But all can be saved:

Now click c4.

We must specify that the diagram given by the thick arrows must be a diagram of the sketch.  The fact that it must become commutative in a model means exactly that the red arrow comp from cp to ar takes a compatible pair to an arrow that satisfies requirements 1–3 of the specification of composite shown above.

References

  1. Peter T. Johnstone, Sketches of an Elephant: A Topos Theory Compendium, Volume 2 (Oxford Logic Guides 44), by Oxford University Press, ISBN 978-0198524960.
  2. Michael Barr and Charles Wells, Category theory for computing science (1999).  Les Publications CRM, Montreal (publication PM023).  (This is the easiest to start with but it doesn’t get very far.)
  3. Michael Barr and Charles Wells, Toposes, Triples and Theories (2005).  Reprints in Theory and Applications of Categories 1.

 
$latex \mathcal{C}$

Function as map

This is a first draft of an article to eventually appear in abstractmath.

Images and metaphors

To explain a math concept, you need to explain how mathematicians think about the concept. This is what in abstractmath I call the images and metaphors carried by the concept. Of course you have to give the precise definition of the concept and basic theorems about it. But without the images and metaphors most students, not to mention mathematicians from a different field, will find it hard to prove much more than some immediate consequences of the definition. Nor will they have much sense of the place of the concept in math and applications.

Teachers will often explain the images and metaphors with handwaving and pictures in a fairly vague way. That is good to start with, but it’s important to get more precise about the images and metaphors. That’s because images and metaphors are often not quite a good fit for the concept — they may suggest things that are false and not suggest things that are true. For example, if a set is a container, why isn’t the element-of relation transitive? (A coin in a coinpurse in your pocket is a coin in your pocket.)

“A metaphor is a useful way to think about something, but it is not the same thing as the same thing.” (I think I stole that from the Economist.) Here, I am going to get precise with the notion that a function is a map. I am acting like a mathematician in “getting precise”, but I am getting precise about a metaphor, not about a mathematical object.

A function is a map

A map (ordinary paper map) of Minnesota has the property that each point on the paper represents a point in the state of Minnesota. This map can be represented as a mathematical function from a subset of a 2-sphere to $latex {{\mathbb R}^2}&fg=000000$. The function is a mathematical idealization of the relation between the state and the piece of paper, analogous to the mathematical description of the flight of a rocket ship as a function from $latex {{\mathbb R}}&fg=000000$ to $latex {{\mathbb R}^3}&fg=000000$.

The Minnesota map-as-function is probably continuous and differentiable, and as is well known it can be angle preserving or area preserving but not both.

So you can say there is a point on the paper that represents the location of the statue of Paul Bunyan in Bemidji. There is a set of points that represents the part of the Mississippi River that lies in Minnesota. And so on.

A function has an image. If you think about it you will realize that the image is just a certain portion of the piece of paper. Knowing that a particular point on the paper is in the image of the function is not the information contained in what we call “this map of Minnesota”.

This yields what I consider a basic insight about function-as-map:  The map contains the information about the preimage of each point on the paper map. So:

The map in the sense of a “map of Minnesota” is represented by the whole function, not merely by the image.

I think that is the essence of the metaphor that a function is a map. And I don’t think newbies in abstractmath always understand that relationship.

A morphism is a map

The preceding discussion doesn’t really represent how we think of a paper map of Minnesota. We don’t think in terms of points at all. What we see are marks on the map showing where some particular things are. If it is a road map it has marks showing a lot of roads, a lot of towns, and maybe county boundaries. If it is a topographical map it will show level curves showing elevation. So a paper map of a state should be represented by a structure preserving map, a morphism. Road maps preserve some structure, topographical maps preserve other structure.

The things we call “maps” in math are usually morphisms. For example, you could say that every simple closed curve in the plane is an equivalence class of maps from the unit circle to the plane. Here equivalence class meaning forget the parametrization.

The very fact that I have to mention forgetting the parametrization is that the commonest mathematical way to talk about morphisms is as point-to-point maps with certain properties. But we think about a simple closed curve in the plane as just a distorted circle. The point-to-point correspondence doesn’t matter. So this example is really talking about a morphism as a shape-preserving map. Mathematicians introduced points into talking about preserving shapes in the nineteenth century and we are so used to doing that that we think we have to have points for all maps.

Not that points aren’t useful. But I am analyzing the metaphor here, not the technical side of the math.

Groups are functors

People who don’t do category theory think the idea of a mathematical structure as a functor is weird. From the point of view of the preceding discussion, a particular group is a functor from the generic group to some category. (The target category is Set if the group is discrete, Top if it is a topological group, and so on.)

The generic group is a group in a category called its theory or sketch that is just big enough to let it be a group. If the theory is the category with finite products that is just big enough then it is the Lawvere theory of the group. If it is a topos that is just big enough then it is the classifying topos of groups. The theory in this sense is equivalent to some theory in the sense of string-based logic, for example the signature-with-axioms (equational theory) or the first order theory of groups. Johnstone’s Elephant book is the best place to find the translation between these ideas.

A particular group is represented by a finite-limit-preserving functor on the algebraic theory, or by a logical functor on the classifying topos, and so on; constructions which bring with them the right concept of group homomorphisms as well (they will be any natural transformations).

The way we talk about groups mimics the way we talk about maps. We look at the symmetric group on five letters and say its multiplication is noncommutative. “Its multiplication” tells us that when we talk about this group we are talking about the functor, not just the values of the functor on objects. We use the same symbols of juxtaposition for multiplication in any group, “$latex {1}&fg=000000$” or “$latex {e}&fg=000000$” for the identity, “$latex {a^{-1}}&fg=000000$” for the inverse of $latex {a}&fg=000000$, and so on. That is because we are really talking about the multiplication, identity and inverse function in the generic group — they really are the same for all groups. That is because a group is not its underlying set, it is a functor. Just like the map of Minnesota “is” the whole function from the state to the paper, not just the image of the function.

Mathematical concepts

This post was triggered by John Armstrong’s comment on my last post.

We need  to distinguish two ideas: representations of a mathematical concept and the total concept.  (I will say more about terminology later.)

Example: We can construct the quotient of the kernel of a group homomorphism by taking its cosets and defining a multiplication on them.  We can construct the image of the homomorphism by take the set of values of the homomorphism and using the multiplication induced by the codomain group.   The quotient group and the image are the same mathematical structure in the sense that anything useful you can say about one is true of the other.   For example, it may be useful to know the cardinality of the quotient (image) but it is not useful to know what its elements are.

But hold on, as the Australians say, if we knew that the codomain was an Abelian group then we would know that the quotient group was abelian because the elements of the image form a subgroup of the codomain. (But the Australians I know wouldn’t say that.)

Now that kind of thinking is based on the idea that the elements of the image are “really” elements of the codomain whereas elements of the quotients are “really” subsets of the domain.  That is outmoded thinking.  The image and the quotient are the same in all important aspects because they are naturally isomorphic.   We should think of the quotient as just as much as subgroup of the codomain as the image is.  John Baez (I think) would say that to ask whether the subgroup embedding is the identity on elements or not is an evil question.

Let’s step back and look at what is going on here.  The definition of the quotient group is a construction using cosets.  The definition of the image is a construction using values of the homomorphism.  Those are two different specific  representations of the same concept.

But what is the concept, as distinct from its representations?  Intuitively, it is

  • All the constructions made possible by the definition of the concept.
  • All the statements that are true about the concept.

(That is not precise.)

The total concept is like the clone plus the equational theory of a specific type of algebra in the sense of universal algebra.  The clone is all the operations you can construct knowing the given signature and equations and the equational theory is the set of all equations that follow from them.  That is one way of describing it.  Another is the monad in Set that gives the type of algebra — the operations are the arrows and the equations are the commutative diagrams.

Note: The preceding description of the monad is not quite right.  Also the whole discussion omits mention of the fact that we are in the world (doctrine) of universal algebra.  In the world of first order logic, for example, we need to refer to the classifying topos of the category of algebras of that type (or to its first order theory).

Terminology

We need better terminology for all this.  I am not going to propose better terminology, so this is a shaggy dog story.

Math ed people talk about a particular concept image of a concept as well as the total schema of the concept.

In categorical logic, we talk about the sketch or presentation of the concept vs. the theory. The theory is a category (of the kind appropriate to the doctrine) that contains all the possible constructions and commutative diagrams that follow from the presentation.

In this post I have used “total concept” to refer to the schema or theory.  I have referred the particular things as  “representations” (for example construct the image of a homomorphism by cosets or by values of the homomorphism).

“Representation” does not have the same connotations as “presentation”.  Indeed a presentation of a group and a representation of a group are mathematically  two different things.  But I suspect they are two different aspects of the same idea.

All this needs to be untangled.  Maybe we should come up with two completely arbitrary words, like “dostak” and “dosh”.