Category Archives: exposition

Idempotents by sketches and forms

 
This post provides a detailed description of an example of a mathematical structure presented as a sketch and as a form.  It is a supplement to my article An Introduction to forms.  Most of the constructions I mention here are given in more detail in that article.
 
It helps in reading this post to be familiar with the basic ideas of category, including commutative diagram and limit cone, and of the concepts of logical theory and model in logic.
 

Sketches and forms

sketch of a mathematical structure is a collection of objects and arrows that make up a digraph (directed graph), together with some specified cones, cocones and diagrams in the digraph.  A model of the sketch is a digraph morphism from the digraph to some category that takes the cones to limit cones, the cocones to colimit cocones, and the diagrams to commutative diagrams.  A morphism of models of a sketch from one model to another in the same category is a natural transformation.  Sketches can be used to define all kinds of algebraic structures in the sense of universal algebra, and many other types of structures (including many types of categories).  

There are many structures that sketches cannot sketch.  Forms were first defined in [4].  They can define anything a sketch can define and lots of other things.  [5] gives a leisurely description of forms suitable for people who have a little bit of knowledge of categories and [1] gives a more thorough description.  

An idempotent is a very simple kind of algebraic structure.  Here I will describe both a sketch and a form for idempotents. In another post I will do the same for binops (magmas).

Idempotent

An idempotent is a unary operation $u$ for which $u^2=u$.

  • If $u$ is a morphism in a category whose morphisms are set functions, a function $u:S\to S$ is an idempotent if $u(u(x))=u(x)$ for all $x$ in the domain.  
  • Any identity element in any category is an idempotent.
  • A nontrivial example is the function $u(x,y):=(-x,0)$ on the real plane.  

Any idempotent $u$ makes the following diagram commute

and that diagram can be taken as the definition of idempotent in any category.

The diagram is in green.  In this post (and in [5]) diagrams in the category of models of a sketch or a form are shown in green.

A sketch for idempotents

The sketch for idempotents contains a digraph with one object and one arrow from that object to itself (above left) and one diagram (above right).  It has no cones or cocones.  So this is an almost trivial example.  When being expository (well, I can hardly say "when you are exposing") your first example should not be trivial, but it should be easy.  Let's call the sketch $\mathcal{S}$.

  • The diagram looks the same as the green diagram above.  It is in black, because I am showing things in syntax (things in sketches and forms) in black and semantics (things in categories of models) in green.
  • The green diagram is a commutative diagram in some category (unspecified).  
  • The black diagram is a diagram in a digraph. It doesn't make sense to say it is commutative because digraphs don't have composition of arrows.
  • Each sketch has a specific digraph and lists of specific diagrams, cones and cocones.  The left digraph above is not in the list of diagrams of $\mathcal{S}$ (see below).

The definition of sketch says that every diagram in the official list of diagrams of a given sketch must become a commutative diagram in a model.  This use of the word "become" means in this case that a model must be a digraph morphism $M:\mathcal{S}\to\mathcal{C}$ for some category $\mathcal{C}$ for which the diagram below commutes.

This sketch generates a category called the Theory ("Cattheory" in [5]) of the sketch $\mathcal{S}$, denoted by $\text{Th}(\mathcal{S})$.  It is roughly the "smallest" category containing $f$ and $C$ for which the diagrams in $\mathcal{S}$ are commutative.  
 
This theory contains the generic model $G:\mathcal{S}\to \text{Th}(\mathcal{S})$ that takes $f$ and $C$ to themselves.
  • $G$ is "generic" because anything you prove about $G$ is true of every model of $\mathcal{S}$ in any category.
  • In particular, in the category $\text{Th}(\mathcal{S})$, $G(f)\circ G(f)=G(f)$.  
  • $G$ is a universal morphism in the sense of category theory: It lifts any model $M:\mathcal{S}\to\mathcal{C}$ to a unique functor $\bar{M}=M\circ G:\text{Th}(\mathcal{S})\to\mathcal{C}$ which can therefore be regarded as the same model.  See Note [2].
SInce models are functors, morphisms between models are natural transformations.  This gives what you would normally call homomorphisms for models of almost any sketchable structure.  In [2] you can find a sketch for groups, and indeed the natural transformations between models are group homomorphisms.

Sketching categories

You can sketch categories with a sketch CatSk containing diagrams and cones, but no cocones.  This is done in detail in [3]. The resulting theory $\text{Th}(\mathbf{CatSk})$ is required to be the least category-with-finite-limits generated by $\mathcal{S}$ with the diagrams becoming commutative diagrams and the cones becoming limit cones.  This theory is the FL-Theory for categories, which I will call ThCat (suppressing mention of FL).  

Doctrines

In general the theory of a particular kind of structure contains a parameter that denotes its doctrine. The sketch $\mathcal{S}$ for idempotents didn't require cones, but you can construct theories $\text{Th}(\mathcal{S})$, $\text{Th} (\text{FP},\mathcal{S})$ and $\text{Th}(\text{FL},\mathcal{S})$ for idempotents (FP means it is a category with finite products).  

In a strong sense, all these theories have the same models, namely idempotents, but the doctrine of the theory allows you to use more mechanisms for proving properties of idempotents.  (The doctrine for $\text{Th}(\mathcal{S})$ provides for equational proofs for unary operations only, a doctrine which has no common name such as FP or FS.)  The paper [1] is devoted to explicating proof in the context of forms, using graphs and diagrams instead of formulas that are strings of symbols.

Describing composable pairs of arrows

The form for any type of structure is constructed using the FL theory for some type of category, for example category with all limits, cartesian closed category, topos, and so on.  The form for idempotents can be constructed in ThCat (no extra structure needed).  The form for reflexive function spaces (for example) needs the FL theory for cartesian closed categories (see [5]).

Such an FL theory must contain objects $\text{ob}$ and $\text{ar}$ that become the set of objects and the set of arrows of the category that a model produces.  (Since FL theories have models in any category with finite limits, I could have said "object of objects" and "object of arrows".  But in this post I will talk about only models in Set.)

ThCat contains an object  $\text{ar}_2$ that represents composable pairs of arrows.  That requires a cone to define it:

This must become a limit cone in a model.

  • I usually show cones in blue. 
  • $\text{dom}$ and $\text{cod}$ give (in a model) the domain and codomain of an arrow.
  • $\text{lfac}$ gives the left factor and $\text{rfac}$ gives the right factor. It is usually useful to give suggestive names to some of the projections in situations like this, since they will be used elsewhere (where they will be black!).
  • The objects and arrows in the diagram (including $\text{ar}_2$) are already members of the FL theory for categories.
  • This diagram is annotated in green with sample names of objects and arrows that might exist in a model.  Atish and I introduced that annotation system in [1] to help you chase the diagram and think about what it means.

This cone is a graph-based description of the object of composable arrows in a category (as opposed to a linguistic or string-based description).

Describing endomorphisms

Now an idempotent must be an endomorphism, so we provide a cone describing the object of endomorphisms in a category. This cone already exists in the FL theory for categories.

  • $\text{loop}$ is a monomorphism (in fact a regular mono because it is the mono produced by an equalizer) so it is not unreasonable to give the element annotation for $\text{endo}$ and $\text{ar}$ the same name.
  • "$\text{dc}$" takes $f$ to its domain and codomain. 
  • $\text{loop}$ and "$\text{dc}$" were not created when I produced the cone above.  They were already in the FL theory for categories.
 
Since the cone defining $\text{ar}_2$ is a limit cone (in the Theory, not in a model), if you have any other commutative cone (purple) to that cone, a unique arrow (red) $\text{diag}$ automatically is present as shown below:

This particular purple cone is the limit cone defining $\text{endo}$ just defined.  Now $\text{diag}$ is a specific arrow in the FL theory for categories. In a model of the theory (which is a category in Set or in some other category) takes an endomorphism to the corresponding pair of composable arrows.

The object of idempotents

Now using these arrows we can define the object $\text{idm}$ of idempotents using the diagram below. See Note [3].

 

 

 

 

 

Idm is an object in ThCat.  In any category, in other words in any model of ThCat, idm becomes the set of idempotent arrows in that category.

In the terminology of [5], the object idm is the form for idempotents, and the cone it is the limit of is the description of idempotent.  

Now take ThCat and adjoin an arrow $g:1\to\text{idm}$.  You get a new FL category I will call the FL-theory of the form for idempotents.  A model of the theory of the form in Set  is a category with a specified idempotent. A particular example of a model of the form idm in the category of real linear vector spaces is the map $u(x,y):=(-x,0)$ of the (set of points of) the real plane to itself (it is an idempotent endomorphism of $\textbf{R}^2$).  

This example is typical of forms and their models, except in one way:  Idempotents are also sketchable, as I described above.  Many mathematical structures can be perceived as models of forms, but not models of sketches, such as reflexive function spaces as in [5].

Notes

[1] The diagrams shown in this post were drawn in Mathematica.  The code for them is shown in the notebook SketchFormExamples.nb .  I am in the early stages of developing a package for drawing categorical diagrams in Mathematica, so this notebook shows the diagrams defined in very primitive machine-code-like Mathematica.  The package will not rival xypic for TeX any time soon.  I am doing it so I can produce diagrams (including 3D diagrams) you can manipulate.

[2] In practice I would refer to the names of the objects and arrows in the sketch rather than using the M notation:  I might write $f\circ f=f$ instead of $M(f)\circ M(f)=M(f)$ for example.  Of course this confuses syntax with semantics, which sounds like a Grievous Sin, but it is similar to what we do all the time in writing math:  "In a semigroup, $x$ is an idempotent if $xx=x$."  We use same notation for the binary operation for any semigroup and we use $x$ as an arbitrary element of most anything.  Actually, if I write $f\circ f=f$ I can claim I am talking in the generic model, since any statement true in the generic model is true in any model.  So there.

[3] In the Mathematica notebook SketchFormExamples.nb in which I drew these diagrams, this diagram is plotted in Euclidean 3-space and can be viewed from different viewpoints by running your cursor over it.

References

[1] Atish Bagchi and Charles Wells, Graph-Base Logic and Sketches, draft, September 2008, on ArXiv.

[2] Michael Barr and Charles Wells, Category Theory for Computing Science (1999). Les Publications CRM, Montreal (publication PM023).

[3] Michael Barr and Charles Wells, Toposes, Triples and Theories (2005). Reprints in Theory and Applications of Categories 1.

[4] Charles Wells, A generalization of the concept of sketch, Theoretical Computer Science 70, 1990

[5] Charles Wells, An Introduction to forms.

 

 

 

Send to Kindle

Whole numbers

Sue Van Hattum wrote in response to a recent post:

I’d like to know what you think of my ‘abuse of terminology’. I teach at a community college, and I sometimes use incorrect terms (and tell the students I’m doing so), because they feel more aligned with common sense.

To me, and to most students, the phrase “whole numbers” sounds like it refers to anything that doesn’t need fractions to represent it, and should include negative numbers. (It then, of course, would mean the same thing that the word integers does.) So I try to avoid the phrase, mostly. But I sometimes say we’ll use it with the common sense meaning, not the official math meaning.

Her comments brought up a couple of things I want to blather about.

Official meaning

There is no such thing as an "official math meaning".  Mathematical notation has no governing authority and research mathematicians are too ornery to go along with one anyway.  There is a good reason for that attitude:  Mathematical research constantly causes us to rethink the relationship among different mathematical ideas, which can make us want to use names that show our new view of the ideas.  An excellent example of that is the evolution of the concept of "function" over the past 150 years, traced in the Wikipedia article.

What some "authorities" say about "whole number":

  • MathWorld  says that "whole number" is used to mean any of these:  Any positive integer, any nonnegative integer or any integer.
  • Wikipedia also allows all three meanings.
  • Webster's New World dictionary (of which I have been a consultant, but they didn't ask me about whole numbers!) gives "any integer" as a second meaning.
  • American Heritage Dictionary give "any integer" as the only meaning.
  • Someone stole my copy of Merriam Webster.

Common Sense Meaning

Mathematicians think about and talk any particular kind of math object using images and metaphors.  Sometimes (not very often) the name they give to a math object embodies a metaphor.  Examples:

  • A complex number is usually notated using two real parameters, so it looks more complicated than a real number.
  • "Rings" were originally called that because the first examples were integers (mod n) for some positive integer, and you can think of them as going around a clock showing n hours.

Unfortunately, much of the time the name of a kind of object contains a suggestive metaphor that is bad,  meaning that it suggests an erroneous picture or idea of what the object is like.

  • A "group" ought to be a bunch of things.  In other words, the word ought to mean "set".
  • The word "line" suggests that it ought to be a row of points.  That suggests that each point on a line ought to have one next to it.  But that's not true on the "real line"!

Sue's idea that the "common sense" meaning of "whole number" is "integer" refers, I think, to the built-in metaphor of the phrase "whole number" (unbroken number).

I urge math teachers to do these things:

  • Explain to your students that the same math word or phrase can mean different things in different books.
  • Convince your  students to avoid being fooled by the common-sense (metaphorical meaning) of a mathematical phrase.

 

Send to Kindle

Mathematical usage

Comments about mathematical usage, extending those in my post on abuse of notation.

Geoffrey Pullum, in his post Dogma vs. Evidence: Singular They, makes some good points about usage that I want to write about in connection with mathematical usage.  There are two different attitudes toward language usage abroad in the English-speaking world. (See Note [1])

  • What matters is what people actually write and say.   Usage in this sense may often be reported with reference to particular dialects or registers, but in any case it is based on evidence, for example citations of quotations or a linguistic corpus.  (Note [2].)  This approach is scientific.
  • What matters is what a particular writer (of usage or style books) believes about  standards for speaking or writing English.  Pullum calls this "faith-based grammar".  (People who think in this way often use the word "grammar" for usage.)  This approach is unscientific.

People who write about mathematical usage fluctuate between these two camps.

My writings in the Handbook of Mathematical Discourse and abstractmath.org are mostly evidence based, with some comments here and there deprecating certain usages because they are confusing to students.  I think that is about the right approach.  Students need to know what is actual mathematical usage, even usage that many mathematicians deprecate.

Most math usage that is deprecated (by me and others) is deprecated for a reason.  This reason should be explained, and that is enough to stop it being faith-based.  To make it really scientific you ought to cite evidence that students have been confused by the usage.  Math education people have done some work of this sort.  Most of it is at the K-12 level, but some have worked with college students observing the way the solve problems or how they understand some concepts, and this work often cites examples.

Examples of usage to be deprecated

 

Powers of functions

f^n(x) can mean either iterated composition or multiplication of the values.  For example, f^2(x) can mean f(x)f(x) or f(f(x)).  This is exacerbated by the fact that in undergrad calculus texts,  \sin^{-1}x refers to the arcsine, and \sin^2 x refers to \sin x\sin x.  This causes innumerable students trouble.  It is a Big Deal.

In

Set "in" another set.  This is discussed in the Handbook.  My impression is that for students the problem is that they confuse "element of" with "subset of", and the fact that "in" is used for both meanings is not the primary culprit.  That's because most sets in practice don't have both sets and non-sets as elements.  So the problem is a Big Deal when students first meet with the concept of set, but the notational confusion with "in" is only a Small Deal.

Two

This is not a Big Deal.  But I have personally witnessed students (in upper level undergrad courses) that were confused by this.

Parentheses

The many uses of parentheses, discussed in abstractmath.  (The Handbook article on parentheses gives citations, including one in which the notation "(a,b)" means open interval once and GCD once in the same sentence!)  I think the only part that is a Big Deal, or maybe Medium Deal, is the fact that the value of a function f at an input x can be written either  "f\,x" or as "f(x)".  In fact, we do without the parentheses when the name of the function is a convention, as in \sin x or \log x, and with the parentheses when it is a variable symbol, as in "f(x)".  (But a substantial minority of mathematicians use f\,x in the latter case.  Not to mention xf.)  This causes some beginning calculus students to think "\sin x" means "sin" times x.

More

The examples given above are only a sampling of troubles caused by mathematical notation.   Many others are mentioned in the Handbook and in Abstractmath, but they are scattered.  I welcome suggestions for other examples, particularly at the college and graduate level. Abstractmath will probably have a separate article listing the examples someday…

Notes

[1] The situation Pullum describes for English is probably different in languages such as Spanish, German and French, which have Academies that dictate usage for the language.  On the other hand, from what I know about them most speakers of those languages ignore their dictates.

[2] Actually, they may use more than one corpus, but I didn't want to write "corpuses" or "corpora" because in either way I would get sharp comments from faith-based usage people.

References on mathematical usage

Bagchi, A. and C. Wells (1997), Communicating Logical Reasoning.

Bagchi, A. and C. Wells (1998)  Varieties of Mathematical Prose.

Bullock, J. O. (1994), ‘Literacy in the language of mathematics’. American Mathematical Monthly, volume 101, pages 735743.

de Bruijn, N. G. (1994), ‘The mathematical vernacular, a language for mathematics with typed sets’. In Selected Papers on Automath, Nederpelt, R. P., J. H. Geuvers, and R. C. de Vrijer, editors, volume 133 of Studies in Logic and the Foundations of Mathematics, pages 865  935.  

Epp, S. S. (1999), ‘The language of quantification in mathematics instruction’. In Developing Mathematical Reasoning in Grades K-12. Stiff, L. V., editor (1999),  NCTM Publications.  Pages 188197.

Gillman, L. (1987), Writing Mathematics Well. Mathematical Association of America

Higham, N. J. (1993), Handbook of Writing for the Mathematical Sciences. Society for Industrial and Applied Mathematics.

Knuth, D. E., T. Larrabee, and P. M. Roberts (1989), Mathematical Writing, volume 14 of MAA Notes. Mathematical Association of America.

Krantz, S. G. (1997), A Primer of Mathematical Writing. American Mathematical Society.

O'Halloran, K. L.  (2005), Mathematical Discourse: Language, Symbolism And Visual Images.  Continuum International Publishing Group.

Pimm, D. (1987), Speaking Mathematically: Communications in Mathematics Classrooms.  Routledge & Kegan Paul.

Schweiger, F. (1994b), ‘Mathematics is a language’. In Selected Lectures from the 7th International Congress on Mathematical Education, Robitaille, D. F., D. H. Wheeler, and C. Kieran, editors. Sainte-Foy: Presses de l’Université Laval.

Steenrod, N. E., P. R. Halmos, M. M. Schiffer, and J. A. Dieudonné (1975), How to Write Mathematics. American Mathematical Society.

Wells, C. (1995), Communicating Mathematics: Useful Ideas from Computer Science.

Wells, C. (2003), Handbook of Mathematical Discourse

Wells, C. (ongoing), Abstractmath.org.

Send to Kindle

Two

The post Are these questions unambiguous? in the blog Explaining Mathematics concerns the funny way mathematicians use the number “two” (Note [3]).  This is discussed in Abstractmath.org, based on usage quotations (see Note [1]) in the Handbook of Mathematical Discourse. They are citations  54, 119, 220, 229, 260, 322, 323 and 338.  The list is in the online version of the Handbook (see Note [2]) which takes forever to load.  (There is a separate file for users of the paperback book but it is currently trashed.)

The usage quirk concerning “two” is exemplified by statements such as these:

  1. The sum of any two even integers is even.
  2. Courant gives Leibniz’ rule for finding the Nth derivative of the product of two functions.  (This is from Citation 323.)
  3. Are there two positive integers m and n, both greater than 1, satisfying mn=9? (This is from Explaining Mathematics.)

Statements 1 and 2 are of course true.  They are still true if the “two” things are the same.  Mathematicians generally assume that such a statement includes the case where the two things are the same.  If the case that they are the same is excluded, the statement becomes an unnecessarily weak assertion.

Statement 3, in my opinion, is badly written.  If the two positive integers have to be distinct, the answer is “no”.   I think any competent mathematical writer would write something like, “There are not two distinct integers m and n both greater than 1 for which mn = 9″.

It is fair to say that when mathematicians refer to “two integers” in statements like these, they are allowed to be the same.  If they can’t be the same for the sentence to remain true, they will (or at least should) insert a word such as “distinct”.

Of course, in some sentences the two integers can’t be the same because of some condition imposed in the context.  That doesn’t happen in the citations I have listed.  Maybe someone can contribute an example.

Notes

[1] In the Handbook, usage quotations are called “citations”.  It appears to me that the commonest name for citations among lexicographers is “usage quotations”, so I will start calling them that.

[2] I created the online version of the Handbook hastily in 2006.  It needs work, since it has TeX mistakes (which may irritate you but should not interfere with readability) and omits the quotations, illustrations, and some backlinks, including backlinks for the citations.  Some Day When I Get A Round Tuit…

[3] This funny property of “two” was discussed many years ago by Steenrod or Knuth or someone, and is mentioned in a paper by Susanna Epp, but I don’t currently have access to any of the references.

 

Send to Kindle

Abuse of notation

I have recently read the Wikipedia article on Abuse of Notation (this link is to the version of 29 December 2011, since I will eventually edit it).  The Handbook of Mathematical Discourse and abstractmath.org mention this idea briefly.  It is time to expand the abstractmath article and to redo parts of the Wikipedia article, which  contains some confusions.

This is a preliminary draft, part of which I’ll incorporate into abstractmath after you readers make insightful comments :).

The phrase “Abuse of Notation” is used in articles and books written by research mathematicians.  It is part of Mathematical English.  This post is about

  • What “abuse of notation” means in mathematical writing and conversation.
  • What it could be used to mean.
  • Mathematical usage in general.  I will discuss this point in the context of the particular phrase “abuse of notation”, not a bad way to talk about a subject.

Mathematical Usage

Sources

If I’m going to write about the usage of Mathematical English, I should ideally verify what I claim about the usage by finding citations for a claim: documented quotations that illustrate the usage.  This is the standard way to produce any dictionary.

There is no complete authoritative source for usage of words and phrases in Mathematical English (ME), or for that matter for usage in the Symbolic Language (SL).

  • The Oxford Concise Dictionary of Mathematics [2] covers technical terms and symbols used in school math and in much of undergraduate math, but not so much of research math.  It does not mention being based on citations and it hardly talks about usage at all, even for notorious student-confusing notations such as “\sin^k x“. But it appears quite accurate with good explanations of the math it covers.
  • I wrote Handbook of Mathematical Discourse to stimulate investigations into mathematical usage.  It describes a good many usages in Mathematical English and the Symbolic Language, documented with citations of quotations, but is quite incomplete (as I said in its Introduction).  The Handbook has 428 citations for various usages.  (They are at the end of the on-line PDF version. They are not in the printed book, but are on the web with links to pages in the printed book.)
  • MathWorld has an extensive list of mathematical words, phrases and symbols, and accurate definitions or descriptions of them, even for a great many advanced research topics. It also frequently mentions usage (see formula and inverse sine), but does not give citations.
  • Wikipedia has the most complete set of definitions of mathematical objects that I know of.  The entries sometimes mention usage. I have not detected any entry that gives citations for usage.  Not that that should stop anyone from adding them.

Teaching mathematical usage

In explaining mathematical usage to students, particularly college-level or higher math students, you have choices:

  1. Tell them what you think the usage of a word, phrase, or symbol is, without researching citations.
  2. Tell them what you think the usage ought to be.
  3. Tell them what you think the usage is, supported by citations.

(1) has the problem that you can be wrong.  In fact when I worked on the Handbook I was amazed  at how wrong I could be in what the usage was, in spite of the fact that I had been thinking about usage in ME and SL since I first started teaching (and kept a folder of what I had noticed about various usages).  However,  professional mathematicians generally have a reasonably accurate idea about usage for most things, particularly in their field and in undergraduate courses.

(2) is dangerous.  Far too many mathematicians (but nevertheless a minority), introduce usage in articles and lecturing that is not common or that they invented themselves. As a result their students will be confused in trying to read other sources and may argue with other teachers about what is “correct”.  It is a gross violation of teaching ethics to tell the students that (for example) “x > 0″ allows x = 0 and not mention to them that nearly all written mathematics does not allow that.  (Did you know that a small percentage of mathematicians and educators do use that meaning, including in some secondary institutions in some countries?  It is partly Bourbaki’s fault.)

(3) You often can’t tell them what the usage is, supported by citations, because, as mentioned above, documented mathematical usage is sparse.

I think people should usually choose (1) instead of (2).  If they do want to introduce a new usage or notation because it is “more logical” or because “my thesis advisor used it” or something, they should reconsider.  Most such attempts have failed, and thousands of students have been confused by the attempts.

Abuse of notation

“Abuse of notation” is a phrase used in mathematical writing to describe terminology and notation that does not have transparent meaning. (Transparent meaning is described in some detail under “compositional” in the Handbook.)

Abuse of notation was originally defined in French, where the word “abus” does not carry the same strongly negative connotation that it does in English.

Suppression of parameters

One widely noticed practice called “abuse of notation”  is the use of the name of the underlying set of a mathematical structure to refer to a structure. For example, a group is a structure (G,\text{*}) where G is a set and * is a binary operation with certain properties. The most common way to refer to this structure is simply to call it G. Since any set of cardinality greater than 1 has more than one group structure on it, this does not include all the information needed to determine the group. This type of usage is cited in 82 below.  It is an example of suppression of parameters.

Writing “\log x” without mentioning the base of the logarithm is also an example of suppression of parameters.  I think most mathematicians would regard this as a convention rather than as an abuse of notation.  But I have no citations for this (although they would probably be easy to find).  I doubt that it is possible to find a rational distinction between “abuse of notation” and “convention”; it is all a matter of what people are used to saying.

Synecdoche

The naming of a structure by using the name of its underlying set is also an example of synecdoche, the naming of a whole by a part (for example, “wheels” to mean a car).

Another type of synecdoche that has been called abuse of notation is referring to an equivalence class by naming one of its elements.  I do not have a good quotation-citation that shows this use.  Sometimes people write 2 + 4 = 1 when they are working in the Galois field with 5 elements.  But that can be interpreted in more than one way.  If GF[5] consists of equivalence classes of integers (mod 5) then they are indeed using 2 (for example) to stand for the equivalence class of 2.  But they could instead define GF[5] in the obvious way with underlying set {0,1,2,3,4}.  In any case, making distinctions of that sort is pedantic, since the two structures are related by a natural isomorphism (next paragraph!)

Identifying objects via isomorphism

This is quite commonly called “abuse of notation” and is exemplified in citations 209, 395 and AB3.

Overloaded notation

John Harrison, in [1], uses “abuse of notation” to describe the use of a function symbol to apply to both an element of its domain and a subset of the domain.  This is an example of overloaded notation.  I have not found another citation for this usage other than Harrison and I don’t remember anyone using it.  Another example of overloaded notation is the use of the same symbol “\times” for multiplication of numbers, matrices and 3-vectors.  I have never heard that called abuse of notation.  But I have no authority to say anything about this usage because I haven’t made the requisite thorough search of the literature.

Powers of functions

The Wikipedia Article on abuse of notation (29 Dec 2011 version) mentions the fact that f^2(x) can mean either f(x)f(x) or f(f(x)).   I have never heard this called abuse of notation and I don’t think it should be called that.  The notation “f^2(x)” can in ordinary usage mean one of two things and the author or teacher should say which one they mean.  Many math phrases or symbolic expressions  can mean more than one thing and the author generally should say which.  I don’t see the point of calling this phenomenon abuse of notation.

Radial concept

The Wikipedia article mentions phrases such as “partial function”.  This article does provide a citation for Bourbaki for calling a sentence such as “Let f:A\to B be a partial function” abuse of notation.  Bourbaki is wrong in a deep sense (as the article implies).  There are several points to make about this:

  • Some authors, particularly in logic, define a function to be what most of us call a partial function.  Some authors  require a ring to have a unit and others don’t.  So what?
  • The phrase “partial function” has a standard meaning in math:  Roughly “it is a function except it is defined on only part of its domain”.  Precisely, f:A\to B is a partial function if it is a function f:A'\to B for some subset A' of A.
  • A partial function is not in general a function.  A stepmother is not a mother.  A left identity may not be an identity, but the phrase “left identity” is defined precisely.   An incomplete proof is not a proof, but you know what the phrase means! (Compare “expectant mother”).   This is the way we normally talk and think.  See the article “radial concept” in the Handbook.

Other uses

AB4 involves a redefinition of  “\in” in a special case.  Authors redefine symbols all the time.  This kind of redefinition on the fly probably should be avoided, but since they did it I am glad they mentioned it.

I have not talked about some of the uses mentioned in the Wikipedia article because I don’t yet understand them well enough.  AB1 and AB2 refer to a common use with pullback that I am not sure I understand (in terms of how they author is thinking of it).  I also don’t understand AB5.  Suggestions from readers would be appreciated.

Kill it!

Well, it’s more polite to say, we don’t need the phrase “abuse of notation” and it should be deprecated.

  • The use of the word “abuse” makes it sound like a bad thing, and most instances of abuse of notation are nothing of the sort.  They make mathematical writing much more readable.
  • Nearly everywhere it is used it could just as well be called a convention.  (This requires verification by studying math texts.)

Citations

The first three citations at in the Handbook list; the numbers refer to that list’s numbering. The others I searched out for the purpose of this post.

82. Busenberg, S., D. C. Fisher, and M. Martelli (1989), Minimal periods of discrete and smooth orbits. American Mathematical Monthly, volume 96, pages 5–17. [p. 8. Lines 2–4.]

Therefore, a normed linear space is really a pair (\mathbf{E},\|\cdot\|) where \mathbf{E} is a linear vector space and \|\cdot\|:\mathbf{E}\to(0,\infty) is a norm. In speaking of normed spaces, we will frequently abuse this notation and write \mathbf{E} instead of the pair (\mathbf{E},\|\cdot\|).

209. Hunter, T. J. (1996), On the homology spectral sequence for topological Hochschild homology. Transactions of the American Mathematical Society, volume 348, pages 3941–3953. [p. 3934. Lines 8–6 from bottom.]

We will often abuse notation by omitting mention of the natural isomorphisms making \wedge associative and unital.

395. Teitelbaum, J. T. (1991), ‘The Poisson kernel for Drinfeld modular curves’. Journal of the American Mathematical Society, volume 4, pages 491–511. [p. 494. Lines 1–4.]

\ldots may find a homeomorphism x:E\to \mathbb{P}^1_k such that \displaystyle x(\gamma u) = \frac{ax(u)+b}{cx(u)+d}. We will tend to abuse notation and identify E with \mathbb{P}^1_k by means of the function x.

AB1. Fujita, T. On the structure of polarized manifolds with total deficiency one.  I. J. Math. Soc. Japan, Vol. 32, No. 4, 1980.

Here we show examples of symbols used in this paper \ldots

L_{T}: The pull back of L to a space T by a given morphism T\rightarrow S . However, when there is no danger of confusion, we OFTEN write L instead of L_T by abuse of notation.

AB2. Sternberg, S. Minimal coupling and the symplectic mechanics of a classical
particle in the presence of a Yang-Mills field. Physics, Vol. 74, No. 12, pp. 5253-5254, December 1977.

On the other hand, let us, by abuse of notation, continue to write \Omega for the pullback of \Omega from F to P \times F by projection onto the second factor. Thus, we can write \xi_Q\rfloor\Omega = \xi_F\rfloor\Omega and \ldots

AB3. Dobson, D, and Vogel, C. Convergence of an iterative method for total variation denoising. SIAM J. Numer. Anal., Vol. 34, pp. 1779, October, 1997.

Consider the approximation

(3.7) u\approx U\stackrel{\text{def}}{=}\sum_{j=1}^N U_j\phi_j \ldots

In an abuse of notation, U will represent both the coefficient vector \{U_j\}_{j=1}^N and the corresponding linear combination (3.7).

AB4. Lewis, R, and Torczon, V. Pattern search algorithms for bound constrained minimization.  NASA Contractor Report 198306; ICASE Report No. 96-20.

By abuse of notation, if A is a matrix, y\in A means that the vector y is a column of A.

AB5. Allemandi, G, Borowiecz, A. and Francaviglia, M. Accelerated Cosmological Models in Ricci squared Gravity. ArXiv:hep-th/0407090v2, 2008.

This allows to reinterpret both f(S) and f'(S) as functions of \tau in the expressions:
\begin{equation*}\begin{cases}  f(S) = f(F(\tau)) = f(\tau )\\  f'(S) = f'(F(\tau )) = f'(\tau )\end{cases}\end{equation*}
following the abuse of notation f(F(t )) = f(t ) and f'(F(t )) = f'(t ).

References

[1] Harrison, J. Criticism and reconstruction, in Formalized Mathematics (1996).

[2] Clapham, C. and J. Nicholson.  Oxford Concise Dictionary of Mathematics, Fourth Edition (2009).  Oxford University Press.

 

Send to Kindle

More about defining “category”


In a recent post, I wrote about defining “category” in a way that (I hope) makes it accessible to undergraduate math majors at an early stage.  I have several more things to say about this.

Early intro to categories

The idea is to define a category as a directed graph equipped with an additional structure of composition of paths subject to some axioms.  By giving several small finite examples of categories drawn in that way that gives you an understanding of “category” that has several desirable properties:

  • You get the idea of what a category is in one lecture.
  • With the right choice of examples you get several fine points cleared up:
    • The composition is added structure.
    • A loop doesn’t have to be an identity.
    • Associativity is a genuine requirement —  it is not automatic.
  • You get immediate access to what is by far the most common notation used to work with a category — objects (nodes) and arrows.
  • You don’t have to cope with the difficult chunking required when the first examples given are sets-with-structure and structure-preserving functions.  It’s quite hard to focus on a couple of dots on the paper each representing a group or a topological space and arrows each representing a whole function (not the value of the function!).

Introduce more examples

Then the teacher can go on with the examples that motivated categories in the first place: the big deal categories such as sets, groups and topological spaces.   But they can be introduced using special cases so they don’t require much background.

  • Draw some finite sets and functions between them.  (As an exercise, get the students to find some finite sets and functions that make the picture a category with $f=kh$ as the composite and $f\neq g$.)
  • If the students have had calculus,  introduce them to the category whose objects are real finite nonempty intervals with continuous or differentiable mappings between them.  (Later you can prove that this category is a groupoid!)
  • Find all the groups on a two element set and figure out which maps preserve group multiplication.  (You don’t have to use the word “group” — you can simply show both of them and work out which maps preserve multiplication — and discover isomorphism!.)  This introduces the idea of the arrows being structure-preserving mape. You can get more complicated and use semigroups as well.  If the students know Mathematica you could even do magmas.  Well, maybe not.

All this sounds like a project you could do with high school students.

Large and small

If all this were just a high school (or intro-to-math-for-math-majors) project you wouldn’t have to talk about large vs. small.  However, I have some ideas about approaching this topic.

In the first place, you can define category, or any other mathematical object that might involve a proper class, using the syntactic approach I described in Just-in-time foundations.  You don’t say “A category consists of a set of objects and a set of arrows such that …”.  Instead you say something like “A category $\mathcal{C}$ has objects $A,\,B,\,C\ldots$ such that…”.

This can be understood as meaning “For any $A$, the statement $A$ is an object of  $\mathcal{C}$ is either true or false”, and so on.

This approach is used in the Wikibook on category theory.  (Note: this is a permanent link to the November 28 version of the section defining categories, which is mostly my work.  As always with Wikimedia things it may be entirely different when you read this.)

If I were dictator of the math world (not the same thing as dictator of MathWorld) I would want definitions written in this syntactic style.  The trouble is that mathematicians are now so used to mathematical objects having to be sets-with-structure that wording the definition as I did above may leave them feeling unmoored.  Yet the technique avoids having to mention large vs. small until a problem comes up. (In category theory it sometimes comes up when you want to quantify over all objects.)

The ideas outlined in this subsection could be a project for math majors.  You would have to introduce Russell’s Paradox.  But for an early-on intro to categories you could just use the syntactic wording and avoid large vs. small altogether.

 

http://en.wikibooks.org/w/index.php?title=Category_Theory/Categories&stableid=2221684

Send to Kindle

Defining “category”

The concept of category is typically taught later in undergrad math than the concept of group is.  It is supposedly a more advanced concept.  Indeed, the typical examples of categories used in applications are more advanced than some of those in group theory (for example, symmetries of geometric shapes and operations on numbers).

Here are some thoughts on how categories could be taught as early as groups, if not earlier.

Nodes and arrows

Small finite categories can be pictured as a graph using nodes and arrows, together with a specification of the identity arrows and a definition of the composition.  (I am using the word “graph” the way category people use it:  a directed graph with possible multiple edges and loops.)

An example is the category pictured below with three objects and seven arrows. The composition is forced except for $kh$, which I hereby define to be $f$.

This way of picturing a category is  easy to grasp. The composite $kh$ visibly has to be either $f$ or $g$.  There is only one choice for the composite of any other composable pair.  Still, the choice of composite is not deducible directly by looking at the graph.

A first class in category theory using graphs as examples could start with this example, or the example in Note 1 below.  This example is nontrivial (never start any subject with trivial examples!) and easy to grasp, in this case using the extraordinary preprocessing your brain does with the input from your eyes.  The definition of category is complicated enough that you should probably present the graph and then give the definition while pointing to what each clause says about the graph.

Most abstract structures have several different ways of representing them. In contrast, when you discuss categorial concepts the standard object-and-arrow notation is the overwhelming favorite.  It reveals domains and codomains and composable pairs, in fact almost everything except which of several possible arrows the composite actually is.  If for example you try to define category using sets and functions as your running example, the student has to do a lot of on-the-go chunking — thinking of a set as a single object, of a set function (which may involve lots of complicated data) as a single chunk with a domain and a codomain, and so on.  But an example shown as a graph comes already chunked and in a picture that is guaranteed to be the most common kind of display they will see in discussions of categories.

After you do these examples, you can introduce trivial and simple graph examples in which the composition is entirely induced; for example these three:

(In case you are wondering, one of them is the empty category.)  I expect that you should also introduce another graph non-example in which associativity fails.

Multiplication tables

The multiplication table for a group is easy to understand, too, in the sense that it gives you a simple method of calculating the product of any two elements.  But it doesn’t provide a visual way to see the product as a category-as-graph does.  Of course, the graph representation works only for finite categories, just as the multiplication table works only for finite groups.

You can give a multiplication table for a small finite category, too, like the one below for the category above.  (“iA” means the identity arrow on A and composition, as usual in category theory, is right to left.) This is certainly more abstract than the graph picture, but it does hit you in the face with the fact that the multiplication is partial.

Notes

1. My suggested example of a category given as a graph shows clearly that you can define two different categorial structures on the graph.  One problem is that the two different structures are isomorphic categories.  In fact, if you engage the students in a discussion about these examples someone may notice that!  So you should probably also use the graph below,where you can define several different category structures that are not all isomorphic. 

2. Multiplication tables and categories-as-graphs-with-composition are extensional presentations.  This means they are presented with all their parts laid out in front of you.  Most groups and categories are given by definitions as accumulations of properties (see concept in the Handbook of Mathematical Discourse).  These definitions tend to make some requirements such as associativity obvious.

Students are sometimes bothered by extensional definitions.  “What are h and k (in the category above)?  What are a, b and c?” (in a group given as a set of letters and a multiplication table).

Send to Kindle

Definition of “function”

I have made a major revision of the abstractmath.org article Functions: Specification and Definition.   The links from the revised article lead into the main abstractmath website, but links from other articles on the website still go back to the old version. So if you click on a link in the revised version, make it come up in a new window.

I expect to link the revision in after I make a few small changes, and I will take into account any comments from you all.

Remarks

1.  You will notice that the new version is in PDF instead of HTML.  A couple of other articles on the website are already in PDF, but I don’t expect to continue replacing HTML by PDF.   It is too much work.  Besides, you can’t shrink it to fit tablets.

2. It would also have been a lot of work to adapt the revision so that I could display it directly on Word Press.  In some cases I have written revisions first in WP and then posted them on the abmath website.  That is not so difficult and I expect to do it again.

Send to Kindle

Freezing a family of functions

The interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook algebra1.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.

Some background

  • Generally, I have advocated using all sorts of images and metaphors to enable people to think about particular mathematical objects more easily.
  • In previous posts I have illustrated many ways (some old, some new, many recently using Mathematica CDF files) that you can provide such images and metaphors, to help university math majors get over the abstraction cliff.
  • When you have to prove something you find yourself throwing out the images and metaphors (usually a bit at a time rather than all at once) to get down to the rigorous view of math [1], [2], [3], to the point where you think of all the mathematical objects you are dealing with as unchanging and inert (not reacting to anything else).  In other words, dead.
  • The simple example of a family of functions in this post is intended to give people a way of thinking about getting into the rigorous view of the family.  So this post uses image-and-metaphor technology to illustrate a way of thinking about one of the basic proof techniques in math (representing the object in rigor mortis so you can dissect it).  I suppose this is meta-math-ed.  But I don’t want to think about that too much…
  • This example also illustrates the difference between parameters and variables. The bottom line is that the difference is entirely in how we think about them. I will write more about that later.

 A family of functions

This graph shows individual members of the family of functions \( y=a\sin\,x\) for various values of a. Let’s look at some of the ways you can think about this.

  • Each choice of  “shows the function for that value of the parameter a“.  But really, it shows the graph of the function, in fact only the part between x=-4 and x= 4.
  • You can also think of it as showing the function changing shape as a changes over time (as you slide the controller back and forth).

Well, you can graph something changing over time by introducing another axis for time.  When you graph vertical motion of a particle over time you use a two-dimensional picture, one axis representing time and the other the height of the particle. Our representation of the function y=a\sin\,x is a two-dimensional object (using its graph) so we represent the function in 3-space, as in this picture, where the slider not only shows the current (graph of the) function for parameter value a but also locates it over a on the z axis.

The picture below shows the surface given by y=a\sin\,x as a function of both variables a and x. Note that this graph is static: it does not change over time (no slide bar!). This is the family of functions represented as a rigorous (dead!) mathematical object.

If you click the “Show Curves” button, you will see a selection of the curves in middle diagram above drawn as functions of x for certain values of a. Each blue curve is thus a sine wave of amplitude a. Pushing that button illustrates the process going on in your mind when you concentrate on one aspect of the surface, namely its cross-sections in the x direction.

Reference [4] gives the code for the diagrams in this post, as well as a couple of others that may add more insight to the idea. Reference [5] gives similar constructions for a different family of functions.

References

  1. Rigorous view in abstractmath.org 
  2. Representations II: Dry Bones (post)
  3. Representations III: Rigor and Rigor Mortis (post)
  4. FamiliesFrozen.nb.
  5. AnotherFamiliesFrozen.nb (Mathematica file showing another family of functions)
Send to Kindle

Thinking about abstract math

 

The abstraction cliff

In universities in the USA, a math major typically starts with calculus, followed by courses such as linear algebra, discrete math, or a special intro course for math majors (which may be taken simultaneously with calculus), then go on to abstract algebra, analysis, and other courses involving abstraction and proofs.

At this point, too many of them hit a wall; their grades drop and they change majors.  They had been getting good grades in high school and in calculus because they were strong in algebra and geometry, but the sudden increase in abstraction in the newer courses completely baffles them. I believe that one big difficulty is that they can't grasp how to think about abstract mathematical objects.  (See Reference [9] and note [a].)   They have fallen off the abstraction cliff.  We lose too many math majors this way. (Abstractmath.org is my major effort to address the problems math majors have during or after calculus.)

This post is a summary of the way I see how mathematicians and students think about math.  I will use it as a reference in later posts where I will write about how we can communicate these ways of thinking.

Concept Image

In 1981, Tall and Vinner  [5] introduced the notion of the concept image that a person has about a mathematical concept or object.   Their paper's abstract says

The concept image consists of all the cognitive structure in the individual's mind that is associated with a given concept. This may not be globally coherent and may have aspects which are quite different from the formal concept definition.

The concept image you may have of an abstract object generally contains many kinds of constituents:

  • visual images of the object
  • metaphors connecting the object to other concepts
  • descriptions of the object in mathematical English
  • descriptions and symbols of the object in the symbolic language of math
  • kinetic feelings concerning certain aspects of the object
  • how you calculate parameters of the object
  • how you prove particular statements about the object

This list is incomplete and the items overlap.  I will write in detail about these ideas later.

The name "concept image" is misleading [b]), so when I have written about them, I have called them metaphors or mental representations as well as concept images, for example in [3] and [4].

Abstract mathematical concepts

This is my take on the notion of concept image, which may be different from that of most researchers in math ed. It owes a lot to the ideas of Reuben Hersh [7], [8].

  • An abstract mathematical concept is represented physically in your brain by what I have called "modules" [1] (physical constituents or activities of the brain [c]).
  • The representation generally consists of many modules.  They correspond to the list of constituents of a concept image given above.  There is no assumption that all the modules are "correct".
  • This representation exists in a semi-public network of mathematicians' and students' brains. This network exercises (incomplete) control over your personal representation of the abstract structure by means of conversation with other mathematicians and reading books and papers.  In this sense, an abstract concept is a social object.  (This is the only point of view in the philosophy of math that I know of that contains any scientific content.)

Notes

[a]  Before you object that abstraction isn't the only thing they have trouble with, note that a proof is an abstract mathematical object. The written proof is a representation of the abstract structure of the proof.  Of course, proofs are a special kind of abstract structure that causes special problems for students.

[b] Cognitive science people use "image" to include nonvisual representations, but not everyone does.  Indeed, cognitive scientists use "metaphor" as well with a broader meaning than your high school English teacher.  A metaphor involves the cognitive merging of parts of two concepts (specifically with other parts not merged). See [6].

[c] Note that I am carefully not saying what the modules actually are — neurons, networks of neurons, events in the brain, etc.   From the point of view of teaching and understanding math, it doesn't matter what they are, only that they exist and live in a society where they get modified by memes  (ideas, attitudes, styles physically transmitted from brain to brain by speech, writing, nonverbal communication, appearance, and in other ways).

References

  1. Math and modules of the mind (previous post)
  2. Mathematical Concepts (previous post)
  3. Mental, physical and mathematical representations (previous post)
  4. Images and Metaphors (abstractmath.org)
  5. David Tall and Schlomo Vinner, Concept Image and Concept Definition in Mathematics with particular reference to limits and continuity, Journal Educational Studies in Mathematics, 12 (May, 1981), no. 2, 151–169.
  6. Conceptual metaphor (Wikipedia article).
  7. What is mathematics, really? by Reuben Hersh, Oxford University Press, 1999.  Read online at Questia.
  8. 18 Unconventional Essays on the Nature of Mathematics, by Reuben Hersh. Springer, 2005.
  9. Mathematical objects (abstractmath.org).

 

 

Send to Kindle