Tag Archives: notation

Notation for sets

This is a revision of the section of abstractmath.org on notation for sets.

Sets of numbers

The following notation for sets of numbers is fairly standard.


  • Some authors use $\mathbb{I}$ for $\mathbb{Z}$, but $\mathbb{I}$ is also used for the unit interval.
  • Many authors use $\mathbb{N}$ to denote the nonnegative integers instead
    of the positive ones.
  • To remember $\mathbb{Q}$, think “quotient”.
  • $\mathbb{Z}$ is used because the German word for “integer” is “Zahl”.

Until the 1930’s, Germany was the world center for scientific and mathematical study, and at least until the 1960’s, being able to read scientific German was was required of anyone who wanted a degree in science. A few years ago I was asked to transcribe some hymns from a German hymnbook — not into English, but merely from fraktur (the old German alphabet) into the Roman alphabet. I sometimes feel that I am the last living American to be able to read fraktur easily.

Element notation

The expression “$x\in A$” means that $x$ is an element of the set $A$. The expression “$x\notin A$” means that $x$ is not an element of $A$.

“$x\in A$” is pronounced in any of the following ways:

  • “$x$ is in $S$”.
  • “$x$ is an element of $S$”.
  • “$x$ is a member of $S$”.
  • “$S$ contains $x$”.
  • “$x$ is contained in $S$”.


  • Warning: The math English phrase “$A$ contains $B$” can mean either “$B\in A$” or “$B\subseteq A$”.
  • The Greek letter epsilon occurs in two forms in math, namely $\epsilon$ and $\varepsilon$. Neither of them is the symbol for “element of”, which is “$\in$”. Nevertheless, it is not uncommon to see either “$\epsilon$” or “$\varepsilon$” being used to mean “element of”.
  • $4$ is an element of all the sets $\mathbb{N}$, $\mathbb{Z}$, $\mathbb{Q}$, $\mathbb{R}$, $\mathbb{C}$.
  • $-5\notin \mathbb{N}$ but it is an element of all the others.

List notation

Definition: list notation

A set with a small number of elements may be denoted by listing the elements inside braces (curly brackets). The list must include exactly all of the elements of the set and nothing else.


The set $\{1,\,3,\,\pi \}$ contains the numbers $1$, $3$ and $\pi $ as elements, and no others. So $3\in \{1,3,\pi \}$ but $-3\notin \{1,\,3,\,\pi \}$.

Properties of list notation

List notation shows every element and nothing else

If $a$ occurs in a list notation, then $a$ is in the set the notation defines.  If it does not occur, then it is not in the set.

Be careful

When I say “$a$ occurs” I don’t mean it necessarily occurs using that name. For example, $3\in\{3+5,2+3,1+2\}$.

The order in which the elements are listed is irrelevant

For example, $\{2,5,6\}$ and $\{5,2,6\}$ are the same set.

Repetitions don’t matter

$\{2,5,6\}$, $\{5,2,6\}$, $\{2,2,5,6 \}$ and $\{2,5,5,5,6,6\}$ are all different representations of the same set. That set has exactly three elements, no matter how many numbers you see in the list notation.

Multisets may be written with braces and repeated entries, but then the repetitions mean something.

When elements are sets

When (some of) the elements in list notation are themselves sets (more about that here), care is required.  For example, the numbers $1$ and $2$  are not elements of the set \[S:=\left\{ \left\{ 1,\,2,\,3 \right\},\,\,\left\{ 3,\,4 \right\},\,3,\,4 \right\}\]The elements listed include the set $\{1, 2, 3\}$ among others, but not the number $2$.  The set $S$ contains four elements, two sets and two numbers. 

Another way of saying this is that the element relation is not transitive: The facts that $A\in B$ and $B\in C$ do not imply that $A\in C$. 

Sets are arbitrary

  • Any mathematical object can be the element of a set.
  • The elements of a set do not have to have anything in common.
  • The elements of a set do not have to form a pattern.
  • $\{1,3,5,6,7,9,11,13,15,17,19\}$ is a set. There is no point in asking, “Why did you put that $6$ in there?” (Sets can be arbitrary.)
  • Let $f$ be the function on the reals for which $f(x)=x^3-2$. Then \[\left\{\pi^3,\mathbb{Q},f,42,\{1,2,7\}\right\}\] is a set. Sets do not have to be homogeneous in any sense.

Setbuilder notation


Suppose $P$ is an assertion. Then the expression “$\left\{x|P(x) \right\}$” denotes the set of all objects $x$ for which $P(x)$ is true. It contains no other elements.

  • The notation “$\left\{ x|P(x) \right\}$” is called setbuilder notation.
  • The assertion $P$ is called the defining condition for the set.
  • The set $\left\{ x|P(x) \right\}$ is called the truth set of the assertion $P$.

In these examples, $n$ is an integer variable and $x$ is a real variable..

  • The expression “$\{n| 1\lt n\lt 6 \}$” denotes the set $\{2, 3, 4, 5\}$. The defining condition is “$1\lt n\lt 6$”.  The set $\{2, 3, 4, 5\}$ is the truth set of the assertion “n is an integer and $1\lt n\lt 6$”.
  • The notation $\left\{x|{{x}^{2}}-4=0 \right\}$ denotes the set $\{2,-2\}$.
  • $\left\{ x|x+1=x \right\}$ denotes the empty set.
  • $\left\{ x|x+0=x \right\}=\mathbb{R}$.
  • $\left\{ x|x\gt6 \right\}$ is the infinite set of all real numbers bigger than $6$.  For example, $6\notin \left\{ x|x\gt6 \right\}$ and $17\pi \in \left\{ x|x\gt6 \right\}$.
  • The set $\mathbb{I}$ defined by $\mathbb{I}=\left\{ x|0\le x\le 1 \right\}$ has among its elements $0$, $1/4$, $\pi /4$, $1$, and an infinite number of
    other numbers. $\mathbb{I}$ is fairly standard notation for this set – it is called the unit interval.

Usage and terminology

  • A colon may be used instead of “|”. So $\{x|x\gt6\}$ could be written $\{x:x\gt6\}$.
  • Logicians and some mathematicians called the truth set of $P$ the extension of $P$. This is not connected with the usual English meaning of “extension” as an add-on.
  • When the assertion $P$ is an equation, the truth set of $P$ is usually called the solution set of $P$. So $\{2,-2\}$ is the solution set of $x^2=4$.
  • The expression “$\{n|1\lt n\lt6\}$” is commonly pronounced as “The set of integers such that $1\lt n$ and $n\lt6$.” This means exactly the set $\{2,3,4,5\}$. Students whose native language is not English sometimes assume that a set such as $\{2,4,5\}$ fits the description.

Setbuilder notation is tricky

Looking different doesn’t mean they are different.

A set can be expressed in many different ways in setbuilder notation. For example, $\left\{ x|x\gt6 \right\}=\left\{ x|x\ge 6\text{ and }x\ne 6 \right\}$. Those two expressions denote exactly the same set. (But $\left\{x|x^2\gt36 \right\}$ is a different set.)

Russell’s Paradox

In certain areas of math research, setbuilder notation can go seriously wrong. See Russell’s Paradox if you are curious.

Variations on setbuilder notation

An expression may be used left of the vertical line in setbuilder notation, instead of a single variable.

Giving the type of the variable

You can use an expression on the left side of setbuilder notation to indicate the type of the variable.


The unit interval $I$ could be defined as \[\mathbb{I}=\left\{x\in \mathrm{R}\,|\,0\le x\le 1 \right\}\]making it clear that it is a set of real numbers rather than, say rational numbers.  You can always get rid of the type expression to the left of the vertical line by complicating the defining condition, like this:\[\mathbb{I}=\left\{ x|x\in \mathrm{R}\text{ and }0\le x\le 1 \right\}\]

Other expressions on the left side

Other kinds of expressions occur before the vertical line in setbuilder notation as well.


The set\[\left\{ {{n}^{2}}\,|\,n\in \mathbb{Z} \right\}\]consists of all the squares of integers; in other words its elements are 0,1,4,9,16,….  This definition could be rewritten as $\left\{m|\text{ there is an }n\in \mathrm{}\text{ such that }m={{n}^{2}} \right\}$.


Let $A=\left\{1,3,6 \right\}$.  Then $\left\{ n-2\,|\,n\in A\right\}=\left\{ -1,1,4 \right\}$.


Be careful when you read such expressions.


The integer $9$ is an element of the set \[\left\{{{n}^{2}}\,|\,n\in \text{ Z and }n\ne 3 \right\}\]It is true that $9={{3}^{2}}$ and that $3$ is excluded by the defining condition, but it is also true that $9={{(-3)}^{2}}$ and $-3$ is not an integer ruled out by the defining condition.


Sets. Previous post.


Toby Bartels for corrections.

Creative Commons License< ![endif]>

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

More alphabets

This post is the third and last in a series of posts containing revisions of the abstractmath.org article Alphabets. The first two were:

Addition to the listings for the Greek alphabet

Sigma: $\Sigma,\,\sigma$ or ς: sĭg'mɘ. The upper case $\Sigma $ is used for indexed sums.  The lower case $\sigma$ (don't call it "oh") is used for the standard deviation and also for the sum-of-divisors function. The ς form for the lower case has not as far as I know been used in math writing, but I understood that someone is writing a paper that will use it.

Hebrew alphabet

Aleph, א is the only Hebrew letter that is widely used in math. It is the cardinality of the set of integers. A set with cardinality א is countably infinite. More generally, א is the first of the aleph numbers $א_1$, $א_2$, $א_3$, and so on.

Cardinality theorists also write about the beth (ב) numbers, and the gimel (ג) function. I am not aware of other uses of the Hebrew alphabet.

If you are thinking of using other Hebrew letters, watch out: If you type two Hebrew letters in a row in HTML they show up on the screen in reverse order. (I didn't know HTML was so clever.)

Cyrillic alphabet

The Cyrillic alphabet is used to write Russian and many other languages in that area of the world. Wikipedia says that the letter Ш, pronounced "sha", is the only Cyrillic letter used in math. I have not investigated further.

The letter is used in several different fields, to denote the Tate-Shafarevich group, the Dirac comb and the shuffle product.

It seems to me that there are a whole world of possibillities for brash young mathematicians to name mathematical objects with other Cyrillic letters. Examples:

  • Ж. Use it for a ornate construction, like the Hopf fibration or a wreath product.
  • Щ. This would be mean because it is hard to pronounce.
  • Ъ. Guaranteed to drive people crazy, since it is silent. (It does have a name, though: "Yehr".)
  • Э. Its pronunciation indicates you are unimpressed (think Fonz).
  • ю. Pronounced "you". "ю may provide a counterexample". "I do?"

Type styles

Boldface and italics

A typeface is a particular design of letters.  The typeface you are reading is Arial.  This is Times New Roman. This is Goudy. (Goudy may not render correctly on your screen if you don't have it installed.)

Typefaces typically come in several styles, such as bold (or boldface) and italic.


Arial Normal Arial italic Arial bold
Times Normal Times italic Times bold Goudy Normal Goudy italic Goudy bold

Boldface and italics are used with special meanings (conventions) in mathematics. Not every author follows these conventions.

Styles (bold, italic, etc.) of a particular typeface are supposedly called fonts.  In fact, these days “font” almost always means the same thing as “typeface”, so I  use “style” instead of “font”.


A letter denoting a vector is put in boldface by many authors.

  • “Suppose $\mathbf{v}$ be an vector in 3-space.”  Its coordinates typically would be denoted by $v_1$, $v_2$ and $v_3$.
  • You could also define it this way:  “Let $\mathbf{v}=({{v}_{1}},{{v}_{2}},{{v}_{3}})$ be a vector in 3-space.”  (See parenthetic assertion.)

It is hard to do boldface on a chalkboard, so lecturers may use $\vec{v}$ instead of $\mathbf{v}$. This is also seen in print.


The definiendum (word or phrase being defined) may be put in boldface or italics. Sometimes the boldface or italics is the only clue you have that the term is being defined. See Definitions.



“A group is Abelian if its multiplication is commutative,” or  “A group is Abelian if its multiplication is commutative.”


Italics are used for emphasis, just as in general English prose. Rarely (in my experience) boldface may be used for emphasis.

In the symbolic language

It is standard practice in printed math to put single-letter variables in italics.   Multiletter identifiers are usually upright.


Example: "$f(x)=a{{x}^{2}}+\sin x$".  Note that mathematicians would typically refer to $a$ as a “constant” or “parameter”, but in the sense we use the word “variable” here, it is a variable, and so is $f$.


On the other hand, “e” is the proper name of a specific number, and so is “i”. Neither is a variable. Nevertheless in print they are usually given in italics, as in ${{e}^{ix}}=\cos x+i\sin x$.  Some authors would write this as ${{\text{e}}^{\text{i}x}}=\cos x+\text{i}\,\sin x$.  This practice is recommended by some stylebooks for scientific writing, but I don't think it is very common in math.

Blackboard bold


Blackboard bold letters are capital Roman letters written with double vertical strokes.   They look like this:


In lectures using chalkboards, they are used to imitate boldface.

In print, the most common uses is to represent certain sets of numbers:


  • Mathe­ma­tica uses some lower case blackboard bold letters.
  • Many mathe­ma­tical writers disapprove of using blackboard bold in print.  I say the more different letter shapes that are available the better.  Also a letter in blackboard bold is easier to distinguish from ordinary upright letters than a letter in boldface is, particularly on computer screens.
Send to Kindle

Images and metaphors in math

About this post

This post is the new revision of the chapter on Images and Metaphors in abstractmath.org.

Images and metaphors in math

In this chapter, I say something about mental represen­tations (metaphors and images) in general, and provide examples of how metaphors and images help us understand math – and how they can confuse us.

Pay special attention to the section called two levels!  The distinction made there is vital but is often not made explicit.

Besides mental represen­tations, there are other kinds of represen­tations used in math, discussed in the chapter on represen­tations and models.

Mathe­matics is the tinkertoy of metaphor. –Ellis D. Cooper

Images and metaphors in general

We think and talk about our experiences of the world in terms of images and metaphors that are ultimately derived from immediate physical experience.  They are mental represen­tations of our experiences.

See Thinking about thought.



We know what a pyramid looks like.  But when we refer to the government’s food pyramid we are not talking about actual food piled up to make a pyramid.  We are talking about a visual image of the pyramid.    


We know by direct physical experience what it means to be warm or cold.  We use these words as metaphors
in many ways: 

  • We refer to a person as having a warm or cold personality.  This has nothing to do with their body temperature.
  • When someone is on a treasure hunt we may tell them they are “getting warm”, even if they are hunting outside in the snow.

Children don’t always sort meta­phors out correctly. Father: “We are all going to fly to Saint Paul to see your cousin Petunia.” Child: “But Dad, I don’t know how to fly!”

Other terminology

  • My use of the word “image” means mental image. In the study of literature, the word “image” is used in a more general way, to refer to an expression that evokes a mental image..
  • I use “metaphor” in the sense of conceptual metaphor. The word metaphor in literary studies is related to my use but is defined in terms of how it is expressed.
  • The metaphors mentioned above involving “warm” and “cold”
    evoke a sensory experience, and so could be called an image as well. 
  • In math education, the phrase concept image means the mental structure associated with a concept, so there may be no direct connection with sensory experience.  
  • In abstractmath.org, I use the phrase metaphors and images to talk about all our mental represen­tations, without trying for fine distinctions.

Mental represen­tations are imperfect

One basic fact about metaphors and images is that they apply only to certain aspects of the situation.

  • When someone is getting physically warm we would expect them to start sweating.
  • But if they are getting warm in a treasure hunt we don’t expect them to start sweating. 
  • We don’t expect the food pyramid to have a pharaoh buried underneath it, either.

Our brains handle these aspects of mental represen­tations easily and usually without our being conscious of them.  They are one of the primary ways we understand the world.

Images and metaphors in math

Half this game is 90% mental. –Yogi Berra

Types of represen­tations

Mathe­maticians who work with a particular kind of mathe­matical object
have mental represen­tations of that type of object that help them
understand it.  These mental represen­tations come in many forms.  Most of them fit into one of the types below, but the list shouldn’t be taken too seriously: Some represen­tations fit more that of these types, and some may not fit into any of them except awkwardly.

  • Visual
  • Notation
  • Kinetic
  • Process
  • Object

All mental represen­tations are conceptual metaphors. Metaphors are treated in detail in this chapter and in the chapter on images and metaphors for functions.  See also literalism and Proofs without dry bones on Gyre&Gimble.

Below I list some examples. Many of them refer to the arch function, the function defined by $h(t)=25-{{(t-5)}^{2}}$.

Visual image

Geometric figures

The arch function

  • You can picture the arch function in terms of its graph, which is a parabola.     This visualization suggests that the function has a single maximum point that appears to occur at $t=5$. That is an example of how metaphors can suggest (but not prove) theorems.
  • You can think of the arch function
    more physically, as like the Gateway Arch. This metaphor is suggested by the graph.

Interior of a shape

  • The interior of a closed curve or a sphere is called that because it is like the interior in the everyday sense of a bucket or a house.
  • Sometimes, the interior can be described using analytic geometry. For example, the interior of the circle $x^2+y^2=1$ is the set of points \[\{(x,y)|x^2+y^2\lt1\}\]
  • But the “interior” metaphor is imperfect: The boundary of a real-life container such as a bucket has thickness, in contrast to the boundary of a closed curve or a sphere. 
  • This observation illustrates my description of a metaphor as identifying part of one situation with part of another. One aspect is emphasized; another aspect, where they may differ, is ignored.

Real number line

  • You may think of the real
    as lying along a straight line (the real line) that extends infinitely far in both directions.  This is both visual and a metaphor (a real number “is” a place on the real line).
  • This metaphor is imperfect because you can’t draw the whole real line, but only part of it. But you can’t draw the whole graph of the curve $y=25-(t-5)^2$, either.

Continuous functions

No gaps

“Continuous functions don’t have gaps in the graph”.    This is a visual image, and it is usually OK.

  • But consider the curve defined by $y=25-(t-5)^2$ for every real $x$ except $x=1$. It is not defined at $x=1$ (and so the function is discontinuous there) but its graph looks exactly like the graph in the figure above because no matter how much you magnify it you can’t see the gap.
  • This is a typi­cal math example that teachers make up to raise your consciousness.

  • So is there a gap or not?
No lifting

“Continuous functions can be drawn without lifting the chalk.” This is true in most familiar cases (provided you draw the graph only on a finite interval). But consider the graph of the function defined by $f(0)=0$ and \[f(t)=t\sin\frac{1}{t}\ \ \ \ \ \ \ \ \ \ (0\lt t\lt 0.16)\]
(see Split Definition). This curve is continuous and is infinitely long even though it is defined on a finite interval, so you can’t draw it with a chalk at all, picking up the chalk or not. Note that it has no gaps.

Keeping concepts separate by using mental “space”

I personally use visual images to remember relationships between abstract objects, as well.  For example, if I think of three groups, two of which are isomorphic (for example $\mathbb{Z}_{3}$ and $\text{Alt}_3$), I picture them as in three different places in my head with a connection between the two isomorphic ones.


Here I give some examples of thinking of math objects in terms of the notation used to name them. There is much more about notation as mathe­matical represen­tation in these sections of abmath:

Notation is both something you visualize in your head and also a physical represen­tation of the object.  In fact notation can also be thought of as a mathe­matical object in itself (common in mathe­matical logic and in theoretical computing science.)   If you think about what notation “really is” a lot,  you can easily get a headache…


  • When I think of the square root of $2$, I visualize the symbol “$\sqrt{2}$”. That is both a typographical object and a mathe­matically defined symbolic represen­tation of the square root of $2$.
  • Another symbolic represen­tation of the square root of $2$ is “$2^{1/2}$”. I personally don’t visualize that when I think of the square root of $2$, but there is nothing wrong with visualizing it that way.
  • What is dangerous is thinking that the square root of $2$ is the symbol “$\sqrt{2}$” (or “$2^{1/2}$” for that matter). The square root of $2$ is an abstract mathe­matical object given by a precise mathe­matical definition.
  • One precise defi­nition of the square root of $2$ is “the positive real number $x$ for which $x^2=2$”. Another definition is that $\sqrt{2}=\frac{1}{2}\log2$.


  • If I mention the number “two thousand, six hundred forty six” you may visualize it as “$2646$”. That is its decimal represen­tation.
  • But $2646$ also has a prime factorization, namely $2\times3^3\times7^2$.
  • It is wrong to think of this number as being the notation “$2646$”. Different notations have different values, and there is no mathe­matical reason to make “$2646$” the “genuine” represen­tation. See represen­tations and Models.
  • For example, the prime factor­ization of $2646$ tells you imme­diately that it is divisible by $49$.

When I was in high school in the 1950’s, I was taught that it was incorrect to say “two thousand, six hundred and forty six”. Being naturally rebellious I used that extra “and” in the early 1960’s in dictating some number in a telegraph mes­sage. The Western Union operator corrected me. Of course, the “and” added to the cost. (In case you are wondering, I was in the middle of a postal Diplomacy game in Graustark.)


Set notation

You can think of the set containing $1$, $3$ and $5$ and nothing else as represented by its common list notation $\{1, 3, 5\}$.  But remember that $\{5, 1,3\}$ is another notation for the same set. In other words the list notation has irrelevant features – the order in which the elements are listed in this case.


Shoot a ball straight up

  • The arch function could model the height over time of a physical object, perhaps a ball shot vertically upwards on a planet with no atmosphere.
  • The ball starts upward at time $t=0$ at elevation $0$, reaches an elevation of (for example) $16$ units at time $t=2$, and lands at $t=10$.
  • The parabola is not the path of the ball. The ball goes up and down along the $x$-axis. A point on the parabola shows it locaion on the $x$ axis at time $t$.
  • When you think about this event, you may imagine a physical event continuing over time, not just as a picture but as a feeling of going up and down.
  • This feeling of the ball going up and down is created in your mind presumably using mirror neuron. It is connected in your mind by a physical connection to the understanding of the function that has been created as connections among some of your neurons.
  • Although $h(t)$ models the height of the ball, it is not the same thing as the height of the ball.  A mathe­matical object may have a relationship in our mind to physical processes or situations, but it is distinct from them.


  1. This example involves a picture (graph of a function).  According to this report, kinetic
    understanding can also help with learning math that does not involve pictures. 
    For example, when I think of evaluating the function ${{x}^{2}}+1$ at 3, I visualize
    3 moving into the x slot and then the formula $9^2+1$ transforming
    itself into $10$. (Not all mathematicians visualize it this way.)
  2. I make the point of emphasizing the physical existence in your brain of kinetic feelings (and all other metaphors and images) to make it clear that this whole section on images and metaphors is about objects that have a physical existence; they are not abstract ideals in some imaginary ideal space not in our world. See Thinking about thought.

I remember visualizing algebra I this way even before I had ever heard of the Transformers.


It is common to think of a function as a process: you put in a number (or other object) and the process produces another number or other object. There are examples in Images and metaphors for functions.

Long division

Let’s divide $66$ by $7$ using long division. The process consists of writing down the decimal places one by one.

  1. You guess at or count on your fingers to find the largest integer $n$ for which $7n\lt66$. That integer is $9$.
  2. Write down $9.$ ($9$ followed by a decimal point).
  3. $66-9\times7=3$, so find the largest integer $n$ for which $7n\lt3\times10$, which is $4$.
  4. Adjoin $4$ to your answer, getting $9.4$
  5. $3\times10-7\times4=2$, so find the largest integer $n$ for which $7n\lt2\times10$, which is $2$.
  6. Adjoin $2$ to your answer, getting $9.42$.
  7. $2\times10-7\times2=6$, so find the largest integer for which $7n\lt6\times10$, which is $8$.
  8. Adjoin $8$ to your answer, getting $9.428$.
  9. $6\times10-7\times8=4$, so find the largest integer for which $7n\lt4\times10$, which is $5$.
  10. Adjoin $5$ to your answer, getting $9.4285$.

You can continue with the procedure to get as many decimal places as you wish of $\frac{66}{7}$.


The sequence of actions just listed is quite difficult to follow. What is difficult is not understanding what they say to do, but where did they get the numbers? So do this exercise!

Exercise worth doing:

Check that the procedure above is exactly what you do to divide $66$ by $7$ by the usual method taught in grammar school:

  • The long division process produces as many decimal places as you have stamina for. It is likely for most readers that when you do long division by hand you have done it so much that you know what to do next without having to consult a list of instructions.
  • It is a process or procedure but not what you might want to call a function. The process recursively constructs the successive integers occurring in the decimal expansion of $\frac{66}{7}$.
  • When you carry out the grammar school procedure above, you know at each step what to do next. That is why is it a process. But do you have the procedure in your head all at once?
  • Well, instructions (5) through (10) could be written in a programming language as a while loop, grouping the instructions in pairs of commands ((5) and (6), (7) and (8), and so on). However many times you go through the while loop determines the number of decimal places you get.
  • It can also be described as a formally defined recursive function $F$ for which $F(n)$ is the $n$th digit in the answer.
  • Each of the program and the recursive definition mentioned in the last two bullets are exercises worth doing.
  • Each of the answers to the exercises is then a mathematical object, and that brings us to the next type of metaphor…


A particular kind of metaphor or image for a mathematical concept is that of a mathematical object that represents the concept.


  • The number $10$ is a mathematical object. The expression “$3^2+1$” is also a mathematical object. It encapsulates the process of squaring $3$ and adding $1$, and so its value is $10$.
  • The long division process above finds the successive decimal places of a fraction of integers. A program that carries out the algorithm encapsulates the process of long division as an algorithm. The result is a mathematical object.
  • The expression “$1958$” is a mathematical object, namely the decimal represen­tation of the number $1958$. The expression
    “$7A6$” is the hexadecimal represen­tation of $1958$. Both represen­tations are mathematical objects with precise definitions.

Represen­tations as math objects is discussed primarily in represen­tations and Models. The difference between represen­tations as math objects and other kinds of mental represen­tations (images and metaphors) is primarily that a math object has a precise mathematical definition. Even so, they are also mental represen­tations.

Uses of mental represen­tations

Mental represen­tations of a concept make up what is arguably the most important part of the mathe­matician’s understanding of the concept.

  • Mental represen­tations of mathe­matical objects using metaphors and images are necessary for understanding and communicating about them (especially with types of objects that are new to us) .
  • They are necessary for seeing how the theory can be applied.
  • They are useful for coming up with proofs. (See example below.) 

Many represen­tations

 Different mental represen­tations of the same kind of object
help you understand different aspects of the object. 

Every important mathe­matical object
has many different kinds of represen­tations
and mathe­maticians typically keep
more that one of them in mind at once.

But images and metaphors are also dangerous (see below).

New concepts and old ones

We especially depend on metaphors and images to understand a math concept that is new to us .  But if we work with it for awhile, finding lots of examples, and
eventually proving theorems and providing counterexamples to conjectures, we begin to understand the concept in its own terms and the images and metaphors tend to fade away from our awareness.

Then, when someone asks us about this concept that we are now experts with, we
trundle out our old images and metaphors – and are often surprised at how difficult and misleading our listener finds them!

Some mathe­maticians retreat from images and metaphors because of this and refuse to do more than state the definition and some theorems about the concept. They are wrong to do this. That behavior encourages the attitude of many people that

  • Mathe­maticians can’t explain things.
  • Math concepts are incomprehensible or bizarre.
  • You have to have a mathe­matical mind to understand math.

In my opinion the third statement is only about 10 percent true.

All three of these statements are half-truths. There is no doubt that a lot of abstract math is hard to understand, but understanding is certainly made easier with the use of images and metaphors. 

Images and metaphors on this website

This website has many examples of useful mental represen­tations.  Usually, when a chapter discusses a particular type of mathe­matical object, say rational numbers, there will be a subhead entitled “Images and metaphors for rational numbers”.  This will suggest ways of thinking about them that many have found useful. 

Two levels of images and metaphors

Images and metaphors have to be used at two different levels, depending on your purpose. 

  • You should expect to use rich view for understanding, applications, and coming up with proofs.
  • You must limit yourself to the rigorous view when constructing and checking proofs.

Math teachers and texts typically do not make an explicit distinction between these views, and you have to learn about it by osmosis. In practice, teachers and texts do make the distinction implicitly.  They will say things
like, “You can think about this theorem as …” and later saying, “Now we give a rigorous proof of the theorem.”  Abstractmath.org makes this distinction explicit in many places throughout the site.

rich view

The kind of metaphors and images discussed in the mental represen­tations section above make math rich, colorful and intriguing to think about.  This is the rich view of math.  The rich view is vitally important.  

  • It is what makes math useful and interesting.
  • It helps us to understand the math we are working with.
  • It suggests applications.
  • It suggests approaches to proofs.

You expect the ball whose trajectory is modeled by the function h(t) above  to slow down as it rises, so the derivative of h must be smaller at t
= 4
 than it is at t = 2.  A mathe­matician might even say that that is an “informal proof” that $h'(4)<h'(2)$.  A rigorous proof is given below.

The rigorous view: inertness

When we are constructing a definition or proof we cannot
trust all those wonderful images and metaphors. 

  • Definitions must
    not use metaphors.
  • Proofs must use only logical reasoning based on definitions and
    previously proved theorems.

For the point of view of doing proofs, math
objects must be thought of as inert (or static),
like your pet rock. This means they

  • don’t move or change over time, and
  • don’t interact with other objects, even other mathe­matical objects.

(See also abstract object).

  • When
    mathe­maticians say things like, “Now we give a rigorous proof…”, part of what they mean is that they have to forget about all the color
    and excitement of the rich view and think of math objects as totally
    inert. Like, put the object under an anesthetic
    when you are proving something about it.
  • As I wrote previously, when you are trying to understand arch function $h(t)=25-{{(t-5)}^{2}}$, it helps to think of it as representing a ball thrown directly upward, or as a graph describing the height of the ball at time $t$ which bends over like an arch at the time when the ball stops going upward and begins to fall down.
  • When you proving something about it, you must be in the frame of mind that says the function (or the graph) is all laid out in front of you, unmoving. That is what the rigorous mode requires. Note that the rigorous mode is a way of thinking, not a claim about what the arch function “really is”.
  • When in rigorous mode,  a mathe­matician will
    think of $h$ as a complete mathe­matical object all at once,
    not changing over time. The
    function is the total relationship of the input values of the input parameter
    $t$ to the output values $h(t)$.  It consists of a bunch of interrelated information, but it doesn’t do anything and it doesn’t change.

Formal proof that $h'(4)<h'(2)$

Above, I gave an informal argument for this.   The rigorous way to see that $h'(4)\lt h'(2)$ for the arch function is to calculate the derivative \[h'(t)=10-2t\] and plug in 4 and 2 to get \[h'(4)=10-8=2\] which is less than $h'(2)=10-4=6$.

Note the embedded

This argument picks out particular data about the function that
prove the statement.  It says nothing about anything slowing down as $t$
increases.  It says nothing about anything at all changing.

Other examples

  • The rigorous way to say that “Integers go to infinity in both directions” is something like this:  “For every integer n there is an integer k such that k < n  and an integer m such that n < m.”
  • The rigorous way to say that continuous functions don’t have gaps in their graph is to use the $\varepsilon-\delta $ definition of continuity.
  • Conditional assertions are one important aspect of mathe­matical reasoning in which this concept of unchanging inertness clears up a lot of misunderstanding.   “If… then…” in our intuition contains an idea of causation and of one thing happening before another (see also here).  But if objects are inert they don’t cause anything and if they are unchanging then “when” is meaningless.

The rigorous view does not apply to all abstract objects, but only to mathe­matical objects.  See abstract objects for examples.

Metaphors and images are dangerous

The price of metaphor is eternal vigilance.–Norbert Wiener

mental represen­tation has flaws. Each oneprovides a way of thinking about an $A$ as a kind of $B$ in some respects. But the represen­tation can have irrelevant features.  People new to the subject will be tempted to think  about $A$ as a kind of $B$ in inappropriate respects as well.  This is a form of cognitive dissonance.

 It may be that most difficulties students have with abstract math are based on not knowing which aspects of a given represen­tation are applicable in a given situation.  Indeed, on not being consciously aware that in general you must restrict the applicability of the mental pictures that come with a represen­tation.

In abstractmath.org you will sometimes see this statement:  “What is wrong with this metaphor:”  (or image, or represen­tation) to warn you about the flaws of that particular represen­tation.


The graph of the arch function $h(t)$ makes it look like the two arms going downward become so nearly vertical that the curve has vertical asymptotes
But it does not have asymptotes.  The arms going down are underneath every point of the $x$-axis. For example, there is a point on the curve underneath the point $(999,0)$, namely $(999, -988011)$.


A set is sometimes described as analogous to A container. But consider:  the integer 3 is “in” the set of all odd integers, and it is also “in” the set $\left\{ 1,\,2,\,3 \right\}$.  How could something be in two containers at once?  (More about this here.)

An analogy can be help­ful, but it isn’t the same thing as the same thing. – The Economist


Mathe­maticians think of the real numbers as constituting a line infinitely long in both directions, with each number as a point on the line. But this does not mean that you can think of the line as a row of points. See density of the real line.


We commonly think of functions as machines that turn one number into another.  But this does not mean that, given any such function, we can construct a machine (or a program) that can calculate it.  For many functions, it is not only impractical to do, it is theoretically
impossible to do it.
They are not href=”http://en.wikipedia.org/wiki/Recursive_function_theory#Turing_computability”>computable. In other words, the machine picture of a function does not apply to all functions.


The images and metaphors you use
to think about a mathe­matical object
are limited in how they apply.

The images and metaphors you use to think about the subject
cannot be directly used in a proof.
Only definitions and previously proved theorems can be used in a proof.

Final remarks

Mental represen­tations are physical represen­tations

It seems likely that cognitive phenomena such as images and metaphors are physically represented in the brain as collec­tions of neurons connected in specific ways.  Research on this topic is pro­ceeding rapidly.  Perhaps someday we will learn things about how we think physi­cally that actually help us learn things about math.

In any case, thinking about mathe­matical objects as physi­cally represented in your brain (not neces­sarily completely or correctly!) wipes out a lot of the dualistic talk about ideas and physical objects as
separate kinds of things.  Ideas, in partic­ular math objects, are emergent constructs in the
physical brain. 

About metaphors

The language that nature speaks is mathe­matics. The language that ordinary human beings speak is metaphor. Freeman Dyson

“Metaphor” is used in abstractmath.org to describe a type of thought configuration.  It is an implicit conceptual identification
of part of one type of situation with part of another. 

Metaphors are a fundamental way we understand the world. In particular,they are a fundamental way we understand math.

The word “metaphor”

The word “metaphor” is also used in rhetoric as the name of a type of figure of speech.  Authors often refer to metaphor in the meaning of  thought configuration as a conceptual metaphor.  Other figures of speech, such as simile and synecdoche, correspond to conceptual metaphors as well.

References for metaphors in general cognition:

Fauconnier, G. and Turner, M., The Way We Think: Conceptual Blending And The Mind’s Hidden Complexities . Basic Books, 2008.

Lakoff, G., Women, Fire, and Dangerous Things. The University of Chicago Press, 1986.

Lakoff, G. and Mark Johnson, Metaphors We Live By
The University of Chicago Press, 1980.

References for metaphors and images in math:

Byers, W., How mathe­maticians Think.  Princeton University Press, 2007.

Lakoff, G. and R. E. Núñez, Where mathe­matics Comes
. Basic Books, 2000.

Math Stack Exchange list of explanatory images in math.

Núñez, R. E., “Do Real Numbers Really Move?”  Chapter
in 18 Unconventional Essays on the Nature of mathe­matics, Reuben Hersh,
Ed. Springer, 2006.

Charles Wells,
Handbook of mathe­matical Discourse.

Charles Wells, Conceptual blending. Post in Gyre&Gimble.

Other articles in abstractmath.org

Conceptual and computational

Functions: images and metaphors

Real numbers: images and metaphors

represen­tations and models

Sets: metaphors and images

Creative Commons License< ![endif]>

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

The only axiom of algebra

This is one of a series of posts I am writing to help me develop my thoughts about how particular topics in my book Abstracting Algebra (“AbAl“) should be organized. This post concerns the relation between substitution and evaluation that essentially constitutes the definition of algebra. The Mathematica code for the diagrams is in Subs Eval.nb.

Substitution and evaluation

This post depends heavily on your understanding of the ideas in the post Presenting binary operations as trees.

Notation for evaluation

I have been denoting evaluation of an expression represented as a tree like this:

In standard algebra notation this would be written:\[(6-4)-1=2-1=1\]


This treatment of evaluation is intended to give you an intuition about evaluation that is divorced from the usual one-dimensional (well, nearly) notation of standard algebra. So it is sloppy. It omits fine points that will have to be included in AbAl.

  • The evaluation goes from bottom up until it reaches a single value.
  • If you reach an expression with an empty box, evaluation stops. Thus $(6-3)-a$ evaluates only to $3-a$.
  • $(6-a)-1$ doesn’t evaluate further at all, although you can use properties peculiar to “minus” to change it to $5-a$.
  • I used the boxed “1” to show that the value is represented as a trivial tree, not a number. That’s so it can be substituted into another tree.

Notation for substitution

I will use a configuration like this

to indicate the data needed to substitute the lower tree into the upper one at the variable (blank box). The result of the substitution is the tree

In standard algebra one would say, “Substitute $3\times 4$ for $a$ in the expression $a+5$.” Note that in doing this you have to name the variable.


“If you substitute $12$ for $a$ in $a+5$ you get $12+5$”:

results in


“If you substitute $3\times 4$ for $a$ in $a+b$ you get $3\times4+b$”:

results in


Like evaluation, this treatment of substitution omits details that will have to be included in AbAl.

  • You can also substitute on the right side.
  • Substitution in standard algebraic notation often requires sudden syntactic changes because the standard notation is essentially two-dimensional. Example: “If you substitute $3+ 4$ for $a$ in $a\times b$ you get $(3+4)\times b$”.
  • The allowed renaming of free variables except when there is a clash causes students much trouble. This has to be illustrated and contrasted with the “binop is tree” treatment which is context-free. Example: The variable $b$ in the expression $(3\times 4)+b$ by itself could be changed to $a$ or $c$, but in the sentence “If you substitute $3+ 4$ for $a$ in $a\times b$ you get $(3+4)\times b$”, the $b$ is bound. It is going to be difficult to decide how much of this needs explaining.

The axiom

The Axiom for Algebra says that the operations of substitution and evaluation commute: if you apply them in either order, you get the same resulting tree. That says that for the current example, this diagram commutes:

The Only Axiom for Algebra

In standard algebra notation, this becomes:

  • Substitute, then evaluate: If $a=3\times 4$, then $a+5=3\times 4+5=12+5$.
  • Evaluate, then substitute: If $a=3\times 4$, then $a=12$, so $a+5=12+5$.

Well, how underwhelming. In ordinary algebra notation my so-called Only Axiom amounts to a mere rewording. But that’s the point:

The Only Axiom of Algebra is what makes algebraic manipulation work.

Miscellaneous comments

  • In functional notation, the Only Axiom says precisely that $\text{eval}∘\text{subst}=\text{subst}∘(\text{eval},\text{id})$.
  • The Only Axiom has a symmetric form: $\text{eval}∘\text{subst}=\text{subst}∘(\text{id},\text{eval})$ for the right branch.
  • You may expostulate: “What about associativity and commutativity. They are axioms of algebra.” But they are axioms of particular parts of algebra. That’s why I include examples using operations such as subtraction. The Only Axiom is the (ahem) only one that applies to all algebraic expressions.
  • You may further expostulate: Using monads requires the unitary or oneidentity axiom. Here that means that a binary operation $\Delta$ can be applied to one element $a$, and the result is $a$. My post Monads for high school III. shows how it is used for associative operations. The unitary axiom is necessary for representing arbitrary binary operations as a monad, which is a useful way to give a theoretical treatment of algebra. I don’t know if anyone has investigated monads-without-the-unitary-axiom. It sounds icky.
  • The Only Axiom applies to things such as single valued functions, which are unary operations, and ternary and higher operations. They also apply to algebraic expressions involving many different operations of different arities. In that sense, my presentation of the Only Axiom only gives a special case.
  • In the case of unary operations, evaluation is what we usually call evaluation. If you think about sets the way I do (as a special kind of category), evaluation is the same as composition. See “Rethinking Set Theory”, by Tom Leinster, American Mathematical Monthly, May, 2014.
  • Calculus functions such as sine and the exponential are unary operations. But not all of calculus is algebra, because substitution in the differential and integral operators is context-sensitive.


Preceding posts in this series

Remarks concerning these posts
  • Each of the posts in this series discusses how I will present a small part of AbAl.
  • The wording of some parts of the posts may look like a first draft, and such wording may indeed appear in the text.
  • In many places I will talk about how I should present the topic, since I am not certain about it.

Other references

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

Variations in meaning in math

Words in a natural language may have different meanings in different social groups or different places.  Words and symbols in both mathematical English and the symbolic language vary according to specialty and, occasionally, country (see convention, default).  And words and symbols can change their meanings from place to place within the same mathematical discourse (see scope).

This article mostly provides pointers to other articles in abstractmath.org that give more details about the ideas.


A convention in mathematical discourse is notation or terminology used with a special meaning in certain contexts or in certain fields. Articles and books in a specialty do not always clue you in on these conventions.

Some conventions are nearly universal in math.

Example 1

The use of “if” to mean “if and only if” in a definition is a convention. More about this here. This is a hidden definition by cases. “Hidden” means that no one tells the students, except for Susanna Epp and me.

Example 2

Constants or parameters are conventionally denoted by a, b, … , functions by f, g, … and variables by x, y,…. More.

Example 3

Referring to a group (or other mathematical structure) and its underlying set by the same name is a convention.  This is an example of both synecdoche and context-sensitive.

Example 4

The meaning of ${{\sin }^{n}}x$ in many calculus books is:

  • The inverse sine (arcsin) if $n=-1$.
  • The mult­iplica­tive power for positive $n$; in other words, ${{\sin }^{n}}x={{(\sin x)}^{n}}$ if $n\ne -1$.

This, like Example 1, is a definition by cases. Unlike Example 1, calculus books often make it explicit. Explicit or not, this usage is an abomination.

Some conventions are pervasive among math­ematicians but different conventions hold in other subjects that use mathematics.

  • Scientists and engineers may regard a truncated decimal such as 0.252 as an approximation, but a mathematician is likely to read it as an exact rational number, namely $\frac{252}{1000}$.
  • In most computer languages a distinction is made between real numbers and integers;
    42 would be an integer but 42.0 would be a real number.  Older mathematicians may not know this.
  • Mathematicians use i to denote the imaginary unit. In electrical engineering it is commonly denoted j instead, a fact that many mathematicians are un­aware of. I first learned about it when a student asked me if i was the same as j.

Conventions may vary by country.

  • In France and possibly other countries schools may use “positive” to mean “nonnegative”, so that zero is positive. 
  • In the secondary schools in some places, the value of sin x may be computed clockwise starting at (0,1)  instead of counterclockwise starting at (1,0).  I have heard this from students. 

Conventions may vary by specialty within math.

Field” and “log” are examples. 


An interface to a computer program may have many possible choices for the user to make. In most cases, the interface will use certain choices automatically when the user doesn’t specify them.  One says the program defaults to those choices.  


  • A word processing program may default to justified paragraphs and insert mode, but allow you to pick ragged right or typeover mode.
  • I have spent a lot of time in both Minne­sota and Georgia and the remarks about skiing are based on my own observation. But these usages are not absolute. Some affluent Geor­gians may refer to snow skiing as “skiing”, for example, and this usage can result in a put-down if the hearer thinks they are talking about water skiing. One wonders where the boundary line is. Perhaps people in Kentucky are confused on the issue.

  • There is a sense in which the word “ski” defaults to snow skiing in Minnesota and to water skiing in Georgia.
  • “CSU” defaults to Cleveland State University in northern Ohio and to Colorado State University in parts of the west.

Math language behaves in this way, too.

Default usage in mathematical discourse


  • In high school, $\pi$ refers by default to the ratio of the circumference of a circle to its diameter.  Students are often quite surprised when they get to abstract math courses and discover the many other meanings of $\pi $ (see here).
  • Recently authors in the popular literature seem to think that $\phi$ (phi) defaults to the golden ratio.  In fact, a search through the research literature shows very few hits for $\phi$ meaning the golden ratio: in other words, it usually means something else. 
  • The set $\mathbb{R}$ of real numbers has many different group structures defined on it but “The group $\mathbb{R}$” essentially always means that the group operation is ordinary addition.  In other words, “$\mathbb{R}$” as a group defaults to +.  Analogous remarks apply to “the field $\mathbb{R}$”. 
  • In informal conversation among many analysts, functions are continuous by default.
  • It used to be the case that in informal conversations among topologists, “group” defaulted to Abelian group. I don’t know whether that is still true or not.


This meaning of “default” has made it into dictionaries only since around 1960 (see the Wikipedia entry). This usage does not carry a derogatory connotation.   In abstractmath.org I am using the word to mean a special type of convention that imposes a choice of parameter, so that it is a special case of both “convention” and “suppression of parameters”.


Both mathematical English and the symbolic language have a feature that is uncommon in ordinary spoken or written English:  The meaning of a phrase or a symbolic expression can be different in different parts of the discourse.   The portion of the text in which a particular meaning is in effect is called the scope of the meaning.  This is accomplished in several ways.

Explicit statement


  • “In this paper, all groups are abelian”.  This means that every instance of the word “group” or any symbol denoting a group the group is constrained to be abelian.   The scope in this case is the whole paper.   See assumption.
  • “Suppose (or “let” or “assume”) $n$ is divisible by $4$”. Before this statement, you could not assume $n$ is divisible by $4$. Now you can, until the end of the current paragraph or section.


The definition of a word, phrase or symbol sets its meaning.  If the word definition is used and the scope is not given explicitly, it is probably the whole discourse.


“Definition.  An integer is even if it is divisible by 2.”  This is marked as a definition, so it establishes the meaning of the word “even” (when applied to an integer) for the rest of the text. 


Used in modus ponens (see here) and (along with let, usually “now let…”) in proof by cases.

Example(modus ponens)

Suppose you want to prove that if an integer $n$ is divisible by $4$ then it is even. To show that it is even you must show that it is divisible by $2$. So you write:

  • “Let $n$ be divisible by $4$. That means $n=4k$ for some integer $k$. But then $n=2(2k)$, so $n$ is even by definition.”

Now if you start a new paragraph with something like “For any integer $n\ldots$” you can no longer assume $n$ is divisible by $4$.

Example (proof by cases)

Theorem: For all integers $n$, $n^2+n+1$ is odd.


  • “$n$ is even” means that $n=2s$ for some integer $s$.
  • “$n$ is odd” means that $n=2t+1$ for some integer $t$.


  • Suppose $n$ is even. Then


    so $n^2+n+1$ is odd. (See Zooming and Chunking.)

  • Now suppose $n$ is odd. Then


    So $n^2+n+1$ is odd.


The proof I just gave uses only the definition of even and odd and some high school algebra. Some simple grade-school facts about even and odd numbers are:

  • Even plus even is even.
  • Odd plus odd is even.
  • Even times even is even.
  • Odd times odd is odd.

Put these facts together and you get a nicer proof (I think anyway): $n^2+n$ is even, so when you add $1$ to it you must get an odd number.

Bound variables

A variable is bound if it is in the scope of an integral, quantifier, summation, or other binding operators.  More here.


Consider this text:

Exercise: Show that for all real numbers $x$, it is true that $x^2\geq0$. Proof: Let $x=-2$. Then $x^2=(-2)^2=4$ which is greater than $0$. End of proof.”

The problem with that text is that in the statement, “For all real numbers $x$, it is true that $x^2\geq0$”, $x$ is a bound variable. It is bound by the universal quantifier “for all” which means that $x$ can be any real number whatever. But in the next sentence, the meaning of $x$ is changed by the assumption that $x=-2$. So the statement that $x\geq0$ only applies to $-2$. As a result the proof does not cover all cases.

Many students just beginning to learn to do proofs make this mistake. Fellow students who are a little further along may be astonished that someone would write something like that paragraph and might sneer at them. But this common mistake does not deserve a sneer, it deserves an explanation. This is an example of the ratchet effect.

Variable meaning in natural language

Meanings commonly vary in natural language because of conventions and defaults. But varying in scope during a conversation seems to me uncommon.

It does occur in games. In Skat and Bridge, the meaning of “trump” changes from hand to hand. The meaning of “strike” in a baseball game changes according to context: If the current batter has already had fewer than two strikes, a foul is a strike, but not otherwise.

I have not come up with non-game examples, and anyway games are played by rules that are suspiciously like mathematical axioms. Perhaps you can think of some non-game occasions in which meaning is determined by scoping that I have overlooked.

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

A mathematical saga

This post outlines some of the intellectual developments in the history of math. I call it a saga because it is like one:

  • It is episodic, telling one story after another.
  • It does not try to give an accurate history, but its episodes resemble what happened in math in the last 3000 years.
  • It tells about only a few of the things that happened.

Early techniques

We represented numbers by symbols.

Thousands of years ago, we figured out how to write down words and phrases in such a way that someone much later could read and understand them.

Naturally, we wanted to keep records of the number of horses the Queen owned, so we came up with various notations for numbers (number representing count). In some societies, these symbols were separate from the symbols used to represent words.

We invented algorithms

We discovered positional notation. We write $213$, which is based on a system: it means $2\times100+1\times10+3\times 1$. This notation encapsulates a particular computation of a number (its base-10 representation). (The expression $190+23$ is another piece of notation that encapsulates a computation that yields $213$.)

Compare that to the Roman notation $CCXIII$, which is an only partly organized jumble.
Try adding $CCXIII+CDXXIX$. (The answer is $DCXLII$.)

Positional notation allowed us to create the straightforward method of addition involving adding single digits and carrying:
Measuring land requires multiplication, which positional notation also allows us to perform easily.
The invention of such algorithms (methodically manipulating symbols) made it easy to calculate with numbers.

Geometry: Direct construction of mathematical objects

We discovered geometry in ancient times, in laying out plots of land and designing buildings. We had a bunch of names for different shapes and for some of them we knew how to calculate their area, perimeter and other things.

Euclid showed how to construct new geometric figures from given ones using specific methods (ruler and compasses) that preserve some properties.


We can bisect a line segment (black) by drawing two circles (blue) centered at the endpoints with radius the length of the line segment. We then construct a line segment (red) between the points of intersection of the circle that intersects the given line segment at its midpoint. These constructions can be thought of as algorithms creating and acting on geometric figures rather than on symbols.

It is true that diagrams were drawn to represent line segments, triangles and so on.
But the diagrams are visualization helpers. The way we think about the process is that we are operating directly on the geometric objects to create new ones. We are thinking of the objects Platonically, although we don’t have to share Plato’s exact concept of their reality. It is enough to say we are thinking about the objects as if they were real.

Axioms and theorems

Euclid came up with the idea that we should write down axioms that are true of these figures and constructions, so that we can systematically use the constructions
to prove theorems about figures using axioms and previously proved theorems. This provided documented reasoning (in natural language, not in symbols) for building up a collection of true statements about math objects.


After creating some tools for proving triangles are congruent, we can prove the the intersection of red and black lines in the figure really is the midpoint of the black line by constructing the four green line segments below and making appeals to congruences between the triangles that show up:

Note that the green lines have the same length as the black line.

Euclid thought about axioms and theorems as applying to geometry, but he also proved theorems about numbers by representing them as ratios of line segments.


People in ancient India and Greece knew how to solve linear and quadratic equations using verbal descriptions of what you should do.
Later, we started using a symbolic language to express numerical problems and symbolic manipulation to solve (some of) them.


The quadratic formula is an encapsulated computation that provides the roots of a quadratic equation. Newton’s method is a procedure for finding a root of an arbitrary polynomial. It is recursive in the loose sense (it does not always give an answer).

The symbolic language is a vast expansion of the symbolic notation for numbers. A major innovations was to introduce variables to represent unknowns and to state equations that are always true.


Aristotle developed an early form of logic (syllogisms) aimed at determining which arguments in rhetoric were sound. “All men are mortal. Socrates is a man. Therefore Socrates is mortal.” This was written in sentences, not in symbols.

By explicit analogy with algebra, we introduced symbolism and manipulation rules for logical reasoning, with an eye toward making mathematical reasoning sound and to some extent computable. For example, in one dialect of logical notation, modus ponens (used in the Socrates syllogism) is expressed as $(P\rightarrow Q,\,P)\,\,\vdash\,\, Q$. This formula is an encapsulated algorithm: it says that if you know $P\rightarrow Q$ and $P$ are valid (are theorems) then $Q$ is valid as well.

Crises of understanding

We struggled with the notion of function as a result of dealing with infinite series. For example, the limit of a sequence of algebraic expressions may not be an algebraic expression. It would no longer do to think of a function as the same thing as an algebraic expression.

We realized that Euclid’s axioms for geometry lacked clarity. For example, as I understand it, the original version of his axioms didn’t imply that the two circles in the proof above had to intersect each other. There were other more subtle problems. Hilbert made a big effort to spell out the axioms in more detail.

We refined our understanding of logic by trying to deal with the mysteries of calculus, limits and spaces. An example is the difference between continuity and uniform continuity.
We also created infinitesimals, only to throw up our hands because we could not make a logic that fit them. Infinitesimals were temporarily replaced by the use of epsilon-delta methods.

We began to understand that there are different kinds of spaces. For example, there were other models of some of Euclid’s axioms than just Euclidean space, and some of those models showed that the parallel axiom is independent of the other axioms. And we became aware of many kinds of topological spaces and manifolds.

We started to investigate sets, in part because spaces have sets of points. Then we discovered that a perfectly innocent activity like considering the set of all sets resulted in an impossibility.

We were led to consider how to understand the Axiom of Choice from several upsetting discoveries. For example, the Banach-Tarski “paradox” implies that you can rearrange the points in a sphere of radius $1$ to make two spheres of radius $1$.

Mathematics adopts a new covenant… for awhile

These problems caused a kind of tightening up, or rigorizing.
For a period of time, less than a century, we settled into a standard way of practicing research mathematics called new math or modern math. Those names were used mostly by math educators. Research mathematicians might have called it axiomatic math based on set theory. Although I was around for the last part of that period I was not aware of any professional mathematicians calling it anything at all; it was just what we did.

First, we would come up with a new concept, type of math object, or a new theorem. In this creative process we would freely use intuition, metaphors, images and analogies.


We might come up with the idea that a function reaches its maximum when its graph swoops up from the left, then goes horizontally for an infinitesimal amount of time, then swoops down to the right. The point at which it was going horizontally would obviously have to be the maximum.

But when we came to publish a paper about the subject, all these pictures would disappear. All our visual, metaphorical/conceptual and kinetic feelings that explain the phenomenon would have to be suppressed.

Rigorizing consisted of a set of practices, which I will hint at:

Orthodox behavior among mathematicians in 1950

Definition in terms of sets and axioms

Each mathematical object had to be defined in some way that started with a set and some other data defined in terms of the set. Axioms were imposed on these data. Everything had to be defined in terms of sets, including functions and relations. (Multiple sets were used occasionally.)

Definitions done in this way omit a lot of the intuition that we have concerning the object being defined.

  • The definition of group as a set with a binary operation satisfying some particular axioms does not tell you that groups constitute the essence of symmetry.
  • The definitions of equivalence relation and of partition do not even hint that they define the same concept.

Even so, definitions done in this way have an advantage: They tend to be close to minimal in the sense that to verify that something fits the definition requires checking no more (or not much more) than necessary.

Proofs had to be clearly translatable into symbolic logic

First order logic (and other sorts of logic) were well developed and proofs were written in a way that they could in principle be reduced to arguments written in the notation of symbolic logic and following the rules of inference of logic. This resulted in proofs which did not appeal to intuition, metaphors or pictures.


In the case of the theorem that the maximum of a (differentiable) function occurs only where the derivative is zero, that meant epsilon-delta proofs in which the proof appeared as a thick string of symbols. Here, “thick” means it had superscripts, subscripts, and other things that gave the string a fractal dimension of about $1.2$ (just guessing!).


When I was a student at Oberlin College in 1959, Fuzzy Vance (Elbridge P. Vance) would sometimes stop in the middle of an epsilon-delta proof and draw pictures and provide intuition. Before he started that he would say “Shut the door, don’t tell anyone”. (But he told us!)


A more famous example of this is the story that Oscar Zariski, when presenting a proof in algebraic geometry at the board, would sometimes remind himself of a part of a proof by hunching over the board so the students couldn’t see what he was doing and drawing a diagram which he would immediately erase. (I fault him for not telling them about the diagram.)

It doesn’t matter whether this story is true or not. It is true in the sense that any good myth is true.


I wrote about rigor in these articles:

Rigorous view in abstractmath.org.

Dry bones, post in this blog.

Logic and sets clarify but get in the way

The orthodox method of “define it by sets and axioms” and “makes proofs at least resemble first order logic” clarified a lot of suspect proofs. But it got in the way of intuition and excessive teaching by using proofs made it harder to students to learn.

  • The definition of a concept can make you think of things that are foreign to your intuition of the concept. A function is a mapping,. The ordered pairs are a secondary construction; you should not think of ordered pairs as essential to your intuition. Even so the definition of function in terms of ordered pairs got rid of a lot of cobwebs.
  • The cartesian product of sets is obviously an associative binary operation. Except that if you define the cartesian product of sets in terms of ordered pairs then it is not associative.
  • Not only that, but if you define the ordered pair $(a,b)$ as $\{\{a,b\},a\}$ the you have to say that $a$ is an element of $(a,b)$ but $b$ is not That is not merely an inconvenient definition of ordered pair, it is wrong. It is not bad way to show that the concept of ordered pair is consistent with ZF set theory, but that is a minor point mathematicians hardly ever worry about.

Mathematical methods applied to everything

The early methods described at the beginning of this post began to be used everywhere in math.

Algorithms on symbols

Algorithms, or methodical procedures, began with the addition and multiplication algorithms and Euclid’s ruler and compass constructions, but they began to be used everywhere.

They are applied to the symbols of math, for example to describe rules for calculating derivatives and integrals and for summing infinite series.

Algorithms are used on strings, arrays and diagrams of math symbols, for example concatenating lists, multiplying matrices, and calculating binary operations on trees.

Algorithms as definitions

Algorithms are used to define the strings that make up the notation of symbolic logic. Such definitions include something like: “If $E$ and $F$ are expressions than $(E)\land (F)$ and $(\forall x)(E)$ are expressions”. So if $E$ is “$x\geq 3$” then $(\forall x)(x\geq 3)$ is an expression. This had the effect of turning an expression in symbolic logic into a mathematical object. Deduction rules such as “$E\land F\vdash E$” also become mathematical objects in this way.

We can define the symbols and expressions of algebra, calculus, and other part of math using algorithms, too. This became a big deal when computer algebra programs such as Mathematica came in.


You can define the set $RP$ of real polynomials this way:

  • $0\in RP$
  • If $p\in RP$ then $p+r x^n\in RP$, where $x$ is a variable and $n$ a nonnegative integer.

That is a recursive definition. You can also define polynomials by pattern recognition:

Let $n$ be a positive integer, $a_0,\,a_1\,\ldots a_n$ be real numbers and $k_0,\,k_1\,\ldots k_n$ be nonnegative integers. Then $a_0 x^{k_0}+a_1 x^{k_1}+\ldots+ a_n x^{k_n}$ is a polynomial.

The recursive version is a way of letting a compiler discover that a string of symbols is a polynomial. That sort of thing became a Big Deal when computers arrived in our world.

Algorithms on mathematical objects

I am using the word “algorithm” in a loose sense to mean any computation that may or may not give a result. Computer programs are algorithms, but so is the quadratic formula. You might not think of a formula as an algorithm, but that is because if you use it in a computer program you just type in the formula; the language compiler has a built-in algorithm to execute calculations given by formulas.

It has not been clearly understood that mathematicians apply algorithms not only to symbols, but also directly to mathematical objects. Socrates thought that way long ago, as I described in the construction of a midpoint above. The procedure says “draw circles with center at the endpoints of the line segment.” It doesn’t say “draw pictures of circles…”

In the last section and this one, I am talking about how we think of applying an algorithm. Socrates thought he was talking about ideal lines and circles that exist in some other universe that we can access by thought. We can think about them as real things without making a metaphysical claim like Socrates did about them. Our brains are wired to think of abstract ideas in some many of the same ways we think about physical objects.


The unit circle (as a topological space at least) is the quotient space of the space $\mathbb{R}$ of real numbers mod the equivalence relation defined by: $x\sim y$ if and only if $x-y$ is an integer.

Mathematicians who understand that construction may have various images in their mind when they read this. One would be something like imagining the real line $\mathbb{R}$ and then gluing all the points together that are an integer apart. This is a distinctly dizzying thing to think about but mathematicians aren’t worried because they know that taking the quotient of a space is a well-understood construction that works. They might check that by imagining the unit circle as the real line wrapped around an infinite number of times, with points an integer apart corresponding to the same point on the unit circle. (When I did that check I hastily inserted the parenthetical remark saying “as a topological space” because I realized the construction doesn’t preserve the metric.) The point of this paragraph is that many mathematicians think of this construction as a construction on math objects, not a construction on symbols.

Everything is a mathematical object

A lot of concepts start out as semi-vague ideas and eventually get defined as mathematical objects.


  • A function was originally thought of as a formula, but then get formalized in the days of orthodoxy as a set of ordered pairs with the functional property.
  • The concept of equation has been turned into a math object many times, for example in universal algebra and in logic. I suspect that some practitioners in those fields might disagree with me. This requires further research.
  • Propositions are turned into math objects by Boolean Algebra.
  • Perhaps numbers were always thought of as math objects, but much later the set $\mathbb{N}$ of all natural numbers and the set $\mathbb{R}$ of all real numbers came to be thought of explicitly as math objects, causing some mathematicians to have hissy fits.
  • Definitions are math objects. This has been done in various ways. A particular theory is a mathematical object, and it is essentially a definition by definition (!): Its models are what the theory defines. A particular example of “theory” is first-order theory which was the gold standard in the orthodox era. A classifying topos is also a math object that is essentially a definition.

Category Theory

The introduction of categories broke the orthodoxy of everything-is-a-set. It has become widely used as a language in many branches of math. It started with problems in homological algebra arising in at least these two ways:

  • Homotopy classes of continuous functions are not functions in the set theory sense. So we axiomatized the concept of function as an arrow (morphism) in a category.
  • The concept of mathematical object is axiomatized as an object in a category. This forces all properties of an object to be expressed in terms of its external relations with other objects and arrows.
  • Categories capture the idea of “kind of math”. There is a category of groups and homomorphisms, a category of topological spaces and homeomorphisms, and so on. This is a new level of abstraction. Before, if someone said “I work in finite groups”, their field was a clear idea and people knew what they were talking about, but now the category of finite groups is a mathematical object.
  • Homology maps one kind of math (topology) into another kind (algebra). Since categories capture the general notion of “kind of math”, we invented the idea of functor to capture the idea of modeling or representing one branch of math in another one. So Homology became a mathematical object.
  • The concept of functor allowed the definition of natural transformation as a mathematical object. Before categories, naturality was only an informal idea.

Advantages of category theory

  • Categories, in the form of toposes, quickly became candidates to replace set theory as a foundation system for math. They are more flexible and allow the kind of logic you want to use (classical, intuitionistic and others) to be a parameter in your foundational system.
  • “Arrow” (morphism) replaced not only the concept of function but also the concept of “element of” (“$\in$”). It allows the concept of variable elements. (This link is to a draft of a section of abstractmath.org that has not been included in the table of contents yet.) It also requires that an “element” has to be an element of one single object; for example, the maps $1\to \mathbb{N}$ and $1\to \mathbb{R}$ that pick out the number $42$ are not the same maps, although of course they are related by the canonical inclusion map $\mathbb{N}\to\mathbb{R}$.
  • Diagrams are used in proofs and give much better immediate understanding than formulas written in strings, which compress a lot of things unnecessarily into thick strings that require understanding lots of conventions and holding things in your memory.
  • Categories-as-kinds-of-math makes it easy to turn an analogy, for example between products of groups and products of spaces, into two examples of the same kind of mathematical object: Namely, a product in a category.

Disadvantages of category theory

  • Category theory requires a new way of thinking. Some people think that is a disadvantage. But genuine innovation is always disruptive. New technology breaks old technology. Of course, the new technology has to turn out to be useful to win out.
  • Category theory has several notions of “equal”. Objects can be the same or isomorphic. Categories can be isomorphic or equivalent. When you are doing category theory, you should never worry about whether two objects are equal: that is considered evil. Category theorists generally ignored the fuzziness of this problem because you can generally get away with it. Still, it was an example of something that had not been turned into a mathematical definition. Two ways of accomplishing this are anafunctors and homotopy type theory.

I expect to write about homotopy type theory soon. It may be the Next Revolution.

Send to Kindle

1.000… and 0.999…


Note: This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.
Recently Julian Wilson sent me this letter:
It is well known that students often have trouble accepting that $0.999\ldots$ is the same number as $1.000\ldots$.  However, there is at least one context in which these could be regarded as in some sense as being distinct. In a discrete dynamical system where the next iterate is formed by multiplying the current value by 10 and dropping the leading digit, and where you make a note at each iteration of the first digit after the decimal point, then 0.9999… generates a sequence of 9s, whereas 1.0000… generates a sequence of 0s. The imagery is of a stretching a circle, wrapping it ten times around itself and recording in which sector (labeled 0 to 9) you end up.
From the dynamical systems perspective, being in state 9 (and remaining there after each iteration) is different from being in state 0.
The $0.9999\ldots =1.0000\ldots$ equation is associated with several conceptual difficulties that math students have, which I will describe here.

The decimal representation is not the number

Another way of describing the equation is to say that "$0.999\ldots$" and "$1.000\ldots$" are distinct decimal representations of the same number, namely $1$. Julian's proposal provides a different interpretation of the notation, in which "$0.999\ldots$" and "$1.000\ldots$" are strings of symbols generated by two different machines.  Of course, that is correct.  But they are both correct decimal notation that correspond to the same number.

Mathematical writing will sometimes use notation to mean the abstract mathematical object it refers to, and at other times the text is referring to the notation itself.  For example,

$x^2+1$ is always positive.

refers to the value of $x^2+1$, but

If you substitute $5$ for $x$ in $x^2+1$ you get $26$.

refers to the expression "$x^2+1$".  Careful authors would write,

If you substitute $5$ for $x$ in "$x^2+1$" you get $26$.

This ambiguity in using mathematical notation is an example of what philosophers call the "use-mention" distinction, but they apply the phrase to many other situations as well.  Mathematicians have an operational knowledge of this distinction but many of them are not consciously aware of it.


A decimal representation of a number by definition represents the number that a certain power series converges to. The two power series corresponding to 1.000… and to 0.999… both converge to 1:




They are different power series (mention) but converge (use) to the same number.

Most students new to abstract math are not aware of the importance of definition in math. As they learn more, they may still hold on to the idea that you have to discover or reason out what a math word or expression means.  In purple prose, THE DEFINITION IS A DICTATOR. 

This does not mean that you can understand the concept merely by reading the definition.  The definition usually does not mention most of the important things about the concept.

Completed Infinity

A common remark by newbies about $0.999\ldots$ is that it gets closer and closer to $1$ but does not get there. So it can't be equal to $1$.  This shows a lack of understanding of completed infinity.  The point is that the notation "$0.999\ldots$" refers to a string beginning with "$0.$" and followed by an infinite sequence of $9$'s.  Now "$s$ is an infinite sequence of $9$'s" means precisely that $s$ has an entry $s_n$ for every positive integer $n$, and that $s_n$ is $9$ for every positive integer $n$. 

  • The expression is gradually unrolling over time, and does not ever "get there". 
  • All the nines are already there.

Both the preceding sentences are metaphorical.  They are about how you should think about "$0.999\ldots$".  The first metaphor is bad, the second metaphor is good.  Neither statement is a formal mathematical statement.  Neither statement says anything about what the sequence really is.  They are not statements about reality at all, they are about how you should think about the sequence if you are going to understand what mathematicians say about it. 

Metaphors are crucial to understanding math.  Too many students use the wrong metaphors, but too often no one tells them about it.

We need a math ed text for teachers

I am thinking of precalculus through typical college math major courses.  The issues I have discussed in this post are occasionally written about in the math ed literature but I have had difficulty finding many articles (on the web and on JStor) about these specific ideas.  Anyway, articles are not what we need.  We need a modest paperback book specifically aimed at teachers, covering the kinds of cognitive difficulties math students have when faced with abstraction. 

What I have written in abstractmath.org and in the Handbook are examples of what I mean, but they don't cover all the problems and they suffer from lack of focus.  (Note that the material in abstractmath.org and in posts on this blog can be used freely under a Creative Commons license — click on "Permissions" in the blue banner at the top of this page). 

Among math ed researchers, I have learned a lot from papers by Anna Sfard and David Tall


Send to Kindle

Algebra is a difficult foreign language

Note: This post uses MathJax.  If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.


In a previous post, I said that the symbolic language of mathematics is difficult to learn and that we don't teach it well. (The symbolic language includes as a subset the notation used in high school algebra, precalculus, and calculus.) I gave some examples in that post but now I want to go into more detail.  This discussion is an incomplete sketch of some aspects of the syntax of the symbolic language.  I will write one or more posts about the semantics later.

The languages of math

First, let's distinguish between mathematical English and the symbolic language of math. 

  • Mathematical English is a special register or jargon of English. It has not only its special vocabulary, like any jargon, but also used ordinary English words such as "If…then", "definition" and "let" in special ways. 
  • The symbolic language of math is a distinct, special-purpose written language which is not a dialect of the English language and can in fact be read by mathematicians with little knowledge of English.
    • It has its own symbols and rules that are quite different from spoken languages. 
    • Simple expressions can be pronounced, but complicated expressions may only be pointed to or referred to.
  • A mathematical article or book is typically written using mathematical English interspersed with expressions in the symbolic language of math.

Symbolic expressions

A symbolic noun (logicians call it a term) is an expression in the symbolic language that names a number or other mathematical object, and may carry other information as well.

  • "3" is a noun denoting the number 3.
  • "$\text{Sym}_3$" is a noun denoting the symmetric group of order 3.
  • "$2+1$" is a noun denoting the number 3.  But it contains more information than that: it describes a way of calculating 3 as a sum.
  • "$\sin^2\frac{\pi}{4}$" is a noun denoting the number $\frac{1}{2}$, and it also describes a computation that yields the number $\frac{1}{2}$.  If you understand the symbolic language and know that $\sin$ is a numerical function, you can recognize "$\sin^2\frac{\pi}{4}$" as a symbolic noun representing a number even if you don't know how to calculate it.
  • "$2+1$" and "$\sin^2\frac{\pi}{4}$" are said to be encapsulated computations.
    • The word "encapsulated" refers to the fact that to understand what the expressions mean, you must think of the computation not as a process but as an object.
    • Note that a computer program is also an object, not a process.
  • "$a+1$" and "$\sin^2\frac{\pi x}{4}$" are encapsulated computations containing variables that represent numbers. In these cases you can calculate the value of these computations if you give values to the variables.  

symbolic statement is a symbolic expression that represents a statement that is either true or false or free, meaning that it contains variables and is true or false depending on the values assigned to the variables.

  • $\pi\gt0$ is a symbolic assertion that is true.
  • $\pi\lt0$ is a symbolic assertion that it is false.  The fact that it is false does not stop it from being a symbolic assertion.
  • $x^2-5x+4\gt0$ is an assertion that is true for $x=5$ and false for $x=1$.
  • $x^2-5x+4=0$ is an assertion that is true for $x=1$ and $x=4$ and false for all other numbers $x$.
  • $x^2+2x+1=(x+1)^2$ is an assertion that is true for all numbers $x$. 

Properties of the symbolic language

The constituents of a symbolic expression are symbols for numbers, variables and other mathematical objects. In a particular expression, the symbols are arranged according to conventions that must be understood by the reader. These conventions form the syntax or grammar of symbolic expressions. 

The symbolic language has been invented piecemeal by mathematicians over the past several centuries. It is thus a natural language and like all natural languages it has irregularities and often results in ambiguous expressions. It is therefore difficult to learn and requires much practice to learn to use it well. Students learn the grammar in school and are often expected to understand it by osmosis instead of by being taught specifically.  However, it is not as difficult to learn well as a foreign language is.

In the basic symbolic language, expressions are written as strings of symbols.

  • The symbolic language gives (sometimes ambiguous) meaning to symbols placed above or below the line of symbols, so the strings are in some sense more than one dimensional but less than two-dimensional.
  • Integral notation, limit notation, and others, are two-dimensional enough to have two or three levels of symbols. 
  • Matrices are fully two-dimensional symbols, and so are commutative diagrams.
  • I will not consider graphs (in both senses) and geometric drawings in this post because I am not sure what I want to write about them.

Syntax of the language

One of the basic methods of the symbolic language is the use of constructors.  These can usually be analyzed as functions or operators, but I am thinking of "constructor" as a linguistic device for producing an expression denoting a mathematical object or assertion. Ordinary languages have constructors, too; for example "-ness" makes a noun out of a verb ("good" to "goodness") and "and" forms a grouping ("men and women").

Special symbols

The language uses special symbols both as names of specific objects and as constructors.

  • The digits "0", "1", "2" are named by special symbols.  So are some other objects: "$\emptyset$", "$\infty$".
  • Certain verbs are represented by special symbols: "$=$", "$\lt$", "$\in$", "$\subseteq$".
  • Some constructors are infixes: "$2+3$" denotes the sum of 2 and 3 and "$2-3$" denotes the difference between them.
  • Others are placed before, after, above or even below the name of an object.  Examples: $a'$, which can mean the derivative of $a$ or the name of another variable; $n!$ denotes $n$ factorial; $a^\star$ is the dual of $a$ in some contexts; $\vec{v}$ constructs a vector whose name is "$v$".
  • Letters from other alphabets may be used as names of objects, either defined in the context of a particular article, or with more nearly global meaning such as "$\pi$" (but "$\pi$" can denote a projection, too).

This is a lot of stuff for students to learn. Each symbol has its own rules of use (where you put it, which sort of expression you may it with, etc.)  And the meaning is often determined by context. For example $\pi x$ usually means $\pi$ multiplied by $x$, but in some books it can mean the function $\pi$ evaluated at $x$. (But this is a remark about semantics — more in another post.)

"Systematic" notation

  • The form "$f(x)$" is systematically used to denote the value of a function $f$ at the input $x$.  But this usage has variations that confuse beginning students:
    • "$\sin\,x$" is more common than "$\sin(x)$".
    • When the function has just been named as a letter, "$f(x)$" is more common that "$fx$" but many authors do use the latter.
  • Raising a symbol after another symbol commonly denotes exponentiation: "$x^2$" denotes $x$ times $x$.  But it is used in a different meaning in the case of tensors (and elsewhere).
  • Lowering a symbol after another symbol, as in "$x_i$"  may denote an item in a sequence.  But "$f_x$" is more likely to denote a partial derivative.
  • The integral notation is quite complicated.  The expression \[\int_a^b f(x)\,dx\] has three parameters, $a$, $b$ and $f$, and a bound variable $x$ that specifies the variable used in the formula for $f$.  Students gradually learn the significance of these facts as they work with integrals. 


Variables have deep problems concerned with their meaning (semantics). But substitution for variables causes syntactic problems that students have difficulty with as well.

  • Substituting $4$ for $x$ in the expression $3+x$ results in $3+4$. 
  • Substituting $4$ for $x$ in the expression $3x$ results in $12$, not $34$. 
  • Substituting "$y+z$" in the expression $3x$ results in $3(y+z)$, not $3y+z$.  Some of my calculus students in preforming this substitution would write $3\,\,y+z$, using a space to separate.  The rules don't allow that, but I think it is a perfectly natural mistake. 

Using expressions and writing about them

  • If I write "If $x$ is an odd integer, then $3+x$ is odd", then I am using $3+x$ in a sentence. It is a noun denoting an unspecified number which can be constructed in a specified way.
  • When I mention substituting $4$ for $x$ in "$3+x$", I am talking about the expression $3+x$.  I am not writing about a number, I am writing about a string of symbols.  This distinction causes students major difficulties and teacher hardly ever talk about it.
  • In the section on variables, I wrote "the expression $3+x$", which shows more explicitly that I am talking about it as an expression.
    • Note that quotes in novels don't mean you are talking about the expression inside the quotes, it means you are describing the act of a person saying something.
  • It is very common to write something like, "If I substitute $4$ for $x$ in $3x$ I get $3 \times 4=12$".  This is called a parenthetic assertion, and it is literally nonsense (it says I get an equation).
  • If I pronounce the sentence "We know that $x\gt0$" we pronounce "$x\gt0$" as "$x$ is greater than zero",  If I pronounce the sentence "For any $x\gt0$ there is $y\gt0$ for which $x\gt y$", then I pronounce the expression "$x\gt0$" as "$x$ greater than zero$",  This is an example of context-sensitive pronunciation
  • There is a lot more about parenthetic assertions and context-sensitive pronunciation in More about the languages of math.


I have described some aspects of the syntax of the symbolic language of math. Learning that syntax is difficult and requires a lot of practice. Students who manage to learn the syntax and semantics can go on to learn further math, but students who don't are forever blocked from many rewarding careers. I heard someone say at the MathFest in Madison that about 25% of all high school students never really understand algebra.  I have only taught college students, but some students (maybe 5%) who get into freshman calculus in college are weak enough in algebra that they cannot continue. 

I am not proposing that all aspects of the syntax (or semantics) be taught explicitly.  A lot must be learned by doing algebra, where they pick up the syntax subconsciously just as they pick up lots of other behavior-information in and out of school. But teachers should explicitly understand the structure of algebra at least in some basic way so that they can be aware of the source of many of the students' problems. 

It is likely that the widespread use of computers will allow some parts of the symbolic language of math to be replaced by other methods such as using Excel or some visual manipulation of operations as suggested in my post Mathematical and linguistic ability.  It is also likely that the symbolic language will gradually be improved to get rid of ambiguities and irregularities.  But a deliberate top-down effort to simplify notation will not succeed. Such things rarely succeed.




Send to Kindle

Mathematical usage

Comments about mathematical usage, extending those in my post on abuse of notation.

Geoffrey Pullum, in his post Dogma vs. Evidence: Singular They, makes some good points about usage that I want to write about in connection with mathematical usage.  There are two different attitudes toward language usage abroad in the English-speaking world. (See Note [1])

  • What matters is what people actually write and say.   Usage in this sense may often be reported with reference to particular dialects or registers, but in any case it is based on evidence, for example citations of quotations or a linguistic corpus.  (Note [2].)  This approach is scientific.
  • What matters is what a particular writer (of usage or style books) believes about  standards for speaking or writing English.  Pullum calls this "faith-based grammar".  (People who think in this way often use the word "grammar" for usage.)  This approach is unscientific.

People who write about mathematical usage fluctuate between these two camps.

My writings in the Handbook of Mathematical Discourse and abstractmath.org are mostly evidence based, with some comments here and there deprecating certain usages because they are confusing to students.  I think that is about the right approach.  Students need to know what is actual mathematical usage, even usage that many mathematicians deprecate.

Most math usage that is deprecated (by me and others) is deprecated for a reason.  This reason should be explained, and that is enough to stop it being faith-based.  To make it really scientific you ought to cite evidence that students have been confused by the usage.  Math education people have done some work of this sort.  Most of it is at the K-12 level, but some have worked with college students observing the way the solve problems or how they understand some concepts, and this work often cites examples.

Examples of usage to be deprecated


Powers of functions

f^n(x) can mean either iterated composition or multiplication of the values.  For example, f^2(x) can mean f(x)f(x) or f(f(x)).  This is exacerbated by the fact that in undergrad calculus texts,  \sin^{-1}x refers to the arcsine, and \sin^2 x refers to \sin x\sin x.  This causes innumerable students trouble.  It is a Big Deal.


Set "in" another set.  This is discussed in the Handbook.  My impression is that for students the problem is that they confuse "element of" with "subset of", and the fact that "in" is used for both meanings is not the primary culprit.  That's because most sets in practice don't have both sets and non-sets as elements.  So the problem is a Big Deal when students first meet with the concept of set, but the notational confusion with "in" is only a Small Deal.


This is not a Big Deal.  But I have personally witnessed students (in upper level undergrad courses) that were confused by this.


The many uses of parentheses, discussed in abstractmath.  (The Handbook article on parentheses gives citations, including one in which the notation "(a,b)" means open interval once and GCD once in the same sentence!)  I think the only part that is a Big Deal, or maybe Medium Deal, is the fact that the value of a function f at an input x can be written either  "f\,x" or as "f(x)".  In fact, we do without the parentheses when the name of the function is a convention, as in \sin x or \log x, and with the parentheses when it is a variable symbol, as in "f(x)".  (But a substantial minority of mathematicians use f\,x in the latter case.  Not to mention xf.)  This causes some beginning calculus students to think "\sin x" means "sin" times x.


The examples given above are only a sampling of troubles caused by mathematical notation.   Many others are mentioned in the Handbook and in Abstractmath, but they are scattered.  I welcome suggestions for other examples, particularly at the college and graduate level. Abstractmath will probably have a separate article listing the examples someday…


[1] The situation Pullum describes for English is probably different in languages such as Spanish, German and French, which have Academies that dictate usage for the language.  On the other hand, from what I know about them most speakers of those languages ignore their dictates.

[2] Actually, they may use more than one corpus, but I didn't want to write "corpuses" or "corpora" because in either way I would get sharp comments from faith-based usage people.

References on mathematical usage

Bagchi, A. and C. Wells (1997), Communicating Logical Reasoning.

Bagchi, A. and C. Wells (1998)  Varieties of Mathematical Prose.

Bullock, J. O. (1994), ‘Literacy in the language of mathematics’. American Mathematical Monthly, volume 101, pages 735743.

de Bruijn, N. G. (1994), ‘The mathematical vernacular, a language for mathematics with typed sets’. In Selected Papers on Automath, Nederpelt, R. P., J. H. Geuvers, and R. C. de Vrijer, editors, volume 133 of Studies in Logic and the Foundations of Mathematics, pages 865  935.  

Epp, S. S. (1999), ‘The language of quantification in mathematics instruction’. In Developing Mathematical Reasoning in Grades K-12. Stiff, L. V., editor (1999),  NCTM Publications.  Pages 188197.

Gillman, L. (1987), Writing Mathematics Well. Mathematical Association of America

Higham, N. J. (1993), Handbook of Writing for the Mathematical Sciences. Society for Industrial and Applied Mathematics.

Knuth, D. E., T. Larrabee, and P. M. Roberts (1989), Mathematical Writing, volume 14 of MAA Notes. Mathematical Association of America.

Krantz, S. G. (1997), A Primer of Mathematical Writing. American Mathematical Society.

O'Halloran, K. L.  (2005), Mathematical Discourse: Language, Symbolism And Visual Images.  Continuum International Publishing Group.

Pimm, D. (1987), Speaking Mathematically: Communications in Mathematics Classrooms.  Routledge & Kegan Paul.

Schweiger, F. (1994b), ‘Mathematics is a language’. In Selected Lectures from the 7th International Congress on Mathematical Education, Robitaille, D. F., D. H. Wheeler, and C. Kieran, editors. Sainte-Foy: Presses de l’Université Laval.

Steenrod, N. E., P. R. Halmos, M. M. Schiffer, and J. A. Dieudonné (1975), How to Write Mathematics. American Mathematical Society.

Wells, C. (1995), Communicating Mathematics: Useful Ideas from Computer Science.

Wells, C. (2003), Handbook of Mathematical Discourse

Wells, C. (ongoing), Abstractmath.org.

Send to Kindle

Abuse of notation

I have recently read the Wikipedia article on Abuse of Notation (this link is to the version of 29 December 2011, since I will eventually edit it).  The Handbook of Mathematical Discourse and abstractmath.org mention this idea briefly.  It is time to expand the abstractmath article and to redo parts of the Wikipedia article, which  contains some confusions.

This is a preliminary draft, part of which I’ll incorporate into abstractmath after you readers make insightful comments :).

The phrase “Abuse of Notation” is used in articles and books written by research mathematicians.  It is part of Mathematical English.  This post is about

  • What “abuse of notation” means in mathematical writing and conversation.
  • What it could be used to mean.
  • Mathematical usage in general.  I will discuss this point in the context of the particular phrase “abuse of notation”, not a bad way to talk about a subject.

Mathematical Usage


If I’m going to write about the usage of Mathematical English, I should ideally verify what I claim about the usage by finding citations for a claim: documented quotations that illustrate the usage.  This is the standard way to produce any dictionary.

There is no complete authoritative source for usage of words and phrases in Mathematical English (ME), or for that matter for usage in the Symbolic Language (SL).

  • The Oxford Concise Dictionary of Mathematics [2] covers technical terms and symbols used in school math and in much of undergraduate math, but not so much of research math.  It does not mention being based on citations and it hardly talks about usage at all, even for notorious student-confusing notations such as “\sin^k x“. But it appears quite accurate with good explanations of the math it covers.
  • I wrote Handbook of Mathematical Discourse to stimulate investigations into mathematical usage.  It describes a good many usages in Mathematical English and the Symbolic Language, documented with citations of quotations, but is quite incomplete (as I said in its Introduction).  The Handbook has 428 citations for various usages.  (They are at the end of the on-line PDF version. They are not in the printed book, but are on the web with links to pages in the printed book.)
  • MathWorld has an extensive list of mathematical words, phrases and symbols, and accurate definitions or descriptions of them, even for a great many advanced research topics. It also frequently mentions usage (see formula and inverse sine), but does not give citations.
  • Wikipedia has the most complete set of definitions of mathematical objects that I know of.  The entries sometimes mention usage. I have not detected any entry that gives citations for usage.  Not that that should stop anyone from adding them.

Teaching mathematical usage

In explaining mathematical usage to students, particularly college-level or higher math students, you have choices:

  1. Tell them what you think the usage of a word, phrase, or symbol is, without researching citations.
  2. Tell them what you think the usage ought to be.
  3. Tell them what you think the usage is, supported by citations.

(1) has the problem that you can be wrong.  In fact when I worked on the Handbook I was amazed  at how wrong I could be in what the usage was, in spite of the fact that I had been thinking about usage in ME and SL since I first started teaching (and kept a folder of what I had noticed about various usages).  However,  professional mathematicians generally have a reasonably accurate idea about usage for most things, particularly in their field and in undergraduate courses.

(2) is dangerous.  Far too many mathematicians (but nevertheless a minority), introduce usage in articles and lecturing that is not common or that they invented themselves. As a result their students will be confused in trying to read other sources and may argue with other teachers about what is “correct”.  It is a gross violation of teaching ethics to tell the students that (for example) “x > 0″ allows x = 0 and not mention to them that nearly all written mathematics does not allow that.  (Did you know that a small percentage of mathematicians and educators do use that meaning, including in some secondary institutions in some countries?  It is partly Bourbaki’s fault.)

(3) You often can’t tell them what the usage is, supported by citations, because, as mentioned above, documented mathematical usage is sparse.

I think people should usually choose (1) instead of (2).  If they do want to introduce a new usage or notation because it is “more logical” or because “my thesis advisor used it” or something, they should reconsider.  Most such attempts have failed, and thousands of students have been confused by the attempts.

Abuse of notation

“Abuse of notation” is a phrase used in mathematical writing to describe terminology and notation that does not have transparent meaning. (Transparent meaning is described in some detail under “compositional” in the Handbook.)

Abuse of notation was originally defined in French, where the word “abus” does not carry the same strongly negative connotation that it does in English.

Suppression of parameters

One widely noticed practice called “abuse of notation”  is the use of the name of the underlying set of a mathematical structure to refer to a structure. For example, a group is a structure (G,\text{*}) where G is a set and * is a binary operation with certain properties. The most common way to refer to this structure is simply to call it G. Since any set of cardinality greater than 1 has more than one group structure on it, this does not include all the information needed to determine the group. This type of usage is cited in 82 below.  It is an example of suppression of parameters.

Writing “\log x” without mentioning the base of the logarithm is also an example of suppression of parameters.  I think most mathematicians would regard this as a convention rather than as an abuse of notation.  But I have no citations for this (although they would probably be easy to find).  I doubt that it is possible to find a rational distinction between “abuse of notation” and “convention”; it is all a matter of what people are used to saying.


The naming of a structure by using the name of its underlying set is also an example of synecdoche, the naming of a whole by a part (for example, “wheels” to mean a car).

Another type of synecdoche that has been called abuse of notation is referring to an equivalence class by naming one of its elements.  I do not have a good quotation-citation that shows this use.  Sometimes people write 2 + 4 = 1 when they are working in the Galois field with 5 elements.  But that can be interpreted in more than one way.  If GF[5] consists of equivalence classes of integers (mod 5) then they are indeed using 2 (for example) to stand for the equivalence class of 2.  But they could instead define GF[5] in the obvious way with underlying set {0,1,2,3,4}.  In any case, making distinctions of that sort is pedantic, since the two structures are related by a natural isomorphism (next paragraph!)

Identifying objects via isomorphism

This is quite commonly called “abuse of notation” and is exemplified in citations 209, 395 and AB3.

Overloaded notation

John Harrison, in [1], uses “abuse of notation” to describe the use of a function symbol to apply to both an element of its domain and a subset of the domain.  This is an example of overloaded notation.  I have not found another citation for this usage other than Harrison and I don’t remember anyone using it.  Another example of overloaded notation is the use of the same symbol “\times” for multiplication of numbers, matrices and 3-vectors.  I have never heard that called abuse of notation.  But I have no authority to say anything about this usage because I haven’t made the requisite thorough search of the literature.

Powers of functions

The Wikipedia Article on abuse of notation (29 Dec 2011 version) mentions the fact that f^2(x) can mean either f(x)f(x) or f(f(x)).   I have never heard this called abuse of notation and I don’t think it should be called that.  The notation “f^2(x)” can in ordinary usage mean one of two things and the author or teacher should say which one they mean.  Many math phrases or symbolic expressions  can mean more than one thing and the author generally should say which.  I don’t see the point of calling this phenomenon abuse of notation.

Radial concept

The Wikipedia article mentions phrases such as “partial function”.  This article does provide a citation for Bourbaki for calling a sentence such as “Let f:A\to B be a partial function” abuse of notation.  Bourbaki is wrong in a deep sense (as the article implies).  There are several points to make about this:

  • Some authors, particularly in logic, define a function to be what most of us call a partial function.  Some authors  require a ring to have a unit and others don’t.  So what?
  • The phrase “partial function” has a standard meaning in math:  Roughly “it is a function except it is defined on only part of its domain”.  Precisely, f:A\to B is a partial function if it is a function f:A'\to B for some subset A' of A.
  • A partial function is not in general a function.  A stepmother is not a mother.  A left identity may not be an identity, but the phrase “left identity” is defined precisely.   An incomplete proof is not a proof, but you know what the phrase means! (Compare “expectant mother”).   This is the way we normally talk and think.  See the article “radial concept” in the Handbook.

Other uses

AB4 involves a redefinition of  “\in” in a special case.  Authors redefine symbols all the time.  This kind of redefinition on the fly probably should be avoided, but since they did it I am glad they mentioned it.

I have not talked about some of the uses mentioned in the Wikipedia article because I don’t yet understand them well enough.  AB1 and AB2 refer to a common use with pullback that I am not sure I understand (in terms of how they author is thinking of it).  I also don’t understand AB5.  Suggestions from readers would be appreciated.

Kill it!

Well, it’s more polite to say, we don’t need the phrase “abuse of notation” and it should be deprecated.

  • The use of the word “abuse” makes it sound like a bad thing, and most instances of abuse of notation are nothing of the sort.  They make mathematical writing much more readable.
  • Nearly everywhere it is used it could just as well be called a convention.  (This requires verification by studying math texts.)


The first three citations at in the Handbook list; the numbers refer to that list’s numbering. The others I searched out for the purpose of this post.

82. Busenberg, S., D. C. Fisher, and M. Martelli (1989), Minimal periods of discrete and smooth orbits. American Mathematical Monthly, volume 96, pages 5–17. [p. 8. Lines 2–4.]

Therefore, a normed linear space is really a pair (\mathbf{E},\|\cdot\|) where \mathbf{E} is a linear vector space and \|\cdot\|:\mathbf{E}\to(0,\infty) is a norm. In speaking of normed spaces, we will frequently abuse this notation and write \mathbf{E} instead of the pair (\mathbf{E},\|\cdot\|).

209. Hunter, T. J. (1996), On the homology spectral sequence for topological Hochschild homology. Transactions of the American Mathematical Society, volume 348, pages 3941–3953. [p. 3934. Lines 8–6 from bottom.]

We will often abuse notation by omitting mention of the natural isomorphisms making \wedge associative and unital.

395. Teitelbaum, J. T. (1991), ‘The Poisson kernel for Drinfeld modular curves’. Journal of the American Mathematical Society, volume 4, pages 491–511. [p. 494. Lines 1–4.]

\ldots may find a homeomorphism x:E\to \mathbb{P}^1_k such that \displaystyle x(\gamma u) = \frac{ax(u)+b}{cx(u)+d}. We will tend to abuse notation and identify E with \mathbb{P}^1_k by means of the function x.

AB1. Fujita, T. On the structure of polarized manifolds with total deficiency one.  I. J. Math. Soc. Japan, Vol. 32, No. 4, 1980.

Here we show examples of symbols used in this paper \ldots

L_{T}: The pull back of L to a space T by a given morphism T\rightarrow S . However, when there is no danger of confusion, we OFTEN write L instead of L_T by abuse of notation.

AB2. Sternberg, S. Minimal coupling and the symplectic mechanics of a classical
particle in the presence of a Yang-Mills field. Physics, Vol. 74, No. 12, pp. 5253-5254, December 1977.

On the other hand, let us, by abuse of notation, continue to write \Omega for the pullback of \Omega from F to P \times F by projection onto the second factor. Thus, we can write \xi_Q\rfloor\Omega = \xi_F\rfloor\Omega and \ldots

AB3. Dobson, D, and Vogel, C. Convergence of an iterative method for total variation denoising. SIAM J. Numer. Anal., Vol. 34, pp. 1779, October, 1997.

Consider the approximation

(3.7) u\approx U\stackrel{\text{def}}{=}\sum_{j=1}^N U_j\phi_j \ldots

In an abuse of notation, U will represent both the coefficient vector \{U_j\}_{j=1}^N and the corresponding linear combination (3.7).

AB4. Lewis, R, and Torczon, V. Pattern search algorithms for bound constrained minimization.  NASA Contractor Report 198306; ICASE Report No. 96-20.

By abuse of notation, if A is a matrix, y\in A means that the vector y is a column of A.

AB5. Allemandi, G, Borowiecz, A. and Francaviglia, M. Accelerated Cosmological Models in Ricci squared Gravity. ArXiv:hep-th/0407090v2, 2008.

This allows to reinterpret both f(S) and f'(S) as functions of \tau in the expressions:
\begin{equation*}\begin{cases}  f(S) = f(F(\tau)) = f(\tau )\\  f'(S) = f'(F(\tau )) = f'(\tau )\end{cases}\end{equation*}
following the abuse of notation f(F(t )) = f(t ) and f'(F(t )) = f'(t ).


[1] Harrison, J. Criticism and reconstruction, in Formalized Mathematics (1996).

[2] Clapham, C. and J. Nicholson.  Oxford Concise Dictionary of Mathematics, Fourth Edition (2009).  Oxford University Press.


Send to Kindle