The intent of mathematical assertions

An assertion in mathematical writing can be a claim, a definition or a constraint.  It may be difficult to determine the intent of the author.  That is discussed briefly here.

Assertions in math texts can play many different roles.

English sentences can state facts, ask question, give commands, and other things.  The intent of an English sentence is often obvious, but sometimes it can be unexpectedly different from what is apparent in the sentence.  For example, the statement “Could you turn the TV down?” is apparently a question expecting a yes or no answer, but in fact it may be a request. (See the Wikipedia article on speech acts.) Such things are normally understood by people who know each other, but people for whom English is a foreign language or who have a different culture have difficulties with them.

There are some problems of this sort in math English and the symbolic language, too.  An assertion can have the intent of being a claim, a definition, or a constraint.

Most of the time the intent of an assertion in math is obvious. But there are conventions and special formats that newcomers to abstract math may not recognize, so they misunderstand the point of the assertion. This section takes a brief look at some of the problems.


The way I am using the words “assertion”, “claim”, and “constraint” is not standard usage in math, logic or linguistics.


In most circumstances, you would expect that if a lecturer or author makes a math assertion, they are claiming that it is a true statement, and you would be right.

  1. “The $240$th digit of $\pi$ after the decimal point is $4$.”
  2. “If a function is differentiable, it must be continuous.”
  3. “$7\gt3$”


  • You don’t have to know whether these statements are true or not to recognize them as claims. An incorrect claim is still a claim.
  • The assertion in (a) is a statement, in this case a false one.  If it claimed the googolth digit was $4$ you would never be able to tell whether it is true or not, but it
    still would be an assertion intended as a claim.
  • The assertion in (b) uses the standard math convention that an indefinite noun phrase (such as “a widget”) in the subject of a sentence is universally quantified (see also the article about “a” in the Glossary.) In other words, “An integer divisible by $4$ must be even” claims that any integer divisible by $4$ is even. This statement is claim, and it is true.
  • (c) is a (true) claim in the symbolic language. (Note that “$3 + 4$” is not an assertion at all, much less a claim.)


Definitions are discussed primarily in the chapter on definitionsA definition is not the same thing as a claim. 


The definition

“An integer is even if it is divisible by $2$”

makes the claim

integer is even if and only if it is
divisible by $2$”


(If you are surprised that the definition uses “if” but the claim uses “if and only if”, see the Glossary article on “if”.)

Unmarked definitions

Math texts sometimes define something without saying that it is a definition. Because of that, students may sometimes think a claim is a definition.


Suppose that the concept of “even integer” was new to you and the book said, “A number is even if it is divisible by $4$.” Perhaps you thought that this was a definition. Later the book refers to $6$ as even and you pull your hair out wondering why. The statement is a correct claim but an incorrect definition. A good writer would write something like “Recall that a number is even if it is divisible by $2$, so that in particular it is even if it is divisible by $4$.”

On the other hand, you may think a definition is only a claim.


A lecturer may say “By definition, an integer is even if it is divisible by $2$”, and you write down: “An integer is even if it is divisible by $2$”. Later, you get all panicky wondering How did she know that?? (This has happened to me.)

The confusion in the preceding example can also occur if a books says, “An integer is even if it is divisible by $2$” and you don’t know about the convention that when an author puts a word or phrase in boldface or italics it may mean that they are defining it.

A good writer always labels definitions


Here are two assertions that contain variables.

  • “$n$ is even.”
  • “$x\gt1$”.

Such an assertion is a constraint (or a condition) if the intent is
that the assertion will hold in that part of the text (the scope of the constraint). The part of the text in which it holds is usually the immediate vicinity unless the authors explicitly says it will hold in a larger part of the text such as “this chapter” or “in the rest of the book”.

  • Sometimes the wording makes it clear that the phrase is a constraint. So a statement such as “Suppose $3x^2-2x-5\geq0$” is a constraint on the possible values of $x$.
  • The statement “Suppose $n$ is even” is an explicit requirement that $n$ be even and an implicit requirement that $n$ be an integer.
  • A condition for which you are told to find the solution(s) is a constraint. For example: “Solve the equation $3x^2-2x-5=0$”. This equation is a constraint on the variable $x$. “Solving” the equation means saying explicitly which numbers make the equation true.


The constraint may appear in parentheses after the assertion as a postcondition on an assertion.


“$x^2\gt x\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,(\text{all }x\gt1)$”

which means that if the constraint “$x\gt1$” holds, then “$x^2\gt x$” is true. In other words, for all $x\gt1$, the statement $x^2\gt x$ is true. In this statement, “$x^2\gt x$” is not a constraint, but a claim which is true when the constraint is true.

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle


This is a revised draft of the article on context in math texts. Note: WordPress changed double primes into quotes. Tsk.


Written and especially spoken language depends heavily on the context – the physical surroundings, the preceding conversation, and social and cultural assumptions.  Mathematical statements are produced in such contexts, too, but here I will discuss a special thing that happens in math conversation and writing that does not seem to happen much in other sorts of discourse:

The meanings of expressions
in both the symbolic language and math English
change from phrase to phrase
as the speaker or writer changes the constraints on them.


In a math text, before the occurrence of a phrase such as “Let $n=3$”, $n$ may be known only as an integer variable.  After the phrase, it means specifically $3$.  So this phrase changes the meaning of $n$ by constraining $n$
to be $3$.  We say the context of occurrences of “$n$” before the phrase requires only that $n$ be an integer, but after the occurrence the context requires $n=3$.


In this article, the context at a particular location in mathematical discourse is the sum total of what the reader or listener can know about the symbols and names used in the discourse when they have read everything up to that location.


  • Each clause can change the meaning of or constraints on one or more symbols or names. The conventions in effect during the discourse can also put constraints on the symbols and names.
  • Chierchia and McConnell-Ginet give a mathematical definition of context in the sense described here.
  • The references to “before” and “after” the phrase “Let $3$” refer to the physical location in text and to actual time in spoken math. There is more about this phenomenon in the Handbook of Mathematical Discourse, page 252, items (f) and (g).
  • Contextual changes of this sort take place using the pretense that you are reading the text in order, which many students and professionals do not do (they are “grasshoppers”).
  • I am not aware of much context-changing in everyday speech. One place it does occur is in playing games. For example, during some card games the word “trumps” changes meaning from time to time.
  • In symbolic logic, the context at a given place may be denoted by “$\Gamma$”.

Detailed example of a math text

Here is a typical example of a theorem and its proof.  It is printed twice, the second time with comments about the changes of context.  This is the same proof that is already analyzed practically to death in the chapter on presentation of proofs.

First time through

Definition: Divides

Let $m$ and $n$ be integers with $m\ne 0$. The statement “$m$ divides $n$” means that there is an integer $q$ for which $n=qm$


Let $m$, $n$ and $p$ be integers, with $m$ and $n$ nonzero, and suppose $m$ divides $n$ and $n$ divides $p$.  Then $m$ divides $p$.


By definition of divides, there are integers $q$ and $q’$ for which $n=qm$ and $p=q’n$. We must prove that there is an integer $q”$ for which $p=q”m$. But $p=q’n=q’qm$, so let $q”=q’q$.  Then $p=q”m$.

Second time, with analysis

Definition: Divides

Begins a definition. The word “divides” is the word being defined. The scope of the definition is the following paragraph.

Let $m$ and $n$ be integers

$m$ and $n$ are new symbols in this discourse, constrained to be integers.

with $m\ne 0$

Another constraint on $m$.

The statement “$m$ divides $n$ means that”

This phrase means that what follows is the definition of “$m$ divides $n$”

there is an integer $q$

“There is” signals that we are beginning an existence statement and that $q$ is the bound variable within the existence statement.

for which $n=qm$

Now we know that “$m$ divides $n$” and “there is an integer $q$ for which $m=qn$” are equivalent statements.  Notes: (1) The first statement would only have implied the second statement if this had not been in the context of a definition. (2) After the conclusion of the definition, $m$, $n$ and $q$ are undefined variables.


This announces that the next paragraph is a statement has been proved. In fact, in real time the statement was proved long before this discourse was written, but in terms of reading the text in order, it has not yet been proved.

Let $m$, $n$ and
$p$ be integers,

“Let” tells us that the following statement is the hypothesis of an implication, so we can assume that $m$, $n$ and $p$ are all integers.  This changes the status of $m$ and $n$, which were variables used in the preceding paragraph, but whose constraints disappeared at the end of the paragraph.  We are starting over with $m$ and $n$.

with $m$
and $n$ nonzero.

This clause is also part of the hypothesis. We can assume $m$ and $n$ are constrained to be nonzero.

and suppose $m$ divides $n$ and $n$ divides $p$.

This is the last clause in the hypothesis. We can assume that $m$ divides $n$ and $n$ divides $p$.

Then $m$
divides $p$.

This is a claim that $m$ divides $p$. It has a different status from the assumptions that $m$ divides $n$ and $n$ divides $p$. If we are going to follow the proof we have to treat $m$ and $n$ as if they divide $n$ and $p$ respectively. However, we can’t treat $m$ as if it divides $p$. All we know is that the author is claiming that $m$ divides $p$, given the facts in the hypothesis.


An announcement that a proof is about to begin, meaning a chain of math reasoning. The fact that it is a proof of the Theorem just stated is not explicitly stated.

By definition of divides, there are integers $q$ and $q’$ for which $n=qm$ and $p=q’n$.

The proof uses the direct method (rather than contradiction or induction or some other method) and begins by rewriting the hypothesis using the definition of “divides”. The proof does not announce the use of these techniques, it just starts in doing it. So $q$ and $q$’ are new symbols that satisfy the equations $n=qm$ and $p=q’n$. The phrase “by definition of divides” justifies the introduction of $q$ and $q’$. $m$, $n$ and $p$ have already been introduced in the statement of the Theorem.

We must prove that there is an integer $q”$ for which $p=q”m$.

Introduces a new variable $q”$ which has not been given a value. We must define it so that $p=q”m$; this requirement is justified (without saying so) by the definition of “divides”.

But $p=q’n=q’qm$,

This is a claim about $p$, $q$, $q’$, $m$ and $n$.  It is justified by certain preceding sentences but this justification is not made explicit. Note that “$p=q’n=q’qm$” pivots on $q’n$, in other words makes two claims about it.

so let $q”=q’q$.

We have already introduced $q”$; now we give it the value $q”=q’q$.

Then $p=q”m$

This is an assertion about $p$, $q”$ and $n$, justified (but not explicitly — note the hidden use of associativity) by the previous claim that $p=q’n=q’qm$.


The proof is now complete, although no
statement asserts that it is.


If you have some skill in reading proofs, all the stuff in the right hand column happens in your brain without, for the most part, your being conscious of it.


Thanks to Chris Smith for correcting errors.

References for “context”

Chierchia, G. and S. McConnell-Ginet
(1990), Meaning and Grammar. The MIT Press.

de Bruijn, N. G. (1994), “The mathematical vernacular, a
language for mathematics with typed sets”. In Selected Papers on Automath,
Nederpelt, R. P., J. H. Geuvers, and R. C. de Vrijer, editors, volume 133 of
Studies in Logic and the Foundations of Mathematics, pages 865 – 935. Elsevier

Steenrod, N. E., P. R. Halmos, M. M. Schif­fer,
and J. A. Dieudonné (1975), How to Write Mathematics.
American Mathematical Society.

Send to Kindle

The Mathematics Depository: A Proposal


This post is about taking texts written in mathematical English and the symbolic language and encoding it in a formal language that could be tested by an automated proof verifier. This is a very difficult undertaking, but we could get closer and closer to a working system by a worldwide effort continuing over, probably, decades. The system would have to contain many components working together to create incremental improvements in the process.

This post, which is a first draft, outlines some suggestions as to how this could work. I do not discuss the encoding required, which is not my area of expertise. Yes, I understand that coding is the hard part!

Much work has been done by computing scientists in developing proof checking and proof-finding programs. Work has also been done, primarily by math education workers but also by some philosophers and computing scientists, in uncovering the many areas where ordinary math language is ambiguous and deviates from ordinary English usage. These characteristics confuse students and also make it hard to design a program that can interpret the language. I have been working in that area mostly from the math ed point of view for the last twenty years.

The Reference section lists many references to the problem of parsing mathematical English, some from the point of view of automatic translation of math language into code, but most from the point of view of helping students understand how to understand it.

The Mathematics Depository

I imagine a system for converting documents written in math language into machine-readable language and testing their claims. An organization, call it the Mathematics Depository, would be developed that is supported by many countries, organizations and individual supporters. It should consist of several components listed below, no doubt with other components as we become aware of needing them. The organization would be tasked with supporting and improving these components over time.

The main parts of the system

Each component is linked to a more detailed description that is given later in this post.

  • A Proof Verifier (PV), that inputs a proof and determines if it is correct.
  • A specification of a supported subset of Mathematical English and the symbolic language, that I will call Strict Math English (SME).
  • A Text-SME Converter, a program that would input a text written in ordinary math English that has been annotated by a knowledgeable person and convert it into SME.
  • An SME-PV Converter that will convert text written in SME into code that can be directly read by the Proof Verifier.
  • One or more Automatic Theorem Provers, that to begin with can take fairly simple conjectures written in SME and sometimes succeed in proving them.
  • An Annotation System containing an Annotation Editor that would allow a person to use SME to annotate an article written in ordinary math English so that it could be read by the Text-SME Converter.
  • A Data Base that would include the texts that have been collected in this endeavor, along with the annotations and the results of the proof checking.
  • A Data Base Miner that would watch for patterns in the annotations as new papers were submitted. The operators might also program it to watch for patterns in other aspects of the operation.

These facilities would be organized so that the systems work together, with the result that the individual components I named improve over time, both automatically and via human intervention.

Flow of Work

  1. A math text is submitted.
  2. If it is already in Strict Math English (SME), it is input to the Proof Verifier (PV).
  3. Otherwise, the math text is input into the Annotation System.
  4. The resulting SME text is input into the Text-SME Converter.
  5. The output of the Text-SME Converter is input into the Proof Verifier.
  6. The PV incorporates each definition in the text into the context of the math text. This is a specific meaning of the word “context”, including a list of the status of variables (bound, unbound, type, and so on), meanings of technical words, and other facts created in the text. “Context” is described informally in my article Context in That article gives references to the formal literature.
  7. In my experience mathematicians spend only a little time reading arguments step by step as described in the Context article. They usually look at a theorem and try to figure it out themselves, “cheating” occasionally by glancing at parts of the proof.

  8. Each mathematical assertion in the text is marked as a claim.
  9. The checking process records those claims occurring in the proof that are not proved in the text, along with any references given to other texts.
  10. If a reference to a result in another text is made, the PV looks for the result in the Database. If it does not find it, the PV incorporates the result and its location in the Database as an externally proven but untested claim.
  11. If no reference or proof for a claim is given, the PV checks the Database to see if it has already been proved.
  12. Any claim in the current text not shown as proven in the Database is submitted to the Automatic Theorem Prover (ATP). The output of the ATP is put in the database (proved, counterexample found, or unable to determine truth).
  13. If a segment of text is presented as a proof, it is input into the PV to be verified.
  14. The PV reports the result for each claimed proof, which can consist of several possibilities:
    • A counterexample for a proof is found, so the claim that the proof was supposed to report is false.
    • The proof contains gaps, so the claim is unsettled.
    • The proof is reported as correct.
  15. At the end of the process, all the information gathered is put into the Database:
    • The original text showing all the annotations.
    • The text in SME.
    • All claims, with their status (proven true, proven false, truth unknown, reference if one was given).
    • Every proof, with its status and the entire context at each step of the proof.


The proof verifier

  • Proof checking programs have been developed over the last thirty or so years. The MD should write or adapt one or more Proof Verifiers and improve it incrementally as a result of experience in running the system. In this post I have assumed the use of just one Proof Verifier.
  • The Proof Verifier should be designed to read the output of the SME-PV converter.
  • The PV must read a whole math text in SME, identify and record each claim and check each proof (among other things). This is different from current proof verifiers, which take exactly one proof as input.
  • The PV must create the context of each proof and change it step by step as it reads each syntactic fragment of the math text.
  • Typically the context for a claimed proof is built up in the whole math text, not just in the part called “Proof”.
  • The PV should automatically query the Data Base for unproved steps in a proof in the input text to see if they have already been verified somewhere else. These results should be quoted in a proof verifier output.
  • The PV should also automatically submit steps in the proof that haven’t been verified to the Automatic Theorem Provers and wait for the step to be verified or not.
  • The Proof Verifier should output details of the result of the checking whether it succeeded in verifying the whole input text or not. In particular, it should list steps in proofs it failed to verify, including steps in proofs for which the input text cited the proof in some other paper, in the MD system or not.
  • The Proof Verifier should be available online for anyone to submit, in SME, a mathematical text claiming to prove a theorem. Submission might require a small charge.

Strict Math English

  • One of the most important aspects of the system would be the simultaneous incremental updating of the SME and the SME-PV Converter.
  • The idea is that SME would get more and more inclusive of the phrases and clauses it allows.

Example: Universal Assertions

At the start SME might allow these statements to be recognized as the same universal assertion:

  • “$\forall x(x^2+1\gt0)$”
  • “For all [every, any] $x$, $x^2+1\gt0$.” (universality asserted using an English word.)
  • “For all [every, any] $x$, $x^2+1$ is positive.”

As time goes on, a person or the Data Base Miner might detect that many annotators also recognized these statements as saying the same thing:

  • “$x^2+1\gt0\,\,\,\,\,(\text{all } x)$” (as a displayed statement)
  • “$x^2+1$ is positive for every $x$.” Universality asserted using an adjective in a postposited phrase.
  • “$x^2+1$ is always positive.” Universality hidden in a postposited adverb that seems to be referring to time!
  • There are more examples in my article Universally True Assertions. See also Susanna Epp’s article on quantification for other problems in this area.

These other variations would then be added to the Strict Math Language. (This is only an example of how the system would evolve. I have no doubt that in fact all the terminology mentioned above would be included at the outset, since they are all documented in the math ed literature.)

Even at the start, SME will include phrases and clauses in the English language as well as symbolic expressions. It is notorious that automatically parsing general English sentences is difficult and that the ubiquity of metaphors makes it essentially impossible to reliably construct the meaning of a sentence. That is why SME must start with a very narrow subset of math English. But even in early days, it should include some stereotyped metaphors, such as using “always” in universal assertions.

The SME-PV Converter

  • The SME-PV Converter would read documents written in SME and convert them into code readable by the proof checking program, as well as by the automatic theorem provers.
  • Such a program is essentially the subject of Ganesingalam’s book.
  • Converting SME so that the Proof Verifier can handle it involves lots of subtleties. For example, if the text says, “For any $x$, $x^2+1\gt0$”, the translation has to recognize not only that this is a universally quantified statement with $x$ as the bound variable, but that $x$ must be a real number, since complex numbers don’t do greater-than.
  • Frequent revisions of the SME-PV Converter will be necessary since its input language, the SME, will be constantly expanded.
  • It may be that the output language of the SME-PV Converter (which the Proof Verifier and Automatic Theorem Provers read) will require only infrequent revisions.

The Automatic Theorem Provers

  • The system could support several ATP’s, each one adapted to read the output of the SME-PV Converter.
  • The Automatic Theorem Provers should provide output in such a way that the Proof Verifier can include in its report the positive or negative results of the Theorem Prover in detail.

The Annotation System

  • The Annotation system would facilitate construction of a data structure that connects each annotation to the specific piece of text it rewrites. The linking should be facilitated by the Annotation Editor.
  • For example, an annotation that is meant to explain that the statement (in the input text) “$x^2+1$ is always greater than $0$” is to be translated as “$\forall x(x^2+1\gt0)$” (which is presumably allowed by SME) should cause the first statement to be be linked to the second statement. The first statement, the one in the input text, should not be changed. This will enable the Data Base Miner to find patterns of similar text being annotated in similar ways.
  • The annotations should clarify words, symbolic expressions and sentences in the input text to allow the Proof Verifier to input them correctly.
  • In particular, every claim that a statement is true should be marked as a proposed theorem, and similarly every proof should be marked as a proof and every definition should be marked as a definition. Such labeling is often omitted in the math literature. Annotators would have to recognize segments of the text as claims, proofs and definitions and annotate them as such.
  • The annotations would be written in the current version of Strict Math English. Since SME is frequently updated, the instructions for the annotator would also have to be frequently updated.


  • If a paper used the word “domain” without defining it, the annotator would clarify whether it meant an open connected set, a type of ring, a type of poset, or the domain of a function. See Example 1
  • Annotators will note instances in which the same text will use a symbol with two different meanings. See Example 2.
  • In a phrase, a single occurrence of a symbol can require an annotation that assigns more than one attribute to the symbol. See Example 3.

The Annotation Editor

  • The annotators should be provided with an Annotation Editor designed specifically for annotation.
  • The editor should include a system of linking an annotation to the exact phrase it annotates that is easy for a person reading the annotated document to understand it as well as providing the information to the Text-SME Converter.

The Annotators

  • Great demands will be made of an annotator.
  • They must understand the detailed meaning of the text they annotate. This means they must be quite familiar with the field of math the text is concerned with.
  • They must learn SME. I know for a fact that many mathematicians are not good at learning foreign languages. It will help that SME will be a subset of the full language of math.
  • All this means that annotators must be chosen carefully and paid well. This means that not very many papers will get annotated by paid annotators, so that there will have to be some committee that chooses the papers to be annotated. This will be a genuine bottleneck.
  • One thing that will help in the long run is that the SME should evolve to include more features of the general language of math, so many mathematicians will actually write their papers in SME and submit it directly to the Depository. (“Long run” may mean more than ten years).

The Text-to-SME Converter

  • This converter takes a math text in ordinary Math English that has been annotated and convert it into SME.
  • The format for feeding it to the Automatic Theorem Prover may very well have to be different from the format to be read by a human. Both formats should be saved.

The Data Base

  • The Data Base would contain all math papers that have been run through the Proof Verifier, along with the results found by the Proof Verifier. A paper should be included whether or not every claim in the paper was verified.
  • Funding agencies (and private individuals) might choose particularly important papers and pay more money for annotation for those than for other papers.
  • Mathematicians in a particular field could be hired to annotate particular articles in their field, using a standard annotation language that would develop through time.
  • The annotated papers would be made freely available to the public.
  • It will no doubt prove useful for the Data Base to contain many other items. Possibilities:
  • A searchable list of all theorems that have been verified.
  • A glossary: a list of math words that have been defined in the papers in the Depository. This will include synonyms and words with multiple meanings.

The Data Base Miner

Watch for patterns

The DBM would watch for patterns in annotation as new annotated papers were submitted. It should probably look only at annotated papers whose proofs had been verified. The patterns might include:

  • Correlation between annotations that associate particular meanings to particular words or symbols with the branch of math the paper belongs to. See Example 1.
  • Noting that a particular format of combining symbols usually results in the same kind of annotation. See Example 4.
  • Providing data in such a way that lexicographers studying math English could make use of them. My Handbook began with my doing lexicographical research on math English, but I found it so slow that when I started I resolved not to such research any more. Nevertheless, it needs to be done and the Database should make the process much easier.

Statistical translation

Since the annotated papers will be stored in the Data Base, the Data Base Miner could use the annotations in somewhat the same way some language translators work (in part): to translate a phrase, it will find occurrences of the phrase in the source language that have been translated into the target language and use the most common translation. In this case the source language is the paper (in English) and the target language is in annotated math English readable by the Proof Verifier. Once the Database includes most of the papers ever published (twenty years from now?), statistical translation might actually become useful.


Example 1: Meaning varies with branch of math

  • Field” means one thing in an algebra paper and another in a mathematical physics paper.
  • Domain” means
  • An open connected set in topology.
  • A type of ring in algebra.
  • A type of poset in theoretical computing science.
  • The domain of a function –everywhere in math, which makes it seem that this is going to be very hard to distinguish without human help!
  • Log” usually implies base $2$ in the computing world, base $10$ in engineering (but I am not sure how prevalant this meaning is there), and base $e$ in pure math. With exceptions!
  • Example 2: Meaning varies even in the same article

    • The notation “$(a,b)$” can mean an ordered pair, an open interval, or the GCD. What’s worse, there are many instances where the symbol is used without definition. Citation 139 in the Handbook provides a single sentence in which the first two meanings both occur:

      $\dots$ Richard Darst and Gerald Taylor investigated the differentiability of functions $f^p$ (which for our purposes we will restrict to $(0,1)$) defined for each $p\geq1$ by\[F(x):=
      0 &
      \text{if }x\text{ is irrational}\\
      \displaystyle{\frac{1}{n^p}} &
      \text{if }x = \displaystyle{\frac{m}{n}}\text{ with }(m,n)=1\\ \end{cases}\]

      The sad thing is that any mathematician will know immediately what each occurrence means. This may be a case where the correct annotation will never be automatically detectable.

    Example 3: One mention of a symbol may require several meanings

    In the sentence, “This infinite series converges to $\zeta(2)=\frac{\pi^2}{6}\approx 1.65$,” the annotator would provide two pieces of information about “$\frac{\pi^2}{6}$”, namely that it is both the right constituent of the equation “$\zeta(2)=\frac{\pi^2}{6}$” and the left constituent of the approximation statement “$\frac{\pi^2}{6}\approx 1.65$” — and that these two statements were the constituents of an asserted conjunction. (See my post Pivoted symbols.)

    Example 4: Function to a power

    Some expressions not in the SME will almost always be annotated in the same way. This makes it discoverable by the Data Base Miner.

    • “$\sin^{-1}x$” always means $\arcsin x$.
    • For positive $n$, “$\sin^n x$” always means $(\sin x)^n$. It never means the $n$-fold application of $\sin$ to $x$.
    • In contrast, for an arbitrary function symbol, $f^n(x)$ will often be annotated as $n$-fold application of $f$ and also often as $f(x)^n$. (And maybe those last two possibilities are correlated by branch of math.)


    I believe that work in formal verification has tended to overlook the work on math language difficulties in math ed, so I have included some articles from that specialty.

    The following are posts from my blog Gyre&Gimble. They are in reverse chronological order.

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle

    Pivoted symbols

    This post describes one problem that someone new to abstract math who is reading mathematical papers might find difficult to understand. Any attempt at machine translation of mathematical prose into computer-readable code would also have to take this phenomenon into account. The math ed literature describes many other problems like this for readers, some of which cause much more trouble than this example.

    I referred to the phenomenon described below as a parenthetic assertion in these publications:

    A symbol is pivoted if it is embedded just once in an expression that, when spelled out explicitly, requires it appear twice because it play two different roles in two different phrases or clauses. The examples below convince me that “pivoted symbol” is a better name than “parenthetic assertion”.


    The expression “$1\lt x\lt 2$” can be spelled out in these ways:

    1. “$1$ is less than $x$, which is less than $2$”.
    2. “$1$ is less than $x$ and $x$ is less than $2$”.

    In both cases, $x$ appears just once in the expression, but any reasonable rendering requires it to appear in two clauses. (The word “which” in the first statement refers to $x$ according to the rules for anaphora in English, so “which” is a homonym for $x$ there.)


    “For any $x\gt0$ there is a $y\gt0$ such that $x\gt y$.”

    • This is mathematical shorthand for: “For any $x$ that is greater than $0$ there is a $y$ that is greater than $0$ such that $x$ is greater than $y$.”
    • The phrase “For any $x$ that is greater than $0$” conveys two pieces of information:
      • $x$ is bound by a universal quantifier.
      • $x$ is a variable in a phrase constraining it.
    • The constraint on $x$ to be bigger than $0$ fills the slot of a subordinate clause playing the role of an adjective, with $x$ as head. The word “that” is anaphoric.
    • The situation for $y$ in this expression is similar.
    • The elimination of “that” (which is one occurrence of $x$) in the expressions “$x$ greater than $0$” and “$x\gt 0$” fit a common pattern in English of omitting “that” or “that is”. Consider: “I saw a house bigger than the White House.” So calling the statement a “parenthetic assertion” doesn’t seem to fit the situation very well.
    • That’s why I have titled this post “Pivoted symbols”. The key to this name is that a translation into a formal language is going to have to encode two facts about $x$: It is bound by a quantifier and it is constrained to be greater than $0$. It has to be copied to accomplish this.
    • There are similar examples in Contrapositive grammar, by David Butler.

    “This infinite series converges to $\zeta(2)=\frac{\pi^2}{6}\approx 1.65$.”

    This example can be read in two ways that are different in English grammar but have the same logical content:

    • “This infinite series converges to $\zeta(2)$, which is $\frac{\pi^2}{6}$, which is approximately $1.65$.”
    • “$\ldots$ converges to $\zeta(2)$, and $\zeta(2)=\frac{\pi^2}{6}$, and $\frac{\pi^2}{6}$ is approximately $1.65$.”

    “Let us return for a moment to the circle $S^1\subseteq \mathbb{C}=\mathbb{R}^2$.” (Citation 426 in the Handbook of mathematical discourse.)


    is a sum of $2n$th powers of elements in $\mathbb{Q}(t)$ for all $n$. (Citation 332 in the Handbook of mathematical discourse.)

    Science-fiction fans used to do something similar with words instead of clauses, writing “yed” for “ye editor” and “scientifiction” for “scientific fiction”.

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle

    The real numbers

    My website contains separate short articles about certain number systems (natural numbers, integers, rationals, reals). The intent of each article is to discuss problems that students have when they begin studying abstract math. The articles do not give complete coverage of each system. They contain links when concepts are mentioned that the reader might not be familiar with.

    This post is a revision of the article on the real numbers. The other articles have also been recently revised.


    A real number is a number that can be represented as a (possibly infinite) decimal expansion, such as 2.56, -3 (which is -3.0), 1/3 (which has the infinite decimal expansion 0.333…), and $\pi$. Every integer and every rational number is a real number, but numbers such as $\sqrt{2}$ and $\pi$ are real numbers that are not rational.

    • I will not give a mathematical definition of “real number”.  There are several equivalent definitions of real number all of which are quite complicated.   Mathematicians rarely think about real numbers in terms of these definitions; what they have in mind when they work with them are their familiar algebraic and topological properties.
    • “Real number” is a technical term.  Real numbers are not any more “genuine” that any other numbers.
    • Integers and rational numbers are real numbers, but there are real numbers that are not integers or rationals. One such number is$\sqrt{2}$. Such numbers are called irrational numbers.

    Properties of the real numbers


    The real numbers are closed under addition, subtraction, and multiplication, as well as division by a nonzero number.

    Notice that these are exactly the same arithmetic closure properties that rational numbers have. In the previous sections in this chapter on numbers, each new number system — natural numbers, integers and rational numbers — were closed under more arithmetic operations than the earlier ones. We don’t appear to have gained anything concerning arithmetic operations in going from the rationals to the reals.

    The real numbers do allow you to find zeroes of some polynomials that don’t have rational zeroes. For example, the equation $x^2-2=0$ has the root $x=\sqrt{2}$, which is a real number but not a rational number. However, you get only some zeroes of polynomials by going to the reals — consider the equation $x^2+2=0$, which requires going to the complex numbers to get a root.

    Closed under limits

    The real numbers are closed under another operation (not an algebraic operation) that rational numbers are not closed under:

    The real numbers are closed under taking limits.
    That fact is the primary reason real numbers are so important
    in math, science and engineering.

    Consider: The concepts of continuous function, derivative and integral — the basic ideas in calculus and differential equations — are all defined in terms of limits. Those are the basic building blocks of mathematical analysis, which provides most of the mathematical tools used by scientists and engineers.

    Some images and metaphors for real numbers

    Line segments

    The length of any line segment is given by a positive real number.



    The diagonal of the square above has length $2\sqrt{2}$.

    Directed line segments

    Measuring directed line segments requires the use of negative real numbers as well as positive ones. You can regard the diagonal above as a directed line segment. If you regard “left to right” as the positive direction (which is what we usually do), then if you measure it from right to left you get $-2\sqrt{2}$.

    Real numbers are quantities

    Real numbers are used to measure continuous variable quantities.

    • The temperature at a given place and a given time.
    • The speed of a moving car.
    • The amount of water in a particular jar.


    • Temperature, speed, volume of water are thought of as quantities that can change, or be changed, which is why I called them “variable” quantities.
    • The name “continuous” for these quantities indicates that the quantity can change from one value to another without “jumping”. (This is a metaphor, not a mathematical definition!)

    If you have $1.334 \text{ cm}^3$ of water in a jar you can add any additional small amount into it or you can withdraw any small amount from it.  The volume does not suddenly jump from $1.334$ to $1.335$ – as you put in the water it goes up gradually from $1.334$ to $1.335$.


    This explanation of “continuous quantity” is done in terms of how we think about continuous quantities, not in terms of a mathematical definition.  In fact. since you can’t measure an amount smaller than one molecule of water, the volume does jump up in tiny discrete amounts.   Because of quantum phenomena, temperature and speed change in tiny jumps, too (much tinier than molecules). 

    Quantum jumps and individual molecules are ignored in large-scale physical applications because the scale at which they occur is so tiny it doesn’t matter.  For such applications, physicists and chemists (and cooks and traffic policemen!) think of the quantities they are measuring as continuous, even though at tiny scales they are not.

    The fact that scientists and engineers treat changes of physical quantities as continuous, ignoring the fact that they are not continuous at tiny scales, is sometimes called the “continuum hypothesis”. This is not what mathematicians mean by that phrase: see continuum hypothesis in Wikipedia.

    The real line

    It is useful to visualize the set of real numbers as the real line.

    The real line goes off to infinity in both directions. Each real number represents a location on the real line. Some locations are shown here:

    The locations are commonly called points on the real line.  This can lead to a seriously mistaken mental image of the reals as a row of points, like beads.  Just as in the case of the rationals, there is no real number “just to the right” of a given real number. 

    Decimal representation of the real numbers

    In this section, I will go into more detail about the decimal representation of the real numbers. There are two reasons for doing this.

    • People just beginning abstract math tend to think in terms of bad metaphors about the real numbers as decimals, and I want to introduce ways of thinking about them that are more helpful.
    • The real numbers can be defined in terms of the decimal representation. This is spelled out in a blog post by Tim Gowers. The definition requires some detail and in some ways is inelegant compared to the definitions usually used in analysis textbooks. But it means that the more you understand about the decimal representation, the better you understand real numbers, and in a pretty direct way.

    The decimal representation of a real number is also called its decimal expansion.  A representation can be given to other bases besides $10$; more about that here.

    Decimal representation as directed length.

    The decimal representation of a real number gives the approximate location of the number on the real line as its directed distance from $0$.

    • The rational number $1/2$ is real and has the decimal representation $0.5$.
    • The rational number $-1/2$ has the representation $-0.5$.
    • The number $1/3$ is also real and has the infinite decimal representation $1.333\ldots$. Thereis an infinite number of $3$’s, or to put it another way, for every
      positive integer $n$, the $n$th decimal place of the decimal representation of $1/3$ is $3$.
    • The number $\pi $ has a decimal representation beginning $3.14159\ldots$. So you can locate $\pi$ approximately by going $3.14$ units to the right from $0$.  You can locate it more exactly by going $3.14159$ units to the right, if you can measure that accurately.  The decimal representation of $\pi$ is infinitely long so you can theoretically represent it with as much accuracy as you wish.  In practice, of course, it would take longer than the age of the universe to find the first ${{10}^{({{10}^{10}})}}$ digits.

    Bar notation

    It is customary to put a bar over a sequence of digits at the end of a decimal representation to indicate that the sequence is repeated forever. 

    • $42\frac{1}{3}=42.\overline{3}$
    • $52.71656565\ldots$ (the group $65$ repeating infinitely often) may be written $52.71\overline{65}$.
    • A decimal representation that is only finitely long, for example $5.477$, could also be written $5.477\overline{0}$.
    • In particular, $6=6.0=6.\overline{0}$, and that works for any integer.


    If you give the first few decimal places of a real number, you are giving an approximation to it.  Mathematicians on the one hand and scientists and engineers on the other tend to treat expressions such as $3.14159$ in two different ways:

    • The mathematician may think of it as a precisely given number, namely $\frac{314159}{100000}$, so in particular it represents a rational number. This number is not $\pi$, although it is close to it.
    • The scientistor engineer will probably treat it as the known part of the decimal representation of a real number. From their point of view, one knows $3.14159$ to six significant figures.
    • always takes the mathematician’s point of view.  If I refer to $3.14159$, I mean the rational number $\frac{314159}{100000}$.  I may also refer to $\pi$ as “approximately $3.15159$”.

    Integers and reals in computer languages

    Computer languages typically treat integers as if they were distinct from real numbers. In particular, many languages have the convention that the expression ‘$2$’ denotes the integer and the expression ‘$2.0$’ denotes the real number.   Mathematicians do not use this convention.  They usually regard the integer $2$ and the real number $2.0$ as the same mathematical object.

    Decimal representation and infinite series

    The decimal representation of a real number is shorthand for a particular infinite series.  Suppose the part before the decimal place is the integer $n$ and the part after the decimal place is\[{{d}_{1}}{{d}_{2}}{{d}_{3}}…\]where ${{d}_{i}}$ is the digit in the $i$th place.  (For example, for $\pi$, $n=3$, ${{d}_{1}}=1,\,\,\,{{d}_{2}}=4,\,\,\,{{d}_{3}}=1,$ and so forth.)  Then the decimal notation $n.{{d}_{1}}{{d}_{2}}{{d}_{3}}…$ represents the limit of the infinite series\[n+\sum\limits_{i=1}^{\infty }{\frac{{{d}_{i}}}{{{10}^{i}}}}\]



    The number $42\frac{1}{3}$ is exactly equal to the sum of the infinite series, which is represented by the expression $42.\overline{3}$.

    If you stop the series after a finite number of terms, then the number is approximately equal to the resulting sum. For example, $42\frac{1}{3}$ is approximately equal to\[42+\frac{3}{10}+\frac{3}{100}+\frac{3}{1000}\]which is the same as $42.333$.

    This inequality gives an estimate of the accuracy of this approximation:\[42.333\lt42\frac{1}{3}\lt42.334\]

    How to think about infinite decimal representations

    The expression $42.\overline{3}$ must be thought of as including all the $3$’s all at once rather than as gradually extending to the right over an infinite period of time.

    In ordinary English, the “…” often indicates continuing through time, as in this example

    “They climbed to the top of the ridge, and saw another, higher ridge in the distance, so they walked to that ridge and climbed it, only to see another one still further away…”

    But the situation with decimal representations is different:

    The decimal representation of $42\frac{1}{3}$ as $42.333\ldots$must be thought of as a complete, infinitely long sequence of decimal digits, every one of which (after the decimal point) is a “$3$” right now.

    In the same way, you need to think of the decimal expansion of $\sqrt{2}$ as having all its decimal digits in place at once. Of course, in this case you have to calculate them in order. And note that calculating them is only finding out what they are. They are already there!

    The preceding description is about how a mathematican thinks about infinite decimal expansions.  The thinking has some sort of physical representation in your head that allows you to think about to the hundred millionth decimal place of $\sqrt{2}$ or of $\pi$ even if you don’t know what it is. This does not mean that you have an infinite number of slots in your brain, one for each decimal place!  Nor does it mean that the infinite number of decimal places actually exist “somewhere”.  After all, you can think about unicorns and they don’t actually exist somewhere.

    Exact definitions

    Both the following statements are true:

    • The numbers $1/3$, $\sqrt{2}$and $\pi $ have infinitely long decimal representations, in contrast for example to $\frac{1}{2}$, whose decimal representation is exactly $0.5$.
    • The expressions “$1/3$”, “$\sqrt{2}$” and “$\pi $” exactly determine the numbers $1/3$, $\sqrt{2}$ and $\pi$:

    These two statements don’t contradict each other. All three numbers have exact definitions.

    • $1/3$ is exactly the number that gives 1 when multiplied by $3$.
    • $\sqrt{2}$is exactly the unique positive real number whose square is 2.
    • $\pi $ is exactly the ratio of the circumference of a circle to its

    The decimal representation of each one to a finite number of places provides an approximate location of that number on the real line On the other hand, the complete decimal representation of each one represents it exactly, although you can’t write it down.

    Different decimal representations for the same number

    The decimal representations of two different real numbers must be different. However, two different decimal representations can, in certain circumstances, represent the same real number. This happens when the decimal representation ends in an infinite sequence of $9$’s or an infinite sequence of $0$’s.


    • $0.\overline{9}=1.\overline{0}$. This means that $0.\overline{9}$ is exactly the same number as $1$. It is not just an approximation of $1$
    • $3.4\bar{9}=3.5\overline{0}$. Indeed, $3.4\overline{9}$, $3.5$, $35/10$, and $7/2$ are all different representations of the same number. 

    The Wikipedia article “$0.\overline{9}$” is an elaborate discussion of the fact that $0.\overline{9}=1$, a fact that many students find hard to believe.

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle

    Abstraction and axiomatic systems

    Abstraction and the axiomatic method

    This post will become an article in


    An abstraction of a concept $C$ is a concept $C’$ with these properties:

    • $C’$ includes all instances of $C$ and
    • $C’$ is constructed by taking as axioms certain assertions that are true of all instances of $C$.

    There are two major situations where abstraction is used in math.

    • $C$ may be a familiar concept or property that has not yet been given a math definition.
    • $C$ may already have a mathe­matical definition using axioms. In that case the abstraction will be a generalization of $C$. 

    In both cases, the math definition may allow instances of $C’$ that were not originally thought of as being part of $C$.

    Example: Relations

    Mathematicians have made use of relations between math objects since antiquity.

    • For real numbers $r$ and $s$. “$r\lt x$” means that $r$ is less than $s$. So the statement “$5\lt 7$” is true, but the statement “$7\lt 5$” is false. We say that “$\lt$” is a relation on the real numbers. Other relations on real numbers denoted by symbols are “$=$” and “$\leq$”.
    • Suppose $m$ and $n$ are positive integers. $m$ and $n$ are said to be relatively prime if the greatest common divisor of $m$ and $n$ is $1$. So $5$ and $7$ are relatively prime, but $15$ and $21$ are not relatively prime. So being relatively prime is a relation on positive integers. This is a relation that does not have a commonly used symbol.
    • The concept of congruence of triangles has been used for a couple of millenia. In recent centuries it has been denoted by the symbol “$\cong$”. Congruence is a relation on triangles.

    One could say that a relation is a true-or-false statement that can be made about a pair of math objects of a certain type. Logicians have in fact made that a formal definition. But when set theory came to be used around 100 years ago as a basis for all definitions in math, we started using this definition:

    A relation on a set $S$ is a set $\alpha$ of ordered pairs of elements of $S$.

    “$\alpha$” is the Greek letter alpha.

    The idea is that if $(s,t)\in\alpha$, then $s$ is related by $\alpha$ to $t$, then $(s,t)$ is an element of $\alpha$, and if $s$ is not related by $\alpha$ to $t$, then $(s,t)$ is not an element of $\alpha$. That abstracts the everyday concept of relationship by focusing on the property that a relation either holds or doesn’t hold between two given objects.

    For example, the less-than relation on the set of all real numbers $\mathbb{R}$ is the set \[\alpha:=\{(r,s)|r\in\mathbb{R}\text{ and }s\in\mathbb{R}\text{ and }r\lt s\}\] In other words, $r\lt s$ if and only if $(r,s)\in \alpha$.


    A consequence of this definition is that any set of ordered pairs is a relation. Example: Let $\xi:=\{(2,3),(2,9),(9,1),(9,2)\}$. Then $\xi$ is a relation on the set $\{1,2,3,9\}$. Your reaction may be: What relation IS it? Answer: just that set of ordered pairs. You know that $2\xi3$ and $2\xi9$, for example, but $9\xi1$ is false. There is no other definition of $\xi$.

    Yes, the relation $\xi$ is weird. It is an arbitrary definition. It does not have any verbal description other than listing the element of $\xi$. It is probably useless. Live with it.

    The symbol “$\xi$” is a Greek letter. It looks weird, so I used it to name a weird relation. Its upper case version is “$\Xi$”, which is even weirder. I pronounce “$\xi$” as “ksee” but most mathematicians call it “si” or “zi” (rhyming with “pie”).

    Defining a relation as any old set of ordered pairs is an example of a reconstructive generalization.

    $n$-ary relations

    Years ago, mathematicians started coming up with things that were like relations but which involved more than two elements of a set.


    Let $r$, $s$ and $t$ be real numbers. We say that “$s$ is between $r$ and $t$” if $r\lt s$ and $s\lt t$. Then betweenness is a relation that is true or false about three real numbers.

    Mathematicians now call this a ternary relation. The abstract definition of a ternary relation is this: A ternary relation on a set $S$ is a set of ordered triple of elements of $S$. This is an reconstructive generalization of the concept of relation that allows ordered triples of elements as well as ordered pairs of elements.

    In the case of betweenness, we have to decide on the ordering. Let us say that the betweenness relation holds for the triple $(r,s,t)$ if $r\lt s$ and $s\lt t$. So $(4,5,7)$ is in the betweenness relation and $(4,7,5)$ is not.

    You could argue that in the sentence, “$s$ is between $r$ and $t$”, the $s$ comes first, so that we should say that the betweenness relation (meaning $r$ is between $s$ and $t$) holds for $(r,s,t)$ if $s\lt r$ and $r\lt t$. Well, when you write an article you can write it that way. But I am writing this article.

    Nowadays we talk about $n$-ary relations for any positive integer $n$. One consequence of this is that if we want to talk just about sets of ordered pairs we must call them binary relations.

    When I was a child there was only one kind of guitar and it was called “a guitar”. (My older cousin Junior has a guitar, but I had only a plastic ukelele.) Some time in the fifties, electrically amplified guitars came into being, so we had to refer to the original kind as “acoustic guitars”. I was a teenager when this happened, and being a typical teenager, I was completely contemptuous of the adults who reacted with extreme irritation at the phrase “acoustic guitar”.

    The axiomatic method

    The axiomatic method is a technique for studying math objects of some kind by formulating them as a type of math structure. You take some basic properties of the kind of structure you are interested in and set them down as axioms, then deduce other properties (that you may or may not have already known) as theorems. The point of doing this is to make your reasoning and all your assumptions completely explicit.

    Nowadays research papers typically state and prove their theorems in terms of math structures defined by axioms, although a particular paper may not mention the axioms but merely refer to other papers or texts where the axioms are given.  For some common structures such as the real numbers and sets, the axioms are not only not referenced, but the authors clearly don’t even think about them in terms of axioms: they use commonly-known properties (or real numbers or sets, for example) without reference.

    The axiomatic method in practice

    Typically when using the axiomatic method some of these things may happen:

    • You discover that there are other examples of this system that you hadn’t previously known about.  This makes the axioms more broadly applicable.
    • You discover that some properties that your original examples had don’t hold for some of the new examples.  Depending on your research goals, you may then add some of those properties to the axioms, so that the new examples are not examples any more.
    • You may discover that some of your axioms follow from others, so that you can omit them from the system.

    Example: Continuity

    A continuous function (from the set of real numbers to the set of real numbers) is sometimes described as a function whose graph you can draw without lifting your chalk from the board.  This is a physical description, not a mathe­matical definition.

    In the nineteenth century, mathe­ma­ticians talked about continuous functions but became aware that they needed a rigorous definition.  One possibility was functions given by formulas, but that didn’t work: some formulas give discontinuous functions and they couldn’t think of formulas for some continuous functions.

    This description of nineteenth century math is an oversimpli­fication.

    Cauchy produced the definition we now use (the epsilon-delta definition) which is a rigorous mathe­matical version of the no-lifting-chalk idea and which included the functions they thought of as continuous.

    To their surprise, some clever mathe­maticians produced examples of some weird continuous functions that you can’t draw, for example the sine blur function.  In the terminology in the discussion of abstraction above, the abstraction $C’$ (epsilon-delta continuous functions) had functions in it that were not in $C$ (no-chalk-lifting functions.) On the other hand, their definition now applied to functions between some spaces besides the real numbers, for example the complex numbers, for which drawing the graph without lifting the chalk doesn’t even make sense.

    Example: Rings

    Suppose you are studying the algebraic properties of numbers.  You know that addition and multiplication are both associative operations and that they are related by the distributive law:  $x(y+z)=xy+xz$. Both addition and multiplication have identity elements ($0$ and $1$) and satisfy some other properties as well: addition forms a commutative group for example, and if $x$ is any number, then $0\cdot x=0$.

    One way to approach this problem is to write down some of these laws as axioms on a set with two binary operations without assuming that the elements are numbers. In doing this, you are abstracting some of the properties of numbers.

    Certain properties such as those in the first paragraph of this example were chosen to define a type of math structure called a ring. (The precise set of axioms for rings is given in the Wikipedia article.)

    You may then prove theorems about rings strictly by logical deduction from the axioms without calling on your familiarity with numbers.

    When mathematicians did this, the following events occurred:

    • They discovered systems such as matrices whose elements are not numbers but which obey most of the axioms for rings.
    • Although multiplication of numbers is commutative, multiplication of matrices is not commutative.
    • Now they had to decide whether to require commutative of multiplication as an axioms for rings or not.  In this example, historically, mathe­maticians decided not to require multi­plication to be commutative, so (for example) the set of all $2\times 2$ matrices with real entries is a ring.
    • They then defined a commutative ring to be a ring in which multi­plication is commutative.
    • So the name “commutative ring” means the multiplication is commutative, because addition in rings is always commutative. Mathematical names are not always transparent.

    • You can prove from the axioms that in any ring, $0 x=0$ for all $x$, so you don’t need to include it as an axiom.

    Nowadays, all math structures are defined by axioms.

    Other examples

    • Historically, the first example of something like the axiomatic method is Euclid’s axiomatization of geometry.  The axiomatic method began to take off in the late nineteenth century and now is a standard tool in math.  For more about the axiomatic method see the Wikipedia article.
    • Partitions. and equivalence
      are two other concepts that have been axiomatized. Remarkably, although the axioms for the two types of structures are quite different, every partition is in fact an equivalence relation in exactly one way, and any equivalence relation is a partition in exactly one way.


    Many articles on the web about the axiomatic method emphasize the representation of the axiom system as a formal logical theory (formal system). 
    In practice, mathematicians create and use a particular axiom system as a tool for research and understanding, and state and prove theorems of the system in semi-formal narrative form rather than in formal logic.

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle

    The great math mystery

    The great math mystery

    Last night Nova aired The great math mystery, a documentary that describes mathematicians’ ideas about whether math is discovered or invented, whether it is “out there” or “in our head”. It was well-done. Things were explained clearly using images and metaphors, although they did show Maxwell’s equations as algebra (without explaining it). The visual illustrations of connections between Maxwell’s equations and music and electromagnetic waves was one of the best parts of the documentary.

    In my opinion they made good choices of mathematical ideas to cover, but I imagine a lot of research mathematicians will have a hissy that they didn’t cover XXX (their subject).

    The applications to physics dominated the show (that is not a complaint), but someone did mention the remarkable depth of number theory. Number theory is deep pure math that has indeed had some applications, but that’s not why some of the greatest mathematicians in the world have spent their lives on the subject. I believe logic and proof was never mentioned, and that is completely appropriate for a video made for the general public. Some mathematicians will disagree with that last sentence.

    Where does math live?

    The question,

    Does math live

    • In an ideal world separate from the physical world,
    • in the physical world, or
    • in our brains?

    has a perfectly clear answer: It exists in our brains.

    Ideal world

    The notion that math lives in an ideal world, as Plato supposedly believed, has no evidence for it at all.

    I suppose you could say that Plato’s ideal world does exist — in our brains. But that wouldn’t be quite correct: We have a mental image of Plato’s ideal world in our brains, but that image is not the whole ideal world: If we know about triangles, we can imagine the Ideal Triangle to be in his world, but we have to know about the zeta function or the monster group to visualize them to be in his world. Even then, the monster group in our brain is just a collection of neurons connected to concepts such as “largest sporadic simple group” or “contains\[2^{46} \cdot 3^{20} \cdot 5^9 \cdot 7^6 \cdot 11^2 \cdot 13^3 \cdot 17 \cdot 19 \cdot 23 \cdot 29 \cdot 31 \cdot 41 \cdot 47 \cdot 59 \cdot 71\]elements” — but there is not a neuron for each element! We don’t have that many neurons.

    The size of the monster group does not live in my brain. I copied it from Wikipedia.

    Real world

    Our collective experience is that math is extraordinarily useful for modeling many aspects of the real world. But in what sense does that mean it exists in the real world?

    There is a sense in which a model of the real world exists in our brains. If we know some of the math that explains certain aspects of the real world, our brains have neuron connections that make that math live in our brain and in some sense in the model of the real world that is in our brain. But does that mean the math is “out there”? I don’t see why.

    Math is a social endeavor

    One point that usually gets left out of discussions of Platonism is this: Some math exists in any individual person’s brain. But math also exists in society. The math floating around in the individual brains of people is subject to frequent amendments to those people’s understanding because they interact with the real world and in particular with other people.

    In particular, theoretical math exists in the society of mathematicians. It is constantly fluctuating because mathematicians talk to each other. They also explain it to non-mathematicians, which as everyone know can bring new insights into the brain of the person doing the explaining.

    So I think that the best answer to the question, where does math live? is that math is a bunch of memes that live in our social brain.


    I have written about these issues before:

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle

    Functions: Metaphors, Images and Representations

    Please read this post at I originally posted the document here but some of the diagrams would not render, and I haven’t been able to figure out why. Sorry for having to redirect.

    Send to Kindle

    Problems caused for students by the two languages of math

    The two languages of math

    Mathematics is communicated using two languages: Mathematical English and the symbolic language of math (more about them in two languages).

    This post is a collection of examples of the sorts of trouble that the two languages cause beginning abstract math students. I have gathered many of them here since they are scattered throughout the literature. I would welcome suggestions for other references to problems caused by the languages of math.

    In many of the examples, I give links to the literature and leave you to fish out the details there. Almost all of the links are to documents on the internet.

    There is an extensive list of references.


    Scattered through this post are conjectures. Like most of my writing about difficulties students have with math language, these conjectures are based on personal observation over 37 years of teaching mostly computer engineering and math majors. The only hard research of any sort I have done in math ed consists of the 426 citations of written mathematical writing included in the Handbook of Mathematical Discourse.


    This post is an attempt to gather together the ways in which math language causes trouble for students. It is even more preliminary and rough than most of my other posts.

    • The arrangement of the topics is unsatisfactory. Indeed, the topics are so interrelated that it is probably impossible to give a satisfactory linear order to them. That is where writing on line helps: Lots of forward and backward references.
    • Other people and I have written extensively about some of the topics, and they have lots of links. Other topics are stubs and need to be filled out. I have probably missed important points about and references to many of them.
    • Please note that many of the most important difficulties that students have with understanding mathematical ideas are not caused by the languages of math and are not represented here.

    I expect to revise this article periodically as I find more references and examples and understand some of the topics better. Suggestions would be very welcome.

    Intricate symbolic expressions

    I have occasionally had students tell me that have great difficulty understanding a complicated symbolic expression. They can’t just look at it and learn something about what it means.


    Consider the symbolic expression \[\displaystyle\left(\frac{x^3-10}{3 e^{-x}+1}\right)^6\]

    Now, I could read this expression aloud as if it were text, or more precisely describe it so that someone else could write it down. But if I am in math mode and see this expression I don’t “read” it, even to myself.

    I am one of those people who much of the time think in pictures or abstractions without words. (See references here.)

    In this case I would look at the expression as a structured picture. I could determine a number of things about it, and when I was explaining it I would point at the board, not try to pronounce it or part of it:

    • The denominator is always positive so the expression is defined for all reals.
    • The exponent is even so the value of the expression is always nonnegative. I would say, “This (pointing at the exponent) is an even power so the expression is never negative.”
    • It is zero in exactly one place, namely $x=\sqrt[3]{10}$.
    • Its derivative is also $0$ at $\sqrt[3]{10}$. You can see this without calculating the formula for the derivative (ugh).

    There is much more about this example in Zooming and Chunking.

    Algebra in high school

    There are many high school students stymied by algebra, never do well at it, and hate math as a result. I have known many such people over the years. A revealing remark that I have heard many times is that “algebra is totally meaningless to me”. This is sometimes accompanied by a remark that geometry is “obvious” or something similar. This may be because they think they have to “read” an algebraic expression instead of studying it as they would a graph or a diagram.


    Many beginning abstractmath students have difficulty understanding a symbolic expression like the one above. Could this be cause by resistance to treating the expression as a structure to be studied?

    Context-sensitive pronunciation

    A symbolic assertion (“formula” to logicians) can be embedded in a math English sentence in different ways, requiring the symbolic assertion to be pronounced in different ways. The assertion itself is not modified in any way in these different situations.

    I used the phrase “symbolic assertion” in because students are confused by the logicians’ use of “formula“.
    In everyday English, “$\text{H}_2\text{O}$” is the “formula” for water, but it is a term, not an assertion.


    “For every real number $x\gt0$ there is a real number $y$ such that $x\gt y\gt0$.”

    • In the sentence above, the assertion “$x\gt0$” must be pronounced “$x$ that is greater than $0$” or something similar.
    • The standalone assertion “$x\gt0$” is pronounced “$x$ is greater than $0$.”
    • The sentence “Let $x\gt0$” must be pronounced “Let $x$ be greater than $0$”.

    The consequence is that the symbolic assertion, in this case “$x\gt0$”, does not reveal that role it plays in the math English sentence that it is embedded in.

    Many of the examples occurring later in the post are also examples of context-sensitive pronunciation.


    Many students are subconsciously bothered by the way the same symbolic expression is pronounced differently in different math English sentences.

    This probably impedes some students’ progress. Teachers should point this phenomenon out with examples.

    Students should be discouraged from pronouncing mathematical expressions.

    For one thing, this could get you into trouble. Consider pronouncing “$\sqrt{3+5}+6$”. In any case, when you are reading any text you don’t pronounce the words, you just take in their meaning. Why not take in the meaning of algebraic expressions in the same way?

    Parenthetic assertions

    A parenthetic assertion is a symbolic assertion embedded in a sentence in math English in such a way that is a subordinate clause.


    In the math English sentence

    “For every real number $x\gt0$ there is a real number $y$ such that $x\gt y\gt0$”

    mentioned above, the symbolic assertion “$x\gt0$” plays the role of a subordinate clause.

    It is not merely that the pronunciation is different compared to that of the independent statement “$x\gt0$”. The math English sentence is hard to parse. The obvious (to an experienced mathematician) meaning is that the beginning of the sentence can be read this way: “For every real number $x$, which is bigger than $0$…”.

    But new student might try to read it is “For every real number $x$ is greater than $0$ …” by literally substituting the standalone meaning of “$x\gt0$” where it occurs in the sentence. This makes the text what linguists call a garden path sentence. The student has to stop and start over to try to make sense of it, and the symbolic expression lacks the natural language hints that help understand how it should be read.

    Note that the other two symbolic expressions in the sentence are not parenthetic assertions. The phrase “real number” needs to be followed by a term, and it is, and the phrase “such that” must be followed by a clause, and it is.

    More examples

    • “Consider the circle $S^1\subseteq\mathbb{C}=\mathbb{R}^2$.” This has subordinate clauses to depth 2.
    • “The infinite series $\displaystyle\sum_{k=1}^\infty\frac{1}{k^2}$ converges to $\displaystyle\zeta(2)=\frac{\pi^2}{6}\approx1.65$”
    • “We define a null set in $I:=[a,b]$ to be a set that can be covered by a countable of intervals with arbitrarily small total length.” This shows a parenthetical definition.
    • “Let $F:A\to B$ be a function.”
      A type declaration is a function? In any case, it would be better to write this sentence simply as “Let $F:A\to B$”.

    David Butler’s post Contrapositive grammar has other good examples.

    Math texts are in general badly written. Students need to be taught how to read badly written math as well as how to write math clearly. Those that succeed (in my observation) in being able to read math texts often solve the problem by glancing at what is written and then reconstructing what the author is supposedly saying.


    Some students are baffled, or at least bothered consciously or unconsciously, by parenthetic assertions, because the clues that would exist in a purely English statement are missing.

    Nevertheless, many if not most math students read parenthetic assertions correctly the first time and never even notice how peculiar they are.

    What makes the difference between them and the students who are stymied by parenthetic assertions?

    There is another conjecture concerning parenthetic assertions below.

    Context-sensitive meaning

    “If” in definitions


    The word “if” in definitions does not mean the same thing that it means in other math statements.

    • In the definition “An integer is even if it is divisible by $2$,” “if” means “if and only if”. In particular, the definition implies that a function is not even if it is not divisible by $2$.
    • In a theorem, for example “If a function is differentiable, then it is continuous”, the word “if” has the usual one-way meaning. In particular, in this case, a continuous function might not be differentiable.

    Context-sensitive meaning occurs in ordinary English as well. Think of a strike in baseball.


    The nearly universal custom of using “if” to mean “if and only if” in definitions makes it a harder for students to understand implication.

    This custom is not the major problem in understanding the role of definitions. See my article Definitions.

    Underlying sets


    In a course in group theory, a lecturer may say at one point, “Let $F:G\to H$ be a homomorphism”, and at another point, “Let $g\in G$”.

    In the first sentence, $G$ refers to the group, and in the second sentence it refers to the underlying set of the group.

    This usage is almost universal. I think the difficulty it causes is subtle. When you refer to $\mathbb{R}$, for example, you (usually) are referring to the set of real numbers together with all its canonical structure. The way students think of it, a real number comes with its many relations and connections with the other real numbers, ordering, field properties, topology, and so on.

    But in a group theory class, you may define the Klein $4$-group to be $\mathbb{Z}_2\times\mathbb{Z}_2$. Later you may say “the symmetry group of a rectangle that is not a square is the Klein $4$-group.” Almost invariably some student will balk at this.

    Referring to a group by naming its underlying set is also an example of synecdoche.


    Students expect every important set in math to have a canonical structure. When they get into a course that is a bit more abstract, suddenly the same set can have different structures, and math objects with different underlying sets can have the same structure. This catastrophic shift in a way of thinking should be described explicitly with examples.

    Way back when, it got mighty upsetting when the earth started going around the sun instead of vice versa. Remind your students that these upheavals happen in the math world too.

    Overloaded notation

    Identity elements

    A particular text may refer to the identity element of any group as $e$.

    This is as far as I know not a problem for students. I think I know why: There is a generic identity element. The identity element in any group is an instantiation of that generic identity element. The generic identity element exists in the sketch for groups; every group is a functor defined on that sketch. (Or if you insist, the generic identity element exists in the first order theory for groups.) I suspect mathematicians subconsciously think of identity elements in this way.

    Matrix multiplication

    Matrix multiplication is not commutative. A student may forget this and write $(A^2B^2=(AB)^2$. This also happens in group theory courses.

    This problem occurs because the symbolic language uses the same symbol for many different operations, in this case the juxtaposition notation for multiplication. This phenomenon is called overloaded notation and is discussed in here.


    Noncommutative binary operations written using juxtaposition cause students trouble because going to noncommutative operations requires abandoning some overlearned reflexes in doing algebra.

    Identity elements seem to behave the same in any binary operation, so there are no reflexes to unlearn. There are generic binary operations of various types as well. That’s why mathematicians are comfortable overloading juxtaposition. But to get to be a mathematician you have to unlearn some reflexes.


    Sometimes you need to reword a math statement that contains symbolic expressions. This particularly causes trouble in connection with negation.

    Ordinary English

    The English language is notorious among language learners for making it complicated to negate a sentence. The negation of “I saw that movie” is “I did not see that movie”. (You have to put “d** not” (using the appropriate form of “do”) before the verb and then modify the verb appropriately.) You can’t just say “I not saw that movie” (as in Spanish) or “I saw not that movie” (as in German).


    The method in English used to negate a sentence may cause problems with math students whose native language is not English. (But does it cause math problems with those students?)

    Negating symbolic expressions


    • The negation of “$n$ is even and a prime” is “$n$ is either odd or it is not a prime”. The negation should not be written “$n$ is not even and a prime” because that sentence is ambiguous. In the heat of doing a proof students may sometimes think the negation is “$n$ is odd and $n$ is not a prime,” essentially forgetting about DeMorgan. (He must roll over in his grave a lot.)
    • The negation of “$x\gt0$” is “$x\leq0$”. It is not “$x\lt0$”. This is a very common mistake.

    These examples are difficulties caused by not understanding the math. They are not directly caused by difficulties with the languages of math.

    Negating expressions containing parenthetic assertions

    Suppose you want to prove:

    “If $f:\mathbb{R}\to\mathbb{R}$ is differentiable, then $f$ is continuous”.

    A good way to do this is by using the contrapositive. A mechanical way of writing the contrapositive is:

    “If $f$ is not continuous, then $f:\mathbb{R}\to\mathbb{R}$ is not differentiable.”

    That is not good. The sentence needs to be massaged:

    “If $f:\mathbb{R}\to\mathbb{R}$ is not continuous, then $f$ is not differentiable.”

    Even better would be to write the original sentence as:

    “Suppose $f:\mathbb{R}\to\mathbb{R}$. Then if $f$ is differentiable, then $f$ is continuous.”

    This is discussed in detail in David Butler’s post Contrapositive grammar.


    Students need to be taught to understand parenthetic assertions that occur in the symbolic language and to learn to extract a parenthetic assertion and write it as a standalone assertion ahead of the statement it occurs in.


    The scope of a word or variable consists of the part of the text for which its current definition is in effect.


    • “Suppose $n$ is divisible by $4$.” The scope is probably the current paragraph or perhaps the current proof. This means that the properties of $n$ are constrained in that section of the text.
    • “In this book, all rings are unitary.” This will hold for the whole book.

    There are many more examples in the article Scope.

    If you are a grasshopper (you like to dive into the middle of a book or paper to find out what it says), knowing the scope of a variable can be hard to determine. It is particularly difficult for commonly used words or symbols that have been defined differently from the usual usage. You may not suspect that this has happened since it might be define once early in the text. Some books on writing mathematics have urged writers to keep global definitions to a minimum. This is good advice.

    Finding the scope is considerably easier when the text is online and you can search for the definition.


    Knowing the scope of a word or variable can be difficult. It is particular hard when the word or variable has a large scope (chapter or whole book.)


    Variables are often introduced in math writing and then used in the subsequent discussion. In a complicated discussion, several variables may be referred to that have different statuses, some of them introduced several pages before. There are many particular ways discussed below that can cause trouble for students. This post is restricted to trouble in connection with the languages of math. The concept of variable is difficult in itself, not just because of the way the math languages represent them, but that is not covered here.

    Much of this part of the post is based on work of Susanna Epp, including three papers listed in the references. Her papers also include many references to other work in the math ed literature that have to do with understanding variables.

    See also Variables in and Variables in Wikipedia.


    Students blunder by forgetting the type of the variable they are dealing with. The example given previously of problems with matrix multiplication is occasioned by forgetting the type of a variable.


    Students sometimes have problems because they forget the data type of the variables they are dealing with. This is primarily causes by overloaded notation.

    Dependent and independent

    If you define $y=x^2+1$, then $x$ is an independent variable and $y$ is a dependent variable. But dependence and independence of variablesare more general than that example suggests.
    In an epsilon-delta proof of the limit of a function (example below,) $\varepsilon$ is independent and $\delta$ is dependent on $\varepsilon$, although not functionally dependent.


    Distinguishing dependent and independent variables causes problems, particularly when the dependence is not clearly functional.

    I recently ran across a discussion of this on the internet but failed to record where I saw it. Help!

    Bound and free

    This causes trouble with integration, among other things. It is discussed in in Variables and Substitution. I expect to add some references to the math ed literature soon.


    Some of these variables may be given by existential instantiation, in which case they are dependent on variables that define them. Others may be given by universal instantiation, in which case the variable is generic; it is independent of other variables, and you can’t impose arbitrary restrictions on it.

    Existential instantiation

    A theorem that an object exists under certain conditions allows you to name it and use it by that name in further arguments.


    Suppose $m$ and $n$ are integers. Then by definition, $m$ divides $n$ if there is an integer $q$ such that $n=qm$. Then you can use “$q$” in further discussion, but $q$ depends on $m$ and $n$. You must not use it with any other meaning unless you start a new paragraph and redefine it.

    So the following (start of a) “proof” blunders by ignoring this restriction:

    Theorem: Prove that if an integer $m$ divides both integers $n$ and $p$, then $m$ divides $n+p$.

    “Proof”: Let $n = qm$ and $p = qm$…”

    Universal instantiation

    It is a theorem that for any integer $n$, there is no integer strictly between $n$ and $n+1$. So if you are given an arbitrary integer $k$, there is no integer strictly between $k$ and $k+1$. There is no integer between $42$ and $43$.

    By itself, universal instantiation does not seem to cause problems, provided you pay attention to the types of your variables. (“There is no integer between $\pi$ and $\pi+1$” is false.)

    However, when you introduce variables using both universal and existential quantification, students can get confused.


    Consider the definition of limit:

    Definition: $\lim_{x\to a} f(x)=L$ if and only if for every $\epsilon\gt0$ there is a $\delta\gt0$ for which if $|x-a|\lt\delta$ then $|f(x)-L|\lt\epsilon$.

    A proof for a particular instance of this definition is given in detail in Rabbits out of a Hat. In this proof, you may not put constraints on $\epsilon$ except the given one that it is positive. On the other hand, you have to come up with a definition of $\delta$ and prove that it works. The $\delta$ depends on what $f$, $a$ and $L$ are, but there are always infinitely many values of $\delta$ which fit the constraints, and you have to come up with only one. So in general, two people doing this proof will not get the same answer.


    Susanna Epp’s paper Proof issues with existential quantification discusses the problems that students have with both existential and universal quantification with excellent examples. In particular, that paper gives examples of problems students have that are not hinted at here.


    A nearly final version of The Handbook of Mathematical Discourse is available on the web with links, including all the citations. This version contains some broken links. I am unable to recompile it because TeX has evolved enough since 2003 that the source no longer compiles. The paperback version (without the citations) can be bought as a book here. (There are usually cheaper used versions on Amazon.) is a website for beginning students in abstract mathematics. It includes most of the material in the Handbook, but not the citations. The Introduction gives you a clue as to what it is about.

    Two languages

    My take on the two languages of math are discussed in these articles:

    The Language of Mathematics, by Mohan Ganesalingam, covers these two languages in more detail than any other book I know of. He says right away on page 18 that mathematical language consists of “textual sentences with symbolic material embedded like ‘islands’ in the text.” So for him, math language is one language.

    I have envisioned two separate languages for math in and in the Handbook, because in fact you can in principle translate any mathematical text into either English or logical notation (first order logic or type theory), although the result in either case would be impossible to understand for any sizeable text.

    Topics in

    Context-sensitive interpretation.

    “If” in definitions.

    Mathematical English.

    Parenthetic assertion.


    Semantic contamination.


    The symbolic language of math


    Zooming and Chunking.

    Topics in the Handbook of mathematical discourse.

    These topics have a strong overlap with the topics with the same name in They are included here because the Handbook contains links to citations of the usage.


    “If” in definitions.

    Parenthetic assertion.


    Posts in Gyre&Gimble


    Naming mathematical objects

    Rabbits out of a Hat.

    Semantics of algebra I.

    Syntactic and semantic thinkers

    Technical meanings clash with everyday meanings

    Thinking without words.

    Three kinds of mathematical thinkers

    Variations in meaning in math.

    Other references

    Contrapositive grammar, blog post by David Butler.

    Proof issues with existential quantification, by Susanna Epp.

    The role of logic in teaching proof, by Susanna Epp (2003).

    The language of quantification in mathematics instruction, by Susanna Epp (1999).

    The Language of Mathematics: A Linguistic and Philosophical Investigation
    by Mohan Ganesalingam, 2013. (Not available from the internet.)

    On the communication of mathematical reasoning, by Atish Bagchi, and Charles Wells (1998a), PRIMUS, volume 8, pages 15–27.

    Variables in Wikipedia.

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle

    Notation for sets

    This is a revision of the section of on notation for sets.

    Sets of numbers

    The following notation for sets of numbers is fairly standard.


    • Some authors use $\mathbb{I}$ for $\mathbb{Z}$, but $\mathbb{I}$ is also used for the unit interval.
    • Many authors use $\mathbb{N}$ to denote the nonnegative integers instead
      of the positive ones.
    • To remember $\mathbb{Q}$, think “quotient”.
    • $\mathbb{Z}$ is used because the German word for “integer” is “Zahl”.

    Until the 1930’s, Germany was the world center for scientific and mathematical study, and at least until the 1960’s, being able to read scientific German was was required of anyone who wanted a degree in science. A few years ago I was asked to transcribe some hymns from a German hymnbook — not into English, but merely from fraktur (the old German alphabet) into the Roman alphabet. I sometimes feel that I am the last living American to be able to read fraktur easily.

    Element notation

    The expression “$x\in A$” means that $x$ is an element of the set $A$. The expression “$x\notin A$” means that $x$ is not an element of $A$.

    “$x\in A$” is pronounced in any of the following ways:

    • “$x$ is in $S$”.
    • “$x$ is an element of $S$”.
    • “$x$ is a member of $S$”.
    • “$S$ contains $x$”.
    • “$x$ is contained in $S$”.


    • Warning: The math English phrase “$A$ contains $B$” can mean either “$B\in A$” or “$B\subseteq A$”.
    • The Greek letter epsilon occurs in two forms in math, namely $\epsilon$ and $\varepsilon$. Neither of them is the symbol for “element of”, which is “$\in$”. Nevertheless, it is not uncommon to see either “$\epsilon$” or “$\varepsilon$” being used to mean “element of”.
    • $4$ is an element of all the sets $\mathbb{N}$, $\mathbb{Z}$, $\mathbb{Q}$, $\mathbb{R}$, $\mathbb{C}$.
    • $-5\notin \mathbb{N}$ but it is an element of all the others.

    List notation

    Definition: list notation

    A set with a small number of elements may be denoted by listing the elements inside braces (curly brackets). The list must include exactly all of the elements of the set and nothing else.


    The set $\{1,\,3,\,\pi \}$ contains the numbers $1$, $3$ and $\pi $ as elements, and no others. So $3\in \{1,3,\pi \}$ but $-3\notin \{1,\,3,\,\pi \}$.

    Properties of list notation

    List notation shows every element and nothing else

    If $a$ occurs in a list notation, then $a$ is in the set the notation defines.  If it does not occur, then it is not in the set.

    Be careful

    When I say “$a$ occurs” I don’t mean it necessarily occurs using that name. For example, $3\in\{3+5,2+3,1+2\}$.

    The order in which the elements are listed is irrelevant

    For example, $\{2,5,6\}$ and $\{5,2,6\}$ are the same set.

    Repetitions don’t matter

    $\{2,5,6\}$, $\{5,2,6\}$, $\{2,2,5,6 \}$ and $\{2,5,5,5,6,6\}$ are all different representations of the same set. That set has exactly three elements, no matter how many numbers you see in the list notation.

    Multisets may be written with braces and repeated entries, but then the repetitions mean something.

    When elements are sets

    When (some of) the elements in list notation are themselves sets (more about that here), care is required.  For example, the numbers $1$ and $2$  are not elements of the set \[S:=\left\{ \left\{ 1,\,2,\,3 \right\},\,\,\left\{ 3,\,4 \right\},\,3,\,4 \right\}\]The elements listed include the set $\{1, 2, 3\}$ among others, but not the number $2$.  The set $S$ contains four elements, two sets and two numbers. 

    Another way of saying this is that the element relation is not transitive: The facts that $A\in B$ and $B\in C$ do not imply that $A\in C$. 

    Sets are arbitrary

    • Any mathematical object can be the element of a set.
    • The elements of a set do not have to have anything in common.
    • The elements of a set do not have to form a pattern.
    • $\{1,3,5,6,7,9,11,13,15,17,19\}$ is a set. There is no point in asking, “Why did you put that $6$ in there?” (Sets can be arbitrary.)
    • Let $f$ be the function on the reals for which $f(x)=x^3-2$. Then \[\left\{\pi^3,\mathbb{Q},f,42,\{1,2,7\}\right\}\] is a set. Sets do not have to be homogeneous in any sense.

    Setbuilder notation


    Suppose $P$ is an assertion. Then the expression “$\left\{x|P(x) \right\}$” denotes the set of all objects $x$ for which $P(x)$ is true. It contains no other elements.

    • The notation “$\left\{ x|P(x) \right\}$” is called setbuilder notation.
    • The assertion $P$ is called the defining condition for the set.
    • The set $\left\{ x|P(x) \right\}$ is called the truth set of the assertion $P$.

    In these examples, $n$ is an integer variable and $x$ is a real variable..

    • The expression “$\{n| 1\lt n\lt 6 \}$” denotes the set $\{2, 3, 4, 5\}$. The defining condition is “$1\lt n\lt 6$”.  The set $\{2, 3, 4, 5\}$ is the truth set of the assertion “n is an integer and $1\lt n\lt 6$”.
    • The notation $\left\{x|{{x}^{2}}-4=0 \right\}$ denotes the set $\{2,-2\}$.
    • $\left\{ x|x+1=x \right\}$ denotes the empty set.
    • $\left\{ x|x+0=x \right\}=\mathbb{R}$.
    • $\left\{ x|x\gt6 \right\}$ is the infinite set of all real numbers bigger than $6$.  For example, $6\notin \left\{ x|x\gt6 \right\}$ and $17\pi \in \left\{ x|x\gt6 \right\}$.
    • The set $\mathbb{I}$ defined by $\mathbb{I}=\left\{ x|0\le x\le 1 \right\}$ has among its elements $0$, $1/4$, $\pi /4$, $1$, and an infinite number of
      other numbers. $\mathbb{I}$ is fairly standard notation for this set – it is called the unit interval.

    Usage and terminology

    • A colon may be used instead of “|”. So $\{x|x\gt6\}$ could be written $\{x:x\gt6\}$.
    • Logicians and some mathematicians called the truth set of $P$ the extension of $P$. This is not connected with the usual English meaning of “extension” as an add-on.
    • When the assertion $P$ is an equation, the truth set of $P$ is usually called the solution set of $P$. So $\{2,-2\}$ is the solution set of $x^2=4$.
    • The expression “$\{n|1\lt n\lt6\}$” is commonly pronounced as “The set of integers such that $1\lt n$ and $n\lt6$.” This means exactly the set $\{2,3,4,5\}$. Students whose native language is not English sometimes assume that a set such as $\{2,4,5\}$ fits the description.

    Setbuilder notation is tricky

    Looking different doesn’t mean they are different.

    A set can be expressed in many different ways in setbuilder notation. For example, $\left\{ x|x\gt6 \right\}=\left\{ x|x\ge 6\text{ and }x\ne 6 \right\}$. Those two expressions denote exactly the same set. (But $\left\{x|x^2\gt36 \right\}$ is a different set.)

    Russell’s Paradox

    In certain areas of math research, setbuilder notation can go seriously wrong. See Russell’s Paradox if you are curious.

    Variations on setbuilder notation

    An expression may be used left of the vertical line in setbuilder notation, instead of a single variable.

    Giving the type of the variable

    You can use an expression on the left side of setbuilder notation to indicate the type of the variable.


    The unit interval $I$ could be defined as \[\mathbb{I}=\left\{x\in \mathrm{R}\,|\,0\le x\le 1 \right\}\]making it clear that it is a set of real numbers rather than, say rational numbers.  You can always get rid of the type expression to the left of the vertical line by complicating the defining condition, like this:\[\mathbb{I}=\left\{ x|x\in \mathrm{R}\text{ and }0\le x\le 1 \right\}\]

    Other expressions on the left side

    Other kinds of expressions occur before the vertical line in setbuilder notation as well.


    The set\[\left\{ {{n}^{2}}\,|\,n\in \mathbb{Z} \right\}\]consists of all the squares of integers; in other words its elements are 0,1,4,9,16,….  This definition could be rewritten as $\left\{m|\text{ there is an }n\in \mathrm{}\text{ such that }m={{n}^{2}} \right\}$.


    Let $A=\left\{1,3,6 \right\}$.  Then $\left\{ n-2\,|\,n\in A\right\}=\left\{ -1,1,4 \right\}$.


    Be careful when you read such expressions.


    The integer $9$ is an element of the set \[\left\{{{n}^{2}}\,|\,n\in \text{ Z and }n\ne 3 \right\}\]It is true that $9={{3}^{2}}$ and that $3$ is excluded by the defining condition, but it is also true that $9={{(-3)}^{2}}$ and $-3$ is not an integer ruled out by the defining condition.


    Sets. Previous post.


    Toby Bartels for corrections.

    Creative Commons License< ![endif]>

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

    Send to Kindle

    math, language and other things that may show up in the wabe