Algebra is a difficult foreign language

Note: This post uses MathJax.  If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.


In a previous post, I said that the symbolic language of mathematics is difficult to learn and that we don't teach it well. (The symbolic language includes as a subset the notation used in high school algebra, precalculus, and calculus.) I gave some examples in that post but now I want to go into more detail.  This discussion is an incomplete sketch of some aspects of the syntax of the symbolic language.  I will write one or more posts about the semantics later.

The languages of math

First, let's distinguish between mathematical English and the symbolic language of math. 

  • Mathematical English is a special register or jargon of English. It has not only its special vocabulary, like any jargon, but also used ordinary English words such as "If…then", "definition" and "let" in special ways. 
  • The symbolic language of math is a distinct, special-purpose written language which is not a dialect of the English language and can in fact be read by mathematicians with little knowledge of English.
    • It has its own symbols and rules that are quite different from spoken languages. 
    • Simple expressions can be pronounced, but complicated expressions may only be pointed to or referred to.
  • A mathematical article or book is typically written using mathematical English interspersed with expressions in the symbolic language of math.

Symbolic expressions

A symbolic noun (logicians call it a term) is an expression in the symbolic language that names a number or other mathematical object, and may carry other information as well.

  • "3" is a noun denoting the number 3.
  • "$\text{Sym}_3$" is a noun denoting the symmetric group of order 3.
  • "$2+1$" is a noun denoting the number 3.  But it contains more information than that: it describes a way of calculating 3 as a sum.
  • "$\sin^2\frac{\pi}{4}$" is a noun denoting the number $\frac{1}{2}$, and it also describes a computation that yields the number $\frac{1}{2}$.  If you understand the symbolic language and know that $\sin$ is a numerical function, you can recognize "$\sin^2\frac{\pi}{4}$" as a symbolic noun representing a number even if you don't know how to calculate it.
  • "$2+1$" and "$\sin^2\frac{\pi}{4}$" are said to be encapsulated computations.
    • The word "encapsulated" refers to the fact that to understand what the expressions mean, you must think of the computation not as a process but as an object.
    • Note that a computer program is also an object, not a process.
  • "$a+1$" and "$\sin^2\frac{\pi x}{4}$" are encapsulated computations containing variables that represent numbers. In these cases you can calculate the value of these computations if you give values to the variables.  

symbolic statement is a symbolic expression that represents a statement that is either true or false or free, meaning that it contains variables and is true or false depending on the values assigned to the variables.

  • $\pi\gt0$ is a symbolic assertion that is true.
  • $\pi\lt0$ is a symbolic assertion that it is false.  The fact that it is false does not stop it from being a symbolic assertion.
  • $x^2-5x+4\gt0$ is an assertion that is true for $x=5$ and false for $x=1$.
  • $x^2-5x+4=0$ is an assertion that is true for $x=1$ and $x=4$ and false for all other numbers $x$.
  • $x^2+2x+1=(x+1)^2$ is an assertion that is true for all numbers $x$. 

Properties of the symbolic language

The constituents of a symbolic expression are symbols for numbers, variables and other mathematical objects. In a particular expression, the symbols are arranged according to conventions that must be understood by the reader. These conventions form the syntax or grammar of symbolic expressions. 

The symbolic language has been invented piecemeal by mathematicians over the past several centuries. It is thus a natural language and like all natural languages it has irregularities and often results in ambiguous expressions. It is therefore difficult to learn and requires much practice to learn to use it well. Students learn the grammar in school and are often expected to understand it by osmosis instead of by being taught specifically.  However, it is not as difficult to learn well as a foreign language is.

In the basic symbolic language, expressions are written as strings of symbols.

  • The symbolic language gives (sometimes ambiguous) meaning to symbols placed above or below the line of symbols, so the strings are in some sense more than one dimensional but less than two-dimensional.
  • Integral notation, limit notation, and others, are two-dimensional enough to have two or three levels of symbols. 
  • Matrices are fully two-dimensional symbols, and so are commutative diagrams.
  • I will not consider graphs (in both senses) and geometric drawings in this post because I am not sure what I want to write about them.

Syntax of the language

One of the basic methods of the symbolic language is the use of constructors.  These can usually be analyzed as functions or operators, but I am thinking of "constructor" as a linguistic device for producing an expression denoting a mathematical object or assertion. Ordinary languages have constructors, too; for example "-ness" makes a noun out of a verb ("good" to "goodness") and "and" forms a grouping ("men and women").

Special symbols

The language uses special symbols both as names of specific objects and as constructors.

  • The digits "0", "1", "2" are named by special symbols.  So are some other objects: "$\emptyset$", "$\infty$".
  • Certain verbs are represented by special symbols: "$=$", "$\lt$", "$\in$", "$\subseteq$".
  • Some constructors are infixes: "$2+3$" denotes the sum of 2 and 3 and "$2-3$" denotes the difference between them.
  • Others are placed before, after, above or even below the name of an object.  Examples: $a'$, which can mean the derivative of $a$ or the name of another variable; $n!$ denotes $n$ factorial; $a^\star$ is the dual of $a$ in some contexts; $\vec{v}$ constructs a vector whose name is "$v$".
  • Letters from other alphabets may be used as names of objects, either defined in the context of a particular article, or with more nearly global meaning such as "$\pi$" (but "$\pi$" can denote a projection, too).

This is a lot of stuff for students to learn. Each symbol has its own rules of use (where you put it, which sort of expression you may it with, etc.)  And the meaning is often determined by context. For example $\pi x$ usually means $\pi$ multiplied by $x$, but in some books it can mean the function $\pi$ evaluated at $x$. (But this is a remark about semantics — more in another post.)

"Systematic" notation

  • The form "$f(x)$" is systematically used to denote the value of a function $f$ at the input $x$.  But this usage has variations that confuse beginning students:
    • "$\sin\,x$" is more common than "$\sin(x)$".
    • When the function has just been named as a letter, "$f(x)$" is more common that "$fx$" but many authors do use the latter.
  • Raising a symbol after another symbol commonly denotes exponentiation: "$x^2$" denotes $x$ times $x$.  But it is used in a different meaning in the case of tensors (and elsewhere).
  • Lowering a symbol after another symbol, as in "$x_i$"  may denote an item in a sequence.  But "$f_x$" is more likely to denote a partial derivative.
  • The integral notation is quite complicated.  The expression \[\int_a^b f(x)\,dx\] has three parameters, $a$, $b$ and $f$, and a bound variable $x$ that specifies the variable used in the formula for $f$.  Students gradually learn the significance of these facts as they work with integrals. 


Variables have deep problems concerned with their meaning (semantics). But substitution for variables causes syntactic problems that students have difficulty with as well.

  • Substituting $4$ for $x$ in the expression $3+x$ results in $3+4$. 
  • Substituting $4$ for $x$ in the expression $3x$ results in $12$, not $34$. 
  • Substituting "$y+z$" in the expression $3x$ results in $3(y+z)$, not $3y+z$.  Some of my calculus students in preforming this substitution would write $3\,\,y+z$, using a space to separate.  The rules don't allow that, but I think it is a perfectly natural mistake. 

Using expressions and writing about them

  • If I write "If $x$ is an odd integer, then $3+x$ is odd", then I am using $3+x$ in a sentence. It is a noun denoting an unspecified number which can be constructed in a specified way.
  • When I mention substituting $4$ for $x$ in "$3+x$", I am talking about the expression $3+x$.  I am not writing about a number, I am writing about a string of symbols.  This distinction causes students major difficulties and teacher hardly ever talk about it.
  • In the section on variables, I wrote "the expression $3+x$", which shows more explicitly that I am talking about it as an expression.
    • Note that quotes in novels don't mean you are talking about the expression inside the quotes, it means you are describing the act of a person saying something.
  • It is very common to write something like, "If I substitute $4$ for $x$ in $3x$ I get $3 \times 4=12$".  This is called a parenthetic assertion, and it is literally nonsense (it says I get an equation).
  • If I pronounce the sentence "We know that $x\gt0$" we pronounce "$x\gt0$" as "$x$ is greater than zero",  If I pronounce the sentence "For any $x\gt0$ there is $y\gt0$ for which $x\gt y$", then I pronounce the expression "$x\gt0$" as "$x$ greater than zero$",  This is an example of context-sensitive pronunciation
  • There is a lot more about parenthetic assertions and context-sensitive pronunciation in More about the languages of math.


I have described some aspects of the syntax of the symbolic language of math. Learning that syntax is difficult and requires a lot of practice. Students who manage to learn the syntax and semantics can go on to learn further math, but students who don't are forever blocked from many rewarding careers. I heard someone say at the MathFest in Madison that about 25% of all high school students never really understand algebra.  I have only taught college students, but some students (maybe 5%) who get into freshman calculus in college are weak enough in algebra that they cannot continue. 

I am not proposing that all aspects of the syntax (or semantics) be taught explicitly.  A lot must be learned by doing algebra, where they pick up the syntax subconsciously just as they pick up lots of other behavior-information in and out of school. But teachers should explicitly understand the structure of algebra at least in some basic way so that they can be aware of the source of many of the students' problems. 

It is likely that the widespread use of computers will allow some parts of the symbolic language of math to be replaced by other methods such as using Excel or some visual manipulation of operations as suggested in my post Mathematical and linguistic ability.  It is also likely that the symbolic language will gradually be improved to get rid of ambiguities and irregularities.  But a deliberate top-down effort to simplify notation will not succeed. Such things rarely succeed.




Send to Kindle

9 thoughts on “Algebra is a difficult foreign language”

Leave a Reply

Your email address will not be published. Required fields are marked *