Please read this post at abstractmath.org. I originally posted the document here but some of the diagrams would not render, and I haven’t been able to figure out why. Sorry for having to redirect.
Send to KindlePlease read this post at abstractmath.org. I originally posted the document here but some of the diagrams would not render, and I haven’t been able to figure out why. Sorry for having to redirect.
Send to KindleI have been working my way through abstractmath.org, revising the articles and turning them into pure HTML so they will be easier to update. In some cases I am making substantial revisions. In particular, many of the articles need a more modern point of view.
The math community’s understanding of sets and structures has changed because of category theory and will change
because of homotopy type theory.
This post considers some issues and possibilities concerning the chapter on sets.
The references listed at the end of the article include several about homotopy type theory. They provide different viewpoints and require different levels of sophistication.
The abmath article Specification of sets specifies what a set is in this way:
A set is a single math object distinct from but completely determined by what its elements are.
I have used this specification for sets since the eighties, first in my Discrete Math lecture notes and then in abstractmath.org. It has proved useful because it is quite simple and the statement implies lots of immediate consequences. Each of the first four consequences in this list below exposes a confusion that some students have.
Those consequences make the spec a useful teaching tool. But if a beginning abstract math student gets very far in their studies, some complications come up.
In the late nineteenth century, math people started formally defining particular math structures such as groups and various
kinds of spaces. This was normally done by starting with a set and adding structure.
You may think that “starting with a set and adding structure” brushes a lot of complications under the rug. Well, don’t look under the rug, at least not right now.
The way they thought about sets was a informal version of what is now called naive set theory. In particular, they freely defined particular sets using what is essentially setbuilder notation, producing sets in a way which (I claim) satisfies my specification.
Then along came Russell’s paradox. In the context of this discussion, the paradox implied that the spec for sets is not a definition.The spec provides a set of necessary conditions for being a set. But it is not sufficient. You can say “Let $S$ be the set of all sets that…[satisfy some condition]” until you are blue in the face, but there are conditions (including the empty condition) that don’t define a set.
The Zermelo-Fraenkel axioms were designed to provide a definition that didn’t create contradictions. The axioms accomplish this by creating a sort of hierarchy that requires that each set must be defined in terms of sets defined previously. They provide a good way (but not the only one) of providing a way of legitimizing our use of sets in math.
Observe that the “set of all sets” is certainly not “defined” in terms of previously defined sets!
During those days there was a movement to provide a solid foundation for mathematics. After Zermelo-Fraenkel came along, the progress of thinking seemed to be:
That list is oversimplified. In particular, the development of predicate logic was essential to this approach, but I can’t write about everything at once.
This leads to monsters such as the notorious definition of ordered pair:
The ordered pair $(a,b)$ is the set $\{a,\{b\}\}$.
This leads to the ludicrous statement that $a$ is an element of $(a,b)$ but that $b$ is not.
By saying every math object may be modeled as a set with structure, ZF set theory becomes a model of all of math. This approach gives a useful proof that all of math is as consistent as ZF set theory is.
But many mathematicians jumped to the conclusion that every math object must be a set with structure. This approach does not match the way mathematicians think about math objects. In particular, it makes computerized proof assistance hard to use because you have to translate your thinking into sets and first order logic.
“A mathematical object is determined by the role it plays in a category.” — A. Grothendieck
In category theory, you define math structures in terms of how they relate to other math structures. This shifts the emphasis from
What is it?
to
What are its properties?
For example, an ordered pair is a mathematical object $p$ determined by these properties:
“Categorical” here means “as understood in category theory”. It unfortunately has a very different meaning in model theory (set of axioms with only one model up to isomorphism) and in general usage, as in “My answer is categorically NO” said by someone who is red in the face. The word “categorial” has an entirely different meaning in linguistics. *Sigh*.
William Lawvere has produced an axiomatization of the category of sets.
The most accessible introduction to it that I know of is the article Rethinking set theory, by Tom Leinster. This axiomatization defines sets by their relationship with each other and other math objects in much the same way as the categorical definition of (for example) groups gives a definition of groups that works in any category.
The word set as used informally has two different meanings.
One of the great improvements in mathematics that homotopy type theory supplies is a systematic way of keeping track of the isomorphisms, the isomorphisms between the isomorphisms, and so on ad infinitum (literally). But note: I am just beginning to understand htt, so regard this remark as something to be suspicious of.
I suggest that we keep the word “set” for the traditional concept and call the strict categorical concept an urset.
The traditional set $\{3,5\}$ consists of the unique two-element urset coindexed on the integers.
A (ur)set $S$ coindexed by a math structure $A$ is a monic map from $S$ to the underlying set of $A$. In this example, the map has codomain the integers and takes one element of the two-element urset to $3$ and the other to $5$.
Note added 2014-10-05 in response to Toby Bartels’ comment: I am inclined to use the names “abstract set” for “urset” and “concrete set” for coindexed sets when I revise the articles on sets. But most of the time we can get away with just “set”.
There is clearly no isomorphism of coindexed sets from $\{3,4\}$ to $\{3,5\}$, so those two traditional sets are not equal in the category of coindexed sets.
I made up the phrase “coindexed set” to use in this sense, since it is a kind of opposite of indexed set. If terminology for this already exists, lemme know. Linguists will tell you they use the word “coindexed” in a different sense.
The concept of “element” in categorical thinking is very different from the traditional idea, where an element of a set can be any mathematical object. In categorical thinking, an element of an object $A$ of a category $\mathbf{C}$ is an arrow $1\to A$ where $1$ is the terminal object. Thus $4$ as an integer is the arrow $1\to \mathbb{Z}$ whose unique value is the number $4$.
In the usage of category theory, the arrow $1\to\mathbb{R}$ whose value is the real number $4$ is a different math object from the arrow $1\to\mathbb{Z}$ whose value is the integer $4$.
A category theorist will probably agree that we can identify the integer $4$ with the real number $4$ via the well known canonical embedding of the ring of integers into the field of real numbers. But in categorical thinking you have to keep all such embeddings in mind; you don’t say the integer $4$ is the same thing as the real number $4$. (Most computer languages keep them distinct, too.)
This difference is actually not hard to get used to and is in fact an improvement over traditional set theory. When you do category theory you use lots of commutative diagrams. The embeddings show up as monic arrows and are essential in keeping the different objects ($\mathbb{Z}$ and $\mathbb{R}$ in the example) separate.
The paper Relating first-order set theory and elementary toposes, by Awodey, Butz, Simpson and Streicher, introduces a concept of “structural system of inclusions” that appears to me to restore the idea of object being an element of more than one set for many purposes.
Homotopy type theory allows an object to have only one type, with much the same effect as in the categorical approach.
The arrow $1\to \mathbb{Z}$ that picks out the integer $4$ is a constant function. It is useful to think of any arrow $A\to B$ of any category as a variable element (or generalized element) of the object $B$. For example, the function $f:\mathbb{R}\to \mathbb{R}$ defined by $f(x)=x^2$ allows you to
think of $x^2$ as a variable number with real parameter. This is another way of thinking about the “$y$” in the equation $y=x^2$, which is commonly called a dependent variable.
One way to think about $y$ is that some statements about it are true, some are false, and many statements are neither true nor false.
This way of thinking about variable objects clears up a lot of confusion about variables and deserves to be more widely used in teaching.
The book Category theory for computing science provides some examples of the use of variable elements as a way of thinking about categorical ideas.
< ![endif]>
This work is licensed under a Creative
Commons Attribution-ShareAlike 2.5
License.
Send to KindleThis post is the new revision of the chapter on Images and Metaphors in abstractmath.org.
In this chapter, I say something about mental representations (metaphors and images) in general, and provide examples of how metaphors and images help us understand math – and how they can confuse us.
Pay special attention to the section called two levels! The distinction made there is vital but is often not made explicit.
Besides mental representations, there are other kinds of representations used in math, discussed in the chapter on representations and models.
Mathematics is the tinkertoy of metaphor. –Ellis D. Cooper
We think and talk about our experiences of the world in terms of images and metaphors that are ultimately derived from immediate physical experience. They are mental representations of our experiences.

We know what a pyramid looks like. But when we refer to the government’s food pyramid we are not talking about actual food piled up to make a pyramid. We are talking about a visual image of the pyramid.
We know by direct physical experience what it means to be warm or cold. We use these words as metaphors
in many ways:
Children don’t always sort metaphors out correctly. Father: “We are all going to fly to Saint Paul to see your cousin Petunia.” Child: “But Dad, I don’t know how to fly!”
One basic fact about metaphors and images is that they apply only to certain aspects of the situation.
Our brains handle these aspects of mental representations easily and usually without our being conscious of them. They are one of the primary ways we understand the world.
Half this game is 90% mental. –Yogi Berra
Mathematicians who work with a particular kind of mathematical object
have mental representations of that type of object that help them
understand it. These mental representations come in many forms. Most of them fit into one of the types below, but the list shouldn’t be taken too seriously: Some representations fit more that of these types, and some may not fit into any of them except awkwardly.
All mental representations are conceptual metaphors. Metaphors are treated in detail in this chapter and in the chapter on images and metaphors for functions. See also literalism and Proofs without dry bones on Gyre&Gimble.
Below I list some examples. Many of them refer to the arch function, the function defined by $h(t)=25-{{(t-5)}^{2}}$.


“Continuous functions don’t have gaps in the graph”. This is a visual image, and it is usually OK.
This is a typical math example that teachers make up to raise your consciousness.
“Continuous functions can be drawn without lifting the chalk.” This is true in most familiar cases (provided you draw the graph only on a finite interval). But consider the graph of the function defined by $f(0)=0$ and \[f(t)=t\sin\frac{1}{t}\ \ \ \ \ \ \ \ \ \ (0\lt t\lt 0.16)\]
(see Split Definition). This curve is continuous and is infinitely long even though it is defined on a finite interval, so you can’t draw it with a chalk at all, picking up the chalk or not. Note that it has no gaps.

I personally use visual images to remember relationships between abstract objects, as well. For example, if I think of three groups, two of which are isomorphic (for example $\mathbb{Z}_{3}$ and $\text{Alt}_3$), I picture them as in three different places in my head with a connection between the two isomorphic ones.
Here I give some examples of thinking of math objects in terms of the notation used to name them. There is much more about notation as mathematical representation in these sections of abmath:
Notation is both something you visualize in your head and also a physical representation of the object. In fact notation can also be thought of as a mathematical object in itself (common in mathematical logic and in theoretical computing science.) If you think about what notation “really is” a lot, you can easily get a headache…
One precise definition of the square root of $2$ is “the positive real number $x$ for which $x^2=2$”. Another definition is that $\sqrt{2}=\frac{1}{2}\log2$.
For example, the prime factorization of $2646$ tells you immediately that it is divisible by $49$.
When I was in high school in the 1950’s, I was taught that it was incorrect to say “two thousand, six hundred and forty six”. Being naturally rebellious I used that extra “and” in the early 1960’s in dictating some number in a telegraph message. The Western Union operator corrected me. Of course, the “and” added to the cost. (In case you are wondering, I was in the middle of a postal Diplomacy game in Graustark.)
You can think of the set containing $1$, $3$ and $5$ and nothing else as represented by its common list notation $\{1, 3, 5\}$. But remember that $\{5, 1,3\}$ is another notation for the same set. In other words the list notation has irrelevant features – the order in which the elements are listed in this case.

I remember visualizing algebra I this way even before I had ever heard of the Transformers.
It is common to think of a function as a process: you put in a number (or other object) and the process produces another number or other object. There are examples in Images and metaphors for functions.
Let’s divide $66$ by $7$ using long division. The process consists of writing down the decimal places one by one.
You can continue with the procedure to get as many decimal places as you wish of $\frac{66}{7}$.
The sequence of actions just listed is quite difficult to follow. What is difficult is not understanding what they say to do, but where did they get the numbers? So do this exercise!
Check that the procedure above is exactly what you do to divide $66$ by $7$ by the usual method taught in grammar school:
Exercise worth doing:

A particular kind of metaphor or image for a mathematical concept is that of a mathematical object that represents the concept.
Representations as math objects is discussed primarily in representations and Models. The difference between representations as math objects and other kinds of mental representations (images and metaphors) is primarily that a math object has a precise mathematical definition. Even so, they are also mental representations.
Mental representations of a concept make up what is arguably the most important part of the mathematician’s understanding of the concept.
Different mental representations of the same kind of object
help you understand different aspects of the object.
|
Every important mathematical object |
But images and metaphors are also dangerous (see below).
We especially depend on metaphors and images to understand a math concept that is new to us . But if we work with it for awhile, finding lots of examples, and
eventually proving theorems and providing counterexamples to conjectures, we begin to understand the concept in its own terms and the images and metaphors tend to fade away from our awareness.
Then, when someone asks us about this concept that we are now experts with, we
trundle out our old images and metaphors – and are often surprised at how difficult and misleading our listener finds them!
Some mathematicians retreat from images and metaphors because of this and refuse to do more than state the definition and some theorems about the concept. They are wrong to do this. That behavior encourages the attitude of many people that
In my opinion the third statement is only about 10 percent true.
All three of these statements are half-truths. There is no doubt that a lot of abstract math is hard to understand, but understanding is certainly made easier with the use of images and metaphors.
This website has many examples of useful mental representations. Usually, when a chapter discusses a particular type of mathematical object, say rational numbers, there will be a subhead entitled “Images and metaphors for rational numbers”. This will suggest ways of thinking about them that many have found useful.
Images and metaphors have to be used at two different levels, depending on your purpose.
Math teachers and texts typically do not make an explicit distinction between these views, and you have to learn about it by osmosis. In practice, teachers and texts do make the distinction implicitly. They will say things
like, “You can think about this theorem as …” and later saying, “Now we give a rigorous proof of the theorem.” Abstractmath.org makes this distinction explicit in many places throughout the site.
The kind of metaphors and images discussed in the mental representations section above make math rich, colorful and intriguing to think about. This is the rich view of math. The rich view is vitally important.
You expect the ball whose trajectory is modeled by the function h(t) above to slow down as it rises, so the derivative of h must be smaller at t
= 4 than it is at t = 2. A mathematician might even say that that is an “informal proof” that $h'(4)<h'(2)$. A rigorous proof is given below.
When we are constructing a definition or proof we cannot
trust all those wonderful images and metaphors.
For the point of view of doing proofs, math
objects must be thought of as inert (or static),
like your pet rock. This means they
(See also abstract object).
Above, I gave an informal argument for this. The rigorous way to see that $h'(4)\lt h'(2)$ for the arch function is to calculate the derivative \[h'(t)=10-2t\] and plug in 4 and 2 to get \[h'(4)=10-8=2\] which is less than $h'(2)=10-4=6$.
Note the embedded
phrases.
This argument picks out particular data about the function that
prove the statement. It says nothing about anything slowing down as $t$
increases. It says nothing about anything at all changing.
The rigorous view does not apply to all abstract objects, but only to mathematical objects. See abstract objects for examples.
The price of metaphor is eternal vigilance.–Norbert Wiener
Every
mental representation has flaws. Each oneprovides a way of thinking about an $A$ as a kind of $B$ in some respects. But the representation can have irrelevant features. People new to the subject will be tempted to think about $A$ as a kind of $B$ in inappropriate respects as well. This is a form of cognitive dissonance.
It may be that most difficulties students have with abstract math are based on not knowing which aspects of a given representation are applicable in a given situation. Indeed, on not being consciously aware that in general you must restrict the applicability of the mental pictures that come with a representation.
In abstractmath.org you will sometimes see this statement: “What is wrong with this metaphor:” (or image, or representation) to warn you about the flaws of that particular representation.
The graph of the arch function $h(t)$ makes it look like the two arms going downward become so nearly vertical that the curve has vertical asymptotes.
But it does not have asymptotes. The arms going down are underneath every point of the $x$-axis. For example, there is a point on the curve underneath the point $(999,0)$, namely $(999, -988011)$.
A set is sometimes described as analogous to A container. But consider: the integer 3 is “in” the set of all odd integers, and it is also “in” the set $\left\{ 1,\,2,\,3 \right\}$. How could something be in two containers at once? (More about this here.)
An analogy can be helpful, but it isn’t the same thing as the same thing. – The Economist
Mathematicians think of the real numbers as constituting a line infinitely long in both directions, with each number as a point on the line. But this does not mean that you can think of the line as a row of points. See density of the real line.
We commonly think of functions as machines that turn one number into another. But this does not mean that, given any such function, we can construct a machine (or a program) that can calculate it. For many functions, it is not only impractical to do, it is theoretically
impossible to do it. They are not href=”http://en.wikipedia.org/wiki/Recursive_function_theory#Turing_computability”>computable. In other words, the machine picture of a function does not apply to all functions.
|
The images and metaphors you use |
|
The images and metaphors you use to think about the subject |
It seems likely that cognitive phenomena such as images and metaphors are physically represented in the brain as collections of neurons connected in specific ways. Research on this topic is proceeding rapidly. Perhaps someday we will learn things about how we think physically that actually help us learn things about math.
In any case, thinking about mathematical objects as physically represented in your brain (not necessarily completely or correctly!) wipes out a lot of the dualistic talk about ideas and physical objects as
separate kinds of things. Ideas, in particular math objects, are emergent constructs in the
physical brain.
The language that nature speaks is mathematics. The language that ordinary human beings speak is metaphor. Freeman Dyson
“Metaphor” is used in abstractmath.org to describe a type of thought configuration. It is an implicit conceptual identification
of part of one type of situation with part of another.
Metaphors are a fundamental way we understand the world. In particular,they are a fundamental way we understand math.
The word “metaphor” is also used in rhetoric as the name of a type of figure of speech. Authors often refer to metaphor in the meaning of thought configuration as a conceptual metaphor. Other figures of speech, such as simile and synecdoche, correspond to conceptual metaphors as well.
Fauconnier, G. and Turner, M., The Way We Think: Conceptual Blending And The Mind’s Hidden Complexities . Basic Books, 2008.
Lakoff, G., Women, Fire, and Dangerous Things. The University of Chicago Press, 1986.
Lakoff, G. and Mark Johnson, Metaphors We Live By.
The University of Chicago Press, 1980.
Byers, W., How mathematicians Think. Princeton University Press, 2007.
Lakoff, G. and R. E. Núñez, Where mathematics Comes
From. Basic Books, 2000.
Math Stack Exchange list of explanatory images in math.
Núñez, R. E., “Do Real Numbers Really Move?” Chapter
in 18 Unconventional Essays on the Nature of mathematics, Reuben Hersh,
Ed. Springer, 2006.
Charles Wells,
Handbook of mathematical Discourse.
Charles Wells, Conceptual blending. Post in Gyre&Gimble.
Functions: images and metaphors
Real numbers: images and metaphors
< ![endif]>
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
Send to KindleCognitive neuroscientists have taken the point of view that concepts, memories, words, and so on are represented in the brain by physical systems: perhaps they are individual neurons, or systems of structures, or even waves of discharges. In my previous writing I have referred to these as modules, and I will do that here. Each module is connected to many other modules that encode various properties of the concept, thoughts and memories that occur when you think of that concept (in other words stimulate the module), and so on.
How these modules implement the way we think and perceive the world is not well understood and forms a major research task of cognitive neuroscience. The fact that they are implemented in physical systems in the brain gives us a new way of thinking about thought and perception.
There is a module in your brain representing the concept of grandmother. It is likely to be connected to other modules representing your actual grandmothers if you have any memory of them. These modules are connected to many others — memories (if you knew them), other relatives related to them, incidents in their lives that you were told about, and so on. Even if you don’t have any memory of them, you have a module representing the fact that you don’t have any memory of them, and maybe modules explaining why you don’t.
Each different aspect related to “grandmother” belongs to a separate module somehow connected to the grandmother module. That may be hard to believe, but the human brain has over eighty billion neurons.
There is a module in your brain connected with the number $42$. That module has many connections to things you know about it, such as its factorization, the fact that it is an integer, and so on. The module may also have connections to a module concerning the attitude that $42$ is the Answer. If it does, that module may have a connection with the module representing Douglas Adams. He was physically outside your body, but is the number $42$ outside your body?
That has a decidedly complicated answer. The number $42$ exists in a network of brains which communicate with each other and share some ideas about properties of $42$. So it exists socially. This social existence occasionally changes your knowledge of the properties of $42$ and in particular may make you realize that you were wrong about some of its aspects. (Perhaps you once thought it was $7\times 8$.)
This example suggests how I have been using the module idea to explain how we think about math.
I am proposing to use the idea of module as a metaphor for thinking about thinking. I believe that it clarifies a lot of the confusion people have about the relation between thinking and the real world. In particular it clarifies why we think of mathematical objects as if they were real-world objects (see Modules and math below.)
I am explicitly proposing this metaphor as a successor to previous metaphors drawn from science to explain things. For example when machines became useful in the 18th century many naturalists used metaphors such as the Universe is a Machine or the Body is a Machine as a way of understanding the world. In the 20th century we fell heavily for the metaphor that the Mind Is A Computer (or Program). Both the 18th century and the 20th century metaphors (in my opinion) improved our understanding of things, even though they both fell short in many ways.
In no way am I claiming that the ways of thinking I am pushing have anything but a rough resemblance to current neuroscientists’ thinking. Even so, further discoveries in neuroscience may give us even more insight into thinking that they do now. Unless at some point something goes awry and we have to, ahem, think differently again.
For thousands of years, new scientific theories have been giving us new metaphors for thinking about life, the universe and everything. I am saying here is a new apple on the tree of knowledge; let’s eat it.
The rest of this post elaborates my proposed metaphor. Like any metaphor, it gets some things right and some wrong, and my explanations of how it works are no doubt full of errors and dubious ideas. Nevertheless, I think it is worth thinking about thought using these ideas with the usual correction process that happens in society with new metaphors.
We don’t have any direct perception of the “real world”; we have only the sensations we get from those parts of our body which sense things in the world. These sensations are organized by our brain into a theory of the world.
Our brain also has a theory of society We are immersed in a world of people, that we have close connections with some of them and more distant connections with many other via speech, stories, reading and various kinds of long-distance communications.
The module associated with a math object is connected to many other modules, some of which have nothing to do with math.
Many of the ideas in this post come from my previous writing, listed in the references. This post was also inspired by ideas from Chomsky, Jackendoff (particularly Chapter 9), the Scientific American article Brain cells for Grandmother by Quian Quiroga, Fried and Koch, and the papers by Ernest and Hersh.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
Send to KindleThis post outlines some of the intellectual developments in the history of math. I call it a saga because it is like one:
Thousands of years ago, we figured out how to write down words and phrases in such a way that someone much later could read and understand them.
Naturally, we wanted to keep records of the number of horses the Queen owned, so we came up with various notations for numbers (number representing count). In some societies, these symbols were separate from the symbols used to represent words.
We discovered positional notation. We write $213$, which is based on a system: it means $2\times100+1\times10+3\times 1$. This notation encapsulates a particular computation of a number (its base-10 representation). (The expression $190+23$ is another piece of notation that encapsulates a computation that yields $213$.)
Compare that to the Roman notation $CCXIII$, which is an only partly organized jumble.
Try adding $CCXIII+CDXXIX$. (The answer is $DCXLII$.)
Positional notation allowed us to create the straightforward method of addition involving adding single digits and carrying:
\[\overset{\hspace{6pt}1\phantom{1}}
{\frac
{\overset{\displaystyle{213}}{429}}{642}
}
\]
Measuring land requires multiplication, which positional notation also allows us to perform easily.
The invention of such algorithms (methodically manipulating symbols) made it easy to calculate with numbers.
We discovered geometry in ancient times, in laying out plots of land and designing buildings. We had a bunch of names for different shapes and for some of them we knew how to calculate their area, perimeter and other things.
Euclid showed how to construct new geometric figures from given ones using specific methods (ruler and compasses) that preserve some properties.
We can bisect a line segment (black) by drawing two circles (blue) centered at the endpoints with radius the length of the line segment. We then construct a line segment (red) between the points of intersection of the circle that intersects the given line segment at its midpoint. These constructions can be thought of as algorithms creating and acting on geometric figures rather than on symbols.

It is true that diagrams were drawn to represent line segments, triangles and so on.
But the diagrams are visualization helpers. The way we think about the process is that we are operating directly on the geometric objects to create new ones. We are thinking of the objects Platonically, although we don’t have to share Plato’s exact concept of their reality. It is enough to say we are thinking about the objects as if they were real.
Euclid came up with the idea that we should write down axioms that are true of these figures and constructions, so that we can systematically use the constructions
to prove theorems about figures using axioms and previously proved theorems. This provided documented reasoning (in natural language, not in symbols) for building up a collection of true statements about math objects.
After creating some tools for proving triangles are congruent, we can prove the the intersection of red and black lines in the figure really is the midpoint of the black line by constructing the four green line segments below and making appeals to congruences between the triangles that show up:

Note that the green lines have the same length as the black line.
Euclid thought about axioms and theorems as applying to geometry, but he also proved theorems about numbers by representing them as ratios of line segments.
People in ancient India and Greece knew how to solve linear and quadratic equations using verbal descriptions of what you should do.
Later, we started using a symbolic language to express numerical problems and symbolic manipulation to solve (some of) them.
The quadratic formula is an encapsulated computation that provides the roots of a quadratic equation. Newton’s method is a procedure for finding a root of an arbitrary polynomial. It is recursive in the loose sense (it does not always give an answer).
The symbolic language is a vast expansion of the symbolic notation for numbers. A major innovations was to introduce variables to represent unknowns and to state equations that are always true.
Aristotle developed an early form of logic (syllogisms) aimed at determining which arguments in rhetoric were sound. “All men are mortal. Socrates is a man. Therefore Socrates is mortal.” This was written in sentences, not in symbols.
By explicit analogy with algebra, we introduced symbolism and manipulation rules for logical reasoning, with an eye toward making mathematical reasoning sound and to some extent computable. For example, in one dialect of logical notation, modus ponens (used in the Socrates syllogism) is expressed as $(P\rightarrow Q,\,P)\,\,\vdash\,\, Q$. This formula is an encapsulated algorithm: it says that if you know $P\rightarrow Q$ and $P$ are valid (are theorems) then $Q$ is valid as well.
We struggled with the notion of function as a result of dealing with infinite series. For example, the limit of a sequence of algebraic expressions may not be an algebraic expression. It would no longer do to think of a function as the same thing as an algebraic expression.
We realized that Euclid’s axioms for geometry lacked clarity. For example, as I understand it, the original version of his axioms didn’t imply that the two circles in the proof above had to intersect each other. There were other more subtle problems. Hilbert made a big effort to spell out the axioms in more detail.
We refined our understanding of logic by trying to deal with the mysteries of calculus, limits and spaces. An example is the difference between continuity and uniform continuity.
We also created infinitesimals, only to throw up our hands because we could not make a logic that fit them. Infinitesimals were temporarily replaced by the use of epsilon-delta methods.
We began to understand that there are different kinds of spaces. For example, there were other models of some of Euclid’s axioms than just Euclidean space, and some of those models showed that the parallel axiom is independent of the other axioms. And we became aware of many kinds of topological spaces and manifolds.
We started to investigate sets, in part because spaces have sets of points. Then we discovered that a perfectly innocent activity like considering the set of all sets resulted in an impossibility.
We were led to consider how to understand the Axiom of Choice from several upsetting discoveries. For example, the Banach-Tarski “paradox” implies that you can rearrange the points in a sphere of radius $1$ to make two spheres of radius $1$.
These problems caused a kind of tightening up, or rigorizing.
For a period of time, less than a century, we settled into a standard way of practicing research mathematics called new math or modern math. Those names were used mostly by math educators. Research mathematicians might have called it axiomatic math based on set theory. Although I was around for the last part of that period I was not aware of any professional mathematicians calling it anything at all; it was just what we did.
First, we would come up with a new concept, type of math object, or a new theorem. In this creative process we would freely use intuition, metaphors, images and analogies.
We might come up with the idea that a function reaches its maximum when its graph swoops up from the left, then goes horizontally for an infinitesimal amount of time, then swoops down to the right. The point at which it was going horizontally would obviously have to be the maximum.
But when we came to publish a paper about the subject, all these pictures would disappear. All our visual, metaphorical/conceptual and kinetic feelings that explain the phenomenon would have to be suppressed.
Rigorizing consisted of a set of practices, which I will hint at:
Each mathematical object had to be defined in some way that started with a set and some other data defined in terms of the set. Axioms were imposed on these data. Everything had to be defined in terms of sets, including functions and relations. (Multiple sets were used occasionally.)
Definitions done in this way omit a lot of the intuition that we have concerning the object being defined.
Even so, definitions done in this way have an advantage: They tend to be close to minimal in the sense that to verify that something fits the definition requires checking no more (or not much more) than necessary.
First order logic (and other sorts of logic) were well developed and proofs were written in a way that they could in principle be reduced to arguments written in the notation of symbolic logic and following the rules of inference of logic. This resulted in proofs which did not appeal to intuition, metaphors or pictures.
In the case of the theorem that the maximum of a (differentiable) function occurs only where the derivative is zero, that meant epsilon-delta proofs in which the proof appeared as a thick string of symbols. Here, “thick” means it had superscripts, subscripts, and other things that gave the string a fractal dimension of about $1.2$ (just guessing!).
When I was a student at Oberlin College in 1959, Fuzzy Vance (Elbridge P. Vance) would sometimes stop in the middle of an epsilon-delta proof and draw pictures and provide intuition. Before he started that he would say “Shut the door, don’t tell anyone”. (But he told us!)
A more famous example of this is the story that Oscar Zariski, when presenting a proof in algebraic geometry at the board, would sometimes remind himself of a part of a proof by hunching over the board so the students couldn’t see what he was doing and drawing a diagram which he would immediately erase. (I fault him for not telling them about the diagram.)
It doesn’t matter whether this story is true or not. It is true in the sense that any good myth is true.
I wrote about rigor in these articles:
Rigorous view in abstractmath.org.
Dry bones, post in this blog.
The orthodox method of “define it by sets and axioms” and “makes proofs at least resemble first order logic” clarified a lot of suspect proofs. But it got in the way of intuition and excessive teaching by using proofs made it harder to students to learn.
The early methods described at the beginning of this post began to be used everywhere in math.
Algorithms, or methodical procedures, began with the addition and multiplication algorithms and Euclid’s ruler and compass constructions, but they began to be used everywhere.
They are applied to the symbols of math, for example to describe rules for calculating derivatives and integrals and for summing infinite series.
Algorithms are used on strings, arrays and diagrams of math symbols, for example concatenating lists, multiplying matrices, and calculating binary operations on trees.
Algorithms are used to define the strings that make up the notation of symbolic logic. Such definitions include something like: “If $E$ and $F$ are expressions than $(E)\land (F)$ and $(\forall x)(E)$ are expressions”. So if $E$ is “$x\geq 3$” then $(\forall x)(x\geq 3)$ is an expression. This had the effect of turning an expression in symbolic logic into a mathematical object. Deduction rules such as “$E\land F\vdash E$” also become mathematical objects in this way.
We can define the symbols and expressions of algebra, calculus, and other part of math using algorithms, too. This became a big deal when computer algebra programs such as Mathematica came in.
You can define the set $RP$ of real polynomials this way:
That is a recursive definition. You can also define polynomials by pattern recognition:
Let $n$ be a positive integer, $a_0,\,a_1\,\ldots a_n$ be real numbers and $k_0,\,k_1\,\ldots k_n$ be nonnegative integers. Then $a_0 x^{k_0}+a_1 x^{k_1}+\ldots+ a_n x^{k_n}$ is a polynomial.
The recursive version is a way of letting a compiler discover that a string of symbols is a polynomial. That sort of thing became a Big Deal when computers arrived in our world.
I am using the word “algorithm” in a loose sense to mean any computation that may or may not give a result. Computer programs are algorithms, but so is the quadratic formula. You might not think of a formula as an algorithm, but that is because if you use it in a computer program you just type in the formula; the language compiler has a built-in algorithm to execute calculations given by formulas.
It has not been clearly understood that mathematicians apply algorithms not only to symbols, but also directly to mathematical objects. Socrates thought that way long ago, as I described in the construction of a midpoint above. The procedure says “draw circles with center at the endpoints of the line segment.” It doesn’t say “draw pictures of circles…”
In the last section and this one, I am talking about how we think of applying an algorithm. Socrates thought he was talking about ideal lines and circles that exist in some other universe that we can access by thought. We can think about them as real things without making a metaphysical claim like Socrates did about them. Our brains are wired to think of abstract ideas in some many of the same ways we think about physical objects.
The unit circle (as a topological space at least) is the quotient space of the space $\mathbb{R}$ of real numbers mod the equivalence relation defined by: $x\sim y$ if and only if $x-y$ is an integer.
Mathematicians who understand that construction may have various images in their mind when they read this. One would be something like imagining the real line $\mathbb{R}$ and then gluing all the points together that are an integer apart. This is a distinctly dizzying thing to think about but mathematicians aren’t worried because they know that taking the quotient of a space is a well-understood construction that works. They might check that by imagining the unit circle as the real line wrapped around an infinite number of times, with points an integer apart corresponding to the same point on the unit circle. (When I did that check I hastily inserted the parenthetical remark saying “as a topological space” because I realized the construction doesn’t preserve the metric.) The point of this paragraph is that many mathematicians think of this construction as a construction on math objects, not a construction on symbols.
A lot of concepts start out as semi-vague ideas and eventually get defined as mathematical objects.
The introduction of categories broke the orthodoxy of everything-is-a-set. It has become widely used as a language in many branches of math. It started with problems in homological algebra arising in at least these two ways:
I expect to write about homotopy type theory soon. It may be the Next Revolution.
Send to KindleA mathematician’s mental representation of a function is generally quite rich and may involve many different metaphors and images kept in mind simultaneously. The abmath article on metaphors and images for functions discusses many of these representations, although the article is incomplete. This post is a fairly thorough rewrite of the discussion in that article of the representation of the concept of “function” as a mathematical object. You must think of functions as math objects when you are taking the rigorous view, which happens when you are trying to prove something about functions (or large classes of functions) in general.
What often happens is that you visualize one of your functions in many of the ways described in this article (it is a calculation, it maps one space to another, its graph is bounded, and so on) but those images can mislead you. So when you are completely stuck, you go back to thinking of the function as an axiomatically-defined mathematical structure of some sort that just sits there, like a complicated machine where you can see all the parts and how they relate to each other. That enables you to prove things by strict logical deduction. (Mathematicians mostly only go this far when they are desperate. We would much rather quote somebody’s theorem.) This is what I have called the dry bones approach.
The “mathematical structure” is most commonly a definition of function in terms of sets and axioms. The abmath article Specification and definition of “function” discusses the usual definitions of “function” in detail.
This example is intended to raise your consciousness about the possibilities for functions as objects.
Consider the function $f:\mathbb{R}\to\mathbb{R}$ defined by $f(x)=2{{\sin }^{2}}x-1$. Its value can be computed at many different numbers but it is a single, static math object.
The discussion above shows many examples of thinking of a function as an object. You are thinking about it as an undivided whole, as a chunk, just as you think of the number $3$ (or $\pi$) as just a thing. You think the same way about your bicycle as a whole when you say, “I’ll ride my bike to the library”. But if the transmission jams, then you have to put it down on the grass and observe its individual pieces and their relation to each other (the chain came off a gear or whatever), in much the same way as noticing that the function $g(x)=x^3$ goes through the origin and looks kind of flat there, but at $(2,8)$ it is really rather steep. Phrases like “steep” and “goes through the origin” are a clue that you are thinking of the function as a curve that goes left to right and levels off in one place and goes up fast in another — you are thinking in a dynamic, not a static way like the dry bones of a math object.
Send to KindleI propose the phrase "shared mental object" to name the sort of thing that includes mathematical objects, abstract objects, fictional objects and other concepts with the following properties:
It is the name "shared mental object" that is a new idea; the concept has been around in philosophy and math ed for awhile and has been called various things, especially "abstract object", which is the name I have used in abstractmath.
I will go into detail concerning some examples in order to make the concept clear. If you examine this concept deeply you discover many fine points, nested ideas and circles of examples that go back on themselves. I will not get very far into these fine points here, but I have written about some of them posts and in abmath (see references). I am working on a post about some of the fine points and will publish it if I can control its tendency to expand into infinite proliferation and recursion.
There is a story about the early days of telegraphy: A man comes into the newly-opened telegraph station and asks to send a telegram to his son who is working in another city. He writes out the message and gives it to the operator with his payment. The operator puts the message on a spike and clicks the key in front of him for a while, then says, "I have sent your message. Thanks for shopping at Postal Telegraph". The man looks astonished and points at the message and says, "But it is still here!"
A message is a shared mental object.
Other examples that are similar in nature to messages are schedules and the month of September (see Math Objects in abmath, where they are called abstract objects.). In English-speaking communities, September is a cultural default: you are expected to know what it is. You can know that September is a month and that right this minute it is not September (unless it is September). You may think that September has 31 days and most people would say you are wrong, but they would agree that you and they are talking about the same month.
The general concept of the month of September and facts concerning it have been in shared existence in English-speaking cultural groups for (maybe) a thousand years. In contrast, a message is usually shared by only two or three people and it has a short life; a few years from now, it may be that none of the people involved with the message remember what it said or even that it existed.
A symbol, such as the letter "a" or the integral sign "$\int$", is a shared mental object. Like the month of September, but unlike messages, letters are shared by large cultural entities, every language community that uses the Latin alphabet (and more) in the case of "a", and math and tech people in the case of "$\int$".
The letter "a" is represented physically on paper, a blackboard or a screen, among other things. If you are literate in English and recognize an occurrence as representing the letter, you probably do this using a process in the brain that is automatic and that operates outside your awareness.
Literate readers of English also generally agree that a string of letters either does or does not represent the word "default" but there are borderline cases (as in those little boxes where you have to prove you are not a robot) where they may disagree or admit that they don't know. Even so, the letter "a" and the word "default" are shared in the minds of many people and there is general (but not absolutely universal) agreement on when you are seeing representations of them.
Fictional objects such as Sherlock Holmes and unicorns are shared mental objects. I wrote briefly about them in Mathematical objects and will not go into them here.
The integer $111$, the integral $\int_0^1 x^2\,dx$ and the set of all real numbers are all mathematical objects. They are all shared mental objects. In most of the world, people with a little education will know that $111$ is a number and what it means to have $111$ beans in a jar (for example). They know that it is one more that $110$ and a lot more than $42$.
Mathematicians, scientists and STEM students will know something about what $\int_0^1 x^2\,dx$ means and they will probably know how to calculate it. Most of them may be able to do it in their head. I have taught calculus so many times that I know it "by heart", which means that it is associated in my brain with the number $1/3$ in such a way that when I see the integral the number automatically and without effort pops us (in the same way that I know September has 30 days).
Beginning calculus students may have a confused and incorrect understanding of the set of all real numbers in several ways, but practicing mathematicians (and many others) know that it is an uncountably infinite dense set and they think of it as an object. A student very likely does not think of it as an object, but as a sprawling unimaginable space that you cannot possibly regard as a thing. Students may picture a real number as having another real number sitting right beside it — the next biggest one. Most practicing mathematicians think of the set of real numbers as a completed infinity — every real number is already there — and they know that between any two of them there is another one.
As a consequence, when students and professors talk about real numbers the student finds that some times the professor says things that sound completely wrong and the professor hears the student say things that are bizarre and confused. They firmly believe they are talking about the same thing, the real numbers, but the student is seen by the professor as wrong and the professor is seen by the students as talking meaningless nonsense. Even so, they believe they are talking about the same thing.
I tried various other names before I came to "shared mental objects".
The major advantage of "shared mental object" is that it describes the important properties of the concept: It is a mental object and it is shared by people. It has no philosophical implications concerning platonism, either. Mathematical objects do have special properties of verifiability that general shared mental objects do not, but my terminology does not suggest any existence of absolute truth or of an Ideal existing in another world. I don't believe in such things, but some people do and I want to point out that "shared mental object" does not rule such things out — it merely gives a direct evidence-based description of a phenomenon that actually exists in the real world.
Abstract objects in the Stanford Encyclopedia of Philosophy
Abstract object in Wikipedia
Mathematical objects in abstractmath
Mathematical objects in Wikipedia
What is Mathematics, Really? R. Hersh,
Previous posts
Representations of mathematical objects
Representations III: Rigor and Rigor Mortis
This post uses MathJax. If you see mathematical expressions with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.
Send to KindleThis is a long post. Notes on viewing.
A mathematical object, or a type of math object, is represented in practice in a great variety of ways, including some that mathematicians rarely think of as "representations".
In this post you will find examples and comments about many different types of representations as well as references to the literature. I am not aware that anyone has considered all these different ideas of representation in one place before. Reading through this post should raise your consciousness about what is going on when you do math.
This is also an experiment in exposition. The examples are discussed in a style similar to the way a Mathematica command is discussed in the Documentation Center, using mostly nonhierarchical bulleted lists. I find it easy to discover what I want to know when it is written in that way. (What is hard is discovering the name of a command that will do what I want.)
The representation itself may be a mathematical object, such as:
A math object can be represented visually using a physical object such as a picture, graph (in several senses), or diagram.
If you are a mathematician, a math object such as "$42$", "the real numbers" or "continuity" has a mental representation in your brain.
Conceptual metaphors are a particular kind of mental representation of an object which involve mentally associating some aspects of the objects with some aspects of something else — a physical object, an image, an action or another abstract object.
A representation of a math object may or may not
This list shows many of the possibilities of representation. In each case I discuss the example in terms of the two bulleted lists above. Some of the examples are reused from my previous publications.
Example (F1) "Let $f(x)$ be the function defined by $f(x)=x^3-x$."
You would expect $f(x)$ by itself to mean the value of $f$ at $x$, but in (F1) the $x$ has the property of a bound variable. In mathematical English, "let" binds variables. However, after the definition, in the text the "$x$" in the expression "$f(x)$" will be free, but the $f$ will be bound to the specific meaning. It is reasonable to say that the term "$f(x)$" represents the expression "$x^3-x$" and that $f$ is the (temporary) name of the function. Nevertheless, it is very common to say "the function $f(x)$" to mean $f$.
A fluent reader of mathematical English knows all this, but probably no one has ever said it explicitly to them. Mathematical English and the symbolic language should be taught explicitly, including its peculiarities such as "the function $f(x)$". (You may want to deprecate this usage when you teach it, but students deserve to understand its meaning.)
You have a mental representation of the positive integers $1,2,3,\ldots$. In this discussion I will assume that "you" know a certain amount of math. Non-mathematicians may have very different mental representations of the integers.
This is a graph of the function $y=x^3-x$:
Example (C1) The $\epsilon-\delta$ definition of the continuity of a function $f:\mathbb{R}\to\mathbb{R}$ may be given in the symbolic language of math:
A function $f$ is continuous at a number $c$ if \[\forall\epsilon(\epsilon\gt0\implies(\forall x(\exists\delta(|x-c|\lt\delta\implies|f(x)-f(c)|\lt\epsilon)))\]
Example (C2) The definition of continuity can also be represented in mathematical English like this:
A function $f$ is continuous at a number $c$ if for any $\epsilon\gt0$ and for any $x$ there is a $\delta$ such that if $|x-c|\lt\delta$, then $|f(x)-f(c)|\lt\epsilon$.
Example (C3) The definition of continuity can be given in a formally defined first order logical theory.
Example (C4) A function from one topological space to another is continuous if the inverse of every open set in the codomain is an open set in the domain.
Example (C4) "The graph of a continuous function can be drawn without picking up the chalk".
This post uses MathJax. If you see mathematical expressions with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.
Send to KindleThe interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook Representing sets.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.
Sets are represented in the math literature in several different ways, some mentioned here. Also mentioned are some other possibilities. Introducing a variety of representations of any type of math object is desirable because students tend to assume that the representation is the object.
The standard representation for a finite set is of the form "$\{1,3,5,6\}$". This particular example represents the unique set containing the integers $1$, $3$, $5$ and $6$ and nothing else. This means precisely that the statement "$n$ is an element of $S$" is true if $n=1$, $n=3$, $n=5$ or $n=6$, and it is false if $n$ represents any other mathematical object.
In the way the notation is usually used, "$\{1,3,5,6\}$", "$\{3,1,5,6\}$", "$\{1,5,3,6\}$", "$\{1,6,3,5,1\}$" and $\{ 6,6,3,5,1,5\}$ all represent the same set. Textbooks sometimes say "order and repetition don't matter". But that is a statement about this particular representation style for sets. It is not a statement about sets.
It would be nice to come up with a representation for sets that doesn't involve an ordering. Traditional algebraic notation is essentially one-dimensional and so automatically imposes an ordering (see Algebra is a difficult foreign language).
In Visible Algebra II, I experimented with the idea of putting the elements at random inside a circle and letting them visibly move around like goldfish in a bowl. (That experiment was actually for multisets but it applies to sets, too.) This is certainly a representation that does not impose an ordering, but it is also distracting. Our visual system is attracted to movement (but not as much as a cat's visual system).
One possibility would be to extend the machinery in a visible algebra system that allows you to make a box you could drag elements into.
This box would order the elements in some canonical order (numerical order for numbers, alphabetical order for strings of letters or words) with the property that if you inserted an element in the wrong place it would rearrange itself, and if you tried to insert an element more than once the representation would not change. What you would then have is a unique representation of the set.
An example is the device below. (If you have Mathematica, not just CDF player, you can type in numbers as you wish instead of having to use the buttons.)
This does not allow a representation of a heterogenous set such as $\{3,\mathbb{R},\emptyset,\left(\begin{array}{cc}1&2\\0&1\\ \end{array}\right)\}$. So what? You can't represent every function by a graph, either.
The tree notation used in my visual algebra posts could be used for sets as well, as illustrated below. The system allows you to drag the elements listed into different positions, including all around the set node. If you had a node for lists, that would not be possible.
This representation has the pedagogical advantage of shows that a set is not its elements.
Infinite sets are sometimes represented using the curly bracket notation using a pattern that defines the set. For example, the set of even integers could be represented by $\{0,2,4,6,\ldots\}$. Such a representation is necessarily a convention, since any beginning pattern can in fact represent an infinite number of different infinite sets. Personally, I would write, "Consider the even integers $\{0,2,4,6,\ldots\}$", but I would not write, "Consider the set $\{0,2,4,6,\ldots\}$".
By the way, if you are writing for newbies, you should say,"Consider the set of even integers $\{0,2,4,6,\ldots\}$". The sentence "Consider the even integers $\{0,2,4,6,\ldots\}$" is unambiguous because by convention a list of numbers in curly brackets defines a set. But newbies need lots of redundancy.
Setbuilder notation is exemplified by $\{x|x>0\}$, which denotes the positive reals, given a convention or explicit statement that $x$ represents a real number. This allows the representation of some infinite sets without depending on a possibly ambiguous pattern.
A Visible Algebra system needs to allow this, too. That could be (necessarily incompletely) done in this way:
The Can't Tell answer is a necessary requirement because the general question of whether an element is in a set defined by a first order sentence is undecidable. Perhaps the system could add some choices:
Even so, I'll bet a system using Mathematica could answer many questions like this for sentences referring to a specific polynomial, using the Solve or NSolve command. For example, the answer to the question, "Is $3\in\{n|n\lt0 \text{ and } n^2=9\}$?" (where $n$ ranges over the integers) would be "No", and the answer to "Is $\{n|n\lt0 \text{ and } n^2=9\}$ empty?" would also be "No". [Corrected 2012.10.24]
Send to KindleNote: This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.
In a previous post, I said that the symbolic language of mathematics is difficult to learn and that we don't teach it well. (The symbolic language includes as a subset the notation used in high school algebra, precalculus, and calculus.) I gave some examples in that post but now I want to go into more detail. This discussion is an incomplete sketch of some aspects of the syntax of the symbolic language. I will write one or more posts about the semantics later.
First, let's distinguish between mathematical English and the symbolic language of math.
A symbolic noun (logicians call it a term) is an expression in the symbolic language that names a number or other mathematical object, and may carry other information as well.
A symbolic statement is a symbolic expression that represents a statement that is either true or false or free, meaning that it contains variables and is true or false depending on the values assigned to the variables.
The constituents of a symbolic expression are symbols for numbers, variables and other mathematical objects. In a particular expression, the symbols are arranged according to conventions that must be understood by the reader. These conventions form the syntax or grammar of symbolic expressions.
The symbolic language has been invented piecemeal by mathematicians over the past several centuries. It is thus a natural language and like all natural languages it has irregularities and often results in ambiguous expressions. It is therefore difficult to learn and requires much practice to learn to use it well. Students learn the grammar in school and are often expected to understand it by osmosis instead of by being taught specifically. However, it is not as difficult to learn well as a foreign language is.
In the basic symbolic language, expressions are written as strings of symbols.
One of the basic methods of the symbolic language is the use of constructors. These can usually be analyzed as functions or operators, but I am thinking of "constructor" as a linguistic device for producing an expression denoting a mathematical object or assertion. Ordinary languages have constructors, too; for example "-ness" makes a noun out of a verb ("good" to "goodness") and "and" forms a grouping ("men and women").
The language uses special symbols both as names of specific objects and as constructors.
This is a lot of stuff for students to learn. Each symbol has its own rules of use (where you put it, which sort of expression you may it with, etc.) And the meaning is often determined by context. For example $\pi x$ usually means $\pi$ multiplied by $x$, but in some books it can mean the function $\pi$ evaluated at $x$. (But this is a remark about semantics — more in another post.)
Variables have deep problems concerned with their meaning (semantics). But substitution for variables causes syntactic problems that students have difficulty with as well.
I have described some aspects of the syntax of the symbolic language of math. Learning that syntax is difficult and requires a lot of practice. Students who manage to learn the syntax and semantics can go on to learn further math, but students who don't are forever blocked from many rewarding careers. I heard someone say at the MathFest in Madison that about 25% of all high school students never really understand algebra. I have only taught college students, but some students (maybe 5%) who get into freshman calculus in college are weak enough in algebra that they cannot continue.
I am not proposing that all aspects of the syntax (or semantics) be taught explicitly. A lot must be learned by doing algebra, where they pick up the syntax subconsciously just as they pick up lots of other behavior-information in and out of school. But teachers should explicitly understand the structure of algebra at least in some basic way so that they can be aware of the source of many of the students' problems.
It is likely that the widespread use of computers will allow some parts of the symbolic language of math to be replaced by other methods such as using Excel or some visual manipulation of operations as suggested in my post Mathematical and linguistic ability. It is also likely that the symbolic language will gradually be improved to get rid of ambiguities and irregularities. But a deliberate top-down effort to simplify notation will not succeed. Such things rarely succeed.
Send to Kindle