This post has been superseded by the post Inverse Image Demo Revisited
Send to KindleThis post has been superseded by the post Inverse Image Demo Revisited
Send to KindleIn a recent post I began a discussion of the mental, physical and mathematical representations of a mathematical object. The discussion continues here. Mathematicians, linguists, cognitive scientists and math educators have investigate some aspects of this topic, but there are many subtle connections between the different ideas which need to be studied.
I don’t have any overall theoretical grasp of these relationships. What I will do here is grope for an overall theory by mentioning a whole bunch of fine points. Some of these have been discussed in the literature and some (as far as I know) have not been discussed. Many of them (I hope) can be described as “an obvious fact about representations but no one has pointed it out before”. Such fine points could be valuable; I think some scholars who have written about mathematical discourse and math in the classroom are not aware of many of these facts.
I am hoping that by thrashing around like this here (for graphs of functions) and for other concepts (set, function, triangle, number …) some theoretical understanding may emerge of what it means to understand math, do math, and talk about math.
Let’s look at the graph of the function .
What you are looking at is a physical representation of the graph of the function. The graph creates in your brain a mental representation of the graph of the function. These are subtly related to each other and to the mathematical definition of the graph.
Fine points
1. This is metaphor in the sense lately used by cognitive scientists, for example in [6]. A metaphor can be described roughly as two mental images in which certain parts of one are identified with certain parts of another, in other words a pushout. The rhetorical use of the word “metaphor” requires it to be a figure of speech expressed in a certain way (the identification is direct rather than expressed by “is like” or some such thing.) In my use in this article a metaphor is something that occurs in your brain. The form it takes in speech or writing is not relevant.
2. I have noticed, for example, that some students don’t really understand that the left and right tails go off to infinity horizontally as well as vertically. In fact, the picture above could mislead someone into thinking the curve has vertical asymptotes: The right tail looks like it goes straight up. How could it get to x equals a billion if it goes straight up?
3. The “mental image” is of course a physical structure in your brain. So mental representations are physical representations.
4. I presume this “annotation” is some kind of physical connection between neurons or something. It is clear that a “mental image” is some sort of physical construction or event in the brain, but from what little I know about cognitive science, the scientists themselves are still arguing about the form of the construction. I would appreciate more information on this. (If the physical representation of mental images is indeed still controversial, this says nothing bad about cognitive science, which is very new.)
[1] Mental Representations in Math (previous post).
[2] Definitions (in abstractmath).
[3] Lakoff, G. and R. E. Núñez (2000), Where Mathematics Comes From. Basic Books.
Send to KindleThis post is a completely rewritten version of the abstractmath article on the definition of function. Like every part of abstractmath, the chapter on functions is designed to get you started thinking about functions. It is no way complete. Wikipedia has much more complete coverage of mathematical functions, but be aware that the coverage is scattered over many articles.
The concept of function in mathematics is as important as any mathematical idea. The mathematician’s concept of function includes the kinds of functions you studied in calculus but is much more abstract and general. If you are new to abstract math you need to know:
I will use two running examples throughout this discussion:
We start by giving a specification of “function”. (See the abstractmath article on specification.) After that, we get into the technicalities of the definitions of the general concept of function.
Specification: A function is a mathematical object which determines and is completely determined bythe following data:
The operation of finding given
and
is called evaluation.
Comment: The formula above that defines the function in fact defines a function of complex numbers (even quaternions).
In the nineteenth century, mathematicians realized that it was necessary for some purposes (particularly harmonic analysis) to give a mathematical definition of the concept of function. A stricter version of this definition turned out to be necessary in algebraic topology and other fields, and that is the one I give here.
To state this definition we need a preliminary idea.
A set R of ordered pairs has the functional property if two pairs in R with the same first coordinate have to have the same second coordinate (which means they are the same pair).
Examples
The graph of the function given above has the functional property. The graph is the set
If you repeatedly plug in one real number over and over, you get out the same real number every time. Example:
This set has the functional property because if is any real number, the formula
defines a specific real number. (This description of the graph implicitly assumes that
.) No other pair whose first coordinate is
is in the graph of
, only
. That is because when you plug
into the formula
, you get
every time. Of course,
is in the graph, but that does not contradict the functional property.
and
have the same second coordinate, but that is OK.
How to think about the functional property
The point of the functional property is that for any pair in the set of ordered pairs, the first coordinate determines what the second one is. That’s why you can write “” for any
in the domain of
and not be ambiguous.
A function is a mathematical structure consisting of the following objects:
According to the definition of function, ,
and
are three different functions.
Suppose we have two sets A and B with .
Remark The identity function and an inclusion function for the same set A have exactly the same graph, namely .
The set has the functional property, so it is the graph of a function. Call the function
. Then its domain is
and
and
.
is not defined because there is no ordered pair in H beginning with
(hence
is not in
.)
I showed above that the graph of the function , ordinarily described as “the function
”, has the functional property. The specification of function requires that we say what the domain is and what the value is at each point. These two facts are determined by the graph.
Because of the examples above, many authors define a function as a graph with the functional property. Now, the graph of a function may be denoted by
. This is an older, less strict definition of function that doesn’t work correctly with the concepts of algebraic topology, category theory, and some other branches of mathematics.
For this less strict definition of function, , which causes a clash of our mental images of “graph” and “function”. In every important way except the less-strict definition, they ARE different!
A definition is a device for making the meaning of math technical terms precise. When a mathematician think of “function” they think of many aspects of functions, such as a map of one shape into another, a graph in the real plane, a computational process, a renaming, and so on. One of the ways of thinking of a function is to think about its graph. That happens to be the best way to define the concept of function. (It is the less strict definition and it is a necessary concept in the modern definition given here.)
The occurrence of the graph in either definition doesn’t make thinking of a function in terms of its graph the most important way of visualizing it. I don’t think it is even in the top three.
Send to KindleFor a given mathematical object, a mathematician may have:
The boldface things in this list are related to each other in lots of ways, and they are fuzzy and overlap and don’t include every phenomenon connected with a math object.
I have written about these things ([1], [2], [3], [4]). So have lots of other people. In this post I summarize these ideas. I expect to write about particular examples later on and will use this as a reference.
The following examples point out a few of the relationships between the ideas in boldface above. There is much more to understand.
Function as black box
The idea that a function is a black box or machine with input and output is a metaphor for a function.
A is a metaphor for B means that A and B are cognitively pasted together in such a way that the behavior of A is in many ways like the behavior of B. Such a thing is both useful and dangerous, dangerous because there will be ways in which A behaves that suggest inappropriate ideas about B.
The function as machine is a good metaphor: for example functional composition involves connecting the output of one machine to the input of another, and the inverse function is like running the machine backward.
The function as machine is a bad metaphor: For example, it is wrong to think you could build a machine to calculate any given function exactly. But you can still imagine such a machine, given by a specification (it outputs the value of the function at a given input) and then, in your imagination, connecting the input of one to the output of another must perforce calculate the composite of the corresponding functions.
Like any metaphor, this is a mental representation. That means the metaphor has a physical instantiation in your brain. So a metaphor has a physical representation.
Different people won’t have quite the same concept of a particular metaphor. So a metaphor will have lots of slightly different physical representations, but mathematicians form a community, and communication between mathematicians fine-tunes the different physical instantiations so that they correspond more closely to each other. This is the sense in which mathematical objects have a shared existence in a community as Reuben Hersh has suggested.
A function is a mathematical object, which can be rigorously specified as a set of ordered pairs together with a domain and a codomain. There is a cognitive relationship between the concepts of function as math object and function as black box with input and output.
Triangle
A triangle can be drawn, or created on a computer and a physical image printed out. You may also have a mental image of the triangle.
The physical and the mental images are not the same thing, but they are definitely related. The relationship is mediated by the neuronal circuitry behind your retinas, which performs a highly sophisticated transformation of the pixels on your retina into an organized physical structure in your brain, connected to various other neurons.
This circuitry exists because it helps us get a useful understanding of the world through our eyes. So a picture of a triangle takes advantage of pre-existing neuron structure to generate a useful mental representation that helps us understand and prove things about triangles.
This mental representation also lives in a community of mathematician. Like any community, it has subgroups with “dialects” — varying understanding of representation.
For example, a mathematician who looks at the triangle below sees a triangle that looks like a right triangle. A student sees a triangle that is a right triangle.
This is “sees” in the sense of what their brain reports after all that processing. The mathematician’s brain connects the “triangle I am seeing” module (in their brain) to the “looks like a right triangle” module, but does not connect it to the “is a right triangle” module because they don’t see any statement in the surrounding text that it is a right triangle. The student, on the other hand, fallaciously makes the connection to “is a right triangle” directly.
In some sense, a student who does not make that connection directly is already a mathematician.
A triangle also exists as a mathematical object in your and my brain. It is described by a formal mathematical definition. The pictures of triangles you see above do not fit this definition. For one thing, the line segments in the pictures have thickness. But the pictures trigger a reaction in your neurons that causes your brain to cognitively paste together the line segments in the drawing to the segments required by the formal definition. This is a kind of metaphor of concrete-to-abstract that connects drawings to math objects that mathematicians use all the time.
Note that this “concrete-to-abstract metaphor” itself has a physical existence in your brain. It drops, for example, the property of thickness that the line segments in the drawing have when matching them (in the metaphor) with the line segments in the corresponding abstract triangle. On the other hand, it preserves the sense the all three angles in the triangle are acute. The abstract mathematical concept of triangle (the generic triangle) has no requirement on the angles except that they add up to pi.
Summary
The discussions above describe a few of the complex and subtle relationships that exist between
I have purported to discuss how mathematics is understood (especially in connection with language) in several articles and a book but only a few of the relationships I just described are mentioned in any of those articles. Perhaps one or two things I said caused you to react: “Actually, that’s obviously true but I never thought of it before”. (Much the way I had mathematicians in the ’60’s tell me, “I see what you mean that addition is a function of two variables, but I never thought of it that way before”.) (I was a brash category theorist wannabe then.)
A lot of research has been done on understanding math, and some research has been done on mathematical discourse. But what has been done has merely exposed the fin of the shark.
References
[1] Images and metaphors (in abstractmath).
[2] Representations and Models (in abstractmath).
[3] Mathematical Concepts (previous blog).
[4] Mental Representations in Math (previous blog).
Send to KindleOperation: Is it just a name or is there a metaphor behind it?
A function of the form may be called a binary operation on
. The main point to notice is that it takes pairs of elements of
to the same set
.
A binary operation is a special case of n-ary operation for any natural number , which is a function of the form
. A
-ary (unary) operation on
is a function from a set to itself (such as the map that takes an element of a group to its inverse), and a
-ary (nullary) operation on
is a constant.
It is useful at times to consider multisorted algebra, where a binary operation can be a function where the
are possibly different sets. Then a unary operation is simply a function.
Calling a function a multisorted unary operation suggest a different way of thinking about it, but as far as I can tell the difference is only that the author is thinking of algebraic operations as examples. This does not seem to be a different metaphor the way “function as map” and “function as transformation” are different metaphors. Am I missing something?
In the 1960’s some mathematicians (not algebraists) were taken aback by the idea that addition of real numbers (for example) is a function. I observed this personally. I don’t think any mathematician would react this way today.
Send to KindleMultivalued functions
I am reconstructing the abstractmath website and am currently working on the part on functions. This has generated some bloggable blustering.
The phrase multivalued function refers to an object that is like a function except that for
,
may denote more than one value. Multivalued functions arose in considering complex functions such as
. Another example: the indefinite integral is a multivalued operator.
It is useful to think of a multivalued function as a function although it violates one of the requirements of being a function (being single-valued).
A multivalued function can be modeled as a function with domain
and codomain the set of all subsets of
. The two meanings are equivalent in a strong sense (naturally equivalent). Even so, it seems to me that they represent two different ways of thinking about multivalued functions.: “The value may be any of these things…” as opposed to “The value is this whole set of things.”) The “value may be any of these…” idea has a perfectly good mathematical model: a relation (set of ordered pairs) from
to
which is the inverse of a surjective function.
Phrases such as “multivalued function” and “partial function” upset some uptight types who say things like, “But a multivalued function is not a function!”. A stepmother is not a mother, either.
I fulminated at length about this in the Handbook article on radial category. (This is conceptual category in the sense of Lakoff, Women, fire and dangerous things, University of Chicago, 1986.). The Handbook is on line, but it downloads very slowly, so I have extracted the particular page on radial categories here.
Functions generate a radial category of concepts in mathematics. There are lots of other concepts in math that have generated radial categories. Think of “incomplete proof” or “left identity”. Radial categories are a basic mechanism of the way we think and function in the world. They should not be banished from mathematics.
Send to KindleThis is a first draft of an article to eventually appear in abstractmath.
To explain a math concept, you need to explain how mathematicians think about the concept. This is what in abstractmath I call the images and metaphors carried by the concept. Of course you have to give the precise definition of the concept and basic theorems about it. But without the images and metaphors most students, not to mention mathematicians from a different field, will find it hard to prove much more than some immediate consequences of the definition. Nor will they have much sense of the place of the concept in math and applications.
Teachers will often explain the images and metaphors with handwaving and pictures in a fairly vague way. That is good to start with, but it’s important to get more precise about the images and metaphors. That’s because images and metaphors are often not quite a good fit for the concept — they may suggest things that are false and not suggest things that are true. For example, if a set is a container, why isn’t the element-of relation transitive? (A coin in a coinpurse in your pocket is a coin in your pocket.)
“A metaphor is a useful way to think about something, but it is not the same thing as the same thing.” (I think I stole that from the Economist.) Here, I am going to get precise with the notion that a function is a map. I am acting like a mathematician in “getting precise”, but I am getting precise about a metaphor, not about a mathematical object.
A map (ordinary paper map) of Minnesota has the property that each point on the paper represents a point in the state of Minnesota. This map can be represented as a mathematical function from a subset of a 2-sphere to . The function is a mathematical idealization of the relation between the state and the piece of paper, analogous to the mathematical description of the flight of a rocket ship as a function from
to
.
The Minnesota map-as-function is probably continuous and differentiable, and as is well known it can be angle preserving or area preserving but not both.
So you can say there is a point on the paper that represents the location of the statue of Paul Bunyan in Bemidji. There is a set of points that represents the part of the Mississippi River that lies in Minnesota. And so on.
A function has an image. If you think about it you will realize that the image is just a certain portion of the piece of paper. Knowing that a particular point on the paper is in the image of the function is not the information contained in what we call “this map of Minnesota”.
This yields what I consider a basic insight about function-as-map: The map contains the information about the preimage of each point on the paper map. So:
The map in the sense of a “map of Minnesota” is represented by the whole function, not merely by the image.
I think that is the essence of the metaphor that a function is a map. And I don’t think newbies in abstractmath always understand that relationship.
The preceding discussion doesn’t really represent how we think of a paper map of Minnesota. We don’t think in terms of points at all. What we see are marks on the map showing where some particular things are. If it is a road map it has marks showing a lot of roads, a lot of towns, and maybe county boundaries. If it is a topographical map it will show level curves showing elevation. So a paper map of a state should be represented by a structure preserving map, a morphism. Road maps preserve some structure, topographical maps preserve other structure.
The things we call “maps” in math are usually morphisms. For example, you could say that every simple closed curve in the plane is an equivalence class of maps from the unit circle to the plane. Here equivalence class meaning forget the parametrization.
The very fact that I have to mention forgetting the parametrization is that the commonest mathematical way to talk about morphisms is as point-to-point maps with certain properties. But we think about a simple closed curve in the plane as just a distorted circle. The point-to-point correspondence doesn’t matter. So this example is really talking about a morphism as a shape-preserving map. Mathematicians introduced points into talking about preserving shapes in the nineteenth century and we are so used to doing that that we think we have to have points for all maps.
Not that points aren’t useful. But I am analyzing the metaphor here, not the technical side of the math.
People who don’t do category theory think the idea of a mathematical structure as a functor is weird. From the point of view of the preceding discussion, a particular group is a functor from the generic group to some category. (The target category is Set if the group is discrete, Top if it is a topological group, and so on.)
The generic group is a group in a category called its theory or sketch that is just big enough to let it be a group. If the theory is the category with finite products that is just big enough then it is the Lawvere theory of the group. If it is a topos that is just big enough then it is the classifying topos of groups. The theory in this sense is equivalent to some theory in the sense of string-based logic, for example the signature-with-axioms (equational theory) or the first order theory of groups. Johnstone’s Elephant book is the best place to find the translation between these ideas.
A particular group is represented by a finite-limit-preserving functor on the algebraic theory, or by a logical functor on the classifying topos, and so on; constructions which bring with them the right concept of group homomorphisms as well (they will be any natural transformations).
The way we talk about groups mimics the way we talk about maps. We look at the symmetric group on five letters and say its multiplication is noncommutative. “Its multiplication” tells us that when we talk about this group we are talking about the functor, not just the values of the functor on objects. We use the same symbols of juxtaposition for multiplication in any group, “” or “
” for the identity, “
” for the inverse of
, and so on. That is because we are really talking about the multiplication, identity and inverse function in the generic group — they really are the same for all groups. That is because a group is not its underlying set, it is a functor. Just like the map of Minnesota “is” the whole function from the state to the paper, not just the image of the function.
Send to KindleIn MathOverflow, statements similar to the following two occurred in comments:
I cannot find either one of them now, but I want to talk about them anyway.
If you look at the definition of categories in various works (for example references [1] through [3] below) you find that the objects and arrows of a category must each form a “collection” or “class” together with certain operations. The authors all describe the connection with Grothendieck’s concept of “universe” and define “large categories” and “small categories” in the usual way. So Statement 1 above is simply wrong.
Statement 2 is more problematic. The trouble is that if the word “categories” includes large categories then the objects do not form a set even in the second universe. You have to go to the third universe.
Now there is a way to define categories where this issue does not come up. It allows us to think about categories without having a particular system such as ZF and universes in mind.
A category consists of objects and arrows, together with four methods of construction M1 – M4 satisfying laws L1 -L7. I treat “object” and “arrow” as predicates: object[f] means f is an object and arrow[a] means a is an arrow. “=” means equals in the mathematical sense.
Mathematicians work inside the categories Set (sets and functions) and Cat (categories and functors) all the time, including functors to or from Cat or Set. When they consider a category, the use theorems that follow from the definition above. They do not have to have foundations in mind.
Once in awhile, they are frustrated because they cannot talk about the set of objects of some category. For example, Freyd’s solution set condition is required to prove the existence of a left adjoint because of that problem. The ss condition is a work-around for a familiar obstruction to an easy way to prove something. I can imagine coming up with such a work-around without ever giving a passing thought to foundations, in particular without thinking of universes.
When you work with a mathematical object, the syntax of the definitions and theorems give you all you need to justify the claim that something is a theorem. You absolutely need models of the theory to think up and understand proofs, but the models could be sets or classes with structure, or functors (as in sketch theory), or you may work with generic models which may require you to use intuitionistic reasoning. You don’t have to have any particular kind of model in mind when you work in Set or Cat.
When you do run into something like the impossibility of forming the set of objects of some category (which happens in any model theory environment that uses classical rather than intuitionistic reasonins) then you may want to consider an approach through some theory of foundations. That is what most mathematicians do: they use just-in-time foundations. For example, in a particular application you may be happy to work in a topos with a set-of-all-objects, particularly if you are a certain type of computer scientists who lives in Pittsburgh. You may be happy to explicitly consider universes, although I am not aware of any category-theoretical results that do explicitly mention universes.
But my point is that most mathematicians think about foundations only when they need to, and most mathematicians never need to think about foundations in their work. Moral: Don’t think in terms of foundations unless you have to.
This point of view is related to the recent discussions of pragmatic foundations [7] [8].
Side remark
The situation that you can’t always construct a set of somethings is analogous to the problem that you have in working with real numbers: You can’t name most real numbers. This may get in the way of some analyst wanting to do something, I don’t know. But in any branch of math, there are obstructions to things you want to do that really do get in your way. For example, in beginning linear algebra, it may have occurred to you, to your annoyance, that if you have the basis of a subspace you can extend it to the basis for the whole space, but if you have a basis of the whole space, and a subspace, the basis may not contain a basis of the subspace.
Send to KindleIn my post on automatic spelling reform, I mentioned the various attempts at spelling reform that have resulted in both the old and new systems being used, which only makes things worse. This happens in Christian denominations, too. Someone (Martin Luther, John Wesley) tries to reform things; result: two denominations. But a lot of the time the reform effort simply disappears. The Chicago Tribune tried for years to get us to write “thru” and “tho” — and failed. Nynorsk (really a language reform rather than a spelling reform) is down to 18% of the population and the result of allowing Nynorsk forms to be used in the standard language have mostly been nil. (See Note 1.)
In my early years as a mathematician I wrote a bunch of papers writing functions on the right (including the one mentioned in the last post). I was inspired by some algebraists and particularly by Beck’s Thesis (available online via TAC), which I thought was exceptionally well-written. This makes function composition read left to right and makes the pronunciation of commutative diagrams get along with notation, so when you see the diagram below you naturally write h = fg instead of h = gf. 
Sadly, I gave all that up before 1980 (I just looked at some of my old papers to check). People kept complaining. I even completely rewrote one long paper (Reference [3]) changing from right hand to left hand (just like Samoa). I did this in Zürich when I had the gout, and I was happy to do it because it was very complicated and I had a chance to check for errors.
Well, I adapted. I have learned to read the arrows backward (g then f in the diagram above). Some French category theorists write the diagram backward, thus:

But I was co-authoring books on category theory in those days and didn’t think people would accept it. Not to mention Mike Barr (not that he is not a people, oh, never mind).
Nevertheless, we should have gone the other way. We should have adopted the Dvorak keyboard and Betamax, too.
Notes
[1] A lifelong Norwegian friend of ours said that when her children say “boka” instead of “boken” it sound like hillbilly talk does to Americans. I kind of regretted this, since I grew up in north Georgia and have been a kind of hillbilly-wannabe (mostly because of the music); I don’t share that negative reaction to hillbillies. On the other hand, you can fageddabout “ho” for “hun”.
References
[1] Charles Wells, Automorphisms of group extensions, Trans. Amer. Math. Soc, 155 (1970), 189-194.
[2] John Martino and Stewart Priddy, Group extensions and automorphism group rings. Homology, Homotopy and Applications 5 (2003), 53-70.
[3] Charles Wells, Wreath product decomposition of categories 1, Acta Sci. Math. Szeged 52 (1988), 307 – 319.
Send to Kindle