Naming mathematical objects

Commonword names confuse

Many technical words and phrases in math are ordinary English words ("commonwords") that are assigned a different and precisely defined mathematical meaning.  

  • Group  This sounds to the "layman" as if it ought to mean the same things as "set".  You get no clue from the name that it involves a binary operation with certain properties.  
  • Formula  In some texts on logic, a formula is a precisely defined expression that becomes a true-or-false sentence (in the semantics) when all its variables are instantiated.  So $(\forall x)(x>0)$ is a formula.  The word "formula" in ordinary English makes you think of things like "$\textrm{H}_2\textrm{O}$", which has no semantics that makes it true or false — it is a symbolic expression for a name.
  • Simple group This has a technical meaning: a group with no nontrivial normal subgroup.  The Monster Group is "simple".  Yes, the technical meaning is motivated by the usual concept of "simple", but to say the Monster Group is simple causes cognitive dissonance.

Beginning students come with the (generally subconscious) expectation that they will pick up clues about the meanings of words from connotations they are already familiar with, plus things the teacher says using those words.  They think in terms of refining an understanding they already have.  This is more or less what happens in most non-math classes.  They need to be taught what definition means to a mathematician.

Names that don't confuse but may intimidate

Other technical names in math don't cause the problems that commonwords cause.

Named after somebody The phrase "Hausdorff space" leads a math student to understand that it has a technical meaning.  They may not even know it is named after a person, but it screams "geek word" and "you don't know what it means".  That is a signal that you can find out what it means.  You don't assume you know its meaning. 

New made-up words  Words such as "affine", "gerbe"  and "logarithm" are made up of words from other languages and don't have an ordinary English meaning.  Acronyms such as "QED", "RSA" and "FOIL" don't occur often.  I don't know of any math objects other than "RSA algorithm" that have an acronymic name.  (No doubt I will think of one the minute I click the Publish button.)  Whole-cloth words such as "googol" are also rare.  All these sorts of words would be good to name new things since they do not fool the readers into thinking they know what the words mean.

Both types of words avoid fooling the student into thinking they know what the words mean, but some students are intimidated by the use of words they haven't seen before.  They seem to come to class ready to be snowed.  A minority of my students over my 35 years of teaching were like that, but that attitude was a real problem for them.

Audience

You can write for several different audiences.

Math fans (non-mathematicians who are interested in math and read books about it occasionally) In my posts Explaining higher math to beginners and in Renaming technical conceptsI wrote about several books aimed at explaining some fairly deep math to interested people who are not mathematicians.  They renamed some things. For example, Mark Ronan in Symmetry and the Monster used the phrase "atom" for "simple group" presumably to get around the cognitive dissonance.  There are other examples in my posts.  

Math newbies  (math majors and other students who want to understand some aspect of mathematics).  These are the people abstractmath.org is aimed at. For such an audience you generally don't want to rename mathematical objects. In fact, you need to give them a glossary to explain the words and phrases used by people in the subject area.   

Postsecondary math students These people, especially the math majors, have many tasks:

  • Gain an intuitive understanding of the subject matter.
  • Understand in practice the logical role of definitions.
  • Learn how to come up with proofs.
  • Understand the ins and outs of mathematical English, particularly the presence of ordinary English words with technical definitions.
  • Understand and master the appropriate parts of the symbolic language of math — not just what the symbols mean but how to tell a statement from a symbolic name.

It is appropriate for books for math fans and math newbies to try to give an understanding of concepts without necessary proving theorems.  That is the aim of much of my work, which has more an emphasis on newbies than on fans. But math majors need as well the traditional emphasis on theorem and proof and clear correct explanations.

Lately, books such as Visual Group Theory have addressed beginning math majors, trying for much more effective ways to help the students develop good intuition, as well as getting into proofs and rigor. Visual Group Theory uses standard terminology.  You can contrast it with Symmetry and the Monster and The Mystery of the Prime Numbers (read the excellent reviews on Amazon) which are clearly aimed at math fans and use nonstandard terminology.  

Terminology for algebraic structures

I have been thinking about the section of Abstracting Algebra on binary operations.  Notice this terminology:

boptable

The "standard names" are those in Wikipedia.  They give little clue to the meaning, but at least most of them, except "magma" and "group", sound technical, cluing the reader in to the fact that they'd better learn the definition.

I came up with the names in the right column in an attempt to make some sense out of them.  The design is somewhat like the names of some chemical compounds.  This would be appropriate for a text aimed at math fans, but for them you probably wouldn't want to get into such an exhaustive list.

I wrote various pieces meant to be part of Abstracting Algebra using the terminology on the right, but thought better of it. I realized that I have been vacillating between thinking of AbAl as for math fans and thinking of it as for newbies. I guess I am plunking for newbies.

I will call groups groups, but for the other structures I will use the phrases in the middle column.  Since the book is for newbies I will include a table like the one above.  I also expect to use tree notation as I did in Visual Algebra II, and other graphical devices and interactive diagrams.

Magmas

In the sixties magmas were called groupoids or monoids, both of which now mean something else.  I was really irritated when the word "magma" started showing up all over Wikipedia. It was the name given by Bourbaki, but it is a bad name because it means something else that is irrelevant.  A magma is just any binary operation. Why not just call it that?  

Well, I will tell you why, based on my experience in Ancient Times (the sixties and seventies) in math. (I started as an assistant professor at Western Reserve University in 1965). In those days people made a distinction between a binary operation and a "set with a binary operation on it".  Nowadays, the concept of function carries with it an implied domain and codomain.  So a binary operation is a function $m:S\times S\to S$.  Thinking of a binary operation this way was just beginning to appear in the common mathematical culture in the late 60's, and at least one person remarked to me: "I really like this new idea of thinking of 'plus' and 'times' as functions."  I was startled and thought (but did not say), "Well of course it is a function".  But then, in the late sixties I was being indoctrinated/perverted into category theory by the likes of John Isbell and Peter Hilton, both of whom were briefly at Case Western Reserve University.  (Also Paul Dedecker, who gave me a glimpse of Grothendieck's ideas).

Now, the idea that a binary operation is a function comes with the fact that it has a domain and a codomain, and specifically that the domain is the Cartesian square of the codomain.  People who didn't think that a binary operation was a function had to introduce the idea of the universe (universal algebraists) or the underlying set (category theorists): you had to specify it separately and introduce terminology such as $(S,\times)$ to denote the structure.   Wikipedia still does it mostly this way, and I am not about to start a revolution to get it to change its ways.

Groups

In the olden days, people thought of groups in this way:

  • A group is a set $G$ with a binary operation denoted by juxtaposition that is closed on $G$, meaning that if $a$ and $b$ are any elements of $G$, then $ab$ is in $G$.
  • The operation is associative, meaning that if $a,\ b,\ c\in G$, then $(ab)c=a(bc)$.
  • The operation has a unity element, meaning an element $e$ for which for any element $a\in G$, $ae=ea=a$.
  • For each element $a\in G$, there is an element $b$ for which $ab=ba=e$.

This is a better way to describe a group:

  • A group consist of a nullary operation e, a unary operation inv,  and a binary operation denoted by juxtaposition, all with the same codomain $G$. (A nullary operation is a map from a singleton set to a set and a unary operation is a map from a set to itself.)
  • The value of e is denoted by $e$ and the value of inv$(a)$ is denoted by $a^{-1}$.
  • These operations are subject to the following equations, true for all $a,\ b,\ c\in G$:

     

    • $ae=ea=a$.
    • $aa^{-1}=a^{-1}a=e$.
    • $(ab)c=a(bc)$.

This definition makes it clear that a group is a structure consisting of a set and three operations whose axioms are all equations.  It was formulated by people in universal algebra but you still see the older form in texts.

The old form is not wrong, it is merely inelegant.  With the old form, you have to prove the unity and inverses are unique before you can introduce notation, and more important, by making it clear that groups satisfy equational logic you get a lot of theorems for free: you construct products on the cartesian power of the underlying set, quotients by congruence relations, and other things. (Of course, in AbAl those theorem will be stated later than when groups are defined because the book is for newbies and you want lots of examples before theorems.)

References

  1. Three kinds of mathematical thinkers (G&G post)
  2. Technical meanings clash with everyday meanings (G&G post)
  3. Commonword names for technical concepts (G&G post)
  4. Renaming technical concepts (G&G post)
  5. Explaining higher math to beginners (G&G post)
  6. Visual Algebra II (G&G post)
  7. Monads for high school II: Lists (G&G post)
  8. The mystery of the prime numbers: a review (G&G post)
  9. Hersh, R. (1997a), "Math lingo vs. plain English: Double entendre". American Mathematical Monthly, volume 104, pages 48–51.
  10. Names (in abmath)
  11. Cognitive dissonance (in abmath)

Improve your language

(Note: Sentences in small print are incidental remarks.  I meant them to be in small print.  The other variations in size of print is due entirely to CKEditor and I didn't mean it to happen.)

"Improve your language" probably makes you think of commands from certain uppity friends like:

  • Don't say, "I have to work just like everybody has to work" — say "just as".
  • Don't say, "Who are you talking to?", say "To whom are you talking?"

These are statements made by people who believe in "correct English" (conforming to a standard imposed by some educated white people).

This blog is not about that sort of thing  It is about bugs in the English language.  I have written about that before (Bugs in English and in mathComma rule found dysfunctional). 

1. Flammable and Inflammable

Both these words mean the same thing.  This is a bug that can do real damage.  In fact companies that make flammable products have policies requiring the use of "flammable" and "fireproof" to avoid what could be serious damage.  Webster's New World Dictionary warns against using "inflammable", but under "flammable", which seems pointless.  (I was a contributing editor to the fourth edition but they didn't ask me about "inflammable".)  Wiktionary also warns against using "inflammable", but not the Oxford English Dictionary.

By the way, I recently learned that some government agency has instituted a standard that "Exit" signs should be in green lighting. Many older ones are red, which usually means "Stop".  As usual, the European Union agitated for this long before the USA did.  I wonder if red exit signs ever fooled anyone.

2. Unisex 3rd person singular pronoun needed

This bug does not cause explosions, except metaphorically, but it is a real problem.  Until the last few years, the only way to achieve neutrality was through clumsy rewording.  In my three books (two written with Michael Barr), we alternated using "he" and "she".  In academic prose, it is common to write things such as "If the reader factors the polynomial, he will discover…".  We would sometimes write "…she will discover…".  No one complained.  Lots of other recent academic writers do this trick, too.  

In these posts and in abstractmath I have Reached A Higher Level (or Lower, according to some people) and use "you" a lot, both in the usual meaning and in the colloquial use replacing "one" (meaning 8 in the OED).  Examples:

  • "If you factor the polynomial, you'll discover…" (Notice the "you'll" –contractions are happening a lot in academic writing these days, too, and in research papers, not just science popularizations.  See The revolution in technical exposition II.)
  • "When someone refers to imaginary numbers it makes you think they are fictitious."   

Brits to the rescue

However, many writers, especially in Britain, have been deliberately using "they" as a 3rd person singular pronoun.  This is the OED meaning 2, and it dates back a long way.  OED meaning 3 is also relevant.  This is discussed extensively in Wikipedia. I have given several other references below.  Note that they are mostly British.

My favorite OED quote is from Fielding: "Every Body fell a laughing, as how could they help it" . I sometimes say things like, "I'm a-running around doing errands" because I think of myself as a southerner (of the American variety), but that is purely posturing — I never heard anyone say a-anything when I was growing up in (the USA version of) Georgia

The next two entries were in a previous post.

3. Right

Q: "Should I turn left at the next corner?" A: "Right".  Probably most Americans who drive now know this bug.  The answer could mean "yes" or "turn right".  So we have to stop and think how to answer this question.  That makes it a bug.  When Jane and I drive together we have learned to answer that question "yes" or "correct".  

4. Too, two

Comment: " We will take Route 30".  Answer: "We will take Route 30 too".  (Say it out loud) This bug may be responsible for the survival of the word "also".  

Note that unlike the case of "right", this is a bug only of spoken English.

Repairing English

Examples 1 and 2 exhibit cases where English bugs cause genuine problems that need repair.  In both cases, deliberate efforts are being made in an organized bottom-up effort to solve the problems.   And both efforts seem to be working.  In the other examples, people come up with workarounds, but not in an organized fashion.  

Now, English does not have an Academy that thinks it runs the language. Spanish and French do.  But in fact when they try to do anything the least bit radical, they usually fail.

  • The Spanish Royal Academy has tried for years to enforce certain rules for the use of third person pronouns but they have apparently (correct me if I am wrong) failed to have any effect.  
  • There was a German spelling reform in the 1990's that the main German speaking countries agreed to and tried to enforce, but they failed miserably.
  • Three branches of the French goverment, along with the French Academy, had long furious discussions about how to translate "cloud computing"  into French.  Many people in the literary and government power structure do not want French people to use English words while speaking French.   But the stuffier types would not allow "informatique en nuage", so the "problem" was left unsolved.  Meanwhile the French go on calling it "cloud computing", as on this website.  

"Informatique en nuage" would be a calque on English, like "Adam's apple" is a calque in English on French "pomme d'Adam".  Meanwhile, English, which thankfully has no Academy, has made hundreds of calques on other languages, mostly to our benefit, not to mention borrowing an enormous number of words directly from French.  

People also change the way they talk without a good reason: consider "between you and I", which is an unneeded change, but it appears to be on its way to standard.  (I say that because, unlike usages such as "I don't want no cabbage", "between you and I" is very common among educated young people.)  Older people often can't stand any change in the way we talk, but as I said in another context, Old Fogies don't like "between you and I", but they die and then the younger people do what they like

I think it is safe to say that needed reform in a language often comes from the ingenuity of the people, sometimes in severe cases with leadership from nongovernmental groups of people, but most often simply by people changing how they talk.  

References

All purpose pronoun, New York Times blog post on singular they.

Bugs in English and in math, previous post

Category theory for computing science, by Michael Barr and Charles Wells

Handbook of mathematical discourse, by Charles Wells

If someone tells you singular 'they' is wrong, please do tell them to get stuffed (Telegraph blog post)

Singular they (Economist blog post)

Singular they (Wikipedia)

The French Get Lost in the Clouds Over a New Term in the Internet Age (Wall Street Journal) 

The revolution in technical exposition II, previous post

Toposes, triples and theories, by Michael Barr and Charles Wells

 

Posted in language. 3 Comments »

Shared mental objects

Notes on viewing

Shared mental objects

I propose the phrase "shared mental object" to name the sort of thing that includes mathematical objects, abstract objects, fictional objects and other concepts with the following properties: ​

  • They are not physical objects
  • We think of them as objects 
  • We share them with other people

It is the name "shared mental object" that is a new idea; the concept has been around in philosophy and math ed for awhile and has been called various things, especially "abstract object", which is the name I have used in abstractmath.

I will go into detail concerning some examples in order to make the concept clear.  If you examine this concept deeply you discover many fine points, nested ideas and circles of examples that go back on themselves.  I will not get very far into these fine points here, but I have written about some of them posts and in abmath (see references).  I am working on a post about some of the fine points and will publish it if I can control its tendency to expand into infinite proliferation and recursion.

Examples

 

Messages

There is a story about the early days of telegraphy:  A man comes into the newly-opened telegraph station and asks to send a telegram to his son who is working in another city. He writes out the message and gives it to the operator with his payment.  The operator puts the message on a spike and clicks the key in front of him for a while, then says, "I have sent your message.  Thanks for shopping at Postal Telegraph".  The man looks astonished and points at the message and says, "But it is still here!"

A message is a shared mental object.  

  • It may be represented by a physical object, such as a piece of paper with writing on it, and people commonly refer to the paper as the message.  
  • It may be a verbal message from you, perhaps delivered by another person to a third person by speech.  
  • The delivery process may introduce errors (so can sending a telegraph).  So the thoughts in the three brains (the sender, the deliverer and the recipient) can differ from each other, but they can still talk about "the message" as if it were one object.

Other examples that are similar in nature to messages are schedules and the month of September (see Math Objects in abmath, where they are called abstract objects.).  In English-speaking communities, September is a cultural default: you are expected to know what it is. You can know that September is a month and that right this minute it is not September (unless it is September). You may think that September has 31 days and most people would say you are wrong, but they would agree that you and they are talking about the same month.

The general concept of the month of September and facts concerning it have been in shared existence in English-speaking cultural groups for (maybe) a thousand years.  In contrast, a message is usually shared by only two or three people and it has a short life; a few years from now, it may be that none of the people involved with the message remember what it said or even that it existed.

Symbols

symbol, such as the letter "a" or the integral sign "$\int$", is a shared mental object.  Like the month of September, but unlike messages, letters are shared by large cultural entities, every language community that uses the Latin alphabet (and more) in the case of "a", and math and tech people in the case of "$\int$". 

The letter "a" is represented physically on paper, a blackboard or a screen, among other things.  If you are literate in English and recognize an occurrence as representing the letter, you probably do this using a process in the brain that is automatic and that operates outside your awareness

Literate readers of English also generally agree that a string of letters either does or does not represent the word "default" but there are borderline cases (as in those little boxes where you have to prove you are not a robot) where they may disagree or admit that they don't know.  Even so, the letter "a" and the word "default" are shared in the minds of many people and there is general (but not absolutely universal) agreement on when you are seeing representations of them.

Fictional objects

Fictional objects such as Sherlock Holmes and unicorns are shared mental objects.  I wrote briefly about them in Mathematical objects and will not go into them here.  

Mathematical objects 

The integer $111$, the integral $\int_0^1 x^2\,dx$ and the set of all real numbers are all mathematical objects.   They are all shared mental objects.  In most of the world, people with a little education will know that $111$ is a number and what it means to have $111$ beans in a jar (for example).  They know that it is one more that $110$ and a lot more than $42$.  

Mathematicians, scientists and STEM students will know something about what  $\int_0^1 x^2\,dx$ means and they will probably know how to calculate it.  Most  of them may be able to do it in their head.  I have taught calculus so many times that I know it "by heart", which means that it is associated in my brain with the number $1/3$ in such a way that when I see the integral the number automatically and without effort pops us (in the same way that I know September has 30 days).

Beginning calculus students may have a confused and incorrect understanding of the set of all real numbers in several ways, but practicing mathematicians (and many others) know that it is an uncountably infinite dense set and they think of it as an object.  A student very likely does not think of it as an object, but as a sprawling unimaginable space that you cannot possibly regard as a thing. Students may picture a real number as having another real number sitting right beside it — the next biggest one. Most practicing mathematicians think of the set of real numbers as a completed infinity – every real number is already there —  and they know that between any two of them there is another one.

As a consequence, when students and professors talk about real numbers the student finds that some times the professor says things that sound completely wrong and the professor hears the student say things that are bizarre and confused.  They firmly believe they are talking about the same thing, the real numbers, but the student is seen by the professor as wrong and the professor is seen by the students as talking meaningless nonsense.  Even so, they believe they are talking about the same thing.

Nomenclature

I tried various other names before I came to "shared mental objects".

  • I called them abstract objects in abstractmath.  The word "abstract" does not convey their actual character — they are mental and they are shared.
  • They are non-physical objects, a phrase widely used in philosophy, but naming something by a negation is always a bad idea.  
  • Co-mental objects is ugly and comental looks like a misspelling.
  • Intermental objects sounds like it has something to do with burial.  Maybe InterMental?
  • The word entity may avoid some confusion caused by the word "object", which suggests physical object.  But "object" is widely used in philosophy and in math ed in the way it is used here.
  • Meme?  Well, in some sense a shared mental object is a meme.  Memes have a connotation of forcing themselves into your brain that I don't want, but I want to consider the relationship further.

The major advantage of "shared mental object" is that it describes the important properties of the concept: It is a mental object and it is shared by people.  It has no philosophical implications concerning platonism, either. Mathematical objects do have special properties of verifiability that general shared mental objects do not, but my terminology does not suggest any existence of absolute truth or of an Ideal existing in another world.  I don't believe in such things, but some people do and I want to point out that "shared mental object" does not rule such things out — it merely gives a direct evidence-based description of a phenomenon that actually exists in the real world.

References  

Abstract objects in the Stanford Encyclopedia of Philosophy

Abstract object in Wikipedia

Mathematical objects in abstractmath

Mathematical objects in Wikipedia

What is Mathematics, Really?  R. Hersh, Oxford University Press, 1997

Previous posts

Representations of mathematical objects 

Representations III: Rigor and Rigor Mortis

Representations II: Dry Bones

Notes on Viewing  

This post uses MathJax. If you see mathematical expressions with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.

Representing and thinking about sets

Notes on viewing

Representations of sets

Sets are represented in the math literature in several different ways, some mentioned here.  Also mentioned are some other possibilities.  Introducing a variety of representations of any type of math object is desirable because students tend to assume that the representation is the object.

Curly bracket notation

The standard representation for a finite set is of the form "$\{1,3,5,6\}$". This particular example represents the unique set containing the integers $1$, $3$, $5$ and $6$ and nothing else. This means precisely that the statement "$n$ is an element of $S$" is true if $n=1$, $n=3$, $n=5$ or $n=6$, and it is false if $n$ represents any other mathematical object. 

In the way the notation is usually used, "$\{1,3,5,6\}$", "$\{3,1,5,6\}$", "$\{1,5,3,6\}$",  "$\{1,6,3,5,1\}$" and $\{ 6,6,3,5,1,5\}$ all represent the same set. Textbooks sometimes say "order and repetition don't matter". But that is a statement about this particular representation style for sets. It is not a statement about sets.

It would be nice to come up with a representation for sets that doesn't involve an ordering. Traditional algebraic notation is essentially one-dimensional and so automatically imposes an ordering (see Algebra is a difficult foreign language).    

Let the elements move

In Visible Algebra II, I experimented with the idea of putting the elements at random inside a circle and letting them visibly move around like goldfish in a bowl.  (That experiment was actually for multisets but it applies to sets, too.)  This is certainly a representation that does not impose an ordering, but it is also distracting.  Our visual system is attracted to movement (but not as much as a cat's visual system).  

Enforce natural ordering

One possibility would be to extend the machinery in a visible algebra system that allows you to make a box you could drag elements into. 

This box would order the elements in some canonical order (numerical order for numbers, alphabetical order for strings of letters or words) with the property that if you inserted an element in the wrong place it would rearrange itself, and if you tried to insert an element more than once the representation would not change.  What you would then have is a unique representation of the set.

An example is the device below.  (If you have Mathematica, not just CDF player, you can type in numbers as you wish instead of having to use the buttons.) 

This does not allow a representation of a heterogenous set such as $\{3,\mathbb{R},\emptyset,\left(\begin{array}{cc}1&2\\0&1\\ \end{array}\right)\}$.  So what?  You can't represent every function by a graph, either.

Hanger notation

The tree notation used in my visual algebra posts could be used for sets as well, as illustrated below. The system allows you to drag the elements listed into different positions, including all around the set node. If you had a node for lists, that would not be possible.

This representation has the pedagogical advantage of shows that a set is not its elements.

  • A set is distinct from its elements
  • A set is completely determined by what the elements are.

Pattern recognition

Infinite sets are sometimes represented using the curly bracket notation using a pattern that defines the set.  For example, the set of even integers could be represented by $\{0,2,4,6,\ldots\}$.  Such a representation is necessarily a convention, since any beginning pattern can in fact represent an infinite number of different infinite sets.  Personally, I would write, "Consider the even integers $\{0,2,4,6,\ldots\}$", but I would not write,  "Consider the set $\{0,2,4,6,\ldots\}$".

By the way, if you are writing for newbies, you should say,"Consider the set of even integers $\{0,2,4,6,\ldots\}$". The sentence "Consider the even integers $\{0,2,4,6,\ldots\}$" is unambiguous because by convention a list of numbers in curly brackets defines a set. But newbies need lots of redundancy.

Representation by a sentence

Setbuilder notation is exemplified by $\{x|x>0\}$, which denotes the positive reals, given a convention or explicit statement that $x$ represents a real number.  This allows the representation of some infinite sets without depending on a possibly ambiguous pattern. 

A Visible Algebra system needs to allow this, too. That could be (necessarily incompletely) done in this way:

  • You type in a sentence into a Setbuilder box that defines the set.
  • You then attach a box to the Setbuilder box containing a possible element.
  • The system then answers Yes, No, or Can't Tell.

The Can't Tell answer is a necessary requirement because the general question of whether an element is in a set defined by a first order sentence is undecidable. Perhaps the system could add some choices:

  • Try for a second.
  • Try for an hour.
  • Try for a year.
  • Try for the age of the universe.

Even so, I'll bet a system using Mathematica could answer many questions like this for sentences referring to a specific polynomial, using the Solve or NSolve command.  For example, the answer to the question, "Is $3\in\{n|n\lt0 \text{ and } n^2=9\}$?" (where $n$ ranges over the integers) would be "No", and the answer to  "Is $\{n|n\lt0 \text{ and } n^2=9\}$ empty?" would also be "No". [Corrected 2012.10.24]

References

  1. Explaining “higher” math to beginners (previous post)
  2. Algebra is a difficult foreign language (previous post)
  3. Visible Algebra II (previous post)
  4. Sets: Notation (abstractmath article)
  5. Setbuilder notation (Wikipedia)

Notes on Viewing  

  • This post uses MathJax. If you see mathematical expressions with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.
  • To manipulate the demos in this post, you must have Wolfram CDF Player installed on your computer. It is available free from the Wolfram website. The code for the demos is in the Mathematica notebook Representing sets.nb.  

Semantics of algebra I

Note: This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.

In the post Algebra is a difficult foreign language  I listed some of the difficulties of the syntax of the symbolic language of math (which includes high school algebra and precalculus).  The semantics causes difficulties as well.  Again I will list some examples without any attempt at completeness.

The status of the symbolic language as a language

There is a sharp distinction between the symbolic language of math and mathematical English, which I have written about in The languages of math and in the Handbook of mathematical discourse. Other authors do not make this sharp distinction (see the list of references at the end of this post). The symbolic language occurs embedded in mathematical English and the embedding has its own semantics which may cause great difficulty for students.

The symbolic language of math can be described as a natural formal language. Pieces of it were invented by mathematicians and others over the course of the last several hundred years. Individual pieces (notation such as "$3x+1=2y$") can be given a strictly formal syntax, but the whole system is ambiguous, inconsistent, and context-sensitive.  When you get to the research level, it has many dialects: Research mathematicians in one field may not be able to read research articles in a very different field.

Examples

I think the examples below will make these claims plausible.  This should be the subject of deep research.

Superscripts and functions

  • A superscript, as in $5^2$ or $x^3$, has a pretty standard meaning denoting a power, at least until you get to higher level stuff such as tensors.  
  • A function can be denoted by a letter, symbol, or string, and the notation $f(x)$ refers to the value of the function at input $x$.  

For functions defined on numbers, it is common in precalculus and higher to write $f^2(x)$ to denoted $(f(x))^2=f(x)\,f(x)$.  Since the value of certain multiletter functions are commonly written without the parentheses (for example, $\sin\,x$), one writes $\sin^2x$ to mean $(\sin\,x)^2$.

The notation $f^n$ is also widely used to mean the $n$th iterate of $f$ (if it exists), so $f^3(x)=f(f(f(x)))$ and so on.  This leads naturally to writing $f^{-1}(x)$ for the inverse function of $f$; this is common notation whether the function $f$ is bijective or not (in which case $f^{-1}$ is set-valued).  Thus $\sin^{-1}x$ means $\arcsin\,x$.

It is notorious that words in mathematical English have different meanings in different texts.  This is an example in the symbolic language (and not just at the research level) of a systematic construction that can give expressions that have ambiguous meanings.

This phenomenon is an example of why I say the symbolic language of math is a natural formal language: I have described a natural extension of notation used with multiplication of values that has been extended to being used for the binary operation of composition.  And that leads to students thinking that $\sin^{-1}x$ means $\frac{1}{\sin\,x}$. 

History can overtake notation, too: Mathematicians probably took to writing $\sin\,x$ instead of $\sin(x)$ because it saves writing.  That was not very misleading in the old days when mathematical variables were always single symbols.  But students see multiletter variable names all the time these days (in programming languages, Excel and elsewhere), so of course some of them think $\sin\,x$ means $\sin$ times $x$. People who do this are not idiots.

Juxtaposition

Juxtaposition of two symbols means many different things.

  • If $m$ and $n$ are numbers, $mn$ denotes the product of the two numbers.
    • Multiplication is commutative, so $mn$ and $nm$ denote the same number, but they correspond to different calculations.  
  • If $M$ and $N$ are matrices, $MN$ denotes the matrix product of the two matrices.
    • This is a binary operation but it is not the same operation denoted by juxtaposition of numbers. (In fact it involves both addition and multiplication of numbers.)
    • Now $MN$ may not be the same matrix as $NM$.
  • If $A$ and $B$ are points in a geometric drawing, $AB$ denotes the line segment from $A$ to $B$.
    • This is a function of two variables denoting points whose value is a line segment.  
    • It is not what is usually called a binary operation, although as an opinionated category theorist I would call it a multisorted binary operation.
    • It is commutative, but it doesn't make sense to ask if it is associative.

This phenomenon is called overloaded notation.  

  • In order to understand the meaning of the juxtaposition of symbols, you have to know the type of the variables.
  • The surrounding text may tell you specifically the variables denote matrices or whatever. So this is an instance of context-sensitive semantics. 
    • Students tend to expect that they know what any formula means in isolation from the text.  It may make them very sad to discover that this doesn't work — once they believe it, which can take quite a while.
  • In many cases the problem is alleviated by the use of convention.
    • Matrices are usually denoted by capital letters, numbers by lower case letters.
    • But points in geometry are usually denoted by capital letters too.  So you have to know that referring to a geometric diagram is significant to understanding the notation. This is an indirect form of context-sensitivity.  Did any teacher every point this out to students?  Does it appear anywhere in print?

The earlier example of $\sin^{-1}x$ is a case which is not context-sensitive.  Knowing the types of the variables won't help.  Of course, if the author explains which meaning is meant, that explanation is within the context of the book!  That is not a lot of help for grasshoppers like me that look back and forth at different parts of a math book instead of reading it straight through..  

Equations

Consider the expressions

  1. $x^2-5x+4=0$
  2. $x^2+y^2=1$
  3. $x^2+2x+1=(x+1)^2$

They are assertions that two expressions have the same value. A strictly logical view of an equation containing variables is that it puts a constraint on the variables.  It is true of some numbers (or pairs of numbers) and false of others.  That is the defining property of an equation. Equation 1 requires that $x=1$ or $x=4$.  Equation 2 imposes a constraint which is satisfied by uncountably many pairs of real numbers, and is also not true of uncountably many pairs. But equation 3 puts no constraint on the variable.  It is true of every number $x$.

A strictly logical view of symbolic notation does math a disservice.  Here, the notion that an equation is by definition a symbolic statement that has a truth set and a falsity set may be correct but it is not the important thing about any particular equation. When we read and do math we have many different metaphors and images about a concept.  The definition of a kind of object is often in terms of things that may not be the most important things to know about it.  (One of the most important fact about groups is that it is an abstraction of symmetries, which the axioms don't mention at all.)

Equation 1. is something that would make most people set out to discover the truth set.  Equation 2. calls out for drawing its graph.  Equation 3. being an identity means that is useful in algebraic reasoning.  The images they call up are different and what you do with them is different.  The images and metaphors that cluster around a concept are an important part of the semantics of the symbolic language.

I expect to post separately about the semantics of variables and about the semantics of symbolic language embedded in mathematical English.

References

Algebra is a difficult foreign language

Note: This post uses MathJax.  If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.

Algebra

In a previous post, I said that the symbolic language of mathematics is difficult to learn and that we don't teach it well. (The symbolic language includes as a subset the notation used in high school algebra, precalculus, and calculus.) I gave some examples in that post but now I want to go into more detail.  This discussion is an incomplete sketch of some aspects of the syntax of the symbolic language.  I will write one or more posts about the semantics later.

The languages of math

First, let's distinguish between mathematical English and the symbolic language of math. 

  • Mathematical English is a special register or jargon of English. It has not only its special vocabulary, like any jargon, but also used ordinary English words such as "If…then", "definition" and "let" in special ways. 
  • The symbolic language of math is a distinct, special-purpose written language which is not a dialect of the English language and can in fact be read by mathematicians with little knowledge of English.
    • It has its own symbols and rules that are quite different from spoken languages. 
    • Simple expressions can be pronounced, but complicated expressions may only be pointed to or referred to.
  • A mathematical article or book is typically written using mathematical English interspersed with expressions in the symbolic language of math.

Symbolic expressions

A symbolic noun (logicians call it a term) is an expression in the symbolic language that names a number or other mathematical object, and may carry other information as well.

  • "3" is a noun denoting the number 3.
  • "$\text{Sym}_3$" is a noun denoting the symmetric group of order 3.
  • "$2+1$" is a noun denoting the number 3.  But it contains more information than that: it describes a way of calculating 3 as a sum.
  • "$\sin^2\frac{\pi}{4}$" is a noun denoting the number $\frac{1}{2}$, and it also describes a computation that yields the number $\frac{1}{2}$.  If you understand the symbolic language and know that $\sin$ is a numerical function, you can recognize "$\sin^2\frac{\pi}{4}$" as a symbolic noun representing a number even if you don't know how to calculate it.
  • "$2+1$" and "$\sin^2\frac{\pi}{4}$" are said to be encapsulated computations.
    • The word "encapsulated" refers to the fact that to understand what the expressions mean, you must think of the computation not as a process but as an object.
    • Note that a computer program is also an object, not a process.
  • "$a+1$" and "$\sin^2\frac{\pi x}{4}$" are encapsulated computations containing variables that represent numbers. In these cases you can calculate the value of these computations if you give values to the variables.  

symbolic statement is a symbolic expression that represents a statement that is either true or false or free, meaning that it contains variables and is true or false depending on the values assigned to the variables.

  • $\pi\gt0$ is a symbolic assertion that is true.
  • $\pi\lt0$ is a symbolic assertion that it is false.  The fact that it is false does not stop it from being a symbolic assertion.
  • $x^2-5x+4\gt0$ is an assertion that is true for $x=5$ and false for $x=1$.
  • $x^2-5x+4=0$ is an assertion that is true for $x=1$ and $x=4$ and false for all other numbers $x$.
  • $x^2+2x+1=(x+1)^2$ is an assertion that is true for all numbers $x$. 

Properties of the symbolic language

The constituents of a symbolic expression are symbols for numbers, variables and other mathematical objects. In a particular expression, the symbols are arranged according to conventions that must be understood by the reader. These conventions form the syntax or grammar of symbolic expressions. 

The symbolic language has been invented piecemeal by mathematicians over the past several centuries. It is thus a natural language and like all natural languages it has irregularities and often results in ambiguous expressions. It is therefore difficult to learn and requires much practice to learn to use it well. Students learn the grammar in school and are often expected to understand it by osmosis instead of by being taught specifically.  However, it is not as difficult to learn well as a foreign language is.

In the basic symbolic language, expressions are written as strings of symbols.

  • The symbolic language gives (sometimes ambiguous) meaning to symbols placed above or below the line of symbols, so the strings are in some sense more than one dimensional but less than two-dimensional.
  • Integral notation, limit notation, and others, are two-dimensional enough to have two or three levels of symbols. 
  • Matrices are fully two-dimensional symbols, and so are commutative diagrams.
  • I will not consider graphs (in both senses) and geometric drawings in this post because I am not sure what I want to write about them.

Syntax of the language

One of the basic methods of the symbolic language is the use of constructors.  These can usually be analyzed as functions or operators, but I am thinking of "constructor" as a linguistic device for producing an expression denoting a mathematical object or assertion. Ordinary languages have constructors, too; for example "-ness" makes a noun out of a verb ("good" to "goodness") and "and" forms a grouping ("men and women").

Special symbols

The language uses special symbols both as names of specific objects and as constructors.

  • The digits "0", "1", "2" are named by special symbols.  So are some other objects: "$\emptyset$", "$\infty$".
  • Certain verbs are represented by special symbols: "$=$", "$\lt$", "$\in$", "$\subseteq$".
  • Some constructors are infixes: "$2+3$" denotes the sum of 2 and 3 and "$2-3$" denotes the difference between them.
  • Others are placed before, after, above or even below the name of an object.  Examples: $a'$, which can mean the derivative of $a$ or the name of another variable; $n!$ denotes $n$ factorial; $a^\star$ is the dual of $a$ in some contexts; $\vec{v}$ constructs a vector whose name is "$v$".
  • Letters from other alphabets may be used as names of objects, either defined in the context of a particular article, or with more nearly global meaning such as "$\pi$" (but "$\pi$" can denote a projection, too).

This is a lot of stuff for students to learn. Each symbol has its own rules of use (where you put it, which sort of expression you may it with, etc.)  And the meaning is often determined by context. For example $\pi x$ usually means $\pi$ multiplied by $x$, but in some books it can mean the function $\pi$ evaluated at $x$. (But this is a remark about semantics — more in another post.)

"Systematic" notation

  • The form "$f(x)$" is systematically used to denote the value of a function $f$ at the input $x$.  But this usage has variations that confuse beginning students:
    • "$\sin\,x$" is more common than "$\sin(x)$".
    • When the function has just been named as a letter, "$f(x)$" is more common that "$fx$" but many authors do use the latter.
  • Raising a symbol after another symbol commonly denotes exponentiation: "$x^2$" denotes $x$ times $x$.  But it is used in a different meaning in the case of tensors (and elsewhere).
  • Lowering a symbol after another symbol, as in "$x_i$"  may denote an item in a sequence.  But "$f_x$" is more likely to denote a partial derivative.
  • The integral notation is quite complicated.  The expression \[\int_a^b f(x)\,dx\] has three parameters, $a$, $b$ and $f$, and a bound variable $x$ that specifies the variable used in the formula for $f$.  Students gradually learn the significance of these facts as they work with integrals. 

Variables

Variables have deep problems concerned with their meaning (semantics). But substitution for variables causes syntactic problems that students have difficulty with as well.

  • Substituting $4$ for $x$ in the expression $3+x$ results in $3+4$. 
  • Substituting $4$ for $x$ in the expression $3x$ results in $12$, not $34$. 
  • Substituting "$y+z$" in the expression $3x$ results in $3(y+z)$, not $3y+z$.  Some of my calculus students in preforming this substitution would write $3\,\,y+z$, using a space to separate.  The rules don't allow that, but I think it is a perfectly natural mistake. 

Using expressions and writing about them

  • If I write "If $x$ is an odd integer, then $3+x$ is odd", then I am using $3+x$ in a sentence. It is a noun denoting an unspecified number which can be constructed in a specified way.
  • When I mention substituting $4$ for $x$ in "$3+x$", I am talking about the expression $3+x$.  I am not writing about a number, I am writing about a string of symbols.  This distinction causes students major difficulties and teacher hardly ever talk about it.
  • In the section on variables, I wrote "the expression $3+x$", which shows more explicitly that I am talking about it as an expression.
    • Note that quotes in novels don't mean you are talking about the expression inside the quotes, it means you are describing the act of a person saying something.
  • It is very common to write something like, "If I substitute $4$ for $x$ in $3x$ I get $3 \times 4=12$".  This is called a parenthetic assertion, and it is literally nonsense (it says I get an equation).
  • If I pronounce the sentence "We know that $x\gt0$" we pronounce "$x\gt0$" as "$x$ is greater than zero",  If I pronounce the sentence "For any $x\gt0$ there is $y\gt0$ for which $x\gt y$", then I pronounce the expression "$x\gt0$" as "$x$ greater than zero$",  This is an example of context-sensitive pronunciation
  • There is a lot more about parenthetic assertions and context-sensitive pronunciation in More about the languages of math.

Conclusion

I have described some aspects of the syntax of the symbolic language of math. Learning that syntax is difficult and requires a lot of practice. Students who manage to learn the syntax and semantics can go on to learn further math, but students who don't are forever blocked from many rewarding careers. I heard someone say at the MathFest in Madison that about 25% of all high school students never really understand algebra.  I have only taught college students, but some students (maybe 5%) who get into freshman calculus in college are weak enough in algebra that they cannot continue. 

I am not proposing that all aspects of the syntax (or semantics) be taught explicitly.  A lot must be learned by doing algebra, where they pick up the syntax subconsciously just as they pick up lots of other behavior-information in and out of school. But teachers should explicitly understand the structure of algebra at least in some basic way so that they can be aware of the source of many of the students' problems. 

It is likely that the widespread use of computers will allow some parts of the symbolic language of math to be replaced by other methods such as using Excel or some visual manipulation of operations as suggested in my post Mathematical and linguistic ability.  It is also likely that the symbolic language will gradually be improved to get rid of ambiguities and irregularities.  But a deliberate top-down effort to simplify notation will not succeed. Such things rarely succeed.

References

 

 

Mathematical and linguistic ability

This post uses MathJax.  If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen.  Sometimes you have to do it two or three times.

Some personal history

When I was young, I was your typical nerdy geek.  (Never mind what I am now that I am old.)

In high school, I was fascinated by languages, primarily by their structure.  I would have wanted to become a linguist if I had known there was such a thing.  I was good at grasping the structure of a language and read grammars for fun. I was only pretty good at picking up vocabulary. I studied four different languages in high school and college and Turkish when I was in the military.  I know a lot about their structure but am not fluent in any of them (possibly including English).

After college, I decided to go to math grad school.  This was soon after Sputnik and jobs for PhD's were temporarily easy to get.

I always found algebra easy.  When I had to learn other symbolic languages, for example set theory, first order logic, and early programming languages, I found them easy too.  I had enough geometric insight that I did well in all my math courses, but my real strength was in learning languages. 

When I got a job at (what is now) Case Western Reserve University, I began learning category theory and a bit of cohomology of groups. I wrote a paper about group automorphisms that got into Transactions of the AMS.  (Full disclosure: I am bragging). 

The way Saunders Mac Lane did cohomology, he used "$+$" as a noncommutative operation.  No problem with that, I did lots of calculations in his notation.  In reading category theory I learned how to reason using commutative diagrams.  That is radically different from other math — it isn't strings of symbols — but I caught on. I read Beck's thesis in detail.  Beck wrote functions on the right (unlike Mac Lane) which I adapted to with no problem.  In fact my automorphisms paper and many others in those days was written with functions on the right. 

Later on in my career, I learned to program in Forth reasonably well. It is a reverse Polish language. Then (by virtue of summer grants in the 1990's) to use Mathematica, which I now use a lot:  I am an "experienced" user but not an "expert".

Learning foreign languages in studying math

I taught mostly engineering students during my 35 years at CWRU (especially computer engineering). When I used a text (including my own discrete math class notes) some students pleaded with me not to use $P\wedge Q$ and $P \vee Q$ but let them use $PQ$ and $P+Q$ like they did in their CS courses.  Likewise $1$ and $0$ instead of T and F.  Many of them simply could not switch easily between different codes.  Similar problems occurred in classes in first order logic. 

In the early days of calculators when most of them were reverse Polish, some students never mastered their use. 

These days, a common complaint about Mathematica is that it is a difficult language to learn; at the MAA meeting in Madison (where I am as I write this) they didn't even staff a booth.  Apparently too many of the professors can't handle Mathematica.

I gave up writing papers with functions on the right because several professional mathematicians complained that they found them too hard to read. I guess not all professional mathematicians can switch code easily, either. 

There are many great mathematicians whose main strength is geometric understanding, not linguistic understanding.  Nevertheless, to become a mathematician you have to have enough linguistic ability to learn…

Algebra

The big elephant in the room is ordinary symbolic algebra as is used in high school algebra and precalculus.  This of course causes difficulty among first year calculus students, too, but college profs are spared the problem that high school teachers have with a large percentage of the students never really grasping how algebra works.  We don't see those students in STEM courses.

It is surely the case that algebra is a difficult and unintuitive foreign language.  I have carried on about this in my stuff about the languages of math in my abstractmath site. 

Some students already in college don't really understand expressions such as $x^2$.  You still get some who sporadically think it means $2x$.  (They don't always think that, but it happens when they are off guard.)  Lots of them don't understand the difference between $x^2$ and $2^x$.

In complicated situations, students don't grasp the difference between an expression such as $x^2+2x+1$ and a statement like $x^2+2x+1=0$.  Not to mention the difference between the way $x^2+2x+1=0$ and $x^2+2x+1=(x+1)^2$ are different kinds of statements even though the difference is not indicated in the syntax.

There are many irregularities and ambiguities (just like any natural language — the symbolic language of math is a natural language!): consider $\sin xy$, $\sin x + y$, $\sin x/y$.  (Don't squawk to me about order of operators.  That's as bad as aus, außer, bei, mit, zu.  German can't help it, but mathematical notation could.)

One monstrous ambiguity is $(x,y)$, which could be an ordered pair, the GCD, or an open interval.  I found an example of two of those in the same sentence in the Handbook of Mathematical Discourse, and today in a lecture I saw someone use it with two meanings about three inches apart on a transparency.

Anyway, the symbolic language of math is difficult and we don't teach it well.

Structuring calculations

There are other ways to structure calculations that are much more transparent.  Most of them use two or three dimensions.

  • Spreadsheets: It is easy to approximate the zeros of a function using a spreadsheet and changing the input till you get the value near zero. Why can't middle school students be taught that?
  • Bret Victor has made suggestions for easy ways to calculate things.
  • My post Visible Algebra I suggest a two-dimensional approach to putting together calculations.  (There are several more posts coming about that idea.)
  • Mathematica interactive demos could maybe be provided in a way that would allow them to be joined together to make a complicated calculation. (Modules such as an inverse image constructor.)  I have not tried to do this.

A lot of these alternatives work better because they make full use of two dimensions.  Toolkits could be made for elementary school students (there are some already but I am not familiar with them).  

It is impractical to expect that every high school student master basic algebraic notation.  It is difficult and we don't know how to teach it to everyone. With the right toolkits, we could provide everyone, not just students, to put together usable calculations on their computer and experiment with them.  This includes working out the effect of different payment periods on loans, how much paint you need for a room, and many other things.

STEM students will still have to learn algebraic notation as we use it now.  It should be taught as a foreign language with explicit instruction in its syntax (sentences and terms, scope of an operator, and so on), ambiguities and peculiarities.

Metaphors in computing science 2

In Metaphors in Computer Science 1, I discussed some metaphors used when thinking about various aspects of computing.  This is a continuation of that post.

Metaphor: A program is a list of instructions.

  • I discussed this metaphor in detail in the earlier post.
  • Note particularly that the instructions can be in a natural or a programming language. (Is that a zeugma?)  Many writers would call instructions in a natural language an algorithm.
  • I will continue to use “program” in the broader sense.

Metaphor: A programming language is a language.

  • This metaphor is a specific conceptual blend that associates the strings of symbols that constitute a program in a computer language with text in a natural language.
  • The metaphor is based on some similarities between expressions in a programming language and expressions in a natural language.
    • In both, the expressions have a meaning.
    • Both natural and programming languages have specific rules for constructing well-formed expressions.
  • This way of thinking ignores many deep differences between programming languages and natural languages. In particular, they don’t talk about the same things!
  • The metaphor has been powerful in suggesting ways of thinking about computer programs, for example semantics (below) and ambiguity.

Metaphor: A computer program is a list of statements

  • A consequence of this metaphor is that a computer program is a list of symbols that can be stored in a computer’s memory.
  • This metaphor comes with the assumption that if the program is written in accordance with the language’s rules, a computer can execute the program and perhaps produce an output.
  • This is the profound discovery, probably by Alan Turing, that made the computer revolution possible. (You don’t have to have different physical machines to do different things.)
  • You may want me to say more in the heading above: “A computer program is a list of statements in a programming language that satisfies the well-formedness requirements of the language.”  But the point of the metaphor is only that a program is a list of statements.  The metaphor is not intended to define the concept of “program”.

Metaphor: A program in a computer language has meanings.

A program is intended to mean something to a human reader.

  • Some languages are designed to be easily read by a human reader: Cobol, Basic, SQL.
    • Their instructions look like English.
    • The algorithm can nevertheless be difficult to understand.
  • Some languages are written in a dense symbolic style.
    • In many cases the style is an extension of the style of algebraic formulas: C, Fortran.
    • Other languages are written in a notation not based on algebra:  Lisp, APL, Forth.
  • The boundary between “easily read” and “dense symbolic” is a matter of opinion!

A program is intended to be executed by a computer.

  • The execution always involves translation into intermediate languages. 
    • Most often the execution requires repeated translation into a succession of intermediate languages.
    • Each translation requires the preservation of the intended meaning of the program.
  • The preservation of intended meaning is what is usually called the semanticsof a programming language.
    • In fact, the meaning of the program to a person could be called semantics, too.
    • And the human semantics had better correspond in “meaning” to the machine semantics!
  • The actual execution of the program requires successive changes in the state of the computer.
    • By “state” I mean a list of the form of the electrical charges of each unit of memory in the computer.
    • Or you can restrict it to the relevant units of memory, but spelling that out is horrifying to contemplate.
    • The resulting state of the machine after the program is run is required to preserve the intended meaning as well as all the intermediate translations.
    • Notice that the actual execution is a series of physical events.  You can describe the execution in English or in some notation, but that notation is not the actual execution.

References

Conceptual blend (Wikipedia)

Conceptual metaphors (Wikipedia)

Images and Metaphors (article in abstractmath)

Semantics in computer science (Wikipedia)

Metaphors in computing science I

(This article is continued in Metaphors in computing science II)

Michael Barr recently told me of a transcription of a talk by Edsger Dijkstra dissing the use of metaphors in teaching programming and advocating that every program be written together with a proof that it works.  This led me to think about the metaphors used in computing science, and that is what this post is about.  It is not a direct answer to what Dijkstra said. 

We understand almost anything by using metaphors.  This is a broader sense of metaphor than that thing in English class where you had to say "my love is a red red rose" instead of "my love is like a red red rose".  Here I am talking about conceptual metaphors (see references at the end of the post).  

Metaphor: A program is a set of instructions

You can think of a program as a list of instructions that you can read and, if it is not very complicated, understand how to carry them out.  This metaphor comes from your experience with directions on how to do something (like directions from Google Maps or for assembling a toy).   In the case of a program, you can visualize doing what the program says to do and coming out with the expected output. This is one of the fundamental metaphors for programs. 

Such a program may be informal text or it may be written in a computer language.

Example

A description of how to calculate $n!$ in English could be:  "Multiply the integers $1$ through $n$".  In Mathematica, you could define the factorial function this way:

fac[n_] := Apply[Times, Table[i, {i, 1, n}]]

This more or less directly copies the English definition, which could have been reworded as "Apply the Times function to the integers from $1$ to $n$ inclusive."  Mathematica programmers customarily use the abbreviation "@@" for Apply because it is more convenient:

Fac[n_]:=Times @@ Table[i, {i, 1, 6}]

As far as I know, C does not have list operations built in.  This simple program gives you the factorial function evaluated at $n$:

 j=1;  for (i=2; i<=n; i++)   j=j*i; return j;  

This does the calculation in a different way: it goes through the numbers $1, 2,\ldots,n$ and multiplies the result-so-far by the new number.  If you are old enough to remember Pascal or Basic, you will see that there you could use a DO loop to accomplish the same thing.

What this metaphor makes you think of

Every metaphor suggests both correct and incorrect ideas about the concept.  

  • If you think of a list of instructions, you typically think that you should carry out the instructions in order.  (If they are Ikea instructions, your experience may have taught you that you must carry out the instructions in order.)  
  • In fact, you don't have to "multiply the numbers from $1$ to $n$" in order at all: You could break the list of numbers into several lists and give each one to a different person to do, and they would give their answers to you and you would multiply them together.
  • The instructions for calculating the factorial can be translated directly into Mathematica instructions, which does not specify an order.   When $n$ is large enough, Mathematica would in fact do something like the process of giving it to several different people (well, processors) to speed things up.
  • I had hoped that Wolfram alpha would answer "720" if I wrote "multiply the numbers from $1$ to $6$" in its box, but it didn't work.  If it had worked, the instruction in English would not be translated at all. (Note added 7 July 2012:  Wolfram has repaired this.)
  • The example program for C that I gave above explicitly multiplies the numbers together in order from little to big.  That is the way it is usually taught in class.  In fact, you could program a package for lists using pointers (a process taught in class!) and then use your package to write a C program that looks like the  "multiply the numbers from $1$ to $n$" approach.  I don't know much about C; a reader could probably tell me other better ways to do it.

So notice what happened:

  • You can translate the "multiply the numbers from $1$ to $n$" directly into Mathematica.
  •  For C, you have to write a program that implements multiplying the numbers from $1$ to $n$. Implementation in this sense doesn't seem to come up when we think about instruction sets for putting furniture together.  It is sort of like: Build a robot to insert & tighten all the screws.

Thus the concept of program in computing science comes with the idea of translating the program instruction set into another instruction set.

  • The translation provided above for Mathematica resembles translating the instruction set into another language. 
  • The two translations I suggested for C (the program and the definition of a list package to be used in the translation) are not like translating from English to another language.  They involve a conceptual reconstruction of the set of instructions.

Similarly, a compiler translates a program in a computer language into machine code, which involves automated conceptual reconstruction on a vast scale.

Other metaphors

In writing about this, I have brought in other metaphors, for example:

  • C or Mathematica as like a natural language in some ways 
  • Compiling (or interpreting) as translation

Computing science has used other VIM's (Very Important Metaphors) that I need to write about later:

  • Semantics (metaphor: meaning)
  • Program as text – this allows you to treat the program as a mathematical object
  • Program as machine, with states and actions like automata and Turing machines.
  • Specification of a program.  You can regard  "the product of the numbers from $1$ to $n$" as a specification.  Notice that saying "the product" instead of "multiply" changes the metaphor from "instruction" to "specification".

References

Conceptual metaphors (Wikipedia)

Images and Metaphors (article in abstractmath)

Images and Metaphors for Sets (article in abstractmath)

Images and Metaphors for Functions (incomplete article in abstractmath)

 

 

The meaning of the word “superposition”

This is from the Wikipedia article on Hilbert's 13th Problem as it was on 31 March 2012:

[Hilbert's 13th Problem suggests this] question: can every continuous function of three variables be expressed as a composition  of finitely many continuous functions of two variables? The affirmative answer to this general question was given in 1957 by Vladimir Arnold, then only nineteen years old and a student of Andrey Kolmogorov. Kolmogorov had shown in the previous year that any function of several variables can be constructed with a finite number of three-variable functions. Arnold then expanded on this work to show that only two-variable functions were in fact required, thus answering Hilbert's question.  

In their paper A relation between multidimensional data compression and Hilbert’s 13th  problem,  Masahiro Yamada and Shigeo Akashi describe an example of Arnold's theorem this way: 

Let $f ( \cdot , \cdot, \cdot )$ be the function of three variable defined as \(f(x, y, z)=xy+yz+zx\), $x ,y , z\in \mathbb{C}$ . Then, we can easily prove that there do not exist functions of two variables $g(\cdot , \cdot )$ , $u(\cdot, \cdot)$ and $v(\cdot , \cdot )$ satisfying the following equality: $f(x, y, z)=g(u(x, y),v(x, z)) , x , y , z\in \mathbb{C}$ . This result shows us that $f$ cannot be represented any 1-time nested superposition constructed from three complex-valued functions of two variables. But it is clear that the following equality holds: $f(x, y, z)=x(y+z)+(yz)$ , $x,y,z\in \mathbb{C}$ . This result shows us that $f$ can be represented as a 2-time nested superposition.

The article about superposition in All about circuits says:

The strategy used in the Superposition Theorem is to eliminate all but one source of power within a network at a time, using series/parallel analysis to determine voltage drops (and/or currents) within the modified network for each power source separately. Then, once voltage drops and/or currents have been determined for each power source working separately, the values are all “superimposed” on top of each other (added algebraically) to find the actual voltage drops/currents with all sources active. 

Superposition Theorem in Wikipedia:

The superposition theorem for electrical circuits states that for a linear system the response (Voltage or Current) in any branch of a bilateral linear circuit having more than one independent source equals the algebraic sum of the responses caused by each independent source acting alone, while all other independent sources are replaced by their internal impedances.

Quantum superposition in Wikipedia:  

Quantum superposition is a fundamental principle of quantum mechanics. It holds that a physical system — such as an electron – exists partly in all its particular, theoretically possible states (or, configuration of its properties) simultaneously; but, when measured, it gives a result corresponding to only one of the possible configurations (as described in interpretation of quantum mechanics).

Mathematically, it refers to a property of solutions to the Schrödinger equation; since theSchrödinger equation is linear, any linear combination of solutions to a particular equation will also be a solution of it. Such solutions are often made to be orthogonal (i.e. the vectors are at right-angles to each other), such as the energy levels of an electron. By doing so the overlap energy of the states is nullified, and the expectation value of an operator (any superposition state) is the expectation value of the operator in the individual states, multiplied by the fraction of the superposition state that is "in" that state

The CIO midmarket site says much the same thing as the first paragraph of the Wikipedia Quantum Superposition entry but does not mention the stuff in the second paragraph.

In particular, the  Yamada & Akashi article describes the way the functions of two variables are put together as "superposition", whereas the Wikipedia article on Hilbert's 13th calls it composition.  Of course, superposition in the sense of the Superposition Principle is a composition of multivalued functions with the top function being addition.  Both of Yamada & Akashi's examples have addition at the top.  But the Arnold theorem allows any continuous function at the top (and anywhere else in the composite).  

So one question is: is the word "superposition" ever used for general composition of multivariable functions? This requires the kind of research I proposed in the introduction of The Handbook of Mathematical Discourse, which I am not about to do myself.

The first Wikipedia article above uses "composition" where I would use "composite".  This is part of a general phenomenon of using the operation name for the result of the operation; for examples, students, even college students, sometimes refer to the "plus of 2 and 3" instead of the "sum of 2 and 3". (See "name and value" in abstractmath.org.)  Using "composite" for "composition" is analogous to this, although the analogy is not perfect.  This may be a change in progress in the language which simplifies things without doing much harm.  Even so, I am irritated when "composition" is used for "composite".

Quantum superposition seems to be a separate idea.  The second paragraph of the Wikipedia article on quantum superposition probably explains the use of the word in quantum mechanics.