## Metaphors in computing science I

Michael Barr recently told me of a transcription of a talk by Edsger Dijkstra dissing the use of metaphors in teaching programming and advocating that every program be written together with a proof that it works.  This led me to think about the metaphors used in computing science, and that is what this post is about.  It is not a direct answer to what Dijkstra said.

We understand almost anything by using metaphors.  This is a broader sense of metaphor than that thing in English class where you had to say "my love is a red red rose" instead of "my love is like a red red rose".  Here I am talking about conceptual metaphors (see references at the end of the post).

### Metaphor: A program is a set of instructions

You can think of a program as a list of instructions that you can read and, if it is not very complicated, understand how to carry them out.  This metaphor comes from your experience with directions on how to do something (like directions from Google Maps or for assembling a toy).   In the case of a program, you can visualize doing what the program says to do and coming out with the expected output. This is one of the fundamental metaphors for programs.

Such a program may be informal text or it may be written in a computer language.

#### Example

A description of how to calculate $n!$ in English could be:  "Multiply the integers $1$ through $n$".  In Mathematica, you could define the factorial function this way:

fac[n_] := Apply[Times, Table[i, {i, 1, n}]]

This more or less directly copies the English definition, which could have been reworded as "Apply the Times function to the integers from $1$ to $n$ inclusive."  Mathematica programmers customarily use the abbreviation "@@" for Apply because it is more convenient:

Fac[n_]:=Times @@ Table[i, {i, 1, 6}]

As far as I know, C does not have list operations built in.  This simple program gives you the factorial function evaluated at $n$:

j=1;  for (i=2; i<=n; i++)   j=j*i; return j;

This does the calculation in a different way: it goes through the numbers $1, 2,\ldots,n$ and multiplies the result-so-far by the new number.  If you are old enough to remember Pascal or Basic, you will see that there you could use a DO loop to accomplish the same thing.

#### What this metaphor makes you think of

Every metaphor suggests both correct and incorrect ideas about the concept.

• If you think of a list of instructions, you typically think that you should carry out the instructions in order.  (If they are Ikea instructions, your experience may have taught you that you must carry out the instructions in order.)
• In fact, you don't have to "multiply the numbers from $1$ to $n$" in order at all: You could break the list of numbers into several lists and give each one to a different person to do, and they would give their answers to you and you would multiply them together.
• The instructions for calculating the factorial can be translated directly into Mathematica instructions, which does not specify an order.   When $n$ is large enough, Mathematica would in fact do something like the process of giving it to several different people (well, processors) to speed things up.
• I had hoped that Wolfram alpha would answer "720" if I wrote "multiply the numbers from $1$ to $6$" in its box, but it didn't work.  If it had worked, the instruction in English would not be translated at all. (Note added 7 July 2012:  Wolfram has repaired this.)
• The example program for C that I gave above explicitly multiplies the numbers together in order from little to big.  That is the way it is usually taught in class.  In fact, you could program a package for lists using pointers (a process taught in class!) and then use your package to write a C program that looks like the  "multiply the numbers from $1$ to $n$" approach.  I don't know much about C; a reader could probably tell me other better ways to do it.

So notice what happened:

• You can translate the "multiply the numbers from $1$ to $n$" directly into Mathematica.
•  For C, you have to write a program that implements multiplying the numbers from $1$ to $n$. Implementation in this sense doesn't seem to come up when we think about instruction sets for putting furniture together.  It is sort of like: Build a robot to insert & tighten all the screws.

Thus the concept of program in computing science comes with the idea of translating the program instruction set into another instruction set.

• The translation provided above for Mathematica resembles translating the instruction set into another language.
• The two translations I suggested for C (the program and the definition of a list package to be used in the translation) are not like translating from English to another language.  They involve a conceptual reconstruction of the set of instructions.

Similarly, a compiler translates a program in a computer language into machine code, which involves automated conceptual reconstruction on a vast scale.

#### Other metaphors

In writing about this, I have brought in other metaphors, for example:

• C or Mathematica as like a natural language in some ways
• Compiling (or interpreting) as translation

Computing science has used other VIM's (Very Important Metaphors) that I need to write about later:

• Semantics (metaphor: meaning)
• Program as text – this allows you to treat the program as a mathematical object
• Program as machine, with states and actions like automata and Turing machines.
• Specification of a program.  You can regard  "the product of the numbers from $1$ to $n$" as a specification.  Notice that saying "the product" instead of "multiply" changes the metaphor from "instruction" to "specification".

#### References

Conceptual metaphors (Wikipedia)

Images and Metaphors (article in abstractmath)

Images and Metaphors for Sets (article in abstractmath)

Images and Metaphors for Functions (incomplete article in abstractmath)

## Syntax Trees in Mathematicians’ Brains

Understanding the quadratic formula

In my last post I wrote about how a student’s pattern recognition mechanism can go awry in applying the quadratic formula.

The template for the quadratic formula says that the solution of a quadratic equation of the form $latex {ax^2+bx+c=0}&fg=000000$ is given by the formula

$latex \displaystyle x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}&fg=000000$

When you ask students to solve $latex {a+bx+cx^2=0}&fg=000000$ some may write

$latex \displaystyle x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}&fg=000000$

$latex \displaystyle x=\frac{-b\pm\sqrt{b^2-4ac}}{2c}&fg=000000$

That’s because they have memorized the template in terms of the letters $latex {a}&fg=000000$, $latex {b}&fg=000000$ and $latex {c}&fg=000000$ instead of in terms of their structural meaning — $latex {a}&fg=000000$ is the coefficient of the quadratic term, $latex {c}&fg=000000$ is the constant term, etc.

The problem occurs because there is a clash between the occurrences of the letters “a”, “b”, and “c” in the template and in the equation to solve. But maybe the confusion would occur anyway, just because of the ordering of the coefficients. As I asked in the previous post, what happens if students are asked to solve $latex {3+5x+2x^2=0}&fg=000000$ after having learned the quadratic formula in terms of $latex {ax^2+bx+c=0}&fg=000000$? Some may make the same kind of mistake, getting $latex {x=-1}&fg=000000$ and $latex {x=-\frac{2}{3}}&fg=000000$ instead of $latex {x=-1}&fg=000000$ and $latex {x=-\frac{3}{2}}&fg=000000$. Has anyone ever investigated this sort of thing?

People do pattern recognition remarkably well, but how they do it is mysterious. Just as mistakes in speech may give the linguist a clue as to how the brain processes language, students’ mistakes may tell us something about how pattern recognition works in parsing symbolic statements as well as perhaps suggesting ways to teach them the correct understanding of the quadratic formula.

Syntactic Structure

“Structural meaning” refers to the syntactic structure of a mathematical expression such as $latex {3+5x+2x^2}&fg=000000$. It can be represented as a tree:

(1)

This is more or less the way a program compiler or interpreter for some language would represent the polynomial. I believe it corresponds pretty well to the organization of the quadratic-polynomial parser in a mathematician’s brain. This is not surprising: The compiler writer would have to have in mind the correct understanding of how polynomials are evaluated in order to write a correct compiler.

Linguists represent English sentences with syntax trees, too. This is a deep and complicated subject, but the kind of tree they would use to represent a sentence such as “My cousin saw a large ship” would look like this:

### Parsing by mathematicians

Presumably a mathematician has constructed a parser that builds a structure in their brain corresponding to a quadratic polynomial using the same mechanisms that as a child they learned to parse sentences in their native language. The mathematician learned this mostly unconsciously, just as a child learns a language. In any case it shouldn’t be surprising that the mathematicians’s syntax tree for the polynomial is similar to the compiler’s.

Students who are not yet skilled in algebra have presumably constructed incorrect syntax trees, just as young children do for their native language.

Lots of theoretical work has been done on human parsing of natural language. Parsing mathematical symbolism to be compiled into a computer program is well understood. You can get a start on both of these by reading the Wikipedia articles on parsing and on syntax trees.

There are papers on students’ misunderstandings of mathematical notation. Two articles I recently turned up in a Google search are:

Both of these papers talk specifically about the syntax of mathematical expressions. I know I have read other such papers in the past, as well.

What I have not found is any study of how the trained mathematician parses mathematical expression.

For one thing, for my parsing of the expression $latex {3+5x+2x^2}&fg=000000$, the branching is wrong in (1). I think of $latex {3+5x+2x^2}&fg=000000$ as “Take 3 and add $latex {5x}&fg=000000$ to it and then add $latex {2x^2}&fg=000000$ to that”, which would require the shape of the tree to be like this:

I am saying this from introspection, which is dangerous!

Of course, a compiler may group it that way, too, although my dim recollection of the little bit I understand about compilers is that they tend to group it as in (1) because they read the expression from left to right.

This difference in compiling is well-understood.  Another difference is that the expression could be compiled using addition as an operator on a list, in this case a list of length 3.  I don’t visualize quadratics that way but I certainly understand that it is equivalent to the tree in Diagram (1).  Maybe some mathematicians do think that way.

But these observations indicate what might be learned about mathematicians’ understanding of mathematical expressions if linguists and mathematicians got together to study human parsing of expressions by trained mathematicians.

Some educational constructivists argue against the idea that there is only one correct way to understand a mathematical expression.  To have many metaphors for thinking about math is great, but I believe we want uniformity of understanding of the symbolism, at least in the narrow sense of parsing, so that we can communicate dependably.  It would be really neat if we discovered deep differences in parsing among mathematicians.  It would also be neat if we discovered that mathematicians parsed in generally the same way!