## The meaning of the word “superposition”

This is from the Wikipedia article on Hilbert's 13th Problem as it was on 31 March 2012:

[Hilbert's 13th Problem suggests this] question: can every continuous function of three variables be expressed as a composition  of finitely many continuous functions of two variables? The affirmative answer to this general question was given in 1957 by Vladimir Arnold, then only nineteen years old and a student of Andrey Kolmogorov. Kolmogorov had shown in the previous year that any function of several variables can be constructed with a finite number of three-variable functions. Arnold then expanded on this work to show that only two-variable functions were in fact required, thus answering Hilbert's question.

In their paper A relation between multidimensional data compression and Hilbert’s 13th  problem,  Masahiro Yamada and Shigeo Akashi describe an example of Arnold's theorem this way:

Let $f ( \cdot , \cdot, \cdot )$ be the function of three variable defined as $f(x, y, z)=xy+yz+zx$, $x ,y , z\in \mathbb{C}$ . Then, we can easily prove that there do not exist functions of two variables $g(\cdot , \cdot )$ , $u(\cdot, \cdot)$ and $v(\cdot , \cdot )$ satisfying the following equality: $f(x, y, z)=g(u(x, y),v(x, z)) , x , y , z\in \mathbb{C}$ . This result shows us that $f$ cannot be represented any 1-time nested superposition constructed from three complex-valued functions of two variables. But it is clear that the following equality holds: $f(x, y, z)=x(y+z)+(yz)$ , $x,y,z\in \mathbb{C}$ . This result shows us that $f$ can be represented as a 2-time nested superposition.

The strategy used in the Superposition Theorem is to eliminate all but one source of power within a network at a time, using series/parallel analysis to determine voltage drops (and/or currents) within the modified network for each power source separately. Then, once voltage drops and/or currents have been determined for each power source working separately, the values are all “superimposed” on top of each other (added algebraically) to find the actual voltage drops/currents with all sources active.

Superposition Theorem in Wikipedia:

The superposition theorem for electrical circuits states that for a linear system the response (Voltage or Current) in any branch of a bilateral linear circuit having more than one independent source equals the algebraic sum of the responses caused by each independent source acting alone, while all other independent sources are replaced by their internal impedances.

Quantum superposition is a fundamental principle of quantum mechanics. It holds that a physical system — such as an electron – exists partly in all its particular, theoretically possible states (or, configuration of its properties) simultaneously; but, when measured, it gives a result corresponding to only one of the possible configurations (as described in interpretation of quantum mechanics).

Mathematically, it refers to a property of solutions to the Schrödinger equation; since theSchrödinger equation is linear, any linear combination of solutions to a particular equation will also be a solution of it. Such solutions are often made to be orthogonal (i.e. the vectors are at right-angles to each other), such as the energy levels of an electron. By doing so the overlap energy of the states is nullified, and the expectation value of an operator (any superposition state) is the expectation value of the operator in the individual states, multiplied by the fraction of the superposition state that is "in" that state

The CIO midmarket site says much the same thing as the first paragraph of the Wikipedia Quantum Superposition entry but does not mention the stuff in the second paragraph.

In particular, the  Yamada & Akashi article describes the way the functions of two variables are put together as "superposition", whereas the Wikipedia article on Hilbert's 13th calls it composition.  Of course, superposition in the sense of the Superposition Principle is a composition of multivalued functions with the top function being addition.  Both of Yamada & Akashi's examples have addition at the top.  But the Arnold theorem allows any continuous function at the top (and anywhere else in the composite).

So one question is: is the word "superposition" ever used for general composition of multivariable functions? This requires the kind of research I proposed in the introduction of The Handbook of Mathematical Discourse, which I am not about to do myself.

The first Wikipedia article above uses "composition" where I would use "composite".  This is part of a general phenomenon of using the operation name for the result of the operation; for examples, students, even college students, sometimes refer to the "plus of 2 and 3" instead of the "sum of 2 and 3". (See "name and value" in abstractmath.org.)  Using "composite" for "composition" is analogous to this, although the analogy is not perfect.  This may be a change in progress in the language which simplifies things without doing much harm.  Even so, I am irritated when "composition" is used for "composite".

Quantum superposition seems to be a separate idea.  The second paragraph of the Wikipedia article on quantum superposition probably explains the use of the word in quantum mechanics.

## Syntax Trees in Mathematicians’ Brains

In my last post I wrote about how a student’s pattern recognition mechanism can go awry in applying the quadratic formula.

The template for the quadratic formula says that the solution of a quadratic equation of the form $latex {ax^2+bx+c=0}&fg=000000$ is given by the formula

$latex \displaystyle x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}&fg=000000$

When you ask students to solve $latex {a+bx+cx^2=0}&fg=000000$ some may write

$latex \displaystyle x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}&fg=000000$

$latex \displaystyle x=\frac{-b\pm\sqrt{b^2-4ac}}{2c}&fg=000000$

That’s because they have memorized the template in terms of the letters $latex {a}&fg=000000$, $latex {b}&fg=000000$ and $latex {c}&fg=000000$ instead of in terms of their structural meaning — $latex {a}&fg=000000$ is the coefficient of the quadratic term, $latex {c}&fg=000000$ is the constant term, etc.

The problem occurs because there is a clash between the occurrences of the letters “a”, “b”, and “c” in the template and in the equation to solve. But maybe the confusion would occur anyway, just because of the ordering of the coefficients. As I asked in the previous post, what happens if students are asked to solve $latex {3+5x+2x^2=0}&fg=000000$ after having learned the quadratic formula in terms of $latex {ax^2+bx+c=0}&fg=000000$? Some may make the same kind of mistake, getting $latex {x=-1}&fg=000000$ and $latex {x=-\frac{2}{3}}&fg=000000$ instead of $latex {x=-1}&fg=000000$ and $latex {x=-\frac{3}{2}}&fg=000000$. Has anyone ever investigated this sort of thing?

People do pattern recognition remarkably well, but how they do it is mysterious. Just as mistakes in speech may give the linguist a clue as to how the brain processes language, students’ mistakes may tell us something about how pattern recognition works in parsing symbolic statements as well as perhaps suggesting ways to teach them the correct understanding of the quadratic formula.

Syntactic Structure

“Structural meaning” refers to the syntactic structure of a mathematical expression such as $latex {3+5x+2x^2}&fg=000000$. It can be represented as a tree:

(1)

This is more or less the way a program compiler or interpreter for some language would represent the polynomial. I believe it corresponds pretty well to the organization of the quadratic-polynomial parser in a mathematician’s brain. This is not surprising: The compiler writer would have to have in mind the correct understanding of how polynomials are evaluated in order to write a correct compiler.

Linguists represent English sentences with syntax trees, too. This is a deep and complicated subject, but the kind of tree they would use to represent a sentence such as “My cousin saw a large ship” would look like this:

### Parsing by mathematicians

Presumably a mathematician has constructed a parser that builds a structure in their brain corresponding to a quadratic polynomial using the same mechanisms that as a child they learned to parse sentences in their native language. The mathematician learned this mostly unconsciously, just as a child learns a language. In any case it shouldn’t be surprising that the mathematicians’s syntax tree for the polynomial is similar to the compiler’s.

Students who are not yet skilled in algebra have presumably constructed incorrect syntax trees, just as young children do for their native language.

Lots of theoretical work has been done on human parsing of natural language. Parsing mathematical symbolism to be compiled into a computer program is well understood. You can get a start on both of these by reading the Wikipedia articles on parsing and on syntax trees.

There are papers on students’ misunderstandings of mathematical notation. Two articles I recently turned up in a Google search are:

Both of these papers talk specifically about the syntax of mathematical expressions. I know I have read other such papers in the past, as well.

What I have not found is any study of how the trained mathematician parses mathematical expression.

For one thing, for my parsing of the expression $latex {3+5x+2x^2}&fg=000000$, the branching is wrong in (1). I think of $latex {3+5x+2x^2}&fg=000000$ as “Take 3 and add $latex {5x}&fg=000000$ to it and then add $latex {2x^2}&fg=000000$ to that”, which would require the shape of the tree to be like this:

I am saying this from introspection, which is dangerous!

Of course, a compiler may group it that way, too, although my dim recollection of the little bit I understand about compilers is that they tend to group it as in (1) because they read the expression from left to right.

This difference in compiling is well-understood.  Another difference is that the expression could be compiled using addition as an operator on a list, in this case a list of length 3.  I don’t visualize quadratics that way but I certainly understand that it is equivalent to the tree in Diagram (1).  Maybe some mathematicians do think that way.

But these observations indicate what might be learned about mathematicians’ understanding of mathematical expressions if linguists and mathematicians got together to study human parsing of expressions by trained mathematicians.

Some educational constructivists argue against the idea that there is only one correct way to understand a mathematical expression.  To have many metaphors for thinking about math is great, but I believe we want uniformity of understanding of the symbolism, at least in the narrow sense of parsing, so that we can communicate dependably.  It would be really neat if we discovered deep differences in parsing among mathematicians.  It would also be neat if we discovered that mathematicians parsed in generally the same way!