Tag Archives: natural language

Semantics of algebra I

Note: This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.

In the post Algebra is a difficult foreign language  I listed some of the difficulties of the syntax of the symbolic language of math (which includes high school algebra and precalculus).  The semantics causes difficulties as well.  Again I will list some examples without any attempt at completeness.

The status of the symbolic language as a language

There is a sharp distinction between the symbolic language of math and mathematical English, which I have written about in The languages of math and in the Handbook of mathematical discourse. Other authors do not make this sharp distinction (see the list of references at the end of this post). The symbolic language occurs embedded in mathematical English and the embedding has its own semantics which may cause great difficulty for students.

The symbolic language of math can be described as a natural formal language. Pieces of it were invented by mathematicians and others over the course of the last several hundred years. Individual pieces (notation such as "$3x+1=2y$") can be given a strictly formal syntax, but the whole system is ambiguous, inconsistent, and context-sensitive.  When you get to the research level, it has many dialects: Research mathematicians in one field may not be able to read research articles in a very different field.

Examples

I think the examples below will make these claims plausible.  This should be the subject of deep research.

Superscripts and functions

  • A superscript, as in $5^2$ or $x^3$, has a pretty standard meaning denoting a power, at least until you get to higher level stuff such as tensors.  
  • A function can be denoted by a letter, symbol, or string, and the notation $f(x)$ refers to the value of the function at input $x$.  

For functions defined on numbers, it is common in precalculus and higher to write $f^2(x)$ to denoted $(f(x))^2=f(x)\,f(x)$.  Since the value of certain multiletter functions are commonly written without the parentheses (for example, $\sin\,x$), one writes $\sin^2x$ to mean $(\sin\,x)^2$.

The notation $f^n$ is also widely used to mean the $n$th iterate of $f$ (if it exists), so $f^3(x)=f(f(f(x)))$ and so on.  This leads naturally to writing $f^{-1}(x)$ for the inverse function of $f$; this is common notation whether the function $f$ is bijective or not (in which case $f^{-1}$ is set-valued).  Thus $\sin^{-1}x$ means $\arcsin\,x$.

It is notorious that words in mathematical English have different meanings in different texts.  This is an example in the symbolic language (and not just at the research level) of a systematic construction that can give expressions that have ambiguous meanings.

This phenomenon is an example of why I say the symbolic language of math is a natural formal language: I have described a natural extension of notation used with multiplication of values that has been extended to being used for the binary operation of composition.  And that leads to students thinking that $\sin^{-1}x$ means $\frac{1}{\sin\,x}$. 

History can overtake notation, too: Mathematicians probably took to writing $\sin\,x$ instead of $\sin(x)$ because it saves writing.  That was not very misleading in the old days when mathematical variables were always single symbols.  But students see multiletter variable names all the time these days (in programming languages, Excel and elsewhere), so of course some of them think $\sin\,x$ means $\sin$ times $x$. People who do this are not idiots.

Juxtaposition

Juxtaposition of two symbols means many different things.

  • If $m$ and $n$ are numbers, $mn$ denotes the product of the two numbers.
    • Multiplication is commutative, so $mn$ and $nm$ denote the same number, but they correspond to different calculations.  
  • If $M$ and $N$ are matrices, $MN$ denotes the matrix product of the two matrices.
    • This is a binary operation but it is not the same operation denoted by juxtaposition of numbers. (In fact it involves both addition and multiplication of numbers.)
    • Now $MN$ may not be the same matrix as $NM$.
  • If $A$ and $B$ are points in a geometric drawing, $AB$ denotes the line segment from $A$ to $B$.
    • This is a function of two variables denoting points whose value is a line segment.  
    • It is not what is usually called a binary operation, although as an opinionated category theorist I would call it a multisorted binary operation.
    • It is commutative, but it doesn't make sense to ask if it is associative.

This phenomenon is called overloaded notation.  

  • In order to understand the meaning of the juxtaposition of symbols, you have to know the type of the variables.
  • The surrounding text may tell you specifically the variables denote matrices or whatever. So this is an instance of context-sensitive semantics. 
    • Students tend to expect that they know what any formula means in isolation from the text.  It may make them very sad to discover that this doesn't work — once they believe it, which can take quite a while.
  • In many cases the problem is alleviated by the use of convention.
    • Matrices are usually denoted by capital letters, numbers by lower case letters.
    • But points in geometry are usually denoted by capital letters too.  So you have to know that referring to a geometric diagram is significant to understanding the notation. This is an indirect form of context-sensitivity.  Did any teacher every point this out to students?  Does it appear anywhere in print?

The earlier example of $\sin^{-1}x$ is a case which is not context-sensitive.  Knowing the types of the variables won't help.  Of course, if the author explains which meaning is meant, that explanation is within the context of the book!  That is not a lot of help for grasshoppers like me that look back and forth at different parts of a math book instead of reading it straight through..  

Equations

Consider the expressions

  1. $x^2-5x+4=0$
  2. $x^2+y^2=1$
  3. $x^2+2x+1=(x+1)^2$

They are assertions that two expressions have the same value. A strictly logical view of an equation containing variables is that it puts a constraint on the variables.  It is true of some numbers (or pairs of numbers) and false of others.  That is the defining property of an equation. Equation 1 requires that $x=1$ or $x=4$.  Equation 2 imposes a constraint which is satisfied by uncountably many pairs of real numbers, and is also not true of uncountably many pairs. But equation 3 puts no constraint on the variable.  It is true of every number $x$.

A strictly logical view of symbolic notation does math a disservice.  Here, the notion that an equation is by definition a symbolic statement that has a truth set and a falsity set may be correct but it is not the important thing about any particular equation. When we read and do math we have many different metaphors and images about a concept.  The definition of a kind of object is often in terms of things that may not be the most important things to know about it.  (One of the most important fact about groups is that it is an abstraction of symmetries, which the axioms don't mention at all.)

Equation 1. is something that would make most people set out to discover the truth set.  Equation 2. calls out for drawing its graph.  Equation 3. being an identity means that is useful in algebraic reasoning.  The images they call up are different and what you do with them is different.  The images and metaphors that cluster around a concept are an important part of the semantics of the symbolic language.

I expect to post separately about the semantics of variables and about the semantics of symbolic language embedded in mathematical English.

References

Send to Kindle

Metaphors in computing science 2

In Metaphors in Computer Science 1, I discussed some metaphors used when thinking about various aspects of computing.  This is a continuation of that post.

Metaphor: A program is a list of instructions.

  • I discussed this metaphor in detail in the earlier post.
  • Note particularly that the instructions can be in a natural or a programming language. (Is that a zeugma?)  Many writers would call instructions in a natural language an algorithm.
  • I will continue to use “program” in the broader sense.

Metaphor: A programming language is a language.

  • This metaphor is a specific conceptual blend that associates the strings of symbols that constitute a program in a computer language with text in a natural language.
  • The metaphor is based on some similarities between expressions in a programming language and expressions in a natural language.
    • In both, the expressions have a meaning.
    • Both natural and programming languages have specific rules for constructing well-formed expressions.
  • This way of thinking ignores many deep differences between programming languages and natural languages. In particular, they don’t talk about the same things!
  • The metaphor has been powerful in suggesting ways of thinking about computer programs, for example semantics (below) and ambiguity.

Metaphor: A computer program is a list of statements

  • A consequence of this metaphor is that a computer program is a list of symbols that can be stored in a computer’s memory.
  • This metaphor comes with the assumption that if the program is written in accordance with the language’s rules, a computer can execute the program and perhaps produce an output.
  • This is the profound discovery, probably by Alan Turing, that made the computer revolution possible. (You don’t have to have different physical machines to do different things.)
  • You may want me to say more in the heading above: “A computer program is a list of statements in a programming language that satisfies the well-formedness requirements of the language.”  But the point of the metaphor is only that a program is a list of statements.  The metaphor is not intended to define the concept of “program”.

Metaphor: A program in a computer language has meanings.

A program is intended to mean something to a human reader.

  • Some languages are designed to be easily read by a human reader: Cobol, Basic, SQL.
    • Their instructions look like English.
    • The algorithm can nevertheless be difficult to understand.
  • Some languages are written in a dense symbolic style.
    • In many cases the style is an extension of the style of algebraic formulas: C, Fortran.
    • Other languages are written in a notation not based on algebra:  Lisp, APL, Forth.
  • The boundary between “easily read” and “dense symbolic” is a matter of opinion!

A program is intended to be executed by a computer.

  • The execution always involves translation into intermediate languages. 
    • Most often the execution requires repeated translation into a succession of intermediate languages.
    • Each translation requires the preservation of the intended meaning of the program.
  • The preservation of intended meaning is what is usually called the semanticsof a programming language.
    • In fact, the meaning of the program to a person could be called semantics, too.
    • And the human semantics had better correspond in “meaning” to the machine semantics!
  • The actual execution of the program requires successive changes in the state of the computer.
    • By “state” I mean a list of the form of the electrical charges of each unit of memory in the computer.
    • Or you can restrict it to the relevant units of memory, but spelling that out is horrifying to contemplate.
    • The resulting state of the machine after the program is run is required to preserve the intended meaning as well as all the intermediate translations.
    • Notice that the actual execution is a series of physical events.  You can describe the execution in English or in some notation, but that notation is not the actual execution.

References

Conceptual blend (Wikipedia)

Conceptual metaphors (Wikipedia)

Images and Metaphors (article in abstractmath)

Semantics in computer science (Wikipedia)

Send to Kindle