Explaining math

To manipulate the demos in this post, you must have Wolfram CDF Player installed on your computer. It is available free from the Wolfram website. The source for the demos is the Mathematica notebook SolvEq.nb.

This post explains some basic distinctions that need to be made about the process of writing and explaining math.  Everyone who teaches math knows subconsciously what is happening here; I am trying to raise your consciousness.  For simplicity, I have chosen a technique used in elementary algebra, but much of what I suggest also applies to more abstract college level math.

An algebra problem

Solve the equation "$ax=b$" ($a\neq0$).

Understanding the statement of this problem requires a lot of Secret Knowledge (the language of ninth grade algebra) that most people don't have.

  • The expression "$ax$" means that $a$ and $x$ are numbers and $ax$ is their product. It is not the word "ax". You have to know that writing two symbols next to each other means multiply them, except when it doesn't mean multiply them as in "$\sin\,x$".

  • The whole expression "$ax=b$" ostensibly says that the number $ax$ is the same number as $b$.  In fact, it means more than that. The phrase "solve the equation" tells you that in fact you are supposed to find the value of $x$ that makes $ax$ the same number as $b$.

  • How do you know that "solve the equation" doesn't mean find the value of $a$ that makes $ax$ the same number as $b$? Answer: The word "solve" triggers a convention that $x$, $y$ and $z$ are numbers you are trying to find and $a$, $b$, $c$ stand for numbers that you are allowed to plug in to the equation.

  • The conventions of symbolic math require that you give a solution for any nonzero value of $a$ and any value of $b$.  You specifically are not allowed to pick $a=1$ and $b=33$ and find the value just for those numbers.  (Some college calculus students do this with problems involving literal coefficients.)

  • The little thingy "$(a\neq0)$" must be read as a constraint on $a$.  It does not mean that $a\neq0$ is a fact that you ought to know. ( I've seen college math students make this mistake, admittedly in more complex situations). Nor does it mean that you can't solve the problem if $a=0$ (you can if $b$ is also zero!).

So understanding what this problem asks, as given, requires (fairly sophisticated in some cases) pattern recognition both to understand the symbolic language it uses, and also to understand the special conventions of the mathematical English that it uses.

Explicit descriptions

This problem could be reworded so that it gives an explicit description of the problem, not requiring pattern recognition.  (Warning: "Not requiring pattern recognition" is a fuzzy concept.)  Something like this:  

You have two numbers $a$ and $b$.  Find a number $c$ for which if you multiply $a$ by $c$ you get $b$.

This version is not completely explicit.  It still requires understanding the idea of referring to a number by a letter, and it still requires pattern recognition to catch on that the two occurrences of each letter means that their meanings have to match. Also, I know from experience that some American first year college students have trouble with the syntax of the sentence ("for which…", "if…").

The following version is more explicit, but it cheats by creating an ad hoc way to distinguish the numbers.

Alice and Bob each give you a number.  How do you find a number with the property that Alice's number times your number is equal to Bob's number? 

If the problem had a couple more variables it would be so difficult to understand in an explicit form that most people would have to draw a picture of the relationships between them.  That is why algebraic notation was invented.

Visual descriptions

Algebra is a difficult foreign language.  Showing the problem visually makes it easier to understand for most people. Our brain's visual processing unit is the most powerful tool the brain has to understand things.  There are various ways to do this.  

Visualization can help someone understand algebraic notation better.  

You can state the problem by producing examples such as

  • $\boxed{3}\times\boxed{\text{??}}=\boxed{6}$ 
  • $\boxed{5}\times\boxed{\text{??}}=\boxed{2}$ 
  • $\boxed{42}\times\boxed{\text{??}}=\boxed{24}$

where the reader has to know the multiplication symbol and, one hopes, will recognize "$\boxed{\text{??}}$" as "What's the value?". But the reader does not have to understand what it means to use letters for numbers, or that "$x$ means you are suppose to discover what it is".  This way of writing an algebra problem is used in some software aimed at K-12 students.  Some of them use a blank box instead of "$\boxed{\text{??}}$".

Such software often shows the algorithm for solving the problem visually, using algebraic notation like this:

I have put in some buttons to show numbers as well as $a$ and $b$.  If you have access to Mathematica instead of just to CDF player, you can load SolvEq.nb and put in any numbers you want, but CDF's don't allow input data. 

You can also illustrate the algorithm using the tree notation for algebra I used in Monads for high school I  (and other posts). The demo below shows how to depict the value-preserving transformation given by the algorithm.  (In this case the value is the truth since the root operation is equals.)

This demo is not as visually satisfactory as the one illustrating the use of the associative law in Monads for high school I.  For one thing, I had to cheat by reversing the placement of $a$ and $x$.  Note that I put labels for the numerator and denominator legs, a practice I have been using in demos for a while for noncommutative operations.  I await a new inspiration for a better presentation of this and other equation-solving algorithms.

Another advantage of using pictures is that you can often avoid having to code things as letters which then has to be remembered.  In Monads for high school I, I used drawings of the four functions from a two-element set to itself instead of assigning them letters.  Even mnemonic letters such as $s$ for "switch" and $\text{id}$ for the identity element carry a burden that the picture dispenses with.

Naming mathematical objects

Commonword names confuse

Many technical words and phrases in math are ordinary English words ("commonwords") that are assigned a different and precisely defined mathematical meaning.  

  • Group  This sounds to the "layman" as if it ought to mean the same things as "set".  You get no clue from the name that it involves a binary operation with certain properties.  
  • Formula  In some texts on logic, a formula is a precisely defined expression that becomes a true-or-false sentence (in the semantics) when all its variables are instantiated.  So $(\forall x)(x>0)$ is a formula.  The word "formula" in ordinary English makes you think of things like "$\textrm{H}_2\textrm{O}$", which has no semantics that makes it true or false — it is a symbolic expression for a name.
  • Simple group This has a technical meaning: a group with no nontrivial normal subgroup.  The Monster Group is "simple".  Yes, the technical meaning is motivated by the usual concept of "simple", but to say the Monster Group is simple causes cognitive dissonance.

Beginning students come with the (generally subconscious) expectation that they will pick up clues about the meanings of words from connotations they are already familiar with, plus things the teacher says using those words.  They think in terms of refining an understanding they already have.  This is more or less what happens in most non-math classes.  They need to be taught what definition means to a mathematician.

Names that don't confuse but may intimidate

Other technical names in math don't cause the problems that commonwords cause.

Named after somebody The phrase "Hausdorff space" leads a math student to understand that it has a technical meaning.  They may not even know it is named after a person, but it screams "geek word" and "you don't know what it means".  That is a signal that you can find out what it means.  You don't assume you know its meaning. 

New made-up words  Words such as "affine", "gerbe"  and "logarithm" are made up of words from other languages and don't have an ordinary English meaning.  Acronyms such as "QED", "RSA" and "FOIL" don't occur often.  I don't know of any math objects other than "RSA algorithm" that have an acronymic name.  (No doubt I will think of one the minute I click the Publish button.)  Whole-cloth words such as "googol" are also rare.  All these sorts of words would be good to name new things since they do not fool the readers into thinking they know what the words mean.

Both types of words avoid fooling the student into thinking they know what the words mean, but some students are intimidated by the use of words they haven't seen before.  They seem to come to class ready to be snowed.  A minority of my students over my 35 years of teaching were like that, but that attitude was a real problem for them.

Audience

You can write for several different audiences.

Math fans (non-mathematicians who are interested in math and read books about it occasionally) In my posts Explaining higher math to beginners and in Renaming technical conceptsI wrote about several books aimed at explaining some fairly deep math to interested people who are not mathematicians.  They renamed some things. For example, Mark Ronan in Symmetry and the Monster used the phrase "atom" for "simple group" presumably to get around the cognitive dissonance.  There are other examples in my posts.  

Math newbies  (math majors and other students who want to understand some aspect of mathematics).  These are the people abstractmath.org is aimed at. For such an audience you generally don't want to rename mathematical objects. In fact, you need to give them a glossary to explain the words and phrases used by people in the subject area.   

Postsecondary math students These people, especially the math majors, have many tasks:

  • Gain an intuitive understanding of the subject matter.
  • Understand in practice the logical role of definitions.
  • Learn how to come up with proofs.
  • Understand the ins and outs of mathematical English, particularly the presence of ordinary English words with technical definitions.
  • Understand and master the appropriate parts of the symbolic language of math — not just what the symbols mean but how to tell a statement from a symbolic name.

It is appropriate for books for math fans and math newbies to try to give an understanding of concepts without necessary proving theorems.  That is the aim of much of my work, which has more an emphasis on newbies than on fans. But math majors need as well the traditional emphasis on theorem and proof and clear correct explanations.

Lately, books such as Visual Group Theory have addressed beginning math majors, trying for much more effective ways to help the students develop good intuition, as well as getting into proofs and rigor. Visual Group Theory uses standard terminology.  You can contrast it with Symmetry and the Monster and The Mystery of the Prime Numbers (read the excellent reviews on Amazon) which are clearly aimed at math fans and use nonstandard terminology.  

Terminology for algebraic structures

I have been thinking about the section of Abstracting Algebra on binary operations.  Notice this terminology:

boptable

The "standard names" are those in Wikipedia.  They give little clue to the meaning, but at least most of them, except "magma" and "group", sound technical, cluing the reader in to the fact that they'd better learn the definition.

I came up with the names in the right column in an attempt to make some sense out of them.  The design is somewhat like the names of some chemical compounds.  This would be appropriate for a text aimed at math fans, but for them you probably wouldn't want to get into such an exhaustive list.

I wrote various pieces meant to be part of Abstracting Algebra using the terminology on the right, but thought better of it. I realized that I have been vacillating between thinking of AbAl as for math fans and thinking of it as for newbies. I guess I am plunking for newbies.

I will call groups groups, but for the other structures I will use the phrases in the middle column.  Since the book is for newbies I will include a table like the one above.  I also expect to use tree notation as I did in Visual Algebra II, and other graphical devices and interactive diagrams.

Magmas

In the sixties magmas were called groupoids or monoids, both of which now mean something else.  I was really irritated when the word "magma" started showing up all over Wikipedia. It was the name given by Bourbaki, but it is a bad name because it means something else that is irrelevant.  A magma is just any binary operation. Why not just call it that?  

Well, I will tell you why, based on my experience in Ancient Times (the sixties and seventies) in math. (I started as an assistant professor at Western Reserve University in 1965). In those days people made a distinction between a binary operation and a "set with a binary operation on it".  Nowadays, the concept of function carries with it an implied domain and codomain.  So a binary operation is a function $m:S\times S\to S$.  Thinking of a binary operation this way was just beginning to appear in the common mathematical culture in the late 60's, and at least one person remarked to me: "I really like this new idea of thinking of 'plus' and 'times' as functions."  I was startled and thought (but did not say), "Well of course it is a function".  But then, in the late sixties I was being indoctrinated/perverted into category theory by the likes of John Isbell and Peter Hilton, both of whom were briefly at Case Western Reserve University.  (Also Paul Dedecker, who gave me a glimpse of Grothendieck's ideas).

Now, the idea that a binary operation is a function comes with the fact that it has a domain and a codomain, and specifically that the domain is the Cartesian square of the codomain.  People who didn't think that a binary operation was a function had to introduce the idea of the universe (universal algebraists) or the underlying set (category theorists): you had to specify it separately and introduce terminology such as $(S,\times)$ to denote the structure.   Wikipedia still does it mostly this way, and I am not about to start a revolution to get it to change its ways.

Groups

In the olden days, people thought of groups in this way:

  • A group is a set $G$ with a binary operation denoted by juxtaposition that is closed on $G$, meaning that if $a$ and $b$ are any elements of $G$, then $ab$ is in $G$.
  • The operation is associative, meaning that if $a,\ b,\ c\in G$, then $(ab)c=a(bc)$.
  • The operation has a unity element, meaning an element $e$ for which for any element $a\in G$, $ae=ea=a$.
  • For each element $a\in G$, there is an element $b$ for which $ab=ba=e$.

This is a better way to describe a group:

  • A group consist of a nullary operation e, a unary operation inv,  and a binary operation denoted by juxtaposition, all with the same codomain $G$. (A nullary operation is a map from a singleton set to a set and a unary operation is a map from a set to itself.)
  • The value of e is denoted by $e$ and the value of inv$(a)$ is denoted by $a^{-1}$.
  • These operations are subject to the following equations, true for all $a,\ b,\ c\in G$:

     

    • $ae=ea=a$.
    • $aa^{-1}=a^{-1}a=e$.
    • $(ab)c=a(bc)$.

This definition makes it clear that a group is a structure consisting of a set and three operations whose axioms are all equations.  It was formulated by people in universal algebra but you still see the older form in texts.

The old form is not wrong, it is merely inelegant.  With the old form, you have to prove the unity and inverses are unique before you can introduce notation, and more important, by making it clear that groups satisfy equational logic you get a lot of theorems for free: you construct products on the cartesian power of the underlying set, quotients by congruence relations, and other things. (Of course, in AbAl those theorem will be stated later than when groups are defined because the book is for newbies and you want lots of examples before theorems.)

References

  1. Three kinds of mathematical thinkers (G&G post)
  2. Technical meanings clash with everyday meanings (G&G post)
  3. Commonword names for technical concepts (G&G post)
  4. Renaming technical concepts (G&G post)
  5. Explaining higher math to beginners (G&G post)
  6. Visual Algebra II (G&G post)
  7. Monads for high school II: Lists (G&G post)
  8. The mystery of the prime numbers: a review (G&G post)
  9. Hersh, R. (1997a), "Math lingo vs. plain English: Double entendre". American Mathematical Monthly, volume 104, pages 48–51.
  10. Names (in abmath)
  11. Cognitive dissonance (in abmath)

Monads for high school I

 

Notes for viewing

To manipulate the demos in this post, you must have Wolfram CDF Player installed on your computer. It is available free from the Wolfram website. The source for the demos is at associative.nb.

Monads in Abstracting Algebra

I've been working on first drafts (topic posts) of several sections of my proposed book Abstracting algebra (AbAl), concentrating on the ideas leading up to monads.  This is going slowly because I want the book to be full of illustrations and interactive demos.  I am writing the demos in Mathematica simultaneously with writing the text, and designing demos is very s l o w work. It occurred to me that I should write an outline of the path leading up to monads, using some of the demos I have already produced. This is the first of probably two posts about the thread.

  • AbAl is intended to give people with a solid high school math background a mental picture of or way of thinking about the various levels of abstraction of high school algebra.
  • This outline is not a "Topic post" as described in the AbAl page. In particular, it is not aimed at high school students! It is a guided tour of my current thoughts about a particular thread through the book.
  • The AbAl page has a brief outline of the topics to be covered in the whole book.  Perhaps it should also have a list of threads like this post.

Associativity

AbAl will have sections introducing functions and binary operations using pictures and demos (not outlined in this thread).  The section on binary operations will introduce infix, prefix and postfix notation but will use trees (illustrated below) as the main display method.  Then it will introduce associativity, using demos such as this one: 

Using this computingscienceish tree notation makes it much easier to visualize what is happening (see Visible Algebra II), compared to, for example, \[(ab)(cd)=a(b(cd))=a((bc)d)=((ab)c)d=(a(bc))d\]  In this equation, the abstract structure is hidden.  You have to visualize doing the operation starting with the innermost parentheses and moving out.  With the trees you can see the computation going up the tree.

I will give examples of associative functions that are not commutative using $2\times2$ matrices and endofunctions on finite sets such as the one below, which gives all the functions from a two element set to itself. 

 

  • Note that each function is shown by a diagram, not by an arbitrary name such as "id" or "sw", which would add a burden to the memory for an example that occurs in one place in the book. (See structural notation in the Handbook.) 
  • The section on composition of functions will also look in some depth at permutations of a three-element set, anticipating a section on groups.

 By introducing a mechanism for transforming trees of associative binary operations, you can demonstrate (as in the demo below) that any associative binary operation is defined on any list of two or more elements of its domain.

For example, applying addition to three numbers $a$, $b$ and $c$ is uniquely defined. This sort of demo gives an understanding of why you get that unique definition but it is not a proof, which requires formal induction. AbAl is not concerned with showing the reader how to prove math statements.

In this section I will also introduce the oneidentity concept: the value of an associative binary operation on a an element $a$ is $a$.  Thus applying addition or multiplication to $a$ gives $a$.  (The reason for this is that you want an associative binary operation to be a unique quotient of the free associative binary operation.  That will come up after we talk about some examples of monads.)  

The oneidentity property also implies that for an associative binary operation with identity element, applying the operation to the empty set gives the identity element.  Now we can say:

An associative binary operation with identity element is uniquely defined on any finite list of elements of its domain.

Thus, in prefix notation,$+(2,3)=5$, $+(2,3,5)=10$, $+(2)=2$ and $+()=0$.  Similarly $\times(2)=2$ and $\times()=1$.

This fact suggests that the natural definition of addition, multiplication, and other associative binary operations is as functions from lists of elements of the domain to elements of the domain.   This fits with our early intuition of addition from grade school, not to mention from Excel:  Addition is something you do to lists.  That feeling (for me) is not so strong for multiplication; for many common business applications you generally multiply two things like price and number sold. That's because multiplication is usually for things of different data types, but you usually add things of the same data type (not apples and oranges?).   

That raises the question: Does every function taking lists to elements come from an associative binary operation?  I will give an example that says no.  But the next thing is to introduce joining lists (concatenation), where we discover that joining lists is an associative binary operation.  So it is really an operation on lists of lists.  This will turn out to give us a systematic way to define all associative binary operations by one mechanism, because it is an example of a monad.  That is for the second installment of this outline.

Abstracting algebra

This post has been turned into a page on WordPress, accessible in the upper right corner of the screen.  The page will be referred to by all topic posts for Abstracting Algebra.

 

Discussions about algebra

I want to call your attention to the following items about ways to avoid algebra and why you should and shouldn't.

Don't kill math, by Evan Miller

Inventing on principle, video by Bret Victor

 

Explaining “higher” math to beginners


Notes on viewing

Explaining math

I am in the process of writing an explanation of monads for people with not much math background.  In that article, I began to explain my ideas about exposition for readers at that level and after I had written several paragraphs decided I needed a separate article about exposition.  This is that article. It is mostly about language.

Who is it written for?

Interested laypeople

There are many recent books explaining some aspect of math for people who are not happy with high school algebra; some of them are listed in the references.  They must be smart readers who know how to concentrate, but for whom algebra and logic and definition-theorem-proof do not communicate.  They could be called interested laypeople, but that is a lousy name and I would appreciate suggestions for a better name. 

Math newbies

My post on monads is aimed at people who have some math, and who are interested in "understanding" some aspect of "higher math"; not understanding in the sense of being able to prove things about monads, but merely how to think about them.   I will call them math newbies.  Of course, I am including math majors, but I want to make it available to other people who are willing to tackle mathematical explanations and who are interested in knowing more about advanced stuff. 

These "other people" may include people (students and practitioners) in other science & technology areas as well as liberal-artsy people.  There are such people, I have met them.  I recall one theologian who asked me about what was the big deal about ruler-and-compass construction and who seemed to feel enlightened when I told him that those constructions preserve exactly the ideal nature of geometric objects.  (I later found out he was a famous theologian I had never heard of, just like Ngô Bảo Châu is a famous mathematician nonmathematicians have never heard of.)

Algebra and other foreign languages

If you are aiming at interested laypeople you absolutely must avoid algebra.  It is a foreign language that simply does not communicate to most of the educated people in the world.  Learning a foreign language is difficult. 

So how do you avoid algebra?  Well, you have to be clever and insightful.  The book by Matthew Watkins (below) has absolutely wonderful tricks for doing that, and I think anyone interested in math exposition ought to read it.  He uses metaphors, pictures and saying the same thing in different words. When you finish reading his book, you won't know how to prove statements related to the prime number theorem (unless you already knew how) but you have a good chance of understanding the statement of some theorem in that subject. See my review of the book for more details.

If your article is for math newbies, you don't have to avoid algebra completely.  But remember they are newbies and not as fluent as you are — they do things analogous to "Throw Mama from the train a kiss" and "I can haz cheeseburger?".  But if you are trying to give them some way of thinking about a concept, you need many other things (metaphors, illustrative applications, diagrams…)  You don't need the definition-theorem-proof style too common in "exposition".  (You do need that for math majors who want to become professional mathematicians.) 

Unfamiliar notation

In writing expositions for interested laypeople or math newbies, you should not introduce an unfamiliar notation system (which is like a minilanguage).  I expect to write the monad article without commutative diagrams.  Now, commutative diagrams are a wonderful invention, the best way of writing about categories, and they are widely used by other than category theorists.  But to explain monads to a newbie by introducing and then using commutative diagrams is like incorporating a short grammar of Spanish which you will then use in an explanation of Sancho Panza's relationship with Don Quixote. 

The abstractmath article on and, or and not does not use any of the several symbolic notations for logic that are in use.  The explanations simply use "and", "or" and "not".  I did introduce the notation, but didn't use it in the explanations.  When I rewrite the article I expect to put the notation at the end of the article instead of in the middle.  I expect to rewrite the other articles on mathematical reasoning to follow that practice, too.

Technical terminology

This is about the technical terminology used in math.  Technical terminology belongs to the math dialect (or register) of English, which is not a foreign language in the same sense as algebra.  One big problem is changing the meaning of ordinary English words to a technical meaning.  This requires a definition, and definitions are not something most people take seriously until they have been thoroughly brainwashed into using mathematical methodology.  Math majors have to be brainwashed in this way, but if you are writing for laypeople or newbies you cannot use the technology of formal definition.

Groups, simple groups

"You say the Monster Group is SIMPLE???  You must be a GENIUS!"  So Mark Ronan in his book (below) referred to simple groups as atoms.  Marcus du Sautoy calls them building blocks.  The mathematical meaning of "simple group" is not a transparent consequence of the meanings of "simple" and "group". Du Sautoy usually writes "group of symmetries" instead of just "group", which gives you an image of what he is talking about without having to go into the abstract definition of group. So in that usage, "group" just means "collection", which is what some students continue to think well after you give the definition.  

A better, but ugly, name for "group" might be "symmetroid". It sounds technical, but that might be an advantage, not a disadvantage. "Group" obviously means any collection, as I've known since childhood. "Symmetroid" I've never heard of so maybe I'd better find out what it means.

In beginning abstract math courses my students fervently (but subconsciously) believe that they can figure out what a word means by what it means already, never mind the "definition" which causes their eyes to glaze over. You have to be really persuasive to change their minds.

Prime factorization

Matthew Watkins referred to the prime factorization of an integer as a cluster. I am not sure why Watkins doesn't like "prime factorization", which usually refers to an expression such as  $p^{n_1}_1p^{n_2}_2\ldots p^{n_k}_k$.  This (as he says) has a spurious ordering that makes you have to worry about what the uniqueness of factorization means. The prime factorization is really a multiset of primes, where the order does not matter. 

Watkins illustrates a cluster of primes as a bunch of pingpong balls stuck together with glue, so the prime factorization of 90 would be four smushed together balls marked 2, 3, 3 and 5. Below is another way of illustrating the prime factorization of 90. Yes, the random movement programming could be improved, but Mathematica seduces you into infinite playing around and I want to finish this post. (Actually, I am beginning to think I like smushed pingpong balls better. Even better would be a smushed pingpong picture that I could click on and look at it from different angles.)

Metaphors, pictures, graphs, animation

Any exposition of math should use metaphors, pictures and graphs, especially manipulable pictures (like the one above) and graphs.  This applies to expositions for math majors as well as laypeople and newbies.  Calculus and other texts nowadays have begun doing this, more with pictures than with metaphors. 

I was turned on to these ideas as far back as 1967 (date not certain) when I found an early version of David Mumford's "Red Book", which I think evolved into the book The Red Book of Varieties and Schemes.  The early version was a revelation to me both about schemes and about exposition. I have lost the early book and only looked at the published version briefly when it appeared (1999).  I remember (not necessarily correctly) that he illustrated the spectrum as a graph whose coordinates were primes, and generic points were smudges.  Writing this post has motivated me to go to the University of Minnesota math library and look at the published version again.

References

Expositions for educated non-mathematicians

Previous posts in G&G

Relevant abmath articles

Notes on Viewing  

  • This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.
  • To manipulate the demos in this post, you must have Wolfram CDF Player installed on your computer. It is available free from the Wolfram website. The code for the demos is in the Mathematica notebook algebra2.nb.

Visible algebra II

Notes on viewing:

  • This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.
  • To manipulate the demos in this post, you must have Wolfram CDF Player installed on your computer. It is available free from the Wolfram website. The code for the demos is in the Mathematica notebook algebra2.nb.

More about visible algebra

I have written about visible algebra in previous posts (see References). My ideas about the interface are constantly changing. Some new ideas are described here.

In the first place I want to make it clear that what I am showing in these posts is a simulation of a possible visual algebra system.  I have not constructed any part of the system; these posts only show something about what the interface will look like.  My practice in the last few years is to throw out ideas, not construct completed documents or programs.  (I am not saying how long I will continue to do this.)  All these posts, Mathematica programs and abstractmath.org are available to reuse under a Creative Commons license.

Commutative and associative operations

Times and Plus are commutative and associative operations.  They are usually defined as binary operations.  A binary operation $*$ is said to be commutative if for all $x$ and $y$ in the underlying set of the operation, $x*y=y*x$, and it is associative if for all $x$,$y$ and $z$ in the underlying set of the operation, $(x*y)*z=x*(y*z)$. 

It is far better to define a commutative and associative operation $*$ on some underlying set $S$ as an operation on any multiset of elements of $S$.  A multiset is like a set, in particular elements can be rearranged in any way, but it is not like a set in that elements can be repeated and a different number of repetitions of an element makes a different multiset.  So for any particular multiset, the number of repetitions of each element is fixed.  Thus $\{a,a,b,b,c\} = \{c,b,a,b,a\}$ but $\{a,a,b,b,c\}\neq\{c,b,a,b,c\}$. This means that the function (operation) Plus, for example, is defined on any multiset of numbers, and \[\mathbf{Plus}\{a,a,b,b,c\}=\mathbf{Plus} \{c,b,a,b,a\}\] but $\mathbf{Plus}\{a,a,b,b,c\}$ might not be equal to $\mathbf{Plus} \{c,b,a,b,c\}$.

This way of defining (any) associative and commutative operation comes from the theory of monads.  An operation defined on all the multisets drawn from a particular set is necessarily commutative and associative if it satisfies some basic monad identities, the main one being it commutes with union of multisets (which is defined in the way you would expect, and if this irritates you, read the Wikipedia article on multisets.). You don't have to impose any conditions specifically referring to commutativity or associativity.  I expect to write further about monads in a later post. 

The input process for a visible algebra system should allow the full strength of this fact. You can attach as many inputs as you want to Times or Plus and you can move them around.  For example, you can click on any input and move it to a different place in the following demo.

Other input notations might be suitable for different purposes.  The example below shows how the inputs can be placed randomly in two dimensions (but preserving multiplicity).  I experimented with making it show the variables slowly moving around inside the circle the way the fish do in that screensaver (which mesmerizes small children, by the way — never mind what it does to me), but I haven't yet made it work.

A visible algebra system might well allow directly input tables to be added up (or multiplied), like the one below. Spreadsheets have such an operation In particular, the spreadsheet operation does not insist that you apply it only as a binary operation to columns with two entries.  By far the most natural way to define addition of numbers is as an operation on multisets of numbers.

Other operations

Operations that are associative but not commutative, such as matrix multiplication, can be defined the monad way as operations on finite lists (or tuples or vectors) of numbers.  The operation is automatically associative if you require it to preserve concatenation of lists and some other monad requirements.

Some binary operations are neither commutative nor associative.  Two such operations on numbers are Subtract and Power.  Such operations are truly binary operations; there is no obvious way to apply them to other structures.  They are only binary because the two inputs have different roles.  This suggests that the inputs be given names, as in the examples below.

Later, I will write more about simplifying trees, solving the max area problem for rectangles surmounted by semicircles, and other things concerning this system of doing algebra.

References

Previous posts about visible algebra

Other references

 

Visible algebra I supplement

Note: This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.

To manipulate the demo in this post, you must have Wolfram CDF Player installed on your computer. It is available free from the Wolfram website.

Active calculation of area

In my previous post Visible algebra I constructed a computation tree for calculating the area of a window consisting of a rectangle surmounted by a semicircle. The visual algebra system described there constructs a computation by selecting operations and attaching them to a tree, which can then be used to calculate the area of the window. 

I promised to produce a live computation tree later; it is below.  The Mathematica code for this tree is in the notebook algebra1.nb.

Press the buttons from left to right to simulate the computation that would take place in a genuine algebra system.  Note that if you skip button 2 you get the effect of parallel computation (the only place in the calculation that can be parallelized).

In Visual Algebra I the tree was put together step by step by reasoning out how you would calculate the area of the window: (1) the area is the sum of the areas of the semidisk and the rectangle, (2) the rectangle is width times height, (3) the semidisk has half the area of a disk of radius half the width of the rectangle, and so on.  So the resulting tree is a transparent construction that lets you see the reasoning that created it.  

The resulting tree could obviously be simplified.  But if you were designing a few such windows, why should you simplify it?  You certainly don't need to simplify it to speed up the computation.  On the other hand, if you are going on to solve the problem of finding the maximum area you can get if the perimeter is fixed, you will have to do some algebraic manipulation and so you do want a simplified expression.    

Later, I will write more about simplifying trees, solving the max area problem, and other things concerning this system of doing algebra.

Remark

What I am showing in these posts is a simulation of a possible visible algebra system.  I have not constructed any part of the system; these posts only show something about what the interface will look like.  My practice in the last few years is to throw out ideas, not construct completed documents or programs.  (I am not saying how long I will continue to do this.)  All these posts, Mathematica programs and abstractmath.org are available to reuse under a Creative Commons license.

Semantics of algebra I

Note: This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.

In the post Algebra is a difficult foreign language  I listed some of the difficulties of the syntax of the symbolic language of math (which includes high school algebra and precalculus).  The semantics causes difficulties as well.  Again I will list some examples without any attempt at completeness.

The status of the symbolic language as a language

There is a sharp distinction between the symbolic language of math and mathematical English, which I have written about in The languages of math and in the Handbook of mathematical discourse. Other authors do not make this sharp distinction (see the list of references at the end of this post). The symbolic language occurs embedded in mathematical English and the embedding has its own semantics which may cause great difficulty for students.

The symbolic language of math can be described as a natural formal language. Pieces of it were invented by mathematicians and others over the course of the last several hundred years. Individual pieces (notation such as "$3x+1=2y$") can be given a strictly formal syntax, but the whole system is ambiguous, inconsistent, and context-sensitive.  When you get to the research level, it has many dialects: Research mathematicians in one field may not be able to read research articles in a very different field.

Examples

I think the examples below will make these claims plausible.  This should be the subject of deep research.

Superscripts and functions

  • A superscript, as in $5^2$ or $x^3$, has a pretty standard meaning denoting a power, at least until you get to higher level stuff such as tensors.  
  • A function can be denoted by a letter, symbol, or string, and the notation $f(x)$ refers to the value of the function at input $x$.  

For functions defined on numbers, it is common in precalculus and higher to write $f^2(x)$ to denoted $(f(x))^2=f(x)\,f(x)$.  Since the value of certain multiletter functions are commonly written without the parentheses (for example, $\sin\,x$), one writes $\sin^2x$ to mean $(\sin\,x)^2$.

The notation $f^n$ is also widely used to mean the $n$th iterate of $f$ (if it exists), so $f^3(x)=f(f(f(x)))$ and so on.  This leads naturally to writing $f^{-1}(x)$ for the inverse function of $f$; this is common notation whether the function $f$ is bijective or not (in which case $f^{-1}$ is set-valued).  Thus $\sin^{-1}x$ means $\arcsin\,x$.

It is notorious that words in mathematical English have different meanings in different texts.  This is an example in the symbolic language (and not just at the research level) of a systematic construction that can give expressions that have ambiguous meanings.

This phenomenon is an example of why I say the symbolic language of math is a natural formal language: I have described a natural extension of notation used with multiplication of values that has been extended to being used for the binary operation of composition.  And that leads to students thinking that $\sin^{-1}x$ means $\frac{1}{\sin\,x}$. 

History can overtake notation, too: Mathematicians probably took to writing $\sin\,x$ instead of $\sin(x)$ because it saves writing.  That was not very misleading in the old days when mathematical variables were always single symbols.  But students see multiletter variable names all the time these days (in programming languages, Excel and elsewhere), so of course some of them think $\sin\,x$ means $\sin$ times $x$. People who do this are not idiots.

Juxtaposition

Juxtaposition of two symbols means many different things.

  • If $m$ and $n$ are numbers, $mn$ denotes the product of the two numbers.
    • Multiplication is commutative, so $mn$ and $nm$ denote the same number, but they correspond to different calculations.  
  • If $M$ and $N$ are matrices, $MN$ denotes the matrix product of the two matrices.
    • This is a binary operation but it is not the same operation denoted by juxtaposition of numbers. (In fact it involves both addition and multiplication of numbers.)
    • Now $MN$ may not be the same matrix as $NM$.
  • If $A$ and $B$ are points in a geometric drawing, $AB$ denotes the line segment from $A$ to $B$.
    • This is a function of two variables denoting points whose value is a line segment.  
    • It is not what is usually called a binary operation, although as an opinionated category theorist I would call it a multisorted binary operation.
    • It is commutative, but it doesn't make sense to ask if it is associative.

This phenomenon is called overloaded notation.  

  • In order to understand the meaning of the juxtaposition of symbols, you have to know the type of the variables.
  • The surrounding text may tell you specifically the variables denote matrices or whatever. So this is an instance of context-sensitive semantics. 
    • Students tend to expect that they know what any formula means in isolation from the text.  It may make them very sad to discover that this doesn't work — once they believe it, which can take quite a while.
  • In many cases the problem is alleviated by the use of convention.
    • Matrices are usually denoted by capital letters, numbers by lower case letters.
    • But points in geometry are usually denoted by capital letters too.  So you have to know that referring to a geometric diagram is significant to understanding the notation. This is an indirect form of context-sensitivity.  Did any teacher every point this out to students?  Does it appear anywhere in print?

The earlier example of $\sin^{-1}x$ is a case which is not context-sensitive.  Knowing the types of the variables won't help.  Of course, if the author explains which meaning is meant, that explanation is within the context of the book!  That is not a lot of help for grasshoppers like me that look back and forth at different parts of a math book instead of reading it straight through..  

Equations

Consider the expressions

  1. $x^2-5x+4=0$
  2. $x^2+y^2=1$
  3. $x^2+2x+1=(x+1)^2$

They are assertions that two expressions have the same value. A strictly logical view of an equation containing variables is that it puts a constraint on the variables.  It is true of some numbers (or pairs of numbers) and false of others.  That is the defining property of an equation. Equation 1 requires that $x=1$ or $x=4$.  Equation 2 imposes a constraint which is satisfied by uncountably many pairs of real numbers, and is also not true of uncountably many pairs. But equation 3 puts no constraint on the variable.  It is true of every number $x$.

A strictly logical view of symbolic notation does math a disservice.  Here, the notion that an equation is by definition a symbolic statement that has a truth set and a falsity set may be correct but it is not the important thing about any particular equation. When we read and do math we have many different metaphors and images about a concept.  The definition of a kind of object is often in terms of things that may not be the most important things to know about it.  (One of the most important fact about groups is that it is an abstraction of symmetries, which the axioms don't mention at all.)

Equation 1. is something that would make most people set out to discover the truth set.  Equation 2. calls out for drawing its graph.  Equation 3. being an identity means that is useful in algebraic reasoning.  The images they call up are different and what you do with them is different.  The images and metaphors that cluster around a concept are an important part of the semantics of the symbolic language.

I expect to post separately about the semantics of variables and about the semantics of symbolic language embedded in mathematical English.

References

Algebra is a difficult foreign language

Note: This post uses MathJax.  If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.

Algebra

In a previous post, I said that the symbolic language of mathematics is difficult to learn and that we don't teach it well. (The symbolic language includes as a subset the notation used in high school algebra, precalculus, and calculus.) I gave some examples in that post but now I want to go into more detail.  This discussion is an incomplete sketch of some aspects of the syntax of the symbolic language.  I will write one or more posts about the semantics later.

The languages of math

First, let's distinguish between mathematical English and the symbolic language of math. 

  • Mathematical English is a special register or jargon of English. It has not only its special vocabulary, like any jargon, but also used ordinary English words such as "If…then", "definition" and "let" in special ways. 
  • The symbolic language of math is a distinct, special-purpose written language which is not a dialect of the English language and can in fact be read by mathematicians with little knowledge of English.
    • It has its own symbols and rules that are quite different from spoken languages. 
    • Simple expressions can be pronounced, but complicated expressions may only be pointed to or referred to.
  • A mathematical article or book is typically written using mathematical English interspersed with expressions in the symbolic language of math.

Symbolic expressions

A symbolic noun (logicians call it a term) is an expression in the symbolic language that names a number or other mathematical object, and may carry other information as well.

  • "3" is a noun denoting the number 3.
  • "$\text{Sym}_3$" is a noun denoting the symmetric group of order 3.
  • "$2+1$" is a noun denoting the number 3.  But it contains more information than that: it describes a way of calculating 3 as a sum.
  • "$\sin^2\frac{\pi}{4}$" is a noun denoting the number $\frac{1}{2}$, and it also describes a computation that yields the number $\frac{1}{2}$.  If you understand the symbolic language and know that $\sin$ is a numerical function, you can recognize "$\sin^2\frac{\pi}{4}$" as a symbolic noun representing a number even if you don't know how to calculate it.
  • "$2+1$" and "$\sin^2\frac{\pi}{4}$" are said to be encapsulated computations.
    • The word "encapsulated" refers to the fact that to understand what the expressions mean, you must think of the computation not as a process but as an object.
    • Note that a computer program is also an object, not a process.
  • "$a+1$" and "$\sin^2\frac{\pi x}{4}$" are encapsulated computations containing variables that represent numbers. In these cases you can calculate the value of these computations if you give values to the variables.  

symbolic statement is a symbolic expression that represents a statement that is either true or false or free, meaning that it contains variables and is true or false depending on the values assigned to the variables.

  • $\pi\gt0$ is a symbolic assertion that is true.
  • $\pi\lt0$ is a symbolic assertion that it is false.  The fact that it is false does not stop it from being a symbolic assertion.
  • $x^2-5x+4\gt0$ is an assertion that is true for $x=5$ and false for $x=1$.
  • $x^2-5x+4=0$ is an assertion that is true for $x=1$ and $x=4$ and false for all other numbers $x$.
  • $x^2+2x+1=(x+1)^2$ is an assertion that is true for all numbers $x$. 

Properties of the symbolic language

The constituents of a symbolic expression are symbols for numbers, variables and other mathematical objects. In a particular expression, the symbols are arranged according to conventions that must be understood by the reader. These conventions form the syntax or grammar of symbolic expressions. 

The symbolic language has been invented piecemeal by mathematicians over the past several centuries. It is thus a natural language and like all natural languages it has irregularities and often results in ambiguous expressions. It is therefore difficult to learn and requires much practice to learn to use it well. Students learn the grammar in school and are often expected to understand it by osmosis instead of by being taught specifically.  However, it is not as difficult to learn well as a foreign language is.

In the basic symbolic language, expressions are written as strings of symbols.

  • The symbolic language gives (sometimes ambiguous) meaning to symbols placed above or below the line of symbols, so the strings are in some sense more than one dimensional but less than two-dimensional.
  • Integral notation, limit notation, and others, are two-dimensional enough to have two or three levels of symbols. 
  • Matrices are fully two-dimensional symbols, and so are commutative diagrams.
  • I will not consider graphs (in both senses) and geometric drawings in this post because I am not sure what I want to write about them.

Syntax of the language

One of the basic methods of the symbolic language is the use of constructors.  These can usually be analyzed as functions or operators, but I am thinking of "constructor" as a linguistic device for producing an expression denoting a mathematical object or assertion. Ordinary languages have constructors, too; for example "-ness" makes a noun out of a verb ("good" to "goodness") and "and" forms a grouping ("men and women").

Special symbols

The language uses special symbols both as names of specific objects and as constructors.

  • The digits "0", "1", "2" are named by special symbols.  So are some other objects: "$\emptyset$", "$\infty$".
  • Certain verbs are represented by special symbols: "$=$", "$\lt$", "$\in$", "$\subseteq$".
  • Some constructors are infixes: "$2+3$" denotes the sum of 2 and 3 and "$2-3$" denotes the difference between them.
  • Others are placed before, after, above or even below the name of an object.  Examples: $a'$, which can mean the derivative of $a$ or the name of another variable; $n!$ denotes $n$ factorial; $a^\star$ is the dual of $a$ in some contexts; $\vec{v}$ constructs a vector whose name is "$v$".
  • Letters from other alphabets may be used as names of objects, either defined in the context of a particular article, or with more nearly global meaning such as "$\pi$" (but "$\pi$" can denote a projection, too).

This is a lot of stuff for students to learn. Each symbol has its own rules of use (where you put it, which sort of expression you may it with, etc.)  And the meaning is often determined by context. For example $\pi x$ usually means $\pi$ multiplied by $x$, but in some books it can mean the function $\pi$ evaluated at $x$. (But this is a remark about semantics — more in another post.)

"Systematic" notation

  • The form "$f(x)$" is systematically used to denote the value of a function $f$ at the input $x$.  But this usage has variations that confuse beginning students:
    • "$\sin\,x$" is more common than "$\sin(x)$".
    • When the function has just been named as a letter, "$f(x)$" is more common that "$fx$" but many authors do use the latter.
  • Raising a symbol after another symbol commonly denotes exponentiation: "$x^2$" denotes $x$ times $x$.  But it is used in a different meaning in the case of tensors (and elsewhere).
  • Lowering a symbol after another symbol, as in "$x_i$"  may denote an item in a sequence.  But "$f_x$" is more likely to denote a partial derivative.
  • The integral notation is quite complicated.  The expression \[\int_a^b f(x)\,dx\] has three parameters, $a$, $b$ and $f$, and a bound variable $x$ that specifies the variable used in the formula for $f$.  Students gradually learn the significance of these facts as they work with integrals. 

Variables

Variables have deep problems concerned with their meaning (semantics). But substitution for variables causes syntactic problems that students have difficulty with as well.

  • Substituting $4$ for $x$ in the expression $3+x$ results in $3+4$. 
  • Substituting $4$ for $x$ in the expression $3x$ results in $12$, not $34$. 
  • Substituting "$y+z$" in the expression $3x$ results in $3(y+z)$, not $3y+z$.  Some of my calculus students in preforming this substitution would write $3\,\,y+z$, using a space to separate.  The rules don't allow that, but I think it is a perfectly natural mistake. 

Using expressions and writing about them

  • If I write "If $x$ is an odd integer, then $3+x$ is odd", then I am using $3+x$ in a sentence. It is a noun denoting an unspecified number which can be constructed in a specified way.
  • When I mention substituting $4$ for $x$ in "$3+x$", I am talking about the expression $3+x$.  I am not writing about a number, I am writing about a string of symbols.  This distinction causes students major difficulties and teacher hardly ever talk about it.
  • In the section on variables, I wrote "the expression $3+x$", which shows more explicitly that I am talking about it as an expression.
    • Note that quotes in novels don't mean you are talking about the expression inside the quotes, it means you are describing the act of a person saying something.
  • It is very common to write something like, "If I substitute $4$ for $x$ in $3x$ I get $3 \times 4=12$".  This is called a parenthetic assertion, and it is literally nonsense (it says I get an equation).
  • If I pronounce the sentence "We know that $x\gt0$" we pronounce "$x\gt0$" as "$x$ is greater than zero",  If I pronounce the sentence "For any $x\gt0$ there is $y\gt0$ for which $x\gt y$", then I pronounce the expression "$x\gt0$" as "$x$ greater than zero$",  This is an example of context-sensitive pronunciation
  • There is a lot more about parenthetic assertions and context-sensitive pronunciation in More about the languages of math.

Conclusion

I have described some aspects of the syntax of the symbolic language of math. Learning that syntax is difficult and requires a lot of practice. Students who manage to learn the syntax and semantics can go on to learn further math, but students who don't are forever blocked from many rewarding careers. I heard someone say at the MathFest in Madison that about 25% of all high school students never really understand algebra.  I have only taught college students, but some students (maybe 5%) who get into freshman calculus in college are weak enough in algebra that they cannot continue. 

I am not proposing that all aspects of the syntax (or semantics) be taught explicitly.  A lot must be learned by doing algebra, where they pick up the syntax subconsciously just as they pick up lots of other behavior-information in and out of school. But teachers should explicitly understand the structure of algebra at least in some basic way so that they can be aware of the source of many of the students' problems. 

It is likely that the widespread use of computers will allow some parts of the symbolic language of math to be replaced by other methods such as using Excel or some visual manipulation of operations as suggested in my post Mathematical and linguistic ability.  It is also likely that the symbolic language will gradually be improved to get rid of ambiguities and irregularities.  But a deliberate top-down effort to simplify notation will not succeed. Such things rarely succeed.

References