Category Archives: representations

Every post that talks about representation of mathematical objects in the most general sense.

The only axiom of algebra

This is one of a series of posts I am writing to help me develop my thoughts about how particular topics in my book Abstracting Algebra (“AbAl“) should be organized. This post concerns the relation between substitution and evaluation that essentially constitutes the definition of algebra. The Mathematica code for the diagrams is in Subs Eval.nb.

Substitution and evaluation

This post depends heavily on your understanding of the ideas in the post Presenting binary operations as trees.

Notation for evaluation

I have been denoting evaluation of an expression represented as a tree like this:



In standard algebra notation this would be written:\[(6-4)-1=2-1=1\]

Comments

This treatment of evaluation is intended to give you an intuition about evaluation that is divorced from the usual one-dimensional (well, nearly) notation of standard algebra. So it is sloppy. It omits fine points that will have to be included in AbAl.

  • The evaluation goes from bottom up until it reaches a single value.
  • If you reach an expression with an empty box, evaluation stops. Thus $(6-3)-a$ evaluates only to $3-a$.
  • $(6-a)-1$ doesn’t evaluate further at all, although you can use properties peculiar to “minus” to change it to $5-a$.
  • I used the boxed “1” to show that the value is represented as a trivial tree, not a number. That’s so it can be substituted into another tree.

Notation for substitution

I will use a configuration like this

to indicate the data needed to substitute the lower tree into the upper one at the variable (blank box). The result of the substitution is the tree

In standard algebra one would say, “Substitute $3\times 4$ for $a$ in the expression $a+5$.” Note that in doing this you have to name the variable.

Example

“If you substitute $12$ for $a$ in $a+5$ you get $12+5$”:

results in

Example

“If you substitute $3\times 4$ for $a$ in $a+b$ you get $3\times4+b$”:

results in

Comments

Like evaluation, this treatment of substitution omits details that will have to be included in AbAl.

  • You can also substitute on the right side.
  • Substitution in standard algebraic notation often requires sudden syntactic changes because the standard notation is essentially two-dimensional. Example: “If you substitute $3+ 4$ for $a$ in $a\times b$ you get $(3+4)\times b$”.
  • The allowed renaming of free variables except when there is a clash causes students much trouble. This has to be illustrated and contrasted with the “binop is tree” treatment which is context-free. Example: The variable $b$ in the expression $(3\times 4)+b$ by itself could be changed to $a$ or $c$, but in the sentence “If you substitute $3+ 4$ for $a$ in $a\times b$ you get $(3+4)\times b$”, the $b$ is bound. It is going to be difficult to decide how much of this needs explaining.

The axiom

The Axiom for Algebra says that the operations of substitution and evaluation commute: if you apply them in either order, you get the same resulting tree. That says that for the current example, this diagram commutes:

The Only Axiom for Algebra

In standard algebra notation, this becomes:

  • Substitute, then evaluate: If $a=3\times 4$, then $a+5=3\times 4+5=12+5$.
  • Evaluate, then substitute: If $a=3\times 4$, then $a=12$, so $a+5=12+5$.

Well, how underwhelming. In ordinary algebra notation my so-called Only Axiom amounts to a mere rewording. But that’s the point:


The Only Axiom of Algebra is what makes algebraic manipulation work.

Miscellaneous comments

  • In functional notation, the Only Axiom says precisely that $\text{eval}∘\text{subst}=\text{subst}∘(\text{eval},\text{id})$.
  • The Only Axiom has a symmetric form: $\text{eval}∘\text{subst}=\text{subst}∘(\text{id},\text{eval})$ for the right branch.
  • You may expostulate: “What about associativity and commutativity. They are axioms of algebra.” But they are axioms of particular parts of algebra. That’s why I include examples using operations such as subtraction. The Only Axiom is the (ahem) only one that applies to all algebraic expressions.
  • You may further expostulate: Using monads requires the unitary or oneidentity axiom. Here that means that a binary operation $\Delta$ can be applied to one element $a$, and the result is $a$. My post Monads for high school III. shows how it is used for associative operations. The unitary axiom is necessary for representing arbitrary binary operations as a monad, which is a useful way to give a theoretical treatment of algebra. I don’t know if anyone has investigated monads-without-the-unitary-axiom. It sounds icky.
  • The Only Axiom applies to things such as single valued functions, which are unary operations, and ternary and higher operations. They also apply to algebraic expressions involving many different operations of different arities. In that sense, my presentation of the Only Axiom only gives a special case.
  • In the case of unary operations, evaluation is what we usually call evaluation. If you think about sets the way I do (as a special kind of category), evaluation is the same as composition. See “Rethinking Set Theory”, by Tom Leinster, American Mathematical Monthly, May, 2014.
  • Calculus functions such as sine and the exponential are unary operations. But not all of calculus is algebra, because substitution in the differential and integral operators is context-sensitive.

References

Preceding posts in this series

Remarks concerning these posts
  • Each of the posts in this series discusses how I will present a small part of AbAl.
  • The wording of some parts of the posts may look like a first draft, and such wording may indeed appear in the text.
  • In many places I will talk about how I should present the topic, since I am not certain about it.

Other references

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.


Send to Kindle

Presenting binops as trees

Binary operations as trees

This is one of a series of posts I am writing to help me develop my thoughts about how particular topics in my book Abstracting Algebra (“AbAl“) should be organized. In some parts, I present various options that I have not decided between.

This post concerns the presen­ta­tion of binary operations as trees. The Mathematica code for the diagrams is in Substitution in algebra.nb

Binary operations as functions

A binary operation or binop $\Delta$ is a function of two variables whose value at $(a,b)$ is traditionally denoted by $a\Delta b$. Most commonly, the function is restricted to having inputs and outputs in the same set. In other words, a binary operation is a function $\Delta:S\times S\to S$ defined on some set $S$. $S$ is the underlying set of the operation. For now, this will be the definition, although binops may be generalized to multiple sets later in the book.

In AbAl:

  • Binops will be defined as functions in the way just described.
  • Algebraic expressions will be represented
    as trees, which exhibit more clearly the structure of the expressions that is encoded in algebraic notation.
  • They will also be represented using the usual infix expressions such as “$3\times 5$” and “$3-5$”,

Fine points

The definition of a binop as a function has termi­no­logical consequences. The correct point of view concerning a function is that it determines its domain and its codomain. In particular:


A binary operation determines its underlying set.

Thus if we talk about an arbitrary binop $\Delta$, we don’t have to give a name to its underlying set. We can just say “the underlying set of $\Delta$” or “$U(\Delta)$”.

Examples

“$+$” is not one binary operation.

  • $+:\mathbb{Z}\times\mathbb{Z}\to\mathbb{Z}$ is a binary operation.
  • $+:\mathbb{R}\times\mathbb{R}\to\mathbb{R}$ is another binary operation.

Mathematicians commonly refer to these particular binops as “addition on the integers” and “addition on the reals”.

Remark

You almost never see this attitude in textbooks on algebra. It is required by both category theory and type theory, two Waves flooding into math. Category theory is a middle-aged Wave and type theory, in the version of homo­topy type theory, is a brand new baby Wave. Both Waves have changed and will change our under­standing of math in deep ways.

Trees

An arbitrary binop $\Delta$ can be represented as a binary tree in this way:

generic binop

This tree represents the expression that in standard algebraic notation is “$a\Delta b$”.

In more detail, the tree is an ordered rooted binary tree. The “ordered” part means that the leaves (nodes with no descendants) are in a specific left to right order. In AbAl, I will define trees in some detail, with lots of pictures.

The root shows the operation and the two leaves show elements of the underlying set. I follow the custom in computing science to put the root at the top.

Metaphors should not dictate your life by being taken literally.

Remark

The Wikipedia treatment of trees is scat­tered over many articles and they almost always describe things mostly in words, not pictures. Describing math objects in words when you could use pictures is against my religion. Describing is not the same as defining, which usually requires words.

Some concrete examples:



    
    

3trees

These are represen­ta­tions of the expressions “$3+5$”, “$3\times5$”, and “$3-5$”.

Just as “$5+3$” is a different expression from “$3+5$”, the left tree in 3trees above is a different expression from this one:



    

switch

They have the same value, but they are distinct as expressions — otherwise, how could you state the commutative law?

Fine points

I regard an expression as an abstract math object that can have many repre­sentations. For example “$3+5$” and the left tree in 3trees are two different represen­ta­tions of the same (abstract) expression. This deviates from the usual idea that “expression” refers to a typographical construction.

In previous posts, when the operation is not commutative, I have sometimes labeled the legs like this:


I have thought about using this notation consistently in AbAl, but I suspect it would be awkward in places.

Evaluation and substitution


The two basic operations on algebraic expressions
are evaluation and substitution.

They and the Only Axiom of Algebra, which I will discuss in a later post, are all that is needed to express the true nature of algebra.

Evaluation

  • If you evaluate $3+5$ you get $8$.
  • If you evaluate $3\times 5$ you get $15$.
  • If you evaluate $3-5$ you get $-2$.

I will show evaluation on trees like this:




Evaluation with trace

A more elaborate version, valuation with trace, would look like this. This allows you to keep track of where the valuations come from.




You could also keep track of the operation used at each node. An interactive illustration of this is in the post Visible algebra I supplement. That illustration requires CDF Player to be installed on your computer. You can get it free from the Mathematica website.

Variables

In the tree above, the $a$ and $b$ are variables, just as they are in the equivalent expression $a\Delta b$. Algebra beginners have a hard time understanding variables.

  • You can’t evaluate an expression until you substitute numbers for the letters, which produces an instance of expression. (“Instance” is the preferable name for this, but I often refer to such a thing as an “example”.)
  • If a variable is repeated you have to substitute the same value for each occurrence. So $a\Delta b$ is a different expression from $a\Delta a$: $2+3$ is an instance of $a+b$ but it is not an instance of $a+a$. But $a\Delta a$ and $b\Delta b$ are the same expression: any instance of one is an instance of the other.
  • Substitute $a\Delta b$ for $a$ in $a\Delta b$ and you get $(a\Delta b)\Delta b$. You may have committed variable clash. You might have meant $(a\Delta b)\Delta c$. (Somebody please tell me a good link that describes variable clash.)
  • Later, you will deal with multiplication tables for algebraic structures. There the elements are denoted by letters of the alphabet. They can’t be substituted for.

Empty boxes

A straightforward way to denote variables would be to use empty boxes:

The idea is that a number (element of the underlying set) can be inserted in each box. If $3$ (left) and $5$ (right) are placed in the boxes, evaluation would place the value of $3\Delta5$ in the root. Each empty box represents a separate variable.

Empty boxes could also be used in the standard algebraic notation: $\Delta$ or $+$ or $-$.
I have seen that notation in texts explaining variables, but I don’t know a reference. I expect to use this notation with trees in AbAl.

To achieve the effect of one variable in two different places, as in

we can cause it to repeat, as below, where “$\text{id}$” denotes the identity function on the underlying set:

To evaluate at a number (member of the underlying set) you insert a number into the only empty box

which evaluates to

which of course evaluates to $3\Delta3$.

This way of treating repeated variables exhibits the nature of repeated variables explicitly and naturally, putting the values automatically in the correct places. This process, like everything in this section, comes from monad theory. It also reminds me of linear logic in that it shows that if you want to use a value more than once you have to copy it.

Substitution

Given two binary trees



      

you could attach the root of the first one to one of the leaves of the second one, in two different ways, to get these trees:



      


2trees

which in standard algebra notation would be written $(a-b)-c)$ and $a-(b-c)$ respectively. Note that this tree



would be represented in algebra as $(a-b)-b$.

In general, substituting a tree for an input (variable or empty box) consists of replacing the empty box by the whole tree, identifying the root of the new tree with the empty box. In graph theorem, “substitution” may be called “grafting”, which is a good metaphor.

You can evaluate the left tree in 2trees at particular numbers to evaluate it in two stages:



Of course, evaluating the right one at the same values would give you a different answer, since subtraction is not associative. Here is another example:


Binary trees in general

By repeated substitution, you can create general binary trees built up of individual trees of this form:

In AbAl I will give examples of such things and their counterparts in algebraic notation. This will include binary trees involving more than one binop, as well. I showed an example in the previous post, which example I repeat here:

It represents the precise unsimplified expression

\[A=wh+\frac{1}{2}\left(\pi(\frac{1}{2}w)^2\right)\]

Some of the operations in that tree are associative and commutative, which is why the expression can be simplified. The collection of all (finite) binary trees built out of a single binop with no assumption that it satisfies laws (associative, commutative and so on) is the free algebra on that binary operation. It is the mother of all binary operations, so it plays the same role for an arbitrary binop that the set of lists plays for associative operations, as described in Monads for High School III: Algebras. All this will be covered in later chapters of AbAl.

References

Preceding posts in this series

Other references

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.


Send to Kindle

Dysfunctions in doing math III

This post concludes the work begun in Dysfunctions in doing math I and Dysfunctions in doing math II, with more revisions to the article in abstractmath on dysfunctions.

False symmetry

Bases of vector spaces

In a finite dimensional vector space $V$ with subspace $W$, every basis of $W$ can be extended to a basis of $V$. But in general there are bases of $V$ that do not contain a subset that is a basis of $W$. A tragic lack of symmetry that causes innocent students to lose points in linear algebra.

Example

The plane $P$ defined by $x=y$ is a two-dimensional subspace of the three dimensional Euclidean space with axes $x,y,z$. One basis of $P$ is $\{(1,1,0),(0,0,1)\}$. It can be extended to the basis $\{(1,1,0),(0,0,1),(0,1,0)\}$ of $\mathbb{R}^3$. But the basis $\{(1,0,0),(0,1,0),(0,0,1)\}$ of $\mathbb{R}^3$ does not contain a subset that is a basis of $P$.

Normal subgroups

Every subgroup $B$ of a commutative group $A$ is a normal subgroup of $A$. But if $B$ is an commutative subgroup of a non-commutative group $S$, then $B$ may not be a normal subgroup of $S$. For example, $\text{Sym}_3$ (the group of symmetries of an equilateral triangle) has three subgroup with two elements each. Each subgroup is commutative, but is not a normal subgroup of $\text{Sym}_3$.

Jump the fence

If you are working with an expression whose variables are constrained to certain values, and you substitute a value in the expression that violates the constraint, you jump the fence
(also called a fencepost error).

Example

The Fibonacci numbers (MW, Wi) are usually defined inductively like this:

\[F(n)=\left\{ \begin{align}
& 0\text{ if }n=0 \\
& 1\text{ if }n=1 \\
& F(n-1)+F(n-2)\text{ if }n\gt 1 \\
\end{align} \right.\]

In calculating a sum of Fibonacci numbers, you might write

\[\sum_{k=0}^{n}{F(k)=}\sum_{k=0}^{n}{F(k-1)+}\sum_{k=0}^{n}{F(k-2)}\]
This contains errors : the sums on the right involve $F(-1)$ and $F(-2)$, which are not defined by the definition above. You could add
\[F(n)=0\text{ if }n\lt 0\]
to the definition to get around this, or keep better track of the fence by writing

\[\sum_{k=0}^{n}{F(k)=}\sum_{k=1}^{n}{F(k-1)+}\sum_{k=2}^{n}{F(k-2)}\,\,\,\,\,\,\,\,\,\text{
}(n>1)\]

(The notation “$(n \gt 1)$” means “for all $n$ greater than $1$.” See here )

Literalism

Every type of math object has to have a definition. In giving a definition, a few of the many ingredients that are involved in that type of object are selected as a basis for the definition. They are not necessarily the most important parts. People who make definitions try to use as little as possible in the definition so that it is easier to verify that something is an example of the thing being defined.

A definitional literalist is someone who insists on thinking about a type of math object primarily in terms of what the definition says it is.


Definitional literalism inhibits your understanding of abstract math.

Ordered pairs

One of the major tools in the study of the foundations of mathematics is to try to define all mathematical objects in terms of as few as possible objects. The most common form this takes is to define everything in terms of sets. For example, the ordered pair $(a,b)$ can be defined to be the set $\{a, \{a, b\}\}$.
(See Wi). A definitional literalist will conclude that the ordered pair $(a,b)$ is the set $\{a, \{a, b\}\}$.

This would mean that it makes sense to say that $a\in(a,b)$ but $b\notin(a,b)$.
No mathematician would ever think of saying such things.

What is important about an ordered pair is its specification:

  • An ordered pair has a first coordinate and a second coordinate.
  • What the first and second coordinates are completely determine the ordered pair.

It is ludicrous to say something like “$a\in (a,b)$”. The “definition” that $(a,b)$ is the set $\{a,\{\{a,b\}\}$ is done purely for the purpose of showing that the study of ordered pairs can be reduced to the study of sets. It is not a fact about ordered pairs that we can use.

Equivalence relations

An equivalence relation on a set S is a relation on S with certain properties. A partition on S is a set of subsets with certain properties. The two definitions can be proven to give the same structure (that is done here).

I have personally heard literalists say,
“How can they give the same structure? One is a relation and one is a partition.” The point is that an equivalence relation/partition has a total structure which can be described either by starting with a relation and imposing axioms, or by giving a set of subsets and imposing axioms. Each set of axioms describes exactly the same structure; every theorem that can be deduced from the axioms for an equivalence relation can be deduced from the axioms for a partition.

Functions

The
(less strict) definition of function says that a function is a set of ordered pairs with the functional property.

This does not mean that if your function is $F ( x ) = 2 x + 1$, then you would say “$\left( 3,\,7 \right)\in F$” . The most common practice is to say that “$F (3) = 7$” or “the value of $F$ at $3$ is $7$” or something of the sort.

I do know mathematicians who tell me that they really do think of a function as a set of ordered pairs and would indeed say “$\left( 3,\,7 \right)\in F$”.

Vanishing

Many years ago I had a math professor who hated it with a purple passion if anyone said a function vanishes at some number $a$, meaning its value at $a$ is $0$. If you said, “The function $x^2-1$ vanishes at $1$”, he would say, “Pah! The function is still there isn’t it?”

There are in fact two different points a literalist can make about such a statement.

  • The function’s value at $1$ is $0$. The function is not zero anywhere, it is $x^2-1$, or if you have other literalness attitudes, it is “the function $f(x)$ defined by $f(x)=x^2-1$”.
  • Even its value doesn’t literally “vanish”. The value is written as “$0$”. Look at it closely. You can see it. It has not vanished.

The phrase “the function vanishes at $a$” is a metaphor. Mathematicians use metaphors in writing and talking about math all the time, just as people do in writing and talking about anything. Nevertheless, being occasionally the obnoxious literalist sometimes clears up misunderstanding. That is why mathematicians have a reputation for literalism.

Method addiction

Beginners at abstract math sometimes have the attitudes that a problem must be solved or a proof constructed by a specific procedure. They become quite uncomfortable when faced with problem solutions that involve guessing or conceptual proofs that involve little or no calculation.

Example

Once I gave a problem in my Theoretical Computer Science class that in order to solve it required finding the largest integer $n$ for which $n!\lt109$ Most students solved it correctly, but several wrote apologies on their paper for doing it by trial and error. Of course:


Trial and error is a perfectly valid method.

Example

Students at a more advanced level may feel insecure in the case where they are faced with solving a problem for which they know there is no known feasible algorithm, a situation that occurs mostly in senior and graduate level classes. For example, there are no known feasible general algorithms for determining if two finite groups given by their multiplication tables are isomorphic, and there is no algorithm at all to determine if two presentations (generators and relations) give the same group. Even so, the question, “Are the dihedral group of order 8 and the quaternion group isomorphic?” is not hard. (Answer: No, they have different numbers of elements of order 2 and 4.)


Sometimes you can solve special cases of unsolvable problems.

See also look ahead and conceptual.

Proof by Example

Definition: An integer is even if it is divisible by 2.

Theorem : Prove that if
$n$ is an even integer then so is ${{n}^{2}}$.

This is proved by universal generalization .

One type of mistake made by beginners for proofs like this would be the following:

“Proof: Let $n = 8$. Then ${{n}^{2}}=64$ and $64$ is even.”

This violates the requirement of universal generalization that you have ” made no restrictions on $c$” – you have restricted it to being a particular even integer!

It may be that some people who make this kind of mistake don’t understand universal generalization (see also bound variable). But for others, the mistake is caused by misreading the phrase “An integer is even if…” to read that you can prove the statement by picking an integer and showing that it is true for that integer. But in fact, “an” in a statement like this means “any”. See indefinite article.

Reading variable names as labels

An assertion such as “There are six times as many students as professors” is translated by some students as $6s = p$ instead of $6p = s$ (where $p$ and $s$ have the obvious meanings). This sort of thing can be avoided by plugging in numbers for the variables to see if the resulting equations make sense. You know it’s wrong to say that if you have $12$ professors then you have $2$ students!

Math ed people have referred to this as the “student-professor problem”. But it is not the real student-professor problem.

The representation is the object

Many newbies at abstract mathematics firmly believe that the number $735$ is the expression “735”. In fact, the number $735$ is an abstract math object, not a string of symbols that represents the number. This attitude inhibits your ability to use whatever representation of an object is best for the purpose.

Example

Someone faced with a question such as “Does $21$ divide $3 \cdot5\cdot72$?” may immediately multiply the expression out to get $1080$ and then carry out long division to see if indeed $21$ divides $1080$. They will say things such as, “I can’t tell what the number is until I multiply it out.”

In this example, it is easy to see that $21$ does not divide $3 \cdot5\cdot72$, because if it did, $7$ would be a prime factor, but $7$ does not divide $72$.

Integers have many representations: decimal, binary, the prime factorization, and so on. Clearly the prime factorization is the best form for determining divisors, whereas for example the decimal notation is a good form for determining which of two integers is the larger. For example, is $3 \cdot5\cdot72$ bigger or smaller than $2\cdot 11\cdot49$?

Unique

By definition, a set $R$ of ordered pairs has the functional property if two pairs in $R$ with the same first coordinate have to have the same second coordinate

It is wrong to rephrase the definition this way: “The first coordinate determines a unique second coordinate.” That use of “unique” is ambiguous. It could mean the set \[\{(1,2),
(2,4), (3,2), (5,8)\}\] does not have the functional property because the first coordinate in $(1,2)$ determines $2$ and the first coordinate in $(3,2)$ determines $2$, so it is “not unique”. This statement is wrong. . The set does have the functional property.

A related error is to reword the definition of injective by saying, “For each input there is a unique output.” It is easy to read this and think injectivity is merely the functional property.

It seemed to me that during the 35 years I taught calculus and discrete math, students fell into this trap about 100,000 times. Of course, this could be a slight exaggeration.


Avoid rewording any definition that does not use the word unique
so that it DOES use the word unique.
Such activity fries your brain and turns A’s into B’s.


Unnecessarily weak assertion

Examples

  • The statement “Either $x \gt 0$ or $x \lt 2$” is true (for real numbers). Yes, you could make a stronger statement, for example “Either $x\le 0$ or $x \gt 0$”. But the statement “Either $x \gt 0$ or $x \lt 2$” is still true.
  • Some students have problems with the true statements “$2\le 2$” and with “$2\le 3$” for a similar reason, since in fact $2 = 2$ and $2 \lt 3$.
  • You may get a twinge if someone says “Many primes are odd”, since in fact there is only one that is not
    odd. But it is still true that many primes are odd.

An unnecessarily weak assertion may occur in math texts because it is the form your proof gives you, or it is the form you need for a proof. In the latter case you may feel the author has pulled a rabbit out of a hat.

There is another example here.


It is not wrong for an author to make an unnecessarily weak assertion.




Rabbits

Sometimes when you are reading or listening to a proof you will find yourself following each step but with no idea why these steps are going to give a proof. This can happen with the whole structure of the proof or with the sudden appearance of a step that seems like the prover pulled a rabbit out of a hat . You feel as if you are walking blindfolded.

Example
(mysterious proof structure)

The lecturer says he will prove that for an integer $n$, if $n^2$ is even then $n$ is even. He begins the proof: Let $n^2$ be odd” and then continues to the conclusion, “Therefore $n$ is odd.”

Why did he begin a proof about being even with the assumption that $n$ is odd?

The answer is that in this case he is doing a proof by contrapositive . If you don’t recognize the pattern of the proof you may be totally lost. This can happen if you don’t recognize other forms, for example contradiction and induction.

Example (rabbit)

You are reading a proof that $\underset{x\to
2}{\mathop{\lim }}{{x}^{2}}=4$. It is an $\varepsilon \text{-}\delta$ proof, so what must be proved is:

  • (*) For any positive real number $\varepsilon $,
  • there is a positive real number $\delta $ for which:
  • if $\left| x-2 \right|\lt\delta$ then
  • $\left| x^2-4 \right|\lt\varepsilon$.

Proof

Here is the proof, with what I imagine might be your agitated reaction to certain steps. Below is a proof with detailed explanations .

1) Suppose $\varepsilon \gt0$ is given.

2) Let $\delta =\text{min}\,(1,\,\frac{\varepsilon }{5})$ (the minimum of the two numbers 1 and $\frac{\varepsilon}{5}$ ).

Where the *!#@! did that come from? They pulled it out of thin air! I can’t see where we are going with this proof!

3) Suppose that $\left| x-2 \right|\lt\delta$.

4) Then $\left| x-2 \right|\lt1$ by (2) and (3).

5) By (4) and algebra, $\left|x+2 \right|\lt5$.

Well, so what? We know that $\left| x+39
\right|\lt42$ and lots of other things, too. Why did they do this?

6) Also $\left| x-2 \right|\lt\frac{\varepsilon }{5}$ by (2).

7) Then $\left| {{x}^{2}}-4
\right|=\left| (x-2)(x+2) \right|\lt\frac{\varepsilon }{5}\cdot 5=\varepsilon$ by (5) and (6). End of Proof.

Remarks

This proof is typical of proofs in texts.

  • Steps 2) and 5) look like they were rabbits pulled out of a hat.
  • The author gives no explanation of where they came from.
  • Even so, each step of the proof follows from previous steps, so the proof is correct.
  • Whether you are surprised or not has nothing to do with whether it is correct.
  • In order to understand a proof, you do not have to know where the rabbits came from.
  • In general, the author did not think up the proof steps in the order they occur in the proof. (See this remark in the section on Forms of Proofs.)
  • See also look ahead.

Acknowledgments

Thanks to Robert Burns for corrections and suggestions

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.


Send to Kindle

Guest post by F. Kafi

Before I posted Extensional and Intensional, I had emailed a draft to F. Kafi.  The following was his response.  –cw

 

In your example, “Suppose you set out to prove that if $f(x)$ is a differentiable function and $f(a)=0$ and the graph going from left to right goes UP to $f(a)$ and then DOWN after that then $a$ has to be a maximum of the function”, could we have the graph of the function $f(x)$ without being aware of the internal structure of the function; i.e., the mathematical formulation of $f(x)$ such as $f(x):=-(x-a)^2$ or simply its intensional meaning? Certainly not.

Furthermore, what paves the way for the comparison with our real world experiences leading to the metaphoric thinking is nothing but the graph of the function. Therefore, it is the intensional meaning of the function which makes the metaphoric mode of thinking possible.

The intensional meaning is specially required if we are using a grounding metaphor. A grounding metaphor uses concepts from our physical and real world life. As a result we require a medium to connect such real life concepts like “going up” and “going down” to mathematical concepts like the function $f(x)$. The intensional meaning of function $f(x)$ through providing numbers opens the door of the mind to the outer world. This is possible because numbers themselves are the result of a kind of abstraction process which the famous educational psychologist Piaget calls empirical abstraction. In fact, through empirical abstraction we transform the real world experience to numbers.

 

 

Let’s consider an example. We see some racing cars in the picture above, a real world experience if you are the spectator of a car match. The empirical abstraction works something like this:

 

 

Now we may choose a symbol like "$5$" to denote our understanding of "|||||".

It is now clear that the metaphoric mode of thinking is the reverse process of “empirical abstraction”. For example, in comparing “|||||||||||” with “||||” we may say “A car race with more competing cars is much more exciting than a much less crowded one.” Therefore, “|||||||||||”>“||”, where “>” is the abstraction of “much more exciting than”.

In the rigorous mode of thinking, the idea is almost similar. However, there is an important difference. Here again we have a metaphor. But this time, the two concepts are mathematical. There is no outer world concept. For example, we want to prove a differentiable function is also a continuous one. Both concepts of “differentiability” and “continuity” have rigorous mathematical definitions. Actually, we want to show that differentiability is similar to continuity, a linking metaphor. As a result, we again require a medium to connect the two mathematical concepts. This time there is no need to open the door of the mind to the outer world because the two concepts are in the mind. Hence, the intensional meaning of function $f(x)$ through providing numbers is not helpful. However, we need the intensional meanings of differentiability and continuity of $f(x)$; i.e., the logical definitions of differentiability and continuity.

In the case of comparing the graph of $f(x$) with a real hill we associated dots on the graph with the path on the hill. Right? Here we need to do the the same. We need to associate the $f(x)$’s in the definition of differentailblity to the $f(x)$’s used in the definition of continuity. The $f(x)$’s play the role of dots on the graph. As the internal structure of dots on the graph are unimportant to the association process in the grounding metaphor, the internal structure of $f(x)$’s in the logical definition are unimportant to the association process in the linking metaphor. Therefore, we only need the extensional meaning of the function $f(x)$; i.e., syntactically valid roles it can play in expressions.

Send to Kindle

Thinking about a function as a mathematical object

A mathematician’s mental representation of a function is generally quite rich and may involve many different metaphors and images kept in mind simultaneously. The abmath article on metaphors and images for functions discusses many of these representations, although the article is incomplete. This post is a fairly thorough rewrite of the discussion in that article of the representation of the concept of “function” as a mathematical object. You must think of functions as math objects when you are taking the rigorous view, which happens when you are trying to prove something about functions (or large classes of functions) in general.

What often happens is that you visualize one of your functions in many of the ways described in this article (it is a calculation, it maps one space to another, its graph is bounded, and so on) but those images can mislead you. So when you are completely stuck, you go back to thinking of the function as an axiomatically-defined mathe­matical structure of some sort that just sits there, like a complicated machine where you can see all the parts and how they relate to each other. That enables you to prove things by strict logical deduction. (Mathematicians mostly only go this far when they are desperate. We would much rather quote somebody’s theorem.) This is what I have called the dry bones approach.

The “mathematical structure” is most commonly a definition of function in terms of sets and axioms. The abmath article Specification and definition of “function” discusses the usual definitions of “function” in detail.

Example

This example is intended to raise your consciousness about the possibilities for functions as objects.

Consider the function $f:\mathbb{R}\to\mathbb{R}$ defined by $f(x)=2{{\sin }^{2}}x-1$. Its value can be computed at many different numbers but it is a single, static math object.

You can apply operators to it

  • Just as you can multiply a number by $2$, you can multiply $f$ by $2$.   You can say “Let $g(x)=2f(x)$” or “Let $g=2f$”. Multiplying a numerical function by $2$ is an operator that take the function $f$ to $2f$. Its input is a function and its output is another function. Then the value of $g$ (which is $2f$) at any real $x$ is $g(x)=2f(x)=4{{\sin }^{2}}x-2$. The notation  “$g=2f$” reveals that mathematicians think of $f$ as a single math object just as the $3$ in the expression “$2\times 3$” represents the number $3$ as a single object.
  • But you can’t do arithmetic operations to functions that don’t have numerical output, such as the function $\text{FL}$ that takes an English word to its first letter, so $\text{FL}(`\text{wolf’})=`\text{w’}$. (The quotes mean that I am writing about the word ‘wolf’ and the letter ‘w’.) The expression $2\times \text{FL}(`\text{wolf’})$ doesn’t make sense because ‘w’ is a letter, not a number.
  • You can find the derivative.  The derivative operator is a function from differentiable functions to functions. Such a thing is usually called an operator.  The derivative operator is sometimes written as $D$, so $Df$ is the function defined by: “$(Df)(x)$ is the slope of the tangent line to $f$ at the point $(x,f(x)$.” That is a perfectly good definition. In calculus class you learn formulas that allow you to calculate $(Df)(x)$ (usually called “$f'(x)$”) to be $4 \sin (x) \cos (x)$.

Like all math objects, functions may have properties

  • The function defined by $f(x)=2{{\sin}^{2}}x-1$ is differentiable, as noted above. It is also continuous.
  • But $f$ is not injective. This means that two different inputs can give the same output. For example,$f(\frac{\pi}{3})=f(\frac{4\pi}{3})=\frac{1}{2}$. This is a property of the whole function, not individual values. It makes no sense to say that $f(\frac{\pi}{3})$ is injective.
  • The function $f$ is periodic with period $2\pi$, meaning that for any $x$, $f(x+2\pi)=f(x)$.     It is the function itself that has period $2\pi$, not any particular value of it.  

As a math object, a function can be an element of a set

  • For example,$f$ is an element of the set ${{C}^{\infty }}(\mathbb{R})$ of real-valued functions that have derivatives of all orders.
  • On ${{C}^{\infty }}(\mathbb{R})$, differentiation is an operator that takes a function in that set to another function in the set.   It takes $f(x)$ to the function $4\sin x\cos x$.
  • If you restrict $f$ to the unit interval, it is an element of the function space ${{\text{L}}^{2}}[0,1]$.   As such it is convenient to think of it as a point in the space (the whole function is the point, not just values of it).    In this particular space, you can think of the points as vectors in an uncountably-infinite-dimensional space. (Ideas like that weird some people out. Do not worry if you are one of them. If you keep on doing math, function spaces will seem ordinary. They are OK by me, except that I think they come in entirely too many different kinds which I can never keep straight.) As a vector, $f$ has a norm, which you can think of as its length. The norm of $f$ is about $0.81$.

The discussion above shows many examples of thinking of a function as an object. You are thinking about it as an undivided whole, as a chunk, just as you think of the number $3$ (or $\pi$) as just a thing. You think the same way about your bicycle as a whole when you say, “I’ll ride my bike to the library”. But if the transmission jams, then you have to put it down on the grass and observe its individual pieces and their relation to each other (the chain came off a gear or whatever), in much the same way as noticing that the function $g(x)=x^3$ goes through the origin and looks kind of flat there, but at $(2,8)$ it is really rather steep. Phrases like “steep” and “goes through the origin” are a clue that you are thinking of the function as a curve that goes left to right and levels off in one place and goes up fast in another — you are thinking in a dynamic, not a static way like the dry bones of a math object.

Send to Kindle

The definition of “function”

 

This is the new version of the abstractmath article on the definition of function. I had to adapt the formatting and some of it looks weird, but legible. It is prettier on abstractmath.org.

I expect to announce new revisions of other abmath articles on this blog, with links, but not to publish them here. This article brings out a new point of view about defining functions that I wanted to call attention to, so I am publishing it here, as well.

 

FUNCTIONS: SPECIFICATION AND DEFINITION

It is essential that you understand many of the images, metaphors and terminology that mathe­maticians use when they think and talk about functions. For many purposes, the precise mathematical definition of "function" does not play much of a role when you are trying to understand particular kinds of functions. But there is one point of view about functions that has resulted in fundamental progress in math:

 

 

A function is a mathematical object.

To deal with functions in that way you need a precise definition of "function". That is what this article gives you.

  • The article starts by giving a specification of "function".
  • After that, we get into the technicalities of the definitions of the general concept of function.
  • Things get complicated because there are several inequivalent definitions of "function" in common use.

Specification of "function"

A function $f$ is a mathematical object which determines and is completely determined by the following data:

(DOM) $f$ has a domain, which is a set. The domain may be denoted by $\text{dom} f$.

(COD) $f$ has a codomain, which is also a set and may be denoted by $\text{cod} f$.

(VAL) For each element $a$ of the domain of $f$, $f$ has a value at $a$, denoted by $f(a)$.

(FP) The value of $f$ at $a$ is completely determined by $a$ and $f$.

(VIC) The value of $f$ at $a$ must be an element of the codomain of $f$.

  • The operation of finding $f(a)$ given $f$ and $a$ is called evaluation.
  • "FP" means functional property.
  • "VIC" means "value in codomain".

Examples

The examples of functions chapter contains many examples. The two I give here provide immediate examples.

A finite function

Let $F$ be the function defined on the set $\left\{1,\,2,3,6 \right\}$ as follows: $F(1)=3,\,\,\,F(2)=3,\,\,\,F(3)=2,\,\,\,F(6)=1$. This is the function called "Finite'' in the chapter on examples of functions.

  • The definition of $F$ says "$F$ is defined on the set $\left\{1,\,2,\,3,\,6 \right\}$". That phrase means that the domain is that set.
  • The value of $F$ at each element of the domain is given explicitly. The value at 3, for example, is 2, because the definition says that $F(2) = 3$. No other reason needs to be given. Mathematical definitions can be arbitrary.
  • The codomain of $F$ is not specified, but must include the set $\{1,2,3\}$. The codomain of a function is often not specified when it is not important — which is most of the time in freshman calculus (for example).

A real-valued function

Let $G$ be the real-valued function defined by the formula $G(x)={{x}^{2}}+2x+5$.

  • The definition of $G$ gives the value at each element of the domain by a formula. The value at $3$, for example, is $G(3)=3^2+2\cdot3+5=20$.
  • The definition of $G$ does not specify the domain. The convention in the case of functions defined on the real numbers by a formula is to take the domain to be all real numbers at which the formula is defined. In this case, that is every real number, so the domain is $\mathbb{R}$.
  • The definition does not specify the codomain, either. However, must include all real numbers greater than or equal to 4. (Why?)

What the specification means

  • The specification guarantees that a function satisfies all five of the properties listed.
  • The specification does not define a mathematical structure in the way mathematical structures have been defined in the past: In particular, it does not require a function to be one or more sets with structure.
  • Even so, it is useful to have the specification, because:

     

     

    Many mathematical definitions
    introduce extraneous technical elements
    which clutter up your thinking
    about the object they define.

     

     

    I will say more about this when I give the various definitions that are in use.

History

Until late in the nineteenth century, functions were usually thought of as defined by formulas (including infinite series). Problems arose in the theory of harmonic analysis which made mathematicians require a more general notion of function. They came up with the concept of function as a set of ordered pairs with the functional property (discussed below), and that understanding revolutionized our understanding of math.

This discussion is an over­simpli­fication of the history of mathe­matics, which many people have written thick books about. A book relevant to these ideas is Plato's Ghost, by Jeremy Gray.

In particular, this definition, along with the use of set theory, enabled abstract math (ahem) to become a common tool for understanding math and proving theorems. It is conceivable that some of you may wish it hadn't. Well, tough.

The more modern definition of function given here (which builds on the older definition) came into use beginning in the 1950's. The strict version became necessary in algebraic topology and is widely used in many fields today.

The concept of function as a formula never disappeared entirely, but was studied mostly by logicians who generalized it to the study of function-as-algorithm. Of course, the study of algorithms is one of the central topics of modern computing science, so the notion of function-as-formula (updated to function-as-algorithm) has achieved a new importance in recent years.

To state both the old abstract definition and the modern one, we need a preliminary idea.

The functional property

A set $P$ of ordered pairs has the functional property if two pairs in $P$ with the same first coordinate have to have the same second coordinate (which means they are the same pair). In other words, if $(x,a)$ and $(x,b)$ are both in $P$, then $a=b$.

How to think about the functional property

The point of the functional property is that for any pair in the set of ordered pairs, the first coordinate determines what the second one is. That's why you can write "$G(x)$'' for any $x $ in the domain of $G$ and not be ambiguous.

Examples

  • The set $\{(1,2), (2,4), (3,2), (5,8)\}$ has the functional property, since no two different pairs have the same first coordinate. Note that there are two different pairs with the same second coordinate. This is irrelevant to the functional property.
  • The set $\{(1,2), (2,4), (3,2), (2,8)\}$ does not have the functional property. There are two different pairs with first coordinate 2.
  • The empty set $\emptyset$ has the function property vacuously.

Graph of a function.

Example: graph of a function defined by a formula

In calculus books, a picture like this one (of part of $y=x^2+2x+5$) is called a graph. Here I use the word "graph" to denote the set of ordered pairs \[\left\{ (x,{{x}^{2}}+2x+5)\,\mathsf{|}\,x\in \mathbb{R } \right\}\] which is a mathematical object rather than some ink on a page or pixels on a screen.

The graph of any function studied in beginning calculus has the functional property. For example, the set of ordered pairs above has the functional property because if $x$ is any real number, the formula ${{x}^{2}}+2x+5$ defines a specific real number.

  • if $x = 0$, then ${{x}^{2}}+2x+5=5$, so the pair $(0, 5)$ is an element of the graph of $G$. Each time you plug in $0$ in the formula you get 5.
  • if $x = 1$, then ${{x}^{2}}+2x+5=8$.
  • if $x = -2$, then ${{x}^{2}}+2x+5=5$.

You can measure where the point $\{-2,5\}$ is on the (picture of) the graph and see that it is on the blue curve as it should be. No other pair whose first coordinate is $-2$ is in the graph of $G$, only $(-2, 5)$. That is because when you plug $-2$ into the formula ${{x}^{2}}+2x+5$, you get $5$ and nothing else. Of course, $(0, 5)$ is in the graph, but that does not contradict the functional property. $(0, 5)$ and $(-2, 5)$ have the same second coordinate, but that is OK.

Modern mathematical definition of function

A function $f$ is a mathematical structure consisting of the following objects:

  • A set called the domain of $f$, denoted by $\text{dom} f$.
  • A set called the codomain of $f$, denoted by $\text{cod} f$.
  • A set of ordered pairs called the graph of $ f$, with the following properties:
  • $\text{dom} f$ is the set of all first coordinates of pairs in the graph of $f$.
  • Every second coordinate of a pair in the graph of $f$ is in $\text{cod} f$ (but $\text{cod} f$ may contain other elements).
  • The graph of $f$ has the functional property.

Using arrow notation, this implies that $f:A\to B$.

Remark

The main difference between the specification of function given previously and this definition is that the definition replaces the statement "$f$ has a value at $a$" by introducing a set of ordered pairs (the graph) with the functional property.

  • This set of ordered pairs is extra structure introduced by the definition mainly in order to make the definition a classical sets-with-structure, which makes the graph, which should be a concept derived from the concept of function, into an apparently necessary part of the function.
  • That suggests incorrectly that the graph is more of a primary intuition that other intuitions such as function as relocator, function as transformer, and other points of view discussed in the article Intuitions and metaphors for functions.

Examples

  • Let $F$ have graph $\{(1,2), (2,4), (3,2), (5,8)\}$ and define $A = \{1, 2, 3, 5\}$ and $B = \{2, 4, 8\}$. Then $F:A\to B$ is a function. In speaking, we would usually say, "$F$ is a function from $A$ to $B$."
  • Let $G$ have graph $\{(1,2), (2,4), (3,2), (5,8)\}$ (same as above), and define $A = \{1, 2, 3, 5\}$ and $C = \{2, 4, 8, 9, 11, \pi, 3/2\}$. Then $G:A\to C$ is a (admittedly ridiculous) function. Note that all the second coordinates of the graph are in $C$, along with a bunch of miscellaneous suspicious characters that are not second coordinates of pairs in the graph.
  • Let $H$ have graph $\{(1,2), (2,4), (3,2), (5,8)\}$. Then $H:A\to \mathbb{R}$ is a function, since $2$, $4$ and $8$ are all real numbers.
  • Let $D = \{1, 2, 5\}$ and $E = \{1, 2, 3, 4, 5\}$. Then there is no function $D\to A$ and no function $E\to A$ with graph $\{(1,2), (2,4), (3,2), (5,8)\}$. Neither $D$ nor $E$ has exactly the same elements as the first coordinates of the graph.

Identity and inclusion

Suppose we have two sets  A and  B with $A\subseteq B$.

  • The identity function on A is the function ${{\operatorname{id}}_{A}}:A\to A$ defined by ${{\operatorname{id}}_{A}}(x)=x$ for all $x\in A$. (Many authors call it ${{1}_{A}}$).
  • When $A\subseteq B$, the inclusion function from $A$ to $B$ is the function $i:A\to B$ defined by $i(x)=x$ for all $x\in A$. Note that there is a different function for each pair of sets $A$ and $B$ for which $A\subseteq B$. Some authors call it ${{i}_{A,\,B}}$ or $\text{in}{{\text{c}}_{A,\,B}}$.

The identity function and an inclusion function for the same set $A$ have exactly the same graph, namely $\left\{ (a,a)|a\in A \right\}$. More about this below.

Other definitions of function

Original abstract definition of function

Definition

Remarks

Possible confusion

Some confusion can result because of the presence of these two different definitions.

Multivalued function

Some older mathematical papers in com­plex func­tion theory do not tell you that their functions are multi­valued. There was a time when com­plex func­tion theory was such a Big Deal in research mathe­matics that the phrase "func­tion theory" meant complex func­tion theory and all the cogno­scenti knew that their functions were multi­valued.

The phrase multivalued function refers to an object that is like a function $f:S\to T$ except that for $s\in S$, $f(s)$ may denote more than one value.

Examples

  • Multivalued functions arose in considering complex functions. In common practice, the symbol $\sqrt{4}$ denoted $2$, although $-2$ is also a square root of $4$. But in complex function theory, the square root function takes on both the values $2$ and $-2$. This is discussed in detail in Wikipedia.
  • The antiderivative is an example of a multivalued operator. For any constant $C$, $\frac{x^3}{3}+C$ is an antiderivative of $x^2$.

A multivalued function $f:S\to T$ can be modeled as a function with domain $S$ and codomain the set of all subsets of $T$. The two meanings are equivalent in a strong sense (naturally equivalent}). Even so, it seems to me that they represent two differ­ent ways of thinking about multivalued functions. ("The value may be any of these things…" as opposed to "The value is this whole set of things.")

The phrases "multivalued function" and "partial function" upset some picky types who say things like, "But a multi­valued func­tion is not a func­tion!". A step­mother is not a mother, either. See the Hand­book article on radial category.

Partial function

A partial function $f:S\to T$ is just like a function except that its input may be defined on only a subset of $S$. For example, the function $f(x)=\frac{1}{x}$ is a partial function from the real numbers to the real numbers.

This models the behavior of computer programs (algorithms): if you consider a program with one input and one output as a function, it may not be defined on some inputs because for them it runs forever (or gives an error message).

In some texts in computing science and mathematical logic, a function is by convention a partial function, and this fact may not be mentioned explicitly, especially in research papers.

New approaches to functions

All the definitions of function given here produce mathematical structures, using the traditional way to define mathematical objects in terms of sets. Such definitions have disadvantages.

Mathematicians have many ways to think about functions. That a function is a set of ordered pairs with a certain property (functional) and possibly some ancillary ideas (domain, codomain, and others) is not the way we usually think about them$\ldots$Except when we need to reduce the thing we are studying to its absolutely most abstract form to make sure our proofs are correct. That most abstract form is what I have called the rigorous view or the dry bones and it is when that reasoning is needed that the sets-with-structure approach has succeeded.

Our practice of abstraction has led us to new approaches to talking about functions. The most important one currently is category theory. Roughly, a category is a bunch of objects together with some arrows going between them that can be composed head to tail. Functions between sets are examples of this: the sets are the objects and the functions the arrows.

This abstracts the idea of function in a way that brings out common ideas in various branches of math. Research papers in many branches of mathematics now routinely use the language of category theory. Categories now appear in some undergraduate math courses, meaning that Someone needs to write a chapter on category theory for abstractmath.org.

Besides category theory, computing scientists have come up with other abstract ways of dealing with functions, for example type theory. It has not come as far along as category theory, but has shown recent signs of major progress.

Both category theory and type theory define math objects in terms of their effect on and relationship with other math objects. This makes it possible to do abstract math entirely without using sets-with-structure as a means of defining concepts.

 

Send to Kindle

The power of being naive

The interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook MM Def Deriv.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. See How to manipulate the diagrams for more information on what you can do with them. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.

Learning about the derivative as a concept

The derivative $f'(x)$ of $f(x)$ is the function whose value at $a$ is the slope of the line tangent to the graph $y=f(x)$ at the point $(a,f(a))$.

To gain understanding of the concept of derivative the student need to see and play with the pictures that illustrate the definition. This can be done in stages:

  • Give an intuitive, pictorial explanation of the tangent line.
  • Show in pictures what the slope of a line is.
  • Show in pictures how you can approximate the tangent line with secant lines.

Of course, many teachers and textbooks do this. I propose that:

The student will benefit in the long run by spending a whole class session on the intuitive ideas I just described and doing a set homework based only on intuition. Then you can start doing the algebraic stuff.

This post provides some ideas about manipulable diagrams that students can play with to gain intuition about derivatives. Others are possible. There are many on the Mathematica Demonstrations website. There are others written in Java and other languages, but I don't know of a site that tries to collect them in one place.

My claim that the student will benefit in the long run is not something I can verify, since I no longer teach.

Present the tangent line conceptually

The tangent line to a curve

  • is a straight line that touches the curve at a point on the curve,
  • and it goes in the same direction that the curve is going, like the red line in the picture below. (See How to manipulate the diagrams.)

 

My recommendation is that you let the students bring up some of the fine points.

  • The graph of $y=x^3-x$ has places where the tangent line cuts the curve at another point without being parallel to the curve there. Move the slider to find these places.
  • The graph of $y=\cos(\pi x)$ has places where the same line is tangent at more than one point on the curve. (This may requre stepping the slider using the incrementers.)
  • Instigate a conversation about the tangent line to a given straight line.
  • My post Tangents has other demos intended to bother the students.
  • Show the unit circle with some tangent lines and make them stare at it until they notice something peculiar.
  • "This graph shows the tangent line but how do you calculate it?" You can point out that if you draw the curve carefully and then slide a ruler around it so that it is tangent at the point you are interested in, then you can draw the tangent carefully and measure the rise and run with the ruler. This is a perfectly legitimate way to estimate the value of the slope there.

Slope of the tangent line conceptually

This diagram shows the slope of the tangent line as height over width.

  • Slide the $x$ slider back and forth. The width does not change. The height is measured from the tangent line to the corner, so the height does change; in particular, it changes sign appropriately.
  • This shows that the standard formula for the derivative of the curve gives the same value as the calculated slope of the tangent. (If you are careful you can find a place where the last decimal places differ.) You may want to omit the "derivative value" info line, but most students in college calculus already know how to calculate the formulas for the derivative of a polynomial– or you can just tell them what it is in this case and promise to show how to calculate the formula later.
  • Changing the width while leaving $x$ fixed does not change the slope of the tangent line (up to roundoff error).
  • In fact I could add another parameter that allows you to calculate height over width at other places on the tangent line. But that is probably excessive. (You could do that in a separate demo that shows that basic property that the slope of a straight line does not change depending on where you measure it — that is what a curve being a straight line means.)
  • This graph provides a way to estimate the slope, but does not suggest a way to come up with a formula for the slope, in other words, a formula for the derivative.

Conceptual calculation of the slope

This diagram shows how to calculate the value of the slope at a point using secant lines to approximate the tangent line. If you have a formula for the function, you can calculate the limit of the slope of the secant line and get a formula for the derivative.

 

  • The function $f(x)=x^3-x$.
  • The secant points are $(x-h,f(x-h))$ and $(x+h, f(x+h))$. $h$ is called "width" in the diagram.
  • Moving $x$ with the slider shows how the tangent line and secant line have similar slopes.
  • Moving the width to the left, to $0$ (almost), makes the secant line coincide with the tangent line. So intuitively the limit of the slope of the secant line is the slope of the tangent line.
  • The distance between the secant points is the Euclidean distance. (It may be that including this information does not help, so maybe it should be left out.)
  • The slope of the secant line is $\frac{f(x+h)-f(x-h)}{(x+h)-(x-h)}$ when $h\neq0$. This simplifies to $3x^2+h^2-1$, so the limit when $h\to0$ is $3x^2-1$, which is therefore a formula for the derivative function.

 

Testing intuitive concepts

Most of the work students do when studying derivatives is to solve some word problems (rate of change, maximization) in which the student is expected to come up with an appropriate function $f(x)$ and then know or find out the formula for $f'(x)$ in the process of solving the problem. In other words there is a heavy emphasis on computation and much less on concept.

The student in the past has had to do very few homework problems that test for understanding the concept. Lately some texts do have problems that test the concept, for example:

This is the graph of a function and its derivative. Which one is the function and which is its derivative?

Concept Prob

Note that the problem does not give you the formula for the function, nor does it have to.

Many variations are possible, all involving calculating parameters directly from the graph:

  • "These are the first and second derivatives of a function. Where (within the bounds of the graph) is the function concave up?"
  • "These are the first and second derivatives of a function. Where (within the bounds of the graph) are its maxima and minima?"
  • "This straight line is the derivative of a function. Show that the function is a quadratic function and measure the slope of the line in order to estimate some of the coefficients of the quadratic."

 

How to manipulate the diagrams

 

  • You can move the sliders back and forth to to move to different points on the curve.
  • In the first diagram, you can click on one of the four buttons to see how it works for various curves.
  • The arrow at the upper right makes it run automatically in a not very useful sort of way.
  • The little plus sign below the arrow opens up some other controls and a box showing the value of $a$, including step by step operation (plus and minus signs).
  • If you are using Mathematica, you can enter values into the box, but if you are using CDF Player, you can only manipulate the number using the slider or the plus and minus incrementers.

 

Send to Kindle

Monads for High School III: Algebras

The interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook MonadAlg.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.

This is a continuation of Monads for high school I and Monads for High School II: Lists. This post covers the concept of algebras for the monad for lists.

Lists

$\textrm{Lists}(S)$ is the set of all lists of finite length whose entries are elements of $S$.

  • $\boxed{2\; 2\; 4}$ is the way I denote the list of length $3$ whose first and second entries are each $2$ and whose third entry is $4$.
  • A list with only one entry, such as $\boxed{2}$, is called a singleton list.
  • The empty list $\boxed{\phantom{2}}$ has no entries.
  • $\textrm{Lists}^*(S)$ is the set of all nonempty lists of finite length whose entries are elements of $S$.
  • $\textrm{Lists}(\textrm{Lists}(S))$ is the list whose entries are lists with entries from $S$.
  • For example, $\boxed{\boxed{5\; 7}\; \boxed{2\; 12\; 7}}$ and $\boxed{\boxed{5\; 7\; 2\; 12\; 7}}$ are both entries in $\textrm{Lists}^*(\textrm{Lists}^*(\mathbb{Z}))$. The second one is a singleton list!
  • $\boxed{\boxed{\phantom{3}}\; \boxed{2}}
    $ and $\boxed{\boxed{\phantom{3}}}$ are entries in $\textrm{Lists}^*(\textrm{Lists}(\mathbb{Z}))$.
  • The empty list $\boxed{\phantom{2}}$ is an entry in $\textrm{Lists}(\mathbb{Z})$, in $\textrm{Lists}(\textrm{Lists}^*(\mathbb{Z}))$ and in $\textrm{Lists}(\textrm{Lists}(\mathbb{Z}))$. If you have stared at this for more than ten minutes, do something else and come back to it later.

The star notation is used widely in math and computing science to imply that you are including everything except some insignificant shrimp of a thing such as the empty list, the empty set, or $0$. For example, $\mathbb{R}^*$ denotes the set of all nonzero real numbers.

More details about lists are in Monads for High School II: Lists.

Join

The function join (or concatenation) takes two lists and creates a third list. For example, if you join $\boxed{5\; 7}$ to $\boxed{2\; 12\; 7 }$ in that order you get $\boxed{5\; 7\; 2\; 12\; 7}$.

  • I will use this notation: join$\boxed{\boxed{5\; 7}\; \boxed{2\; 12\; 7}}=\boxed{5\; 7\; 2\; 12\; 7}$.
  • This notation means that I am regarding join as a function that takes a two-element list in $\textrm{Lists}(\textrm{Lists}(S))$ to an element of $\textrm{Lists}(S)$.
  • join removes one level of lists
  • join is not commutative: join$\boxed{\boxed{2\; 12\; 7}\; \boxed{5\; 7}}=\boxed{2\; 12\; 7\; 5\; 7}$
  • Join is associative, and as for any associative binary operation, join is defined on any finite list of lists of elements of $S$. So for example, join$\boxed{\boxed{5\; 7}\; \boxed{2\; 12\; 7}\; \boxed{1}}=\boxed{5\; 7\; 2\; 12\; 7\; 1}$.
  • For any single list $\boxed{a\; b\; c}$, join$\boxed{\boxed{a\; b\; c}}=\boxed{a\; b\; c}$. This is required to make the theory work. It is called the oneidentity property.
  • If the empty list $\boxed{\phantom{2}}$ occurs in a list of lists, it disappears when join is applied: join $\boxed{\boxed{2\; 3}\; \boxed{\phantom{2}}\; \boxed{4\; 5\; 6}}=\boxed{2\; 3\; 4\; 5\; 6}$.

More details about join in Monads for High School II: Lists.

The main monad diagram

When you have a list of lists of lists, join can be applied in two different ways, "inside" and "outside" as illustrated in the diagram below. It gives you several different inputs to try out as a way to understand what is happening.

This is the special case of the main diagram for all monads as it applies to the List monad.

As you can see, after doing either of "inside" and "outside", if you then apply join, you get the same list. That list is simply the list of entries in the beginning list (and the two intermediate ones) in the same order, disregarding groupings.

From what I have just written, you must depend on your pattern recognition abilities to learn what inside and outside mean. But both can also be described in words.

  • The lists outlined in black are lists of elements of $\mathbb{Z}$. In other words, they are elements of $\textrm{Lists}(\mathbb{Z})$.
  • The lists outlined in blue are lists of elements of $\textrm{Lists}(\mathbb{Z})$. In other words, they are list of lists of elements of $\mathbb{Z}$. Those are the kinds of things you can apply join to.
  • The leftmost list in the diagram, outlined in green, is a list in $\textrm{Lists}(\textrm{Lists}(\mathbb{Z}))$. This means you can apply join in two different ways:
  • Each list boxed in blue is a list of lists of integers (two of the are singletons!) so you can apply join to each of them. This is joining inside first.
  • You can apply join directly to the leftmost list, which is a list of lists (of lists, but forget that for the moment), so you can apply join to the blue lists. This is join outside first.

To understand this diagram, staring at the diagram (for most people) uses the visual pattern recognition part of your brain (which uses over a fifth of the energy used by your brain) to understand what inside and outside mean, and then check your understanding by reading the verbal description. Starting by reading the verbal description first does not work as well for most people.

The unit monad diagram

There is a second unitary diagram for all monads:

The two right hand entries are always the same. Again, I am asking you to use your pattern recognition abilities to learn what singleton list and singleton each mean.

The main and unit monad diagrams will be used as axioms to give the general definition of monad. To give those axioms, we also need the concepts of functor and natural transformation, which I will define later after I have finished the monad algebra diagrams for Lists and several other examples.

Algebras for the List monad

If you have any associative binary operation on a set $S$, its definition can be extended to any nonempty list of elements (see Monads for High School I.)

Plus and Times are like that:

  • $(3+2)+4$ and $3+(2+4)$ have the same value $9$, so you can write $3+2+4$ and it means $9$ no matter how you calculate it.
  • I will be using the notation Plus$\boxed{3\; 2\; 4}$ instead of $3+2+4$.
  • Times is also associative, so for example we can write Times$\boxed{3\; 2\; 4}=24$.
  • Like join, we require that these operations satisfy oneidentity, so we know Plus$\boxed{3}=3$ and Times$\boxed{3}=3$.
  • When the associative binary operation has an identity element, you can also define its value on the empty list as the identity element: Plus$\boxed{\phantom{3}}=0$ and Times$\boxed{\phantom{3}}=1$. I recommend that you experiment with examples to see why it works.

An algebra for the List monad is a function algop:$\textrm{Lists}(S)\to S$ with certain properties: It must satisfy the Main Monad Algebra Diagram and the Unit Monad Algebra Diagram, discussed below.

The main monad algebra diagram

Example using Plus and Times

The following interactive diagram allows you to see what happens with Plus and Times. Afterwards, I will give the general definition.

Plus insides replaces each inside list with the result of applying Plus to it, and the other operation Join is the same operation I have used before.

Another example

The main monad algebra diagram requires that if you have a list of lists of numbers such as the one below, you can add up each list (Plus insides) and then add up the list of totals (top list in diagram), you must get the same answer that you get when you join all the lists of numbers together into one list (bottom list in the diagram) and then add up that list.

This is illustrated by this special case of the main monad algebra diagram for Plus:

General statement of the main monad algebra diagram

Suppose we have any function $\blacksquare$ $:\textrm{Lists}(S)\to S$ for any set $S$.
If we want to give the main monad algebra diagram for $\blacksquare$ we have a problem. We know for example that Plus$\boxed{1\; 2}=3$. But for some elements $a $ and $b$ of $S$, we don’t know what $\blacksquare\boxed{a\; b}$ is. One way to write it is simply to write $\blacksquare\boxed{a\; b}$ (the usual way we write a function). Or we could use tree notation and write

newalopdouble.

I will use tree notation mostly, but it is a good exercise to redraw the diagrams with functional notation.

Main monad diagram in prose

Below is a presentation of the general main monad algebra diagram using (gasp!) English phrases to describe the nodes.

genalgdiag

The unit monad algebra diagram

Suppose $\blacksquare$ is any function from $\textrm{Lists}(S)$ to $S$ for any set $S$. Then the diagram is

UnitMAdiag

This says that if you apply $\blacksquare$ to a singleton you get the unique entry of the singleton. This is not surprising: I defined above what it means when you apply an operation to a singleton just so this would happen!

A particular example

These are specific examples of the general main monad algebra diagram for an arbitrary operation $\blacksquare$:

stalgdiagleft

staldiagright

These examples show that if $\blacksquare$ is any function from $\textrm{Lists}(S)$ to $S$ for any set $S$, then

newalopleft

equals

newaloptriple

and

newalopright

equals

newaloptriple

Well, according to some ancient Greek guy, that means

newalopleft

equals

newalopright

which says that
newalopdouble
is an associative binary operation!

The mother of all associative operations

We also know that any associative binary $\blacksquare$ on any set $S$ can be extended to a function on all finite nonempty lists of elements of $S$. This is the general associative law and was discussed (without using that name) in Monads fo High School I.

Let’s put what we’ve done together into one statement:

Every associative binary operation $\blacksquare$ on a set $S$ can be extended uniquely to a function $\blacksquare:\textrm{Lists}^*(S)\to S$ that satisfies both the main monad algebra diagram and the unit monad algebra diagram. Furthermore, any function $\blacksquare:\textrm{Lists}^*(S)\to S$ that satisfies both the main monad algebra diagram and the unit monad algebra diagram is an asssociative binary operation when applied to lists of length $2$ of elements of $S$.

That is why I claim that the NonemptyList monad is the mother of all associative binary operations.

I have not proved this, but the work in this and preceding posts provide (I think) a good intuitive understanding of this fundamental relationship between lists and associative binary operations.

Things to do in upcoming posts

  • I have to give a proper definition of monads using the concepts of functor and natural transformation. I expect to do this just for set functors, not mentioning categories.
  • Every type of binary operation that is defined by equations corresponds to a monad which is the mother of all binary operations of that type. I will give examples, but not prove the general case.

Other examples of monads

  • Associative binary operations on $S$ with identity element (monoids) corresponds to all lists, including the empty list, with entries from $S$.
  • Commutative, associative and idempotent binary operations, like and and or in Boolean algebra, correspond to the set monad: $\text{Sets}(S)$ is the set of all finite and countably infinite sets of elements of $S$. (You can change the cardinality restrictions, but you have to have some cardinality restrictions.) Join is simply union.
  • Commutative and associative binary operations corresponds to the multiset monad (with a proper definition of join) and appropriate cardinality restrictions. You have to fuss about identity elements here, too.
  • Various kinds of nonassociative operations get much more complicated, involving tree structures with equivalence relations on them. I expect to work out a few of them.
  • There are lots of monads in computing science that you never heard of (unless you are a computing scientist). I will mention a few of them.

  • Every type of binary operation defined by equations corresponds to a monad. But some of them are unsolvable, meaning you cannot describe the monad precisely.

There will probably be long delay before I get back to this project. There are too many other things I want to do!

Send to Kindle

Explaining math

The interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook SolvEq.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.

This post explains some basic distinctions that need to be made about the process of writing and explaining math.  Everyone who teaches math knows subconsciously what is happening here; I am trying to raise your consciousness.  For simplicity, I have chosen a technique used in elementary algebra, but much of what I suggest also applies to more abstract college level math.

An algebra problem

Solve the equation "$ax=b$" ($a\neq0$).

Understanding the statement of this problem requires a lot of Secret Knowledge (the language of ninth grade algebra) that most people don't have.

  • The expression "$ax$" means that $a$ and $x$ are numbers and $ax$ is their product. It is not the word "ax". You have to know that writing two symbols next to each other means multiply them, except when it doesn't mean multiply them as in "$\sin\,x$".

  • The whole expression "$ax=b$" ostensibly says that the number $ax$ is the same number as $b$.  In fact, it means more than that. The phrase "solve the equation" tells you that in fact you are supposed to find the value of $x$ that makes $ax$ the same number as $b$.

  • How do you know that "solve the equation" doesn't mean find the value of $a$ that makes $ax$ the same number as $b$? Answer: The word "solve" triggers a convention that $x$, $y$ and $z$ are numbers you are trying to find and $a$, $b$, $c$ stand for numbers that you are allowed to plug in to the equation.

  • The conventions of symbolic math require that you give a solution for any nonzero value of $a$ and any value of $b$.  You specifically are not allowed to pick $a=1$ and $b=33$ and find the value just for those numbers.  (Some college calculus students do this with problems involving literal coefficients.)

  • The little thingy "$(a\neq0)$" must be read as a constraint on $a$.  It does not mean that $a\neq0$ is a fact that you ought to know. ( I've seen college math students make this mistake, admittedly in more complex situations). Nor does it mean that you can't solve the problem if $a=0$ (you can if $b$ is also zero!).

So understanding what this problem asks, as given, requires (fairly sophisticated in some cases) pattern recognition both to understand the symbolic language it uses, and also to understand the special conventions of the mathematical English that it uses.

Explicit descriptions

This problem could be reworded so that it gives an explicit description of the problem, not requiring pattern recognition.  (Warning: "Not requiring pattern recognition" is a fuzzy concept.)  Something like this:  

You have two numbers $a$ and $b$.  Find a number $c$ for which if you multiply $a$ by $c$ you get $b$.

This version is not completely explicit.  It still requires understanding the idea of referring to a number by a letter, and it still requires pattern recognition to catch on that the two occurrences of each letter means that their meanings have to match. Also, I know from experience that some American first year college students have trouble with the syntax of the sentence ("for which…", "if…").

The following version is more explicit, but it cheats by creating an ad hoc way to distinguish the numbers.

Alice and Bob each give you a number.  How do you find a number with the property that Alice's number times your number is equal to Bob's number? 

If the problem had a couple more variables it would be so difficult to understand in an explicit form that most people would have to draw a picture of the relationships between them.  That is why algebraic notation was invented.

Visual descriptions

Algebra is a difficult foreign language.  Showing the problem visually makes it easier to understand for most people. Our brain's visual processing unit is the most powerful tool the brain has to understand things.  There are various ways to do this.  

Visualization can help someone understand algebraic notation better.  

You can state the problem by producing examples such as

  • $\boxed{3}\times\boxed{\text{??}}=\boxed{6}$ 
  • $\boxed{5}\times\boxed{\text{??}}=\boxed{2}$ 
  • $\boxed{42}\times\boxed{\text{??}}=\boxed{24}$

where the reader has to know the multiplication symbol and, one hopes, will recognize "$\boxed{\text{??}}$" as "What's the value?". But the reader does not have to understand what it means to use letters for numbers, or that "$x$ means you are suppose to discover what it is".  This way of writing an algebra problem is used in some software aimed at K-12 students.  Some of them use a blank box instead of "$\boxed{\text{??}}$".

Such software often shows the algorithm for solving the problem visually, using algebraic notation like this:

I have put in some buttons to show numbers as well as $a$ and $b$.  If you have access to Mathematica instead of just to CDF player, you can load SolvEq.nb and put in any numbers you want, but CDF's don't allow input data. 

You can also illustrate the algorithm using the tree notation for algebra I used in Monads for high school I  (and other posts). The demo below shows how to depict the value-preserving transformation given by the algorithm.  (In this case the value is the truth since the root operation is equals.)

This demo is not as visually satisfactory as the one illustrating the use of the associative law in Monads for high school I.  For one thing, I had to cheat by reversing the placement of $a$ and $x$.  Note that I put labels for the numerator and denominator legs, a practice I have been using in demos for a while for noncommutative operations.  I await a new inspiration for a better presentation of this and other equation-solving algorithms.

Another advantage of using pictures is that you can often avoid having to code things as letters which then has to be remembered.  In Monads for high school I, I used drawings of the four functions from a two-element set to itself instead of assigning them letters.  Even mnemonic letters such as $s$ for "switch" and $\text{id}$ for the identity element carry a burden that the picture dispenses with.

Send to Kindle

Naming mathematical objects

Commonword names confuse

Many technical words and phrases in math are ordinary English words ("commonwords") that are assigned a different and precisely defined mathematical meaning.  

  • Group  This sounds to the "layman" as if it ought to mean the same things as "set".  You get no clue from the name that it involves a binary operation with certain properties.  
  • Formula  In some texts on logic, a formula is a precisely defined expression that becomes a true-or-false sentence (in the semantics) when all its variables are instantiated.  So $(\forall x)(x>0)$ is a formula.  The word "formula" in ordinary English makes you think of things like "$\textrm{H}_2\textrm{O}$", which has no semantics that makes it true or false — it is a symbolic expression for a name.
  • Simple group This has a technical meaning: a group with no nontrivial normal subgroup.  The Monster Group is "simple".  Yes, the technical meaning is motivated by the usual concept of "simple", but to say the Monster Group is simple causes cognitive dissonance.

Beginning students come with the (generally subconscious) expectation that they will pick up clues about the meanings of words from connotations they are already familiar with, plus things the teacher says using those words.  They think in terms of refining an understanding they already have.  This is more or less what happens in most non-math classes.  They need to be taught what definition means to a mathematician.

Names that don't confuse but may intimidate

Other technical names in math don't cause the problems that commonwords cause.

Named after somebody The phrase "Hausdorff space" leads a math student to understand that it has a technical meaning.  They may not even know it is named after a person, but it screams "geek word" and "you don't know what it means".  That is a signal that you can find out what it means.  You don't assume you know its meaning. 

New made-up words  Words such as "affine", "gerbe"  and "logarithm" are made up of words from other languages and don't have an ordinary English meaning.  Acronyms such as "QED", "RSA" and "FOIL" don't occur often.  I don't know of any math objects other than "RSA algorithm" that have an acronymic name.  (No doubt I will think of one the minute I click the Publish button.)  Whole-cloth words such as "googol" are also rare.  All these sorts of words would be good to name new things since they do not fool the readers into thinking they know what the words mean.

Both types of words avoid fooling the student into thinking they know what the words mean, but some students are intimidated by the use of words they haven't seen before.  They seem to come to class ready to be snowed.  A minority of my students over my 35 years of teaching were like that, but that attitude was a real problem for them.

Audience

You can write for several different audiences.

Math fans (non-mathematicians who are interested in math and read books about it occasionally) In my posts Explaining higher math to beginners and in Renaming technical conceptsI wrote about several books aimed at explaining some fairly deep math to interested people who are not mathematicians.  They renamed some things. For example, Mark Ronan in Symmetry and the Monster used the phrase "atom" for "simple group" presumably to get around the cognitive dissonance.  There are other examples in my posts.  

Math newbies  (math majors and other students who want to understand some aspect of mathematics).  These are the people abstractmath.org is aimed at. For such an audience you generally don't want to rename mathematical objects. In fact, you need to give them a glossary to explain the words and phrases used by people in the subject area.   

Postsecondary math students These people, especially the math majors, have many tasks:

  • Gain an intuitive understanding of the subject matter.
  • Understand in practice the logical role of definitions.
  • Learn how to come up with proofs.
  • Understand the ins and outs of mathematical English, particularly the presence of ordinary English words with technical definitions.
  • Understand and master the appropriate parts of the symbolic language of math — not just what the symbols mean but how to tell a statement from a symbolic name.

It is appropriate for books for math fans and math newbies to try to give an understanding of concepts without necessary proving theorems.  That is the aim of much of my work, which has more an emphasis on newbies than on fans. But math majors need as well the traditional emphasis on theorem and proof and clear correct explanations.

Lately, books such as Visual Group Theory have addressed beginning math majors, trying for much more effective ways to help the students develop good intuition, as well as getting into proofs and rigor. Visual Group Theory uses standard terminology.  You can contrast it with Symmetry and the Monster and The Mystery of the Prime Numbers (read the excellent reviews on Amazon) which are clearly aimed at math fans and use nonstandard terminology.  

Terminology for algebraic structures

I have been thinking about the section of Abstracting Algebra on binary operations.  Notice this terminology:

boptable

The "standard names" are those in Wikipedia.  They give little clue to the meaning, but at least most of them, except "magma" and "group", sound technical, cluing the reader in to the fact that they'd better learn the definition.

I came up with the names in the right column in an attempt to make some sense out of them.  The design is somewhat like the names of some chemical compounds.  This would be appropriate for a text aimed at math fans, but for them you probably wouldn't want to get into such an exhaustive list.

I wrote various pieces meant to be part of Abstracting Algebra using the terminology on the right, but thought better of it. I realized that I have been vacillating between thinking of AbAl as for math fans and thinking of it as for newbies. I guess I am plunking for newbies.

I will call groups groups, but for the other structures I will use the phrases in the middle column.  Since the book is for newbies I will include a table like the one above.  I also expect to use tree notation as I did in Visual Algebra II, and other graphical devices and interactive diagrams.

Magmas

In the sixties magmas were called groupoids or monoids, both of which now mean something else.  I was really irritated when the word "magma" started showing up all over Wikipedia. It was the name given by Bourbaki, but it is a bad name because it means something else that is irrelevant.  A magma is just any binary operation. Why not just call it that?  

Well, I will tell you why, based on my experience in Ancient Times (the sixties and seventies) in math. (I started as an assistant professor at Western Reserve University in 1965). In those days people made a distinction between a binary operation and a "set with a binary operation on it".  Nowadays, the concept of function carries with it an implied domain and codomain.  So a binary operation is a function $m:S\times S\to S$.  Thinking of a binary operation this way was just beginning to appear in the common mathematical culture in the late 60's, and at least one person remarked to me: "I really like this new idea of thinking of 'plus' and 'times' as functions."  I was startled and thought (but did not say), "Well of course it is a function".  But then, in the late sixties I was being indoctrinated/perverted into category theory by the likes of John Isbell and Peter Hilton, both of whom were briefly at Case Western Reserve University.  (Also Paul Dedecker, who gave me a glimpse of Grothendieck's ideas).

Now, the idea that a binary operation is a function comes with the fact that it has a domain and a codomain, and specifically that the domain is the Cartesian square of the codomain.  People who didn't think that a binary operation was a function had to introduce the idea of the universe (universal algebraists) or the underlying set (category theorists): you had to specify it separately and introduce terminology such as $(S,\times)$ to denote the structure.   Wikipedia still does it mostly this way, and I am not about to start a revolution to get it to change its ways.

Groups

In the olden days, people thought of groups in this way:

  • A group is a set $G$ with a binary operation denoted by juxtaposition that is closed on $G$, meaning that if $a$ and $b$ are any elements of $G$, then $ab$ is in $G$.
  • The operation is associative, meaning that if $a,\ b,\ c\in G$, then $(ab)c=a(bc)$.
  • The operation has a unity element, meaning an element $e$ for which for any element $a\in G$, $ae=ea=a$.
  • For each element $a\in G$, there is an element $b$ for which $ab=ba=e$.

This is a better way to describe a group:

  • A group consist of a nullary operation e, a unary operation inv,  and a binary operation denoted by juxtaposition, all with the same codomain $G$. (A nullary operation is a map from a singleton set to a set and a unary operation is a map from a set to itself.)
  • The value of e is denoted by $e$ and the value of inv$(a)$ is denoted by $a^{-1}$.
  • These operations are subject to the following equations, true for all $a,\ b,\ c\in G$:

     

    • $ae=ea=a$.
    • $aa^{-1}=a^{-1}a=e$.
    • $(ab)c=a(bc)$.

This definition makes it clear that a group is a structure consisting of a set and three operations whose axioms are all equations.  It was formulated by people in universal algebra but you still see the older form in texts.

The old form is not wrong, it is merely inelegant.  With the old form, you have to prove the unity and inverses are unique before you can introduce notation, and more important, by making it clear that groups satisfy equational logic you get a lot of theorems for free: you construct products on the cartesian power of the underlying set, quotients by congruence relations, and other things. (Of course, in AbAl those theorem will be stated later than when groups are defined because the book is for newbies and you want lots of examples before theorems.)

References

  1. Three kinds of mathematical thinkers (G&G post)
  2. Technical meanings clash with everyday meanings (G&G post)
  3. Commonword names for technical concepts (G&G post)
  4. Renaming technical concepts (G&G post)
  5. Explaining higher math to beginners (G&G post)
  6. Visual Algebra II (G&G post)
  7. Monads for high school II: Lists (G&G post)
  8. The mystery of the prime numbers: a review (G&G post)
  9. Hersh, R. (1997a), "Math lingo vs. plain English: Double entendre". American Mathematical Monthly, volume 104, pages 48–51.
  10. Names (in abmath)
  11. Cognitive dissonance (in abmath)
Send to Kindle