Abstraction and axiomatic systems

Abstraction and the axiomatic method

This post will become an article in abstractmath.org.

Abstraction

An abstraction of a concept $C$ is a concept $C’$ with these properties:

  • $C’$ includes all instances of $C$ and
  • $C’$ is constructed by taking as axioms certain assertions that are true of all instances of $C$.

There are two major situations where abstraction is used in math.

  • $C$ may be a familiar concept or property that has not yet been given a math definition.
  • $C$ may already have a mathe­matical definition using axioms. In that case the abstraction will be a generalization of $C$. 

In both cases, the math definition may allow instances of $C’$ that were not originally thought of as being part of $C$.

Example: Relations

Mathematicians have made use of relations between math objects since antiquity.

  • For real numbers $r$ and $s$. “$r\lt x$” means that $r$ is less than $s$. So the statement “$5\lt 7$” is true, but the statement “$7\lt 5$” is false. We say that “$\lt$” is a relation on the real numbers. Other relations on real numbers denoted by symbols are “$=$” and “$\leq$”.
  • Suppose $m$ and $n$ are positive integers. $m$ and $n$ are said to be relatively prime if the greatest common divisor of $m$ and $n$ is $1$. So $5$ and $7$ are relatively prime, but $15$ and $21$ are not relatively prime. So being relatively prime is a relation on positive integers. This is a relation that does not have a commonly used symbol.
  • The concept of congruence of triangles has been used for a couple of millenia. In recent centuries it has been denoted by the symbol “$\cong$”. Congruence is a relation on triangles.

One could say that a relation is a true-or-false statement that can be made about a pair of math objects of a certain type. Logicians have in fact made that a formal definition. But when set theory came to be used around 100 years ago as a basis for all definitions in math, we started using this definition:

A relation on a set $S$ is a set $\alpha$ of ordered pairs of elements of $S$.

“$\alpha$” is the Greek letter alpha.

The idea is that if $(s,t)\in\alpha$, then $s$ is related by $\alpha$ to $t$, then $(s,t)$ is an element of $\alpha$, and if $s$ is not related by $\alpha$ to $t$, then $(s,t)$ is not an element of $\alpha$. That abstracts the everyday concept of relationship by focusing on the property that a relation either holds or doesn’t hold between two given objects.

For example, the less-than relation on the set of all real numbers $\mathbb{R}$ is the set \[\alpha:=\{(r,s)|r\in\mathbb{R}\text{ and }s\in\mathbb{R}\text{ and }r\lt s\}\] In other words, $r\lt s$ if and only if $(r,s)\in \alpha$.

Example

A consequence of this definition is that any set of ordered pairs is a relation. Example: Let $\xi:=\{(2,3),(2,9),(9,1),(9,2)\}$. Then $\xi$ is a relation on the set $\{1,2,3,9\}$. Your reaction may be: What relation IS it? Answer: just that set of ordered pairs. You know that $2\xi3$ and $2\xi9$, for example, but $9\xi1$ is false. There is no other definition of $\xi$.

Yes, the relation $\xi$ is weird. It is an arbitrary definition. It does not have any verbal description other than listing the element of $\xi$. It is probably useless. Live with it.

The symbol “$\xi$” is a Greek letter. It looks weird, so I used it to name a weird relation. Its upper case version is “$\Xi$”, which is even weirder. I pronounce “$\xi$” as “ksee” but most mathematicians call it “si” or “zi” (rhyming with “pie”).

Defining a relation as any old set of ordered pairs is an example of a reconstructive generalization.

$n$-ary relations

Years ago, mathematicians started coming up with things that were like relations but which involved more than two elements of a set.

Example

Let $r$, $s$ and $t$ be real numbers. We say that “$s$ is between $r$ and $t$” if $r\lt s$ and $s\lt t$. Then betweenness is a relation that is true or false about three real numbers.

Mathematicians now call this a ternary relation. The abstract definition of a ternary relation is this: A ternary relation on a set $S$ is a set of ordered triple of elements of $S$. This is an reconstructive generalization of the concept of relation that allows ordered triples of elements as well as ordered pairs of elements.

In the case of betweenness, we have to decide on the ordering. Let us say that the betweenness relation holds for the triple $(r,s,t)$ if $r\lt s$ and $s\lt t$. So $(4,5,7)$ is in the betweenness relation and $(4,7,5)$ is not.

You could argue that in the sentence, “$s$ is between $r$ and $t$”, the $s$ comes first, so that we should say that the betweenness relation (meaning $r$ is between $s$ and $t$) holds for $(r,s,t)$ if $s\lt r$ and $r\lt t$. Well, when you write an article you can write it that way. But I am writing this article.

Nowadays we talk about $n$-ary relations for any positive integer $n$. One consequence of this is that if we want to talk just about sets of ordered pairs we must call them binary relations.

When I was a child there was only one kind of guitar and it was called “a guitar”. (My older cousin Junior has a guitar, but I had only a plastic ukelele.) Some time in the fifties, electrically amplified guitars came into being, so we had to refer to the original kind as “acoustic guitars”. I was a teenager when this happened, and being a typical teenager, I was completely contemptuous of the adults who reacted with extreme irritation at the phrase “acoustic guitar”.

The axiomatic method

The axiomatic method is a technique for studying math objects of some kind by formulating them as a type of math structure. You take some basic properties of the kind of structure you are interested in and set them down as axioms, then deduce other properties (that you may or may not have already known) as theorems. The point of doing this is to make your reasoning and all your assumptions completely explicit.

Nowadays research papers typically state and prove their theorems in terms of math structures defined by axioms, although a particular paper may not mention the axioms but merely refer to other papers or texts where the axioms are given.  For some common structures such as the real numbers and sets, the axioms are not only not referenced, but the authors clearly don’t even think about them in terms of axioms: they use commonly-known properties (or real numbers or sets, for example) without reference.

The axiomatic method in practice

Typically when using the axiomatic method some of these things may happen:

  • You discover that there are other examples of this system that you hadn’t previously known about.  This makes the axioms more broadly applicable.
  • You discover that some properties that your original examples had don’t hold for some of the new examples.  Depending on your research goals, you may then add some of those properties to the axioms, so that the new examples are not examples any more.
  • You may discover that some of your axioms follow from others, so that you can omit them from the system.

Example: Continuity

A continuous function (from the set of real numbers to the set of real numbers) is sometimes described as a function whose graph you can draw without lifting your chalk from the board.  This is a physical description, not a mathe­matical definition.

In the nineteenth century, mathe­ma­ticians talked about continuous functions but became aware that they needed a rigorous definition.  One possibility was functions given by formulas, but that didn’t work: some formulas give discontinuous functions and they couldn’t think of formulas for some continuous functions.

This description of nineteenth century math is an oversimpli­fication.

Cauchy produced the definition we now use (the epsilon-delta definition) which is a rigorous mathe­matical version of the no-lifting-chalk idea and which included the functions they thought of as continuous.

To their surprise, some clever mathe­maticians produced examples of some weird continuous functions that you can’t draw, for example the sine blur function.  In the terminology in the discussion of abstraction above, the abstraction $C’$ (epsilon-delta continuous functions) had functions in it that were not in $C$ (no-chalk-lifting functions.) On the other hand, their definition now applied to functions between some spaces besides the real numbers, for example the complex numbers, for which drawing the graph without lifting the chalk doesn’t even make sense.

Example: Rings

Suppose you are studying the algebraic properties of numbers.  You know that addition and multiplication are both associative operations and that they are related by the distributive law:  $x(y+z)=xy+xz$. Both addition and multiplication have identity elements ($0$ and $1$) and satisfy some other properties as well: addition forms a commutative group for example, and if $x$ is any number, then $0\cdot x=0$.

One way to approach this problem is to write down some of these laws as axioms on a set with two binary operations without assuming that the elements are numbers. In doing this, you are abstracting some of the properties of numbers.

Certain properties such as those in the first paragraph of this example were chosen to define a type of math structure called a ring. (The precise set of axioms for rings is given in the Wikipedia article.)

You may then prove theorems about rings strictly by logical deduction from the axioms without calling on your familiarity with numbers.

When mathematicians did this, the following events occurred:

  • They discovered systems such as matrices whose elements are not numbers but which obey most of the axioms for rings.
  • Although multiplication of numbers is commutative, multiplication of matrices is not commutative.
  • Now they had to decide whether to require commutative of multiplication as an axioms for rings or not.  In this example, historically, mathe­maticians decided not to require multi­plication to be commutative, so (for example) the set of all $2\times 2$ matrices with real entries is a ring.
  • They then defined a commutative ring to be a ring in which multi­plication is commutative.
  • So the name “commutative ring” means the multiplication is commutative, because addition in rings is always commutative. Mathematical names are not always transparent.

  • You can prove from the axioms that in any ring, $0 x=0$ for all $x$, so you don’t need to include it as an axiom.

Nowadays, all math structures are defined by axioms.

Other examples

  • Historically, the first example of something like the axiomatic method is Euclid’s axiomatization of geometry.  The axiomatic method began to take off in the late nineteenth century and now is a standard tool in math.  For more about the axiomatic method see the Wikipedia article.
  • Partitions. and equivalence
    relations
    are two other concepts that have been axiomatized. Remarkably, although the axioms for the two types of structures are quite different, every partition is in fact an equivalence relation in exactly one way, and any equivalence relation is a partition in exactly one way.

Remark

Many articles on the web about the axiomatic method emphasize the representation of the axiom system as a formal logical theory (formal system). 
In practice, mathematicians create and use a particular axiom system as a tool for research and understanding, and state and prove theorems of the system in semi-formal narrative form rather than in formal logic.



Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

The great math mystery

The great math mystery

Last night Nova aired The great math mystery, a documentary that describes mathematicians’ ideas about whether math is discovered or invented, whether it is “out there” or “in our head”. It was well-done. Things were explained clearly using images and metaphors, although they did show Maxwell’s equations as algebra (without explaining it). The visual illustrations of connections between Maxwell’s equations and music and electromagnetic waves was one of the best parts of the documentary.

In my opinion they made good choices of mathematical ideas to cover, but I imagine a lot of research mathematicians will have a hissy that they didn’t cover XXX (their subject).

The applications to physics dominated the show (that is not a complaint), but someone did mention the remarkable depth of number theory. Number theory is deep pure math that has indeed had some applications, but that’s not why some of the greatest mathematicians in the world have spent their lives on the subject. I believe logic and proof was never mentioned, and that is completely appropriate for a video made for the general public. Some mathematicians will disagree with that last sentence.

Where does math live?

The question,

Does math live

  • In an ideal world separate from the physical world,
  • in the physical world, or
  • in our brains?

has a perfectly clear answer: It exists in our brains.

Ideal world

The notion that math lives in an ideal world, as Plato supposedly believed, has no evidence for it at all.

I suppose you could say that Plato’s ideal world does exist — in our brains. But that wouldn’t be quite correct: We have a mental image of Plato’s ideal world in our brains, but that image is not the whole ideal world: If we know about triangles, we can imagine the Ideal Triangle to be in his world, but we have to know about the zeta function or the monster group to visualize them to be in his world. Even then, the monster group in our brain is just a collection of neurons connected to concepts such as “largest sporadic simple group” or “contains\[2^{46} \cdot 3^{20} \cdot 5^9 \cdot 7^6 \cdot 11^2 \cdot 13^3 \cdot 17 \cdot 19 \cdot 23 \cdot 29 \cdot 31 \cdot 41 \cdot 47 \cdot 59 \cdot 71\]elements” — but there is not a neuron for each element! We don’t have that many neurons.

The size of the monster group does not live in my brain. I copied it from Wikipedia.

Real world

Our collective experience is that math is extraordinarily useful for modeling many aspects of the real world. But in what sense does that mean it exists in the real world?

There is a sense in which a model of the real world exists in our brains. If we know some of the math that explains certain aspects of the real world, our brains have neuron connections that make that math live in our brain and in some sense in the model of the real world that is in our brain. But does that mean the math is “out there”? I don’t see why.

Math is a social endeavor

One point that usually gets left out of discussions of Platonism is this: Some math exists in any individual person’s brain. But math also exists in society. The math floating around in the individual brains of people is subject to frequent amendments to those people’s understanding because they interact with the real world and in particular with other people.

In particular, theoretical math exists in the society of mathematicians. It is constantly fluctuating because mathematicians talk to each other. They also explain it to non-mathematicians, which as everyone know can bring new insights into the brain of the person doing the explaining.

So I think that the best answer to the question, where does math live? is that math is a bunch of memes that live in our social brain.

References

I have written about these issues before:

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Functions: Metaphors, Images and Representations

Please read this post at abstractmath.org. I originally posted the document here but some of the diagrams would not render, and I haven’t been able to figure out why. Sorry for having to redirect.

Problems caused for students by the two languages of math

The two languages of math

Mathematics is communicated using two languages: Mathematical English and the symbolic language of math (more about them in two languages).

This post is a collection of examples of the sorts of trouble that the two languages cause beginning abstract math students. I have gathered many of them here since they are scattered throughout the literature. I would welcome suggestions for other references to problems caused by the languages of math.

In many of the examples, I give links to the literature and leave you to fish out the details there. Almost all of the links are to documents on the internet.

There is an extensive list of references.

Conjectures

Scattered through this post are conjectures. Like most of my writing about difficulties students have with math language, these conjectures are based on personal observation over 37 years of teaching mostly computer engineering and math majors. The only hard research of any sort I have done in math ed consists of the 426 citations of written mathematical writing included in the Handbook of Mathematical Discourse.

Disclaimer

This post is an attempt to gather together the ways in which math language causes trouble for students. It is even more preliminary and rough than most of my other posts.

  • The arrangement of the topics is unsatisfactory. Indeed, the topics are so interrelated that it is probably impossible to give a satisfactory linear order to them. That is where writing on line helps: Lots of forward and backward references.
  • Other people and I have written extensively about some of the topics, and they have lots of links. Other topics are stubs and need to be filled out. I have probably missed important points about and references to many of them.
  • Please note that many of the most important difficulties that students have with understanding mathematical ideas are not caused by the languages of math and are not represented here.

I expect to revise this article periodically as I find more references and examples and understand some of the topics better. Suggestions would be very welcome.

Intricate symbolic expressions

I have occasionally had students tell me that have great difficulty understanding a complicated symbolic expression. They can’t just look at it and learn something about what it means.

Example

Consider the symbolic expression \[\displaystyle\left(\frac{x^3-10}{3 e^{-x}+1}\right)^6\]

Now, I could read this expression aloud as if it were text, or more precisely describe it so that someone else could write it down. But if I am in math mode and see this expression I don’t “read” it, even to myself.

I am one of those people who much of the time think in pictures or abstractions without words. (See references here.)

In this case I would look at the expression as a structured picture. I could determine a number of things about it, and when I was explaining it I would point at the board, not try to pronounce it or part of it:

  • The denominator is always positive so the expression is defined for all reals.
  • The exponent is even so the value of the expression is always nonnegative. I would say, “This (pointing at the exponent) is an even power so the expression is never negative.”
  • It is zero in exactly one place, namely $x=\sqrt[3]{10}$.
  • Its derivative is also $0$ at $\sqrt[3]{10}$. You can see this without calculating the formula for the derivative (ugh).

There is much more about this example in Zooming and Chunking.

Algebra in high school

There are many high school students stymied by algebra, never do well at it, and hate math as a result. I have known many such people over the years. A revealing remark that I have heard many times is that “algebra is totally meaningless to me”. This is sometimes accompanied by a remark that geometry is “obvious” or something similar. This may be because they think they have to “read” an algebraic expression instead of studying it as they would a graph or a diagram.

Conjecture

Many beginning abstractmath students have difficulty understanding a symbolic expression like the one above. Could this be cause by resistance to treating the expression as a structure to be studied?

Context-sensitive pronunciation

A symbolic assertion (“formula” to logicians) can be embedded in a math English sentence in different ways, requiring the symbolic assertion to be pronounced in different ways. The assertion itself is not modified in any way in these different situations.

I used the phrase “symbolic assertion” in abstractmath.org because students are confused by the logicians’ use of “formula“.
In everyday English, “$\text{H}_2\text{O}$” is the “formula” for water, but it is a term, not an assertion.

Example

“For every real number $x\gt0$ there is a real number $y$ such that $x\gt y\gt0$.”

  • In the sentence above, the assertion “$x\gt0$” must be pronounced “$x$ that is greater than $0$” or something similar.
  • The standalone assertion “$x\gt0$” is pronounced “$x$ is greater than $0$.”
  • The sentence “Let $x\gt0$” must be pronounced “Let $x$ be greater than $0$”.

The consequence is that the symbolic assertion, in this case “$x\gt0$”, does not reveal that role it plays in the math English sentence that it is embedded in.

Many of the examples occurring later in the post are also examples of context-sensitive pronunciation.

Conjectures

Many students are subconsciously bothered by the way the same symbolic expression is pronounced differently in different math English sentences.

This probably impedes some students’ progress. Teachers should point this phenomenon out with examples.

Students should be discouraged from pronouncing mathematical expressions.

For one thing, this could get you into trouble. Consider pronouncing “$\sqrt{3+5}+6$”. In any case, when you are reading any text you don’t pronounce the words, you just take in their meaning. Why not take in the meaning of algebraic expressions in the same way?

Parenthetic assertions

A parenthetic assertion is a symbolic assertion embedded in a sentence in math English in such a way that is a subordinate clause.

Example

In the math English sentence

“For every real number $x\gt0$ there is a real number $y$ such that $x\gt y\gt0$”

mentioned above, the symbolic assertion “$x\gt0$” plays the role of a subordinate clause.

It is not merely that the pronunciation is different compared to that of the independent statement “$x\gt0$”. The math English sentence is hard to parse. The obvious (to an experienced mathematician) meaning is that the beginning of the sentence can be read this way: “For every real number $x$, which is bigger than $0$…”.

But new student might try to read it is “For every real number $x$ is greater than $0$ …” by literally substituting the standalone meaning of “$x\gt0$” where it occurs in the sentence. This makes the text what linguists call a garden path sentence. The student has to stop and start over to try to make sense of it, and the symbolic expression lacks the natural language hints that help understand how it should be read.

Note that the other two symbolic expressions in the sentence are not parenthetic assertions. The phrase “real number” needs to be followed by a term, and it is, and the phrase “such that” must be followed by a clause, and it is.

More examples

  • “Consider the circle $S^1\subseteq\mathbb{C}=\mathbb{R}^2$.” This has subordinate clauses to depth 2.
  • “The infinite series $\displaystyle\sum_{k=1}^\infty\frac{1}{k^2}$ converges to $\displaystyle\zeta(2)=\frac{\pi^2}{6}\approx1.65$”
  • “We define a null set in $I:=[a,b]$ to be a set that can be covered by a countable of intervals with arbitrarily small total length.” This shows a parenthetical definition.
  • “Let $F:A\to B$ be a function.”
    A type declaration is a function? In any case, it would be better to write this sentence simply as “Let $F:A\to B$”.

David Butler’s post Contrapositive grammar has other good examples.

Math texts are in general badly written. Students need to be taught how to read badly written math as well as how to write math clearly. Those that succeed (in my observation) in being able to read math texts often solve the problem by glancing at what is written and then reconstructing what the author is supposedly saying.

Conjectures

Some students are baffled, or at least bothered consciously or unconsciously, by parenthetic assertions, because the clues that would exist in a purely English statement are missing.

Nevertheless, many if not most math students read parenthetic assertions correctly the first time and never even notice how peculiar they are.

What makes the difference between them and the students who are stymied by parenthetic assertions?

There is another conjecture concerning parenthetic assertions below.

Context-sensitive meaning

“If” in definitions

Example

The word “if” in definitions does not mean the same thing that it means in other math statements.

  • In the definition “An integer is even if it is divisible by $2$,” “if” means “if and only if”. In particular, the definition implies that a function is not even if it is not divisible by $2$.
  • In a theorem, for example “If a function is differentiable, then it is continuous”, the word “if” has the usual one-way meaning. In particular, in this case, a continuous function might not be differentiable.

Context-sensitive meaning occurs in ordinary English as well. Think of a strike in baseball.

Conjectures

The nearly universal custom of using “if” to mean “if and only if” in definitions makes it a harder for students to understand implication.

This custom is not the major problem in understanding the role of definitions. See my article Definitions.

Underlying sets

Example

In a course in group theory, a lecturer may say at one point, “Let $F:G\to H$ be a homomorphism”, and at another point, “Let $g\in G$”.

In the first sentence, $G$ refers to the group, and in the second sentence it refers to the underlying set of the group.

This usage is almost universal. I think the difficulty it causes is subtle. When you refer to $\mathbb{R}$, for example, you (usually) are referring to the set of real numbers together with all its canonical structure. The way students think of it, a real number comes with its many relations and connections with the other real numbers, ordering, field properties, topology, and so on.

But in a group theory class, you may define the Klein $4$-group to be $\mathbb{Z}_2\times\mathbb{Z}_2$. Later you may say “the symmetry group of a rectangle that is not a square is the Klein $4$-group.” Almost invariably some student will balk at this.

Referring to a group by naming its underlying set is also an example of synecdoche.

Conjecture

Students expect every important set in math to have a canonical structure. When they get into a course that is a bit more abstract, suddenly the same set can have different structures, and math objects with different underlying sets can have the same structure. This catastrophic shift in a way of thinking should be described explicitly with examples.

Way back when, it got mighty upsetting when the earth started going around the sun instead of vice versa. Remind your students that these upheavals happen in the math world too.

Overloaded notation

Identity elements

A particular text may refer to the identity element of any group as $e$.

This is as far as I know not a problem for students. I think I know why: There is a generic identity element. The identity element in any group is an instantiation of that generic identity element. The generic identity element exists in the sketch for groups; every group is a functor defined on that sketch. (Or if you insist, the generic identity element exists in the first order theory for groups.) I suspect mathematicians subconsciously think of identity elements in this way.

Matrix multiplication

Matrix multiplication is not commutative. A student may forget this and write $(A^2B^2=(AB)^2$. This also happens in group theory courses.

This problem occurs because the symbolic language uses the same symbol for many different operations, in this case the juxtaposition notation for multiplication. This phenomenon is called overloaded notation and is discussed in abstractmath.org here.

Conjecture

Noncommutative binary operations written using juxtaposition cause students trouble because going to noncommutative operations requires abandoning some overlearned reflexes in doing algebra.

Identity elements seem to behave the same in any binary operation, so there are no reflexes to unlearn. There are generic binary operations of various types as well. That’s why mathematicians are comfortable overloading juxtaposition. But to get to be a mathematician you have to unlearn some reflexes.

Negation

Sometimes you need to reword a math statement that contains symbolic expressions. This particularly causes trouble in connection with negation.

Ordinary English

The English language is notorious among language learners for making it complicated to negate a sentence. The negation of “I saw that movie” is “I did not see that movie”. (You have to put “d** not” (using the appropriate form of “do”) before the verb and then modify the verb appropriately.) You can’t just say “I not saw that movie” (as in Spanish) or “I saw not that movie” (as in German).

Conjecture

The method in English used to negate a sentence may cause problems with math students whose native language is not English. (But does it cause math problems with those students?)

Negating symbolic expressions

Examples

  • The negation of “$n$ is even and a prime” is “$n$ is either odd or it is not a prime”. The negation should not be written “$n$ is not even and a prime” because that sentence is ambiguous. In the heat of doing a proof students may sometimes think the negation is “$n$ is odd and $n$ is not a prime,” essentially forgetting about DeMorgan. (He must roll over in his grave a lot.)
  • The negation of “$x\gt0$” is “$x\leq0$”. It is not “$x\lt0$”. This is a very common mistake.

These examples are difficulties caused by not understanding the math. They are not directly caused by difficulties with the languages of math.

Negating expressions containing parenthetic assertions

Suppose you want to prove:

“If $f:\mathbb{R}\to\mathbb{R}$ is differentiable, then $f$ is continuous”.

A good way to do this is by using the contrapositive. A mechanical way of writing the contrapositive is:

“If $f$ is not continuous, then $f:\mathbb{R}\to\mathbb{R}$ is not differentiable.”

That is not good. The sentence needs to be massaged:

“If $f:\mathbb{R}\to\mathbb{R}$ is not continuous, then $f$ is not differentiable.”

Even better would be to write the original sentence as:

“Suppose $f:\mathbb{R}\to\mathbb{R}$. Then if $f$ is differentiable, then $f$ is continuous.”

This is discussed in detail in David Butler’s post Contrapositive grammar.

Conjecture

Students need to be taught to understand parenthetic assertions that occur in the symbolic language and to learn to extract a parenthetic assertion and write it as a standalone assertion ahead of the statement it occurs in.

Scope

The scope of a word or variable consists of the part of the text for which its current definition is in effect.

Examples

  • “Suppose $n$ is divisible by $4$.” The scope is probably the current paragraph or perhaps the current proof. This means that the properties of $n$ are constrained in that section of the text.
  • “In this book, all rings are unitary.” This will hold for the whole book.

There are many more examples in the abstractmath.org article Scope.

If you are a grasshopper (you like to dive into the middle of a book or paper to find out what it says), knowing the scope of a variable can be hard to determine. It is particularly difficult for commonly used words or symbols that have been defined differently from the usual usage. You may not suspect that this has happened since it might be define once early in the text. Some books on writing mathematics have urged writers to keep global definitions to a minimum. This is good advice.

Finding the scope is considerably easier when the text is online and you can search for the definition.

Conjecture

Knowing the scope of a word or variable can be difficult. It is particular hard when the word or variable has a large scope (chapter or whole book.)

Variables

Variables are often introduced in math writing and then used in the subsequent discussion. In a complicated discussion, several variables may be referred to that have different statuses, some of them introduced several pages before. There are many particular ways discussed below that can cause trouble for students. This post is restricted to trouble in connection with the languages of math. The concept of variable is difficult in itself, not just because of the way the math languages represent them, but that is not covered here.

Much of this part of the post is based on work of Susanna Epp, including three papers listed in the references. Her papers also include many references to other work in the math ed literature that have to do with understanding variables.

See also Variables in abstractmath.org and Variables in Wikipedia.

Types

Students blunder by forgetting the type of the variable they are dealing with. The example given previously of problems with matrix multiplication is occasioned by forgetting the type of a variable.

Conjecture

Students sometimes have problems because they forget the data type of the variables they are dealing with. This is primarily causes by overloaded notation.

Dependent and independent

If you define $y=x^2+1$, then $x$ is an independent variable and $y$ is a dependent variable. But dependence and independence of variablesare more general than that example suggests.
In an epsilon-delta proof of the limit of a function (example below,) $\varepsilon$ is independent and $\delta$ is dependent on $\varepsilon$, although not functionally dependent.

Conjecture

Distinguishing dependent and independent variables causes problems, particularly when the dependence is not clearly functional.

I recently ran across a discussion of this on the internet but failed to record where I saw it. Help!

Bound and free

This causes trouble with integration, among other things. It is discussed in abstractmath.org in Variables and Substitution. I expect to add some references to the math ed literature soon.

Instantiation

Some of these variables may be given by existential instantiation, in which case they are dependent on variables that define them. Others may be given by universal instantiation, in which case the variable is generic; it is independent of other variables, and you can’t impose arbitrary restrictions on it.

Existential instantiation

A theorem that an object exists under certain conditions allows you to name it and use it by that name in further arguments.

Example

Suppose $m$ and $n$ are integers. Then by definition, $m$ divides $n$ if there is an integer $q$ such that $n=qm$. Then you can use “$q$” in further discussion, but $q$ depends on $m$ and $n$. You must not use it with any other meaning unless you start a new paragraph and redefine it.

So the following (start of a) “proof” blunders by ignoring this restriction:

Theorem: Prove that if an integer $m$ divides both integers $n$ and $p$, then $m$ divides $n+p$.

“Proof”: Let $n = qm$ and $p = qm$…”

Universal instantiation

It is a theorem that for any integer $n$, there is no integer strictly between $n$ and $n+1$. So if you are given an arbitrary integer $k$, there is no integer strictly between $k$ and $k+1$. There is no integer between $42$ and $43$.

By itself, universal instantiation does not seem to cause problems, provided you pay attention to the types of your variables. (“There is no integer between $\pi$ and $\pi+1$” is false.)

However, when you introduce variables using both universal and existential quantification, students can get confused.

Example

Consider the definition of limit:

Definition: $\lim_{x\to a} f(x)=L$ if and only if for every $\epsilon\gt0$ there is a $\delta\gt0$ for which if $|x-a|\lt\delta$ then $|f(x)-L|\lt\epsilon$.

A proof for a particular instance of this definition is given in detail in Rabbits out of a Hat. In this proof, you may not put constraints on $\epsilon$ except the given one that it is positive. On the other hand, you have to come up with a definition of $\delta$ and prove that it works. The $\delta$ depends on what $f$, $a$ and $L$ are, but there are always infinitely many values of $\delta$ which fit the constraints, and you have to come up with only one. So in general, two people doing this proof will not get the same answer.

Reference

Susanna Epp’s paper Proof issues with existential quantification discusses the problems that students have with both existential and universal quantification with excellent examples. In particular, that paper gives examples of problems students have that are not hinted at here.

References

A nearly final version of The Handbook of Mathematical Discourse is available on the web with links, including all the citations. This version contains some broken links. I am unable to recompile it because TeX has evolved enough since 2003 that the source no longer compiles. The paperback version (without the citations) can be bought as a book here. (There are usually cheaper used versions on Amazon.)

Abstractmath.org is a website for beginning students in abstract mathematics. It includes most of the material in the Handbook, but not the citations. The Introduction gives you a clue as to what it is about.

Two languages

My take on the two languages of math are discussed in these articles:

The Language of Mathematics, by Mohan Ganesalingam, covers these two languages in more detail than any other book I know of. He says right away on page 18 that mathematical language consists of “textual sentences with symbolic material embedded like ‘islands’ in the text.” So for him, math language is one language.

I have envisioned two separate languages for math in abstractmath.org and in the Handbook, because in fact you can in principle translate any mathematical text into either English or logical notation (first order logic or type theory), although the result in either case would be impossible to understand for any sizeable text.

Topics in abstractmath.org

Context-sensitive interpretation.

“If” in definitions.

Mathematical English.

Parenthetic assertion.

Scope

Semantic contamination.

Substitution.

The symbolic language of math

Variables.

Zooming and Chunking.

Topics in the Handbook of mathematical discourse.

These topics have a strong overlap with the topics with the same name in abstractmath.org. They are included here because the Handbook contains links to citations of the usage.

Context-sensitive.

“If” in definitions.

Parenthetic assertion.

Substitution.

Posts in Gyre&Gimble

Names

Naming mathematical objects

Rabbits out of a Hat.

Semantics of algebra I.

Syntactic and semantic thinkers

Technical meanings clash with everyday meanings

Thinking without words.

Three kinds of mathematical thinkers

Variations in meaning in math.

Other references

Contrapositive grammar, blog post by David Butler.

Proof issues with existential quantification, by Susanna Epp.

The role of logic in teaching proof, by Susanna Epp (2003).

The language of quantification in mathematics instruction, by Susanna Epp (1999).

The Language of Mathematics: A Linguistic and Philosophical Investigation
by Mohan Ganesalingam, 2013. (Not available from the internet.)

On the communication of mathematical reasoning, by Atish Bagchi, and Charles Wells (1998a), PRIMUS, volume 8, pages 15–27.

Variables in Wikipedia.

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Notation for sets

This is a revision of the section of abstractmath.org on notation for sets.

Sets of numbers

The following notation for sets of numbers is fairly standard.

Remarks

  • Some authors use $\mathbb{I}$ for $\mathbb{Z}$, but $\mathbb{I}$ is also used for the unit interval.
  • Many authors use $\mathbb{N}$ to denote the nonnegative integers instead
    of the positive ones.
  • To remember $\mathbb{Q}$, think “quotient”.
  • $\mathbb{Z}$ is used because the German word for “integer” is “Zahl”.

Until the 1930’s, Germany was the world center for scientific and mathematical study, and at least until the 1960’s, being able to read scientific German was was required of anyone who wanted a degree in science. A few years ago I was asked to transcribe some hymns from a German hymnbook — not into English, but merely from fraktur (the old German alphabet) into the Roman alphabet. I sometimes feel that I am the last living American to be able to read fraktur easily.

Element notation

The expression “$x\in A$” means that $x$ is an element of the set $A$. The expression “$x\notin A$” means that $x$ is not an element of $A$.

“$x\in A$” is pronounced in any of the following ways:

  • “$x$ is in $S$”.
  • “$x$ is an element of $S$”.
  • “$x$ is a member of $S$”.
  • “$S$ contains $x$”.
  • “$x$ is contained in $S$”.

Remarks

  • Warning: The math English phrase “$A$ contains $B$” can mean either “$B\in A$” or “$B\subseteq A$”.
  • The Greek letter epsilon occurs in two forms in math, namely $\epsilon$ and $\varepsilon$. Neither of them is the symbol for “element of”, which is “$\in$”. Nevertheless, it is not uncommon to see either “$\epsilon$” or “$\varepsilon$” being used to mean “element of”.
Examples
  • $4$ is an element of all the sets $\mathbb{N}$, $\mathbb{Z}$, $\mathbb{Q}$, $\mathbb{R}$, $\mathbb{C}$.
  • $-5\notin \mathbb{N}$ but it is an element of all the others.

List notation

Definition: list notation

A set with a small number of elements may be denoted by listing the elements inside braces (curly brackets). The list must include exactly all of the elements of the set and nothing else.

Example

The set $\{1,\,3,\,\pi \}$ contains the numbers $1$, $3$ and $\pi $ as elements, and no others. So $3\in \{1,3,\pi \}$ but $-3\notin \{1,\,3,\,\pi \}$.

Properties of list notation

List notation shows every element and nothing else

If $a$ occurs in a list notation, then $a$ is in the set the notation defines.  If it does not occur, then it is not in the set.

Be careful

When I say “$a$ occurs” I don’t mean it necessarily occurs using that name. For example, $3\in\{3+5,2+3,1+2\}$.

The order in which the elements are listed is irrelevant

For example, $\{2,5,6\}$ and $\{5,2,6\}$ are the same set.

Repetitions don’t matter

$\{2,5,6\}$, $\{5,2,6\}$, $\{2,2,5,6 \}$ and $\{2,5,5,5,6,6\}$ are all different representations of the same set. That set has exactly three elements, no matter how many numbers you see in the list notation.

Multisets may be written with braces and repeated entries, but then the repetitions mean something.

When elements are sets

When (some of) the elements in list notation are themselves sets (more about that here), care is required.  For example, the numbers $1$ and $2$  are not elements of the set \[S:=\left\{ \left\{ 1,\,2,\,3 \right\},\,\,\left\{ 3,\,4 \right\},\,3,\,4 \right\}\]The elements listed include the set $\{1, 2, 3\}$ among others, but not the number $2$.  The set $S$ contains four elements, two sets and two numbers. 

Another way of saying this is that the element relation is not transitive: The facts that $A\in B$ and $B\in C$ do not imply that $A\in C$. 

Sets are arbitrary

  • Any mathematical object can be the element of a set.
  • The elements of a set do not have to have anything in common.
  • The elements of a set do not have to form a pattern.
Examples
  • $\{1,3,5,6,7,9,11,13,15,17,19\}$ is a set. There is no point in asking, “Why did you put that $6$ in there?” (Sets can be arbitrary.)
  • Let $f$ be the function on the reals for which $f(x)=x^3-2$. Then \[\left\{\pi^3,\mathbb{Q},f,42,\{1,2,7\}\right\}\] is a set. Sets do not have to be homogeneous in any sense.


Setbuilder notation

Definition:

Suppose $P$ is an assertion. Then the expression “$\left\{x|P(x) \right\}$” denotes the set of all objects $x$ for which $P(x)$ is true. It contains no other elements.

  • The notation “$\left\{ x|P(x) \right\}$” is called setbuilder notation.
  • The assertion $P$ is called the defining condition for the set.
  • The set $\left\{ x|P(x) \right\}$ is called the truth set of the assertion $P$.
Examples

In these examples, $n$ is an integer variable and $x$ is a real variable..

  • The expression “$\{n| 1\lt n\lt 6 \}$” denotes the set $\{2, 3, 4, 5\}$. The defining condition is “$1\lt n\lt 6$”.  The set $\{2, 3, 4, 5\}$ is the truth set of the assertion “n is an integer and $1\lt n\lt 6$”.
  • The notation $\left\{x|{{x}^{2}}-4=0 \right\}$ denotes the set $\{2,-2\}$.
  • $\left\{ x|x+1=x \right\}$ denotes the empty set.
  • $\left\{ x|x+0=x \right\}=\mathbb{R}$.
  • $\left\{ x|x\gt6 \right\}$ is the infinite set of all real numbers bigger than $6$.  For example, $6\notin \left\{ x|x\gt6 \right\}$ and $17\pi \in \left\{ x|x\gt6 \right\}$.
  • The set $\mathbb{I}$ defined by $\mathbb{I}=\left\{ x|0\le x\le 1 \right\}$ has among its elements $0$, $1/4$, $\pi /4$, $1$, and an infinite number of
    other numbers. $\mathbb{I}$ is fairly standard notation for this set – it is called the unit interval.

Usage and terminology

  • A colon may be used instead of “|”. So $\{x|x\gt6\}$ could be written $\{x:x\gt6\}$.
  • Logicians and some mathematicians called the truth set of $P$ the extension of $P$. This is not connected with the usual English meaning of “extension” as an add-on.
  • When the assertion $P$ is an equation, the truth set of $P$ is usually called the solution set of $P$. So $\{2,-2\}$ is the solution set of $x^2=4$.
  • The expression “$\{n|1\lt n\lt6\}$” is commonly pronounced as “The set of integers such that $1\lt n$ and $n\lt6$.” This means exactly the set $\{2,3,4,5\}$. Students whose native language is not English sometimes assume that a set such as $\{2,4,5\}$ fits the description.

Setbuilder notation is tricky

Looking different doesn’t mean they are different.

A set can be expressed in many different ways in setbuilder notation. For example, $\left\{ x|x\gt6 \right\}=\left\{ x|x\ge 6\text{ and }x\ne 6 \right\}$. Those two expressions denote exactly the same set. (But $\left\{x|x^2\gt36 \right\}$ is a different set.)

Russell’s Paradox

In certain areas of math research, setbuilder notation can go seriously wrong. See Russell’s Paradox if you are curious.

Variations on setbuilder notation

An expression may be used left of the vertical line in setbuilder notation, instead of a single variable.

Giving the type of the variable

You can use an expression on the left side of setbuilder notation to indicate the type of the variable.

Example

The unit interval $I$ could be defined as \[\mathbb{I}=\left\{x\in \mathrm{R}\,|\,0\le x\le 1 \right\}\]making it clear that it is a set of real numbers rather than, say rational numbers.  You can always get rid of the type expression to the left of the vertical line by complicating the defining condition, like this:\[\mathbb{I}=\left\{ x|x\in \mathrm{R}\text{ and }0\le x\le 1 \right\}\]

Other expressions on the left side

Other kinds of expressions occur before the vertical line in setbuilder notation as well.

Example

The set\[\left\{ {{n}^{2}}\,|\,n\in \mathbb{Z} \right\}\]consists of all the squares of integers; in other words its elements are 0,1,4,9,16,….  This definition could be rewritten as $\left\{m|\text{ there is an }n\in \mathrm{}\text{ such that }m={{n}^{2}} \right\}$.

Example

Let $A=\left\{1,3,6 \right\}$.  Then $\left\{ n-2\,|\,n\in A\right\}=\left\{ -1,1,4 \right\}$.

Warning

Be careful when you read such expressions.

Example

The integer $9$ is an element of the set \[\left\{{{n}^{2}}\,|\,n\in \text{ Z and }n\ne 3 \right\}\]It is true that $9={{3}^{2}}$ and that $3$ is excluded by the defining condition, but it is also true that $9={{(-3)}^{2}}$ and $-3$ is not an integer ruled out by the defining condition.

Reference

Sets. Previous post.

Acknowledgments

Toby Bartels for corrections.

Creative Commons License< ![endif]>

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Rabbits out of a hat

This is a revision and expansion of the entry on rabbits in the abstractmath article Dysfunctional attitudes and behaviors.

Rabbits

Sometimes when you are reading or listening to a proof you will find yourself following each step but with no idea why these steps are going to give a proof. This can happen with the whole structure of the proof or with the sudden appearance of a step that seems like the prover pulled a rabbit out of a hat . You feel as if you are walking blindfolded.

Example (mysterious proof structure)

The lecturer says he will prove that for an integer $n$, if $n^2$ is even then $n$ is even. He begins the proof: Let $n^2$ be odd” and then continues to the conclusion, “Therefore $n$ is odd.”

Why did he begin a proof about being even with the assumption that $n$ is odd?

The answer is that in this case he is doing a proof by contrapositive . If you don’t recognize the pattern of the proof you may be totally lost. This can happen if you don’t recognize other forms, for example contradiction and induction.

Example (rabbit)

You are reading a proof that $\underset{x\to2}{\mathop{\lim }}{{x}^{2}}=4$. It is an $\varepsilon \text{-}\delta$ proof, so what must be proved is:

  • For any positive real number $\varepsilon $,
  • there is a positive real number $\delta $ for which:
  • if $\left| x-2 \right|\lt\delta$ then
  • $\left| x^2-4 \right|\lt\varepsilon$.

Proof

Here is the proof, with what I imagine might be your agitated reaction to certain steps. Below is a proof with detailed explanations .

1) Suppose $\varepsilon \gt0$ is given.

2) Let $\delta =\text{min}\,(1,\,\frac{\varepsilon }{5})$ (the minimum of the two numbers 1 and $\frac{\varepsilon}{5}$ ).

Where the *!#@! did that come from? They pulled it out of thin air! I can’t see where we are going with this proof!

3) Suppose that $\left| x-2 \right|\lt\delta$.

4) Then $\left| x-2 \right|\lt1$ by (2) and (3).

5) By (4) and algebra, $\left|x+2 \right|\lt5$.

Well, so what? We know that $\left| x+39\right|\lt42$ and lots of other things, too. Why did they do this?

6) Also $\left| x-2 \right|\lt\frac{\varepsilon }{5}$ by (2) and(3).

7) Then $\left| {{x}^{2}}-4\right|=\left| (x-2)(x+2) \right|\lt\frac{\varepsilon }{5}\cdot 5=\varepsilon$ by (5) and (6). End of Proof.

Remarks

This proof is typical of proofs in texts.

  • Steps 2) and 5) look like they were rabbits pulled out of a hat.
  • The author gives no explanation of where they came from.
  • Even so, each step of the proof follows from previous steps, so the proof is correct.
  • Whether you are surprised or not has nothing to do with whether it is correct.
  • In order to understand a proof, you do not have to know where the rabbits came from.
  • In general, the author did not think up the proof steps in the order they occur in the proof. (See this remark in the section on Forms of Proofs.) See also look ahead.

Proof with detailed explanations

  1. Suppose $\varepsilon >0$ is given. (We are starting a proof by universal generalization.)
  2. Let $\delta$ be the minimum of the two numbers $1$ and $\frac{\varepsilon}{5}$). (Rabbit out of the hat. You can “let” any symbol mean anything you want, so this is a legitimate thing to do even if you don’t see where this is all going.{
  3. Suppose $\left|x-2\right|\lt\delta$. (We are about to prove the conditional statement “If $\left| x-2 \right|\lt\delta$ then $\left| {{x}^{2}}-4 \right|\lt\varepsilon$” and we are proceeding by the direct method.)
  4. Then $\left| x-2 \right|\lt 1$ by (2) and (3). (The fact that $\delta =\text{min}\,(1,\,\frac{\varepsilon }{5})$ means that $\delta \le 1$ and that $\delta \le \frac{\varepsilon }{5}$. Since $\left| x-2 \right|\lt \delta $, the statement $\left| x-2 \right|\lt 1$ follows by transitivity of “$\lt $”. This is another rabbit. WHY do we want $\left| x-2 \right|\lt 1$? Be Patient.)
  5. By (4) and algebra, $\left| x+2 \right|\lt 5$. ($\left| x-2 \right|\lt 1$ means that $-1\lt x-2\lt 1$. Add $4$ to each term in this equation to get $3\lt x+2\lt 5$. This is another rabbit, but it is a correct statement!)
  6. Also $\left| x-2 \right|\lt \frac{\varepsilon }{5}$ by (2) and (3). ((2) says that $\delta\le\frac{\varepsilon }{5}$ and (3) says that $\left| x-2 \right|\lt\delta$, so $\left| x-2 \right|\lt \frac{\varepsilon }{5}$ follows by transitivity.)
  7. Then $\left| {{x}^{2}}-4\right|=\left| (x-2)(x+2) \right|\lt\frac{\varepsilon }{5}\cdot 5=\varepsilon$ by (5) and (6). End of Proof. (This last statement actually shows the algebra.)

Coming up with that proof

The author did not think up the proof steps in the order they occur in the proof. She looked ahead at the goal of proving that \[\left| {{x}^{2}}-4\right|\lt\varepsilon\] and thought of factoring the left side. Now she must prove that \[\left| (x-2)(x+2) \right|\lt\varepsilon\]

But if $\left|x-2\right|$ is small then $x$ has to be close to $2$, so that $x + 2$ can’t be too big. Since the only restriction on $\delta$ is that it has to be positive, let’s restrict it to being smaller than $1$. (The choice of $1$ is purely arbitrary. Any positive real number would do.)

In that case step (5) shows that $\left|x+2\right|\lt5$.. So how small do you have to make to make $\varepsilon$? In other words, how small do you have to make $\delta $ to make $\left| 5(x-2) \right|\lt\varepsilon$ (remembering that $\left| x-2 \right|\lt\delta $). Well, clearly $\frac{\varepsilon }{5}$ will do!

That explains her choice of $\delta$ be the minimum of the two numbers $1$ and $\frac{\varepsilon}{5}$. Notice that that choice is made very early in the proof but it was made only after experimenting with the sizes of $\left|x-2\right|$ and $\left|x+2\right|$.

You can check that if she had chosen to restrict $\delta $ to being less than 42, then she would need $\delta =\text{min}\,(42,\,\frac{\varepsilon }{47})$.

Acknowledgments

Thanks to Robert Burns for corrections and suggestions

Forms of proofs

Abstractmath.org is a website I have been maintaining since 2005. It is intended for people beginning the study of abstract math, often a course that requires proofs and thinking about mathematical structures. The Introduction to the website and the article Attitude explain the website in more detail.

One of the chapters in abstractmath.org covers Proofs. As everywhere in abstractmath.org, there is no attempt at complete coverage: the emphasis is on aspects that cause difficulty for abstraction-newbies. In the case of proofs, this includes sections on how proofs are written (math language is a big emphasis all over abstractmath.org). One of those sections is Forms of Proof. This post is a fairly extensive revision of that section.

More than half of the section on Proofs has already been revised (the ones entitled “abstractmath.org 2.0)”, and my current task is to finish that revision.

Normally, I post the actual article here on Gyre&Gimble, but something has changed in the operation of WordPress which causes the html processor to obey linebreaks in the input, which would make the article look chaotic.

So this time, I have to ask you to click a button to read the revised section on Forms of Proof. I apologize for the excessive effort by your finger.
 

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

The many boobytraps of “if…then”

CONTENTS

The truth table for conditionals

Conditionals and truth sets

Vacuous truth

Universal conditional assertions

Related assertions

Understanding conditionals

Modus ponens

CONDITIONAL ASSERTIONS

This section is concerned with logical construc­tions made with the connective called the conditional operator. In mathe­matical English, applying the conditional operator to $P$ and $Q$ produces a sentence that may bewritten, “If $P$, then $Q$”, or “$P$ implies$Q$”. Sentences of this form are conditional assertions.

Conditional assertions are at the very heart of mathematical reasoning. Mathematical proofs typically consist of chains of conditional assertions.

Some of the narrative formats used for proving conditional assertions are discussed in Forms of Proof.

The truth table for conditional assertions

A conditional assertion “If $P$ then $Q$” has the precise truth table shown here.

 

$P$ $Q$ If $P$ then $Q$
T T T
T F F
F T T
F F T

The meaning of “If $P$ then $Q$” is determined entirely by the truth values of $P$ and $Q$ and this truth table. The meaning is not determined by the usual English meanings of the words “if” and “then”.

The truth table is summed up by this purple pronouncement:

The Prime Directive of conditional assertions:
A conditional assertion is true unless
the hypothesis is true and the conclusion is false.
That means that to prove “If $P$ then $Q$” is  FALSE
you must show that $P$ is TRUE(!) and $Q$ is FALSE.

The Prime Directive is harder to believe in than leprechauns. Some who are new to abstract math get into an enormous amount of difficulty because they don’t take it seriously.

Example

The statement “if $n\gt 5$, then $n\gt 3$” is true for all integers
$n$.

  • This means that “If $7\gt 5$ then $7\gt 3$” is true.
  • It also means that “If $2\gt 5$ then $2\gt 3$” is true!  If you really believe that “If $n\gt 5$, then $n\gt 3$” is true for all integers n, then you must in particular believe that  “If $2\gt 5$ then $2\gt 3$” is true.  That’s why the truth table for conditional assertions takes the form it does.
  • On the other hand, “If $n\gt 5$, then $n\gt 8$” is not true for all integers $n$.  In particular, “If $7\gt 5$, then $7\gt 8$” is false. This fits what the truth table says, too.

For more about this, see Understanding conditionals.

Remark

Most of the time in mathematical writing the conditional assertions which are actually stated involve assertions containing variables, and the claim is typically that the assertion is true for all instances of the variables. Assertions involving statements without variables occur only implicitly in the process of checking instances of the assertions. That is why a statement such as, “If $2\gt 5$ then $2\gt 3$” seems awkward and unfamiliar.

It is unfamiliar and occurs rarely. I mention it here because of the occurrence of vacuous truths, which do occur in mathematical writing.

Conditionals and Truth Sets

The set $\{x|P(x)\}$ is the set of exactly all $x$ for which $P(x)$ is true. It is called the truth set of $P(x)$.

Examples
  • If $n$ is an integer variable, then the truth set of “$3\lt n\lt9$” is the set $\{4,5,6,7,8\}$.
  • The truth set of “$n\gt n+1$” is the empty set.

Weak and strong

“If $P(x)$ then $Q(x)$” means that $\{x|P(x)\}\subseteq
\{x|Q(x)\}$.  We say $P(x)$ is stronger than $Q(x)$, meaning that $P$ puts more requirements on $x$ than $Q$ does.  The objects $x$ that make $P$ true necessarily make $Q$ true, so there might be objects making $Q$ true that don’t make $P$ true.

Example

The statement “$x\gt4$” is stronger than the statement “$x\gt\pi$”. That means that $\{x|x\gt4\}$ is a proper subset of $\{x|x\gt\pi\}$. In other words, $\{x|x\gt4\}$ is “smaller” than $\{x|x\gt\pi\}$ in the sense of subsets. For example, $3.5\in\{x|x\gt\pi\}$ but $3.5\notin\{x|x\gt4\}$. This is a kind of reversal (a Galois correspondence) that confused many of my students.

“Smaller” means the truth set of the stronger statement omits elements that are in the truth set of the weaker statement. In the case of finite truth sets, “smaller” also means it has fewer elements, but that does not necessarily work for infinite sets, such as in the example above, because the two truth sets $\{x|x\gt4\}$ and $\{x|x\gt\pi\}$ have the same cardinality.

Making a statement stronger
makes its truth set “smaller”.

Terminology and usage

Hypothesis and conclusion

In the assertion “If $P$, then $Q$”:

  • P is the hypothesis or antecedent
    of the assertion.  It is a constraint or condition that holds in the very narrow context of the assertion.  In other words, the assertion, “If $P$, then $Q$” does not say that $P$ is true. The idea of the direct method of proof is to assume that $P$ is true during the proof.
  • $Q$ is the conclusion or consequent. It is also incorrect to assume that $Q$ is true anywhere else except in the assertion “If $P$, then $Q$”.

“Implication”

Conditionals such as “If $P$ then $Q$” are also called implications , but be wary: “implication” is a technical term and does not fit the meaning of the word in conversational English.

  • In ordinary English, you might ask, “What are the implications of knowing that $x\gt4$? Answer: “Well, for one thing, $x$ is bigger that $\pi$.”
  • In the terminology of math and logic, the whole statement “If $x\gt4$ then $x\gt\pi$” is called an “implication”.

Vacuous truth

The last two lines of the truth table for conditional assertions mean that if the hypothesis of the assertion is false, then the assertion is automatically true.
In the case that “If $P$ then $Q$” is true because $P$ is false, the assertion is said to be vacuously true.

The word “vacuous” refers to the fact that in the vacuous case the conditional assertion says nothing interesting about either $P$ or $Q$. In particular, the conditional assertion may be true even if the conclusion is false (because of the last line of the truth table).

Example

Both these statements are vacuously true!

  • If $4$ is odd, then $3 = 3$.
  • If $4$ is odd, then $3\neq3$.
Example

If $A$ is any set then $\emptyset\subseteq A.$ Proof (rewrite by definition): You have to prove that if $x\in\emptyset$, then $x\in A$. But the statement “$x\in\emptyset$” is false no matter what $x$ is, so the statement “$\emptyset\subseteq A$” is vacuously true.

Definitions involving vacuous truth

Vacuous truth can cause surprises in connection with certain concepts which are defined using a conditional assertion.

Example
  • Suppose $R$ is a relation on a set $S$. Then $R$ is antisymmetric if the following statement is true: If for all $x,y\in S$, $xRy$ and $yRx$, then $x=y$.
  • For example, the relation “$\leq$” on the real numbers is antisymmetric, because if $x\leq y$ and $y\leq x$, then $x=y$.
  • The relation “$\lt$” on the real numbers is also antisymmetric. It is vacuously antisymmetric, because the statement

    (AS) “if $x\gt y$ and $y\gt x$, then $x = y$”

    is vacuously true. If you say it can’t happen that $x\gt y$ and $y\gt x$, you are correct, and that means precisely that (AS) is vacuously true.

Remark

Although vacuous truth may be disturbing when you first see it, making either statement in the example false would result in even more peculiar situations. For example, if you decided that “If $P$ then $Q$” must be false when $P$ and $Q$ are both false, you would then have to say that this statement

“For any integers $m$ and $n$, if $m\gt 5$ and $5\gt n$, then $m\gt n$”
 

 

is not always true (substitute $3$ for $m$ and $4$ for $n$ and you get both $P$ and $Q$ false). This would surely be an unsatisfactory state of affairs.

How conditional assertions are worded

A conditional assertion may be worded in various ways.  It takes some practice to get used to understanding all of them as conditional.

Our habit of swiping English words and phrases and changing their meaning in an unintuitive way causes many problems for new students, but I am sure that the worst problem of that kind is caused by the way conditional assertions are worded.

In math English

The most common ways of wording a conditional assertion with hypothesis $P$ and conclusion $Q$ are:

  • If $P$, then $Q$.
  • $P$ implies $Q$.
  • $P$ only if $Q$.
  • $P$ is sufficient for $Q$.
  • $Q$ is necessary for $P$.

In the symbolic language

  • $P(x)\to Q(x)$
  • $P(x)\Rightarrow Q(x)$
  • $P(x)\supset Q(x)$

Math logic is notorious for the many different symbols used by different authors with the same meaning. This is in part because it developed separately in three different academic areas: Math, Philosophy and Computing Science.

Example

All the statements below mean the same thing. In these statements $n$ is an integer variable.

  • If $n\lt5$, then $n\lt10$.
  • $n\lt5$ implies $n\lt10$.
  • $n\lt5$ only if $n\lt10$.
  • $n\lt5$ is sufficient for $n\lt10$.
  • $n\lt10$ is necessary for $n\lt5$.
  • $n\lt5\to n\lt10$
  • $n\lt5\Rightarrow n\lt10$
  • $n\lt5\supset n\lt10$

Since “$P(x)\supset Q(x)$” means that $\{x|P(x)\}\subseteq
\{x|Q(x)\}$, there is a notational clash between implication written “$\supset $” and inclusion written “$\subseteq $”. This is exacerbated by the two meanings of the inclusion symbol “$\subset$”.

These ways of wording conditionals cause problems for students, some of them severe. They are discussed in the section Understanding conditionals.

Usage of symbols

The logical symbols “$\to$”, “$\Rightarrow$”,
“$\supset$” are frequently used when writing on the blackboard, but are not common in texts, except for texts in mathematical logic.

More about implication in logic

If you know some logic, you may know that there is a subtle difference between the statements

  • “If $P$ then $Q$”
  • “$P$ implies $Q$”.

Here is a concrete example:

  1. “If $x\gt2$,  then $x$ is positive.”
  2. “$x\gt2$ implies that $x$ is positive.”

Note that the subject of sentence (1) is the (variable) number $x$, but the subject of sentence (2) is the assertion
“$x\lt2$”.   Behind this is a distinction made in formal logic between the material conditional “if $P$ then $Q$” (which means that $P$ and $Q$ obey the truth table for “If..then”) and logical consequence ($Q$ can be proved given $P$). I will ignore the distinction here, as most mathematicians do except when they are proving things about logic.

In some texts, $P\Rightarrow Q$ denotes the material conditional and $P\to Q$ denotes logical consequence.

Universal conditional assertions

A conditional assertion containing a variable that is true for any value of the correct type of that variable is a universally true conditional assertion. It is a special case of the general notion of universally true assertion.

Examples
  1. For all $x$, if $x\lt5$, then $x\lt10$.
  2. For any integer $n$, if $n^2$ is even, then $n$ is even.
  3. For any real number $x$, if $x$ is an integer, then $x^2$ is an integer.

These are all assertions of the form “If $P(x)$, then $Q(x)$”. In (1), the hypothesis is the assertion “$x\lt5$”; in (2), it is the assertion “$n^2$ is even”, using an adjective to describe property that $n^2$ is even; in (3), it is the assertion “$x$ is an integer”, using a noun to assert that $x$ has the property of being an integer. (See integral.)

Expressing universally true conditionals in math English

The sentences listed in the example above provide ways of expressing universally true conditionals in English. They use “for all” or “for any”, You may also use these forms (compare in this discussion of universal assertions in general.)

  • For all functions $f$, if $f$ is differentiable then it is continuous.
  • For (every, any, each) function $f$, if $f$ is differentiable then it is continuous.
  • If $f$ is differentiable then it is continuous, for any function $f$.
  • If $f$ is differentiable then it is continuous, where $f$ is any function.
  • If a function $f$ is differentiable, then it is continuous. (See indefinite article.)

Sometimes mathematicians write, “If the function $f$ is differentiable, then it is continuous.” At least sometimes, they mean that every function that is differentiable is continuous. I suspect that this usage occurs in texts written by non-native-English speakers.

Disguised conditionals

There are other ways of expressing universal conditionals that are disguised, because they are not conditional assertions in English.

Let $C(f)$ mean that $f$ is continuous and and $D(f)$ mean that $f$ is differentiable. The (true) assertion

“For all $f$, if $D(f)$, then $C(f)$”
 

 

can be said in the following ways:

  1. Every (any, each) differentiable function is continuous.
  2. All differentiable functions are continuous.
  3. Differentiable functions are continuous. Or: “…are always continuous.”
  4. A differentiable function is continuous.
  5. The differentiable functions are continuous.

Notes

  • Watch out for (4). Beginning abstract math students sometimes don’t recognize it as universal. They may read it as “Some differentiable function is continuous.” Authors often write, “A differentiable function is necessarily continuous.”
  • I believe that (5) is obsolescent. I don’t think younger native-English-speaking Americans would use it. (Warning: This claim is not based on lexicographical research.)

Assertions related to a conditional assertion

Converse

The converse of a conditional assertion “If $P$ then $Q$” is “If $Q$ then $P$”.

Whether a conditional assertion is true
has no bearing on whether its converse it true.

Examples
  • The converse of “If it’s a cow, it eats grass” is “If it eats grass, it’s a cow”. The first statement is true (let’s ignore the Japanese steers that drink beer or whatever), but the second statement is definitely false. Sheep eat grass, and they are not cows..
  • The converse of “For all real numbers $x$, if $x > 3$, then $x > 2$.” is “For all real numbers $x$, if $x > 2$, then $x > 3$.” The first is true and the second one is false.
  • “For all integers $n$, if $n$ is even, then $n^2$ is even.” Both this statement and its converse are true.
  • “For all integers $n$, if $n$ is divisible by $2$, then $2n +1$ is divisible by $3$.” Both this statement and its converse are false.

Contrapositive

The contrapositive of a conditional assertion “If $P$ then $Q$” is “If not $Q$ then not $P$.”

A conditional assertion and its contrapositive
are both true or both false.

Example

The contrapositive of
“If $x > 3$, then $x > 2$”
is (after a little translation)
“If $x\leq2$ then $x\leq3$.”
For any number $x$, these two statements are both true or both false.

This means that if you prove “If not $P$ then not $Q$”, then you have also proved “If $P$ then $Q$.”

You can prove an assertion by proving its contrapositive.

This is called the contrapositive method and is discussed in detail in this section.

So a conditional assertion and its contrapositive have the same truth value. Two assertions that have the same truth value are said to be equivalent. Equivalence is discussed with examples in the Wikipedia article on necessary and sufficient.

Understanding conditional assertions

As you can see from the preceding discussions, statements of the form “If $P$ then Q” don’t mean the same thing in math that they do in ordinary English. This causes semantic contamination.

Examples

Time

In ordinary English, “If $P$ then $Q$” can suggest order of occurrence. For example, “If we go outside, then the neighbors will see us” implies that the neighbors will see us after we go outside.

Consider “If $n\gt7$, then $n\gt5$.” If $n\gt7$, that doesn’t mean $n$ suddenly gets greater than $7$ earlier than $n$ gets greater than $5$. On the other hand, “$n\gt5$ is necessary for $n\gt7$” (which remember means the same thing as “If $n\gt7$, then $n\gt5$) doesn’t mean that $n\gt5$ happens earlier than $n\gt7$. Since we are used to “if…then” having a timing implication, I suspect we get subconscious dissonance between “If $P$ then $Q$” and “$Q$ is necessary for $P$” in mathematical statements, and this dissonance makes it difficult to believe that that can mean the same thing.

Causation

“If $P$ then $Q$” can also suggest causation. The the sentence, “If we go outside, the neighbors will see us” has the connotation that the neighbors will see us because we went outside.

The contrapositive is “If the neighbors won’t see us, then we don’t go outside.” This English sentence seems to me to mean that if the neighbors are not around to see us, then that causes us to stay inside. In contrast to contrapositive in math, this means something quite different from the original sentence.

Wrong truth table

For some instances of the use of “if…then” in English, the truth table is different.

Consider: “If you eat your vegetables, you can have dessert.” Every child knows that this means they will get dessert if they eat their vegetables and not otherwise. So the truth table is:






$P$ $Q$ If $P$ then $Q$
T T T
T F F
F T F
F F T

In other words, $P$ is equivalent to $Q$. It appears to me that this truth table corresponds to English “if…then” when a rule is being asserted.

These examples show:

The different ways of expressing conditional assertions
may mean different things in English.

How can you get to the stage where you automatically understand the meaning of conditional assertions in math English?

You need to understand the equivalence of these formulations so well that it is part of your unconscious reaction to conditionals.

How can you gain that intuitive understanding? One way is by doing abstract math regularly for several years! (Of course, this is how you gain expertise in anything.) In other words, Practice, Practice!

Rigor

But it may help to remember that when doing proofs, we must take the rigorous view of mathematical objects:

  • Math objects don’t change.
  • Math objects don’t cause anything to happen.

The integers (like all math objects) just sit there, not doing anything and not affecting anything. $10$ is not greater than $4$ “because” it is greater than $7$. There is no “because” in rigorous math. Both facts, $10\gt4$ and $10\gt7$, are eternally true.

Eternal is how we think of them – I am not making a claim about “reality”.

  • When you look at the integers, every time you find one that is greater than $7$ it turns out to be greater than $4$. That is how to think about “If $n > 7$, then $n > 4$”.
  • You can’t find one that is greater than $7$ unless it is greater than $4$: It happens that $n > 7$ only if $n > 4$.
  • Every time you look at one less than or equal to $4$ it turns out to be less than or equal to $7$ (contrapositive).

These three observations describe the same set of facts about a bunch of things (integers) that just sit there in their various relationships without changing, moving or doing anything. If you keep these remarks in mind, you will eventually have a natural, unforced understanding of conditionals in math.

Remark

None of this means you have to think of mathematical objects as dead and fossilized all the time. Feel free to think of them using all the metaphors and imagery you know, except when you are reading or formulating a proof written in mathematical English. Then you have to be rigorous!

Modus ponens

The truth table for conditional assertions may be summed up by saying: The conditional assertion “If $P$, then $Q$” is true unless $P$ is true and $Q$ is false.

This fits with the major use of conditional assertions in reasoning:

Modus Ponens

  • If you know that a conditional assertion is true
  • and
    you know that its hypothesis is true,
  • then you know its conclusion is true.

In symbols:

(1) When “If $P$ then $Q$” and $P$ are both true,

(2) then $Q$ must be true as well.

Modus Ponens is the most used method of deduction of all.

Remark

Modus ponens is not a method of proving conditional assertions. It is a method of using a conditional assertion in the proof of another assertion. Methods for proving conditional assertions are found in the chapter Forms of proof.

Creative Commons License<![endif]>

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

 


The role of proofs in mathematical writing

This post outlines the way that proofs are used in mathematical writing. I have been revising the chapter on Proofs in abstractmath.org, and I felt that giving an overview would keep my mind organized when I was enmeshed in writing up complicated details.

Proofs are the sole method for ensuring that a math statement is correct.

  • Evidence that something is true gooses us into trying to prove it, but as all research mathematicians know, evidence only means that some instances are true, nothing else.
  • Intuition, metaphors, analogies may lead us to come up with conjectures. If the gods are smiling that day, they may even suggest a method of proof. And that method may even (miracle) work. Sometimes. If it does, we get a theorem, but not a Fields medal.
  • Students may not know these facts about proof. Indeed, students at the very beginning probably don’t know what a proof actually is: “Proof” in math is not at all the same as “proof” in science or “proof” in law.

A proof has two faces: Its logical structure and its presentation.

The logical structure of a proof consists of methods of compounding and quantifying assertions and methods of deduction.

  • The logical structure is usually expressed as a mathematical object.
  • The most familiar such math objects are the predicate calculus and type theory.
  • Mathematical logic does not have standard terminology (see Math reasoning.) Because of that, the chapter on Proofs uses English words, for example “or” instead of symbols such as $P\lor Q$ or $P+Q$ or $P||Q$.
  • For beginning students, throwing large chunks of mathematical logic at them doesn’t work. The expressions and the rules of deduction need to be introduced to them in context, and in my opinion using few or no logical symbols.
  • Students vary widely in their ability to grasp foreign languages, and symbolic logic in any of its forms is a foreign language. (So is algebra; see my rant.)
  • The rules of deduction do not come naturally to the students, and yet they need to have the rules operate automatically and subconsciously. They should know the names of the nonobvious rules, like “proof by contradiction” and “induction”, but teaching them to be fluent with logical notation is probably a waste of time, since they would have to learn the rules of deduction and a new foreign language at the same time.
  • I hasten to add, a waste of time for beginning students. There are good reasons for students aiming at certain careers to be proficient in type theory, and maybe even for predicate calculus.

Presentation of proofs

  • Proofs are usually written in narrative form
  • A major source of difficulties is that the presentation of a proof (the way it is written in narrative form) omits the reasons that most of the proof steps follow from preceding ones.
  • Some of the omitted reasons may depend on knowledge the reader does not have. “Let $S\subset\mathbb{Q}\times\mathbb{Q}$. Let $i:S\to\mathbb{N}$ be a bijection…” Note: I am not criticizing someone who writes an argument like this, I am just saying that it is a problem for many beginning students.
  • Some reasons are given for some of the steps, presumably ones that the writer thinks might not be obvious to the reader.
  • Sometimes the narrative form gives a clue to the form of proof to be used. Example: “Prove that the length $C$ of the hypotenuse of a right triangle is less than the sum of the other two sides $A$ and $B$. Proof: Assume $C\geq A+B$…” So you immediately know that this is going to be a proof by contradiction. But you have to teach the student to recognize this.
  • Another example: in proving $P$ implies $Q$, the author will assume that $Q$ is false implies $P$ is false without further comment. The reader is suppose to recognize the proof by contrapositive.

Translation problem

  • The Translation problem is the problem of translating a narrative proof into the logical reasoning needed to see that it really is a proof.
  • Many experienced professional mathematicians say it is so hard for them to read a narrative proof that they read the theorem and the try to recreate the proof by thinking about it and glancing at the written proof for hints from time to time. That is a sign of how difficult the translation problem really is.
  • Nevertheless, the students need to learn the unfamiliar proof techniques such as contrapositive and contradiction and the wording tricks that communicate proof methodology. Learning this is hard work. It helps for teachers to be more explicit about the techniques and tricks with students who are beginning math major courses.

References

Added 2014-12-19

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Those monks

In my long post, Proofs without dry bones, I discussed the Monk Theorem (my name) in the context of my ideas about rigorous proof. Here, I want to amplify some of my remarks in the post.

This post was stimulated by Mark Turner’s new book on conceptual blending. That book has many examples of conceptual blending, including the monk theorem, that go into deep detail about how they work. I highly recommend reading his analysis of the monk theorem. Note: I haven’t finished reading the book.

The Monk Theorem

A monk starts at dawn at the bottom of a mountain and goes up a path to the top, arriving there at dusk. The next morning at dawn he begins to go down the path, arriving at dusk at the place he started from on the previous day. Prove that there is a time of day at which he is at the same place on the path on both days.

Proof: Envision both events occurring on the same day, with a monk starting at the top and another starting at the bottom at the same time and doing the same thing the original monk did on different days. They are on the same path, so they must meet each other. The time at which they meet is the time required.

The proof

One of the points in Proofs without dry bones was that the proof above is a genuine mathematical proof, in spite of the fact that it uses no recognizable math theorems or math objects. It does contain unspoken assumptions, but so does any math proof. Some of the assumptions:

  • A path has the property that if two people, one at each end, start walking to the opposite end, they will meet each other at a certain time.
  • A day is a period of time which contains a time “dawn” and a later time “dusk”.

From a mathematician’s point of view, the words “people”, “walking”, “meet”, “path”, “day”, “dawn” and “dusk” could be arbitrary names having the properties stipulated by the assumptions. This is typical mathematical behavior. “Time” is assumed to behave as we commonly perceive it.

If you think closely about the proof, you will probably come up with some refinements that are necessary to reveal other hidden assumptions (particularly about time). That is also typical mathematical behavior. (Remember Hilbert refining Euclid’s postulates about geometry after thousands of year of people not noticing the enthymemes in the postulates.)

This proof does not require that walking on the path be modeled by a function \[t\mapsto (x,y,z):\mathbb{R}\to\mathbb{R}\times\mathbb{R}\times\mathbb{R}\] followed by an appeal to the intermediate value theorem, which I mentioned in “Proofs without dry bones”.

You could simply proceed to make your assumptions about “path”, “meet”, “time”, and so on more explicit until you (or the mathematician you are arguing with) is satisfied. It is in that sense that I claim the proof given above is a genuine mathematical proof.

References

  • The Origin of Ideas: Blending, Creativity, and the Human Spark, by Mark Turner. Oxford University Press, 2014.
  • The Way We Think: Conceptual Blending And The Mind’s Hidden Complexities, by Giles Fauconnier and Mark Turner. Basic Books, 2003.
  • Proofs without dry bones. Blog post.
  • The rigorous view: inertness. Article on abstractmath.org.
  • Conceptual blending in Wikipedia.

math, language and other things that may show up in the wabe