This post has been replaced by the post A slow introduction to category theory.
Tag Archives: definition
Introducing abstract topics
I have been busy for the past several years revising abstractmath.org (abmath). Now I believe, perhaps foolishly, that most of the articles in abmath have reached beta, so now it is time for something new.
For some time I have been considering writing introductions to topics in abstract math, some typically studied by undergraduates and some taken by scientists and engineers. The topics I have in mind to do first include group theory and category theory.
The point of these introductions is to get the student started at the very beginning of the topic, when some students give up in total confusion. They meet and fall off of what I have called the abstraction cliff, which is discussed here and also in my blog posts Very early difficulties and Very early difficulties II.
I may have stolen the phrase “abstraction cliff” from someone else.
Group theory
Group theory sets several traps for beginning students.
Multiplication table
- A student may balk when a small finite group is defined using a set of letters in a multiplication table.
“But you didn’t say what the letters are or what the multiplication is?” - Such a definition is an abstract definition, in contrast to the definition of “prime”, for example, which is stated in terms of already known entities, namely the integers.
- The multiplication table of a group tells you exactly what the binary operation is and any set with an operation that makes such a table correct is an example of the group being defined.
- A student who has no understanding of abstraction is going to be totally lost in this situation. It is quite possible that the professor has never even mentioned the concept of abstract definition. The professor is probably like most successful mathematicians: when they were students, they understood abstraction without having to have it explained, and possibly without even noticing they did so.
Cosets
- Cosets are a real killer. Some students at this stage are nowhere near thinking of a set as an object or a thing. The concept of applying a binary operation on a pair of sets (or any other mathematical objects with internal structure) is completely foreign to them. Did anyone ever talk to them about mathematical objects?
- The consequence of this early difficulty is that such a student will find it hard to understand what a quotient group is, and that is one of the major concepts you get early in a group theory course.
- The conceptual problems with multiplication of cosets is similar to those with pointwise addition of functions. Given two functions $f,g:\mathbb{R}\to\mathbb{R}$, you define $f+g$ to be the function \[(f+g)(x):=f(x)+g(x)\] Along with pointwise multiplication, this makes the space of functions $\mathbb{R}\to\mathbb{R}$ a ring with nice properties.
- But you have to understand that each element of the ring is a function thought of as a single math object. The values of the function are properties of the function, but they are not elements of the ring. (You can include the real numbers in the ring as constant functions, but don’t confuse me with facts.)
- Similarly the elements of the quotient group are math objects called cosets. They are not elements of the original group. (To add to the confusion, they are also blocks of a congruence.)
Isomorphic groups
- Many books, and many professors (including me) regard two isomorphic groups as the same. I remember getting anguished questions: “But the elements of $\mathbb{Z}_2$ are equivalence classes and the elements of the group of permutations of $\{1,2\}$ are functions.”
- I admit that regarding two isomorphic groups as the same needs to be treated carefully when, unlike $\mathbb{Z}_2$, the group has a nontrivial automorphism group. ($\mathbb{Z}_3$ is “the same as itself” in two different ways.) But you don’t have to bring that up the first time you attack that subject, any more than you have to bring up the fact that the category of sets does not have a set of objects on the first day you define categories.
Category theory
Category theory causes similar troubles. Beginning college math majors don’t usually meet it early. But category theory has begun to be used in other fields, so plenty of computer science students, people dealing with databases, and so on are suddenly trying to understand categories and failing to do so at the very start.
The G&G post A new kind of introduction to category theory constitutes an alpha draft of the first part of an article introducing category theory following the ideas of this post.
Objects and arrows are abstract
- Every once in a while someone asks a question on Math StackExchange that shows they have no idea that an object of a category need not have elements and that morphisms need not be functions that take elements to elements.
- One questioner understood that the claim that a morphism need not be a function meant that it might be a multivalued function.
Duality
- That misunderstanding comes up with duality. The definition of dual category requires turning the arrows around. Even if the original morphism takes elements to elements, the opposite morphism does not have to take elements to elements. In the case of the category of sets, an arrow in $\text{Set}^{op}$ cannot take elements to elements — for example, the opposite of the function $\emptyset\to\{1,2\}$.
- The fact that there is a concrete category equivalent to $\text{Set}^{op}$ is a red herring. It involves different sets: the function corresponding to the function just mentioned goes from a four-element set to a singleton. But in the category $\text{Set}^{op}$ as defined it is simply an arrow, not a function.
Not understanding how to use definitions
- Some of the questioners on Math Stack Exchange ask how to prove a statement that is quite simple to prove directly from the definitions of the terms involved, but what they ask and what they are obviously trying to do is to gain an intuition in order to understand why the statement is true. This is backward — the first thing you should do is use the definition (at least in the first few days of a math class — after that you have to use theorems as well!
- I have discussed this in the blog post Insights into mathematical definitions (which gives references to other longer discussions by math ed people). See also the abmath section Rewrite according to the definitions.
How an introduction to a math topic needs to be written
The following list shows some of the tactics I am thinking of using in the math topic introductions. It is quite likely that I will conclude that some tactics won’t work, and I am sure that tactics I haven’t mentioned here will be used.
- The introductions should not go very far into the subject. Instead, they should bring an exhaustive and explicit discussion of how to get into the very earliest part of the topic, perhaps the definition, some examples, and a few simple theorems. I doubt that a group theory student who hasn’t mastered abstraction and what proofs are about will ever be ready to learn the Sylow theorems.
- You can’t do examples and definitions simultaneously, but you can come close by going through an example step by step, checking each part of the definition.
- When you introduce an axiom, give an example of how you would prove that some binary operation satisfies the axiom. For example, if the axiom is that every element of a group must have an inverse, right then and there prove that addition on the integers satisfies the axiom and disprove that multiplication on integers satisies it.
- When the definition uses some undefined math objects, point out immediately with examples that you can’t have any intuition about them except what the axioms give you. (In contrast to definition of division of integers, where you and the student already have intuitions about the objects.)
- Make explicit the possible problems with abstractmath.org and Gyre&Gimble) will indeed find it difficult to become mathematical researchers — but not impossible!
- But that is not the point. All college math professors will get people who will go into theoretical computing science, and therefore need to understand category theory, or into particle physics, and need to understand groups, and so on.
- By being clear at the earliest stages of how mathematicians actually do math, they will produce more people in other fields who actually have some grasp of what is going on with the topics they have studied in math classes, and hopefully will be willing to go back and learn some more math if some type of math rears its head in the theories of their field.
- Besides, why do you want to alienate huge numbers of people from math, as our way of teaching in the past has done?
- “Our” means grammar school teachers, high school teachers and college professors.
There is a real split between students who want the definitions first
(most of whom don’t have the abstraction problems I am trying to overcome)
and those who really really think they need examples first (the majority)
because they don’t understand abstraction.
Acknowledgment
Thanks to Kevin Clift for corrections.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
Insights into mathematical definitions
My general practice with abstractmath.org has been to write about the problems students have at the point where they first start studying abstract math, with some emphasis on the languages of math. I have used my own observations of students, lexicographical work I did in the early 2000’s, and papers written by workers in math ed at the college level.
A few months ago, I finished revising and updating abstractmath.org. This took rather more than a year because among other things I had to reconstitute the files so that the html could be edited directly. During that time I just about quit reading the math ed literature. In the last few weeks I have found several articles that have changed my thinking about some things I wrote in abmath, so now I need to go back and revise some more!
In this post I will make some points about definitions that I learned from the paper by Edwards and Ward and the paper by Selden and Selden
I hope math ed people will read the final remarks.
Peculiarities of math definitions
“When I use a word, it means just what I choose it to mean–neither more nor less.” — Humpty Dumpty
A mathematical definition is fundamentally different from other sorts of definitions in two different ways. These differences are not widely appreciated by students or even by mathematicians. The differences cause students a lot of trouble.
List of properties
One of the ways in which a math definition is different from other kinds is that the definition of a math object is given by accumulation of attributes, that is, by listing properties that the object is required to have. Any object defined by the definition must have all those properties, and conversely any object with all the properties must be an example of the type of object being defined. Furthermore, there is no other criterion than the list of attributes.
Definitions in many fields, including some sciences, don’t follow this rule. Those definitions may list some properties the objects defined may have, but exceptions may be allowed. They also sometimes give prototypical examples. Dictionary definitions are generally based on observation of usage in writing and speech.
Imposed by decree
One thing that Edwards and Ward pointed out is that, unlike definitions in most other areas of knowledge, a math definition is stipulated. That means that meaning of (the name of) a math object is imposed on the reader by decree, rather than being determined by studying the way the word is used, as a lexicographer would do. Mathematicians have the liberty of defining (or redefining) a math object in any way they want, provided it is expressed as a compulsory list of attributes. (When I read the paper by Edwards and Ward, I realized that the abstractmath.org article on math definitions did not spell that out, although it was implicit. I have recently revised it to say something about this, but it needs further work.)
An example is the fact that in the nineteenth century some mathematicians allowed $1$ to be a prime. Eventually they restricted the definition to exclude $1$ because including it made the statement of the Fundamental Theorem of Arithmetic complicated to state.
Another example is that it has become common to stipulate codomains as well as domains for functions.
Student difficulties
Giving the math definition low priority
Some beginning abstract math students don’t give the math definition the absolute dictatorial power that it has. They may depend on their understanding of some examples they have studied and actively avoid referring to the definition. Examples of this are given by Edwards and Ward.
Arbitrary bothers them
Students are bothered by definitions that seem arbitrary. This includes the fact that the definition of “prime” excludes $1$. There is of course no rule that says definitions must not seem arbitrary, but the students still need an explanation (when we can give it) about why definitions are specified in the way they are.
What do you DO with a definition?
Some students don’t realize that a definition gives a magic formula — all you have to do is say it out loud.
More generally, the definition of a kind of math object, and also each theorem about it, gives you one or more methods to deal with the type of object.
For example, $n$ is a prime by definition if $n\gt 1$ and the only positive integers that divide $n$ are $1$ and $n$. Now if you know that $p$ is a prime bigger than $10$ then you can say that $p$ is not divisible by $3$ because the definition of prime says so. (In Hogwarts you have to say it in Latin, but that is no longer true in math!) Likewise, if $n\gt10$ and $3$ divides $n$ then you can say that $n$ is not a prime by definition of prime.
The paper by Bills and Tall calls this sort of thing an operable definition.
The paper by Selden and Selden gives a more substantial example using the definition of inverse image. If $f:S\to T$ and $T’\subseteq T$, then by definition, the inverse image $f^{-1}T’$ is the set $\{s\in S\,|\,f(s)\in T’\}$. You now have a magic spell — just say it and it makes something true:
- If you know $x\in f^{-1}T’$ then can state that $f(x)\in T’$, and all you need to justify that statement is to say “by definition of inverse image”.
- If you know $f(x)\in T’$ then you can state that $x\in f^{-1}T’$, using the same magic spell.
Theorems can be operable, too. Wiles’ Theorem wipes out the possibility that there is an integer $n$ for which $n^{42}=365^{42}+666^{42}$. You just quote Wiles’ Theorem — you don’t have to calculate anything. It’s a spell that reveals impossibilities.
What the operability of definitions and theorems means is:
A definition or theorem is not just a static statement,it is a weapon for deducing truth.
Some students do not realize this. The students need to be told what is going on. They do not have to be discarded to become history majors just because they may not have the capability of becoming another Andrew Wiles.
Final remarks
I have a wish that more math ed people would write blog posts or informal articles (like the one by Edwards and Ward) about what that have learned about students learning math at the college level. Math ed people do write scholarly articles, but most of the articles are behind paywalls. We need accessible articles and blog posts aimed at students and others aimed at math teachers.
And feel free to steal other math ed people’s ideas (and credit them in a footnote). That’s what I have been doing in abstractmath.org and in this blog for the last ten years.
References
Things. University of Chicago Press, 1990. See his discussion of concepts and prototypes.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
Very early difficulties II
Very early difficulties II
This is the second part of a series of posts about certain difficulties math students have in the very early stages of studying abstract math. The first post, Very early difficulties in studying abstract math, gives some background to the subject and discusses one particular difficulty: Some students do not know that it is worthwhile to try starting a proof by rewriting what is to be proved using the definitions of the terms involved.
Math StackExchange
The website Math StackExchange is open to any questions about math, even very easy ones. It is in contrast with Math OverFlow, which is aimed at professional mathematicians asking questions in their own field.
Math SE contains many examples of the early difficulties discussed in this series of posts, and I recommend to math ed people (not just RUME people, since some abstract math occurs in advanced high school courses) that they might consider reading through questions on Math SE for examples of misunderstanding students have.
There are two caveats:
- Most questions on Math SE are at a high enough level that they don’t really concern these early difficulties.
- Many of the questions are so confused that it is hard to pinpoint what is causing the difficulty that the questioner has.
Connotations of English words
The terms(s) defined in a definition are often given ordinary English words as names, and the beginner automatically associates the connotations of the meaning of the English word with the objects defined in the definition.
Infinite cardinals
If $A$ if a finite set, the cardinality of $A$ is simply a natural number (including $0$). If $A$ is a proper subset of another set $B$, then the cardinality of $A$ is strictly less than the cardinality of $B$.
In the nineteenth century, mathematicians extended the definition of cardinality for infinite sets, and for the most part cardinality has the same behavior as for finite sets. For example, the cardinal numbers are well-ordered. However, for infinite sets it is possible for a set and a proper subset of the set to have the same cardinality. For example, the cardinality of the set of natural numbers is the same as the cardinality of the set of rational numbers. This phenomenon causes major cognitive dissonance.
Question 1331680 on Math Stack Exchange shows an example of this confusion. I have also discussed the problem with cardinality in the abstractmath.org section Cardinality.
Morphism in category theory
The concept of category is defined by saying there is a bunch of objects called objects (sorry bout that) and a bunch of objects called morphisms, subject to certain axioms. One requirement is that there are functions from morphisms to objects choosing a “domain” and a “codomain” of each morphism. This is spelled out in Category Theory in Wikibooks, and in any other book on category theory.
The concepts of morphism, domain and codomain in a category are therefore defined by abstract definitions, which means that any property of morphisms and their domains and codomains that is true in every category must follow from the axioms. However, the word “morphism” and the talk about domains and codomains naturally suggests to many students that a morphism must be a function, so they immediately and incorrectly expect to evaluate it at an element of its domain, or to treat it as a function in other ways.
Example
If $\mathcal{C}$ is a category, its opposite category $\mathcal{C}^{op}$ is defined this way:
- The objects of $\mathcal{C}^{op}$ are the objects of $\mathcal{C}$.
- A morphism $f:X\to Y$ of $\mathcal{C}^{op}$ is a morphism from $Y$ to $X$ of $\mathcal{C}$ (swap the domain and codomain).
In Question 980933 on Math SE, the questioner is saying (among other things) that in $\text{Set}^{op}$, this would imply that there has to be a morphism from a nonempty set to the empty set. This of course is true, but the questioner is worried that you can’t have a function from a nonempty set to the empty set. That is also true, but what it implies is that in $\text{Set}^{op}$, the morphism from $\{1,2,3\}$ to the empty set is not a function from $\{1,2,3\}$ to the empty set. The morphism exists, but it is not a function. This does not any any sense make the definition of $\text{Set}^{op}$ incorrect.
Student confusion like this tends to make the teacher want to have a one foot by six foot billboard in his classroom saying
A MORPHISM DOESN’T HAVE TO BE A FUNCTION!
However, even that statement causes confusion. The questioner who asked Question 1594658 essentially responded to the statement in purple prose above by assuming a morphism that is “not a function” must have two distinct values at some input!
That questioner is still allowing the connotations of the word “morphism” to lead them to assume something that the definition of category does not give: that the morphism can evaluate elements of the domain to give elements of the codomain.
So we need a more elaborate poster in the classroom:
The definition of “category” makes no requirement
that an object has elements
or that morphisms evaluate elements.
As was remarked long long ago, category theory is pointless.
English words implementing logic
There are lots of questions about logic that show that students really do not think that the definition of some particular logical construction can possibly be correct. That is why in the abstractmath.org chapter on definitions I inserted this purple prose:
A definition is a totalitarian dictator.
It is often the case that you can explain why the definition is worded the way it is, and of course when you can you should. But it is also true that the student has to grovel and obey the definition no matter how weird they think it is.
Formula and term
In logic you learn that a formula is a statement with variables in it, for example “$\exists x((x+5)^3\gt2)$”. The expression “$(x+5)^3$” is not a formula because it is not a statement; it is a “term”. But in English, $H_2O$ is a formula, the formula for water. As a result, some students have a remarkably difficult time understanding the difference between “term” and “formula”. I think that is because those students don’t really believe that the definition must be taken seriously.
Exclusive or
Question 804250 in MathSE says:
“Consider $P$ and $Q$. Let $P+Q$ denote exclusive or. Then if $P$ and $Q$ are both true or are both false then $P+Q$ is false. If one of them is true and one of them is false then $P+Q$ is true. By exclusive or I mean $P$ or $Q$ but not both. I have been trying to figure out why the truth table is the way it is. For example if $P$ is true and $Q$ is true then no matter what would it be true?”
I believe that the questioner is really confused by the plus sign: $P+Q$ ought to be true if $P$ and $Q$ are both true because that’s what the plus sign ought to mean.
Yes, I know this is about a symbol instead of an English word, but I think the difficulty has the same dynamics as the English-word examples I have given.
If I have understood this difficulty correctly, it is similar to the students who want to know why $1$ is not a prime number. In that case, there is a good explanation.
Only if
The phrase “only if” simply does not mean the same thing in math as it does in English. In Question 17562 in MathSE, a reader asks the question, why does “$P$ only if $Q$” mean the same as “if $P$ then $Q$” instead of “if $Q$ then $P$”?
Many answerers wasted a lot of time trying to convince us that “$P$ only if $Q$” mean the same as “if $P$ then $Q$” in ordinary English, when in fact it does not. That’s because in English, clauses involving “if” usually connote causation, which does not happen in math English.
Consider these two pairs of examples.
- “I take my umbrella only if it is raining.”
- “If I take my umbrella, then it is raining.”
- “I flip that switch only if a light comes on.”
- “If I flip that switch, a light comes on.”
The average non-mathematical English speaker will easily believe that (1) and (4) are true, but will balk and (2) and (3). To me, (3) means that the light coming on makes me flip the switch. (2) is more problematical, but it does (to me) have a feeling of causation going the wrong way. It is this difference that causes students to balk at the equivalence in math of “$P$ only if $Q$” and “If $P$, then $Q$”. In math, there is no such thing as causation, and the truth tables for implication force us to live with the fact that these two sentences mean the same thing.
Henning Makholm’ answer to Question 17562 begins this way: “I don’t think there’s really anything to understand here. One simply has to learn as a fact that in mathematics jargon the words ‘only if’ invariably encode that particular meaning. It is not really forced by the everyday meanings of ‘only’ and’ if’ in isolation; it’s just how it is.” That is the best way to answer the question. (Other answerers besides Makholm said something similar.)
I have also discussed this difficulty (and other difficulties with logic) in the abmath section on “only if“.
References
- Abstractmath.org (“abmath”)
- Cognitive dissonance in abmath
- Commonword names for technical concepts blog post
- Definitions in abmath
- Different names for the same thing blog post
- Handbook of Mathematical Discourse
- Languages of math in abmath
- Math majors attacked by cognitive dissonance blog post
- Math Stack Exchange.
- Names in math English in abmath
- Naming mathematical objects blog post
- Renaming technical concepts blog post
- Technical meanings clash with everyday meanings, blog post.
- Very early difficulties in studying abstract math
- Category Theory in Wikibooks.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
Very early difficulties in studying abstract math
Introduction
There are a some difficulties that students have at the very beginning of studying abstract math that are overwhelmingly important, not because they are difficult to explain but because too many teachers don’t even know the difficulties exist, or if they do, they think they are trivial and the students should know better without being told. These difficulties cause too many students to give up on abstract math and drop out of STEM courses altogether.
I spent my entire career in math at Case Western Reserve University. I taught many calculus sections, some courses taken by math majors, and discrete math courses taken mostly by computing science majors. I became aware that some students who may have been A students in calculus essentially fell off a cliff when they had to do the more abstract reasoning involved in discrete math, and in the initial courses in abstract algebra, linear algebra, advanced calculus and logic.
That experience led me to write the Handbook of Mathematical Discourse and to create the website abstractmath.org. Abstractmath.org in particular grew quite large. It does describe some of the major difficulties that caused good students to fall of the abstraction cliff, but also describes many many minor difficulties. The latter are mostly about the peculiarities of the languages of math.
I have observed people’s use of language since I was like four or five years old. Not because I consciously wanted to — I just did. When I was a teenager I would have wanted to be a linguist if I had known what linguistics is.
I will describe one of the major difficulties here (failure to rewrite according to the definition) with an example. I am planning future posts concerning other difficulties that occur specifically at the very beginning of studying abstract math.
Rewrite according to the definition
To prove that a statement
involving some concepts is true,
start by rewriting the statement
using the definitions of the concepts.
Example
Definition
A function $f:S\to T$ is surjective if for any $t\in T$ there is an $s\in S$ for which $f(s)=t$.
Definition
For a function $f:S\to T$, the image of $f$ is the set \[\{t\in T\,|\,\text{there is an }s\in S\text{ for which }f(s)=t\}\]
Theorem
Let $f:S\to T$ be a function between sets. Then $f$ is surjective if and only if the image of $f$ is $T$.
Proof
If $f$ is surjective, then the statement “there is an $s\in S$ for which $f(s)=t$” is true for any $t\in T$ by definition of surjectivity. Therefore, by definition of image, the image of $f$ is $T$.
If the image of $f$ is $T$, then the definition of image means that there is an $s\in S$ for which $f(s)=t$ for any $t\in T$. So by definition of surjective, $f$ is surjective.
“This proof is trivial”
The response of many mathematicians I know is that this proof is trivial and a student who can’t come up with it doesn’t belong in a university math course. I agree that the proof is trivial. I even agree that such a student is not a likely candidate for getting a Ph.D. in math. But:
- Most math students in an American university are not going to get a Ph.D. in math. They may be going on in some STEM field or to teach high school math.
- Some courses taken by students who are not math majors take courses in which simple proofs are required (particularly discrete math and linear algebra). Some of these students may simply be interested in math for its own sake!
A sizeable minority of students who are taking a math course requiring proofs need to be told the most elementary facts about how to do proofs. To refuse to explain these facts is a disfavor to the mathematics community and adds to the fear and dislike of math that too many people already have.
These remarks may not apply to students in many countries other than the USA. See When these problems occur.
“This proof does not describe how mathematicians think”
The proof I wrote out above does not describe how I would come up with a proof of the statement, which would go something like this: I do math largely in pictures. I envision the image of $f$ as a kind of highlighted area of the codomain of $f$. If $f$ is surjective, the highlighting covers the whole codomain. That’s what the theorem says. I wouldn’t dream of writing out the proof I gave about just to verify that it is true.
More examples
Abstractmath.org and Gyre&Gimble contain several spelled-out theorems that start by rewriting according to the definition. In these examples one then goes on to use algebraic manipulation or to quote known theorems to put the proof together.
- Rewrite according to the definitions in abmath
- Direct method in abmath
- Detailed proof in abmath
- A proof by diagram chasing, blog post
Comments
This post contains testable claims
Herein, I claim that some things are true of students just beginning abstract math. The claims are based largely on my teaching experience and some statements in the math ed literature. These claims are testable.
When these problems occur
In the United States, the problems I describe here occur in the student’s first or second year, in university courses aimed at math majors and other STEM majors. Students typically start university at age 18, and when they start university they may not choose their major until the second year.
In much of the rest of the world, students are more likely to have one more year in a secondary school (sixth form in England lasts two years) or go to a “college” for a year or two before entering a university, and then they get their bachelor’s degree in three years instead of four as in the USA. Not only that, when they do go to university they enter a particular program immediately — math, computing science, etc.
These differences may mean that the abstract math cliff occurs early in a student’s university career in the USA and before the student enters university elsewhere.
In my experience at CWRU, some math majors fall of the cliff, but the percentage of computing science students having trouble was considerably greater. On the other hand, more of them survived the discrete math course when I taught it because the discrete math course contain less abstraction and more computation than the math major courses (except linear algebra, which had a balance similar to the discrete math course — and was taken by a sizeable number of non-math majors).
References
- Abstractmath.org (“abmath”)
- Definitions in abmath
- Detailed proof in abmath
- Direct method in abmath
- Handbook of Mathematical Discourse
- Languages of math in abmath
- A proof by diagram chasing, blog post
- Rewrite according to the definitions in abmath
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
The intent of mathematical assertions
An assertion in mathematical writing can be a claim, a definition or a constraint. It may be difficult to determine the intent of the author. That is discussed briefly here.
Assertions in math texts can play many different roles.
English sentences can state facts, ask question, give commands, and other things. The intent of an English sentence is often obvious, but sometimes it can be unexpectedly different from what is apparent in the sentence. For example, the statement “Could you turn the TV down?” is apparently a question expecting a yes or no answer, but in fact it may be a request. (See the Wikipedia article on speech acts.) Such things are normally understood by people who know each other, but people for whom English is a foreign language or who have a different culture have difficulties with them.
There are some problems of this sort in math English and the symbolic language, too. An assertion can have the intent of being a claim, a definition, or a constraint.
Most of the time the intent of an assertion in math is obvious. But there are conventions and special formats that newcomers to abstract math may not recognize, so they misunderstand the point of the assertion. This section takes a brief look at some of the problems.
Terminology
The way I am using the words “assertion”, “claim”, and “constraint” is not standard usage in math, logic or linguistics.
Claims
In most circumstances, you would expect that if a lecturer or author makes a math assertion, they are claiming that it is a true statement, and you would be right.
Examples
- “The $240$th digit of $\pi$ after the decimal point is $4$.”
- “If a function is differentiable, it must be continuous.”
- “$7\gt3$”
Remarks
- You don’t have to know whether these statements are true or not to recognize them as claims. An incorrect claim is still a claim.
- The assertion in (a) is a statement, in this case a false one. If it claimed the googolth digit was $4$ you would never be able to tell whether it is true or not, but it
still would be an assertion intended as a claim. - The assertion in (b) uses the standard math convention that an indefinite noun phrase (such as “a widget”) in the subject of a sentence is universally quantified (see also the article about “a” in the Glossary.) In other words, “An integer divisible by $4$ must be even” claims that any integer divisible by $4$ is even. This statement is claim, and it is true.
- (c) is a (true) claim in the symbolic language. (Note that “$3 + 4$” is not an assertion at all, much less a claim.)
Definitions
Definitions are discussed primarily in the chapter on definitions. A definition is not the same thing as a claim.
Example
The definition
“An integer is even if it is divisible by $2$”
makes the claim
“An
integer is even if and only if it is
divisible by $2$”
true.
(If you are surprised that the definition uses “if” but the claim uses “if and only if”, see the Glossary article on “if”.)
Unmarked definitions
Math texts sometimes define something without saying that it is a definition. Because of that, students may sometimes think a claim is a definition.
Example
Suppose that the concept of “even integer” was new to you and the book said, “A number is even if it is divisible by $4$.” Perhaps you thought that this was a definition. Later the book refers to $6$ as even and you pull your hair out wondering why. The statement is a correct claim but an incorrect definition. A good writer would write something like “Recall that a number is even if it is divisible by $2$, so that in particular it is even if it is divisible by $4$.”
On the other hand, you may think a definition is only a claim.
Example
A lecturer may say “By definition, an integer is even if it is divisible by $2$”, and you write down: “An integer is even if it is divisible by $2$”. Later, you get all panicky wondering How did she know that?? (This has happened to me.)
The confusion in the preceding example can also occur if a books says, “An integer is even if it is divisible by $2$” and you don’t know about the convention that when an author puts a word or phrase in boldface or italics it may mean that they are defining it.
A good writer always labels definitions
Constraints
Here are two assertions that contain variables.
- “$n$ is even.”
- “$x\gt1$”.
Such an assertion is a constraint (or a condition) if the intent is
that the assertion will hold in that part of the text (the scope of the constraint). The part of the text in which it holds is usually the immediate vicinity unless the authors explicitly says it will hold in a larger part of the text such as “this chapter” or “in the rest of the book”.
Examples
- Sometimes the wording makes it clear that the phrase is a constraint. So a statement such as “Suppose $3x^2-2x-5\geq0$” is a constraint on the possible values of $x$.
- The statement “Suppose $n$ is even” is an explicit requirement that $n$ be even and an implicit requirement that $n$ be an integer.
- A condition for which you are told to find the solution(s) is a constraint. For example: “Solve the equation $3x^2-2x-5=0$”. This equation is a constraint on the variable $x$. “Solving” the equation means saying explicitly which numbers make the equation true.
Postconditions
The constraint may appear in parentheses after the assertion as a postcondition on an assertion.
Example
“$x^2\gt x\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,(\text{all }x\gt1)$”
which means that if the constraint “$x\gt1$” holds, then “$x^2\gt x$” is true. In other words, for all $x\gt1$, the statement $x^2\gt x$ is true. In this statement, “$x^2\gt x$” is not a constraint, but a claim which is true when the constraint is true.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
The Mathematics Depository: A Proposal
Introduction
This post is about taking texts written in mathematical English and the symbolic language and encoding it in a formal language that could be tested by an automated proof verifier. This is a very difficult undertaking, but we could get closer and closer to a working system by a worldwide effort continuing over, probably, decades. The system would have to contain many components working together to create incremental improvements in the process.
This post, which is a first draft, outlines some suggestions as to how this could work. I do not discuss the encoding required, which is not my area of expertise. Yes, I understand that coding is the hard part!
Much work has been done by computing scientists in developing proof checking and proof-finding programs. Work has also been done, primarily by math education workers but also by some philosophers and computing scientists, in uncovering the many areas where ordinary math language is ambiguous and deviates from ordinary English usage. These characteristics confuse students and also make it hard to design a program that can interpret the language. I have been working in that area mostly from the math ed point of view for the last twenty years.
The Reference section lists many references to the problem of parsing mathematical English, some from the point of view of automatic translation of math language into code, but most from the point of view of helping students understand how to understand it.
The Mathematics Depository
I imagine a system for converting documents written in math language into machine-readable language and testing their claims. An organization, call it the Mathematics Depository, would be developed that is supported by many countries, organizations and individual supporters. It should consist of several components listed below, no doubt with other components as we become aware of needing them. The organization would be tasked with supporting and improving these components over time.
The main parts of the system
Each component is linked to a more detailed description that is given later in this post.
- A Proof Verifier (PV), that inputs a proof and determines if it is correct.
- A specification of a supported subset of Mathematical English and the symbolic language, that I will call Strict Math English (SME).
- A Text-SME Converter, a program that would input a text written in ordinary math English that has been annotated by a knowledgeable person and convert it into SME.
- An SME-PV Converter that will convert text written in SME into code that can be directly read by the Proof Verifier.
- One or more Automatic Theorem Provers, that to begin with can take fairly simple conjectures written in SME and sometimes succeed in proving them.
- An Annotation System containing an Annotation Editor that would allow a person to use SME to annotate an article written in ordinary math English so that it could be read by the Text-SME Converter.
- A Data Base that would include the texts that have been collected in this endeavor, along with the annotations and the results of the proof checking.
- A Data Base Miner that would watch for patterns in the annotations as new papers were submitted. The operators might also program it to watch for patterns in other aspects of the operation.
These facilities would be organized so that the systems work together, with the result that the individual components I named improve over time, both automatically and via human intervention.
Flow of Work
- A math text is submitted.
- If it is already in Strict Math English (SME), it is input to the Proof Verifier (PV).
- Otherwise, the math text is input into the Annotation System.
- The resulting SME text is input into the Text-SME Converter.
- The output of the Text-SME Converter is input into the Proof Verifier.
- The PV incorporates each definition in the text into the context of the math text. This is a specific meaning of the word “context”, including a list of the status of variables (bound, unbound, type, and so on), meanings of technical words, and other facts created in the text. “Context” is described informally in my article Context in abstractmath.org. That article gives references to the formal literature.
- Each mathematical assertion in the text is marked as a claim.
- The checking process records those claims occurring in the proof that are not proved in the text, along with any references given to other texts.
- If a reference to a result in another text is made, the PV looks for the result in the Database. If it does not find it, the PV incorporates the result and its location in the Database as an externally proven but untested claim.
- If no reference or proof for a claim is given, the PV checks the Database to see if it has already been proved.
- Any claim in the current text not shown as proven in the Database is submitted to the Automatic Theorem Prover (ATP). The output of the ATP is put in the database (proved, counterexample found, or unable to determine truth).
- If a segment of text is presented as a proof, it is input into the PV to be verified.
- The PV reports the result for each claimed proof, which can consist of several possibilities:
- A counterexample for a proof is found, so the claim that the proof was supposed to report is false.
- The proof contains gaps, so the claim is unsettled.
- The proof is reported as correct.
- At the end of the process, all the information gathered is put into the Database:
- The original text showing all the annotations.
- The text in SME.
- All claims, with their status (proven true, proven false, truth unknown, reference if one was given).
- Every proof, with its status and the entire context at each step of the proof.
In my experience mathematicians spend only a little time reading arguments step by step as described in the Context article. They usually look at a theorem and try to figure it out themselves, “cheating” occasionally by glancing at parts of the proof.
Details
The proof verifier
- Proof checking programs have been developed over the last thirty or so years. The MD should write or adapt one or more Proof Verifiers and improve it incrementally as a result of experience in running the system. In this post I have assumed the use of just one Proof Verifier.
- The Proof Verifier should be designed to read the output of the SME-PV converter.
- The PV must read a whole math text in SME, identify and record each claim and check each proof (among other things). This is different from current proof verifiers, which take exactly one proof as input.
- The PV must create the context of each proof and change it step by step as it reads each syntactic fragment of the math text.
- Typically the context for a claimed proof is built up in the whole math text, not just in the part called “Proof”.
- The PV should automatically query the Data Base for unproved steps in a proof in the input text to see if they have already been verified somewhere else. These results should be quoted in a proof verifier output.
- The PV should also automatically submit steps in the proof that haven’t been verified to the Automatic Theorem Provers and wait for the step to be verified or not.
- The Proof Verifier should output details of the result of the checking whether it succeeded in verifying the whole input text or not. In particular, it should list steps in proofs it failed to verify, including steps in proofs for which the input text cited the proof in some other paper, in the MD system or not.
- The Proof Verifier should be available online for anyone to submit, in SME, a mathematical text claiming to prove a theorem. Submission might require a small charge.
Strict Math English
- One of the most important aspects of the system would be the simultaneous incremental updating of the SME and the SME-PV Converter.
- The idea is that SME would get more and more inclusive of the phrases and clauses it allows.
Example: Universal Assertions
At the start SME might allow these statements to be recognized as the same universal assertion:
- “$\forall x(x^2+1\gt0)$”
- “For all [every, any] $x$, $x^2+1\gt0$.” (universality asserted using an English word.)
- “For all [every, any] $x$, $x^2+1$ is positive.”
As time goes on, a person or the Data Base Miner might detect that many annotators also recognized these statements as saying the same thing:
- “$x^2+1\gt0\,\,\,\,\,(\text{all } x)$” (as a displayed statement)
- “$x^2+1$ is positive for every $x$.” Universality asserted using an adjective in a postposited phrase.
- “$x^2+1$ is always positive.” Universality hidden in a postposited adverb that seems to be referring to time!
- There are more examples in my article Universally True Assertions. See also Susanna Epp’s article on quantification for other problems in this area.
These other variations would then be added to the Strict Math Language. (This is only an example of how the system would evolve. I have no doubt that in fact all the terminology mentioned above would be included at the outset, since they are all documented in the math ed literature.)
Even at the start, SME will include phrases and clauses in the English language as well as symbolic expressions. It is notorious that automatically parsing general English sentences is difficult and that the ubiquity of metaphors makes it essentially impossible to reliably construct the meaning of a sentence. That is why SME must start with a very narrow subset of math English. But even in early days, it should include some stereotyped metaphors, such as using “always” in universal assertions.
The SME-PV Converter
- The SME-PV Converter would read documents written in SME and convert them into code readable by the proof checking program, as well as by the automatic theorem provers.
- Such a program is essentially the subject of Ganesingalam’s book.
- Converting SME so that the Proof Verifier can handle it involves lots of subtleties. For example, if the text says, “For any $x$, $x^2+1\gt0$”, the translation has to recognize not only that this is a universally quantified statement with $x$ as the bound variable, but that $x$ must be a real number, since complex numbers don’t do greater-than.
- Frequent revisions of the SME-PV Converter will be necessary since its input language, the SME, will be constantly expanded.
- It may be that the output language of the SME-PV Converter (which the Proof Verifier and Automatic Theorem Provers read) will require only infrequent revisions.
The Automatic Theorem Provers
- The system could support several ATP’s, each one adapted to read the output of the SME-PV Converter.
- The Automatic Theorem Provers should provide output in such a way that the Proof Verifier can include in its report the positive or negative results of the Theorem Prover in detail.
The Annotation System
- The Annotation system would facilitate construction of a data structure that connects each annotation to the specific piece of text it rewrites. The linking should be facilitated by the Annotation Editor.
- For example, an annotation that is meant to explain that the statement (in the input text) “$x^2+1$ is always greater than $0$” is to be translated as “$\forall x(x^2+1\gt0)$” (which is presumably allowed by SME) should cause the first statement to be be linked to the second statement. The first statement, the one in the input text, should not be changed. This will enable the Data Base Miner to find patterns of similar text being annotated in similar ways.
- The annotations should clarify words, symbolic expressions and sentences in the input text to allow the Proof Verifier to input them correctly.
- In particular, every claim that a statement is true should be marked as a proposed theorem, and similarly every proof should be marked as a proof and every definition should be marked as a definition. Such labeling is often omitted in the math literature. Annotators would have to recognize segments of the text as claims, proofs and definitions and annotate them as such.
- The annotations would be written in the current version of Strict Math English. Since SME is frequently updated, the instructions for the annotator would also have to be frequently updated.
Examples
- If a paper used the word “domain” without defining it, the annotator would clarify whether it meant an open connected set, a type of ring, a type of poset, or the domain of a function. See Example 1
- Annotators will note instances in which the same text will use a symbol with two different meanings. See Example 2.
- In a phrase, a single occurrence of a symbol can require an annotation that assigns more than one attribute to the symbol. See Example 3.
The Annotation Editor
- The annotators should be provided with an Annotation Editor designed specifically for annotation.
- The editor should include a system of linking an annotation to the exact phrase it annotates that is easy for a person reading the annotated document to understand it as well as providing the information to the Text-SME Converter.
The Annotators
- Great demands will be made of an annotator.
- They must understand the detailed meaning of the text they annotate. This means they must be quite familiar with the field of math the text is concerned with.
- They must learn SME. I know for a fact that many mathematicians are not good at learning foreign languages. It will help that SME will be a subset of the full language of math.
- All this means that annotators must be chosen carefully and paid well. This means that not very many papers will get annotated by paid annotators, so that there will have to be some committee that chooses the papers to be annotated. This will be a genuine bottleneck.
- One thing that will help in the long run is that the SME should evolve to include more features of the general language of math, so many mathematicians will actually write their papers in SME and submit it directly to the Depository. (“Long run” may mean more than ten years).
The Text-to-SME Converter
- This converter takes a math text in ordinary Math English that has been annotated and convert it into SME.
- The format for feeding it to the Automatic Theorem Prover may very well have to be different from the format to be read by a human. Both formats should be saved.
The Data Base
- The Data Base would contain all math papers that have been run through the Proof Verifier, along with the results found by the Proof Verifier. A paper should be included whether or not every claim in the paper was verified.
- Funding agencies (and private individuals) might choose particularly important papers and pay more money for annotation for those than for other papers.
- Mathematicians in a particular field could be hired to annotate particular articles in their field, using a standard annotation language that would develop through time.
- The annotated papers would be made freely available to the public.
- It will no doubt prove useful for the Data Base to contain many other items. Possibilities:
- A searchable list of all theorems that have been verified.
- A glossary: a list of math words that have been defined in the papers in the Depository. This will include synonyms and words with multiple meanings.
The Data Base Miner
Watch for patterns
The DBM would watch for patterns in annotation as new annotated papers were submitted. It should probably look only at annotated papers whose proofs had been verified. The patterns might include:
- Correlation between annotations that associate particular meanings to particular words or symbols with the branch of math the paper belongs to. See Example 1.
- Noting that a particular format of combining symbols usually results in the same kind of annotation. See Example 4.
- Providing data in such a way that lexicographers studying math English could make use of them. My Handbook began with my doing lexicographical research on math English, but I found it so slow that when I started abstractmath.org I resolved not to such research any more. Nevertheless, it needs to be done and the Database should make the process much easier.
Statistical translation
Since the annotated papers will be stored in the Data Base, the Data Base Miner could use the annotations in somewhat the same way some language translators work (in part): to translate a phrase, it will find occurrences of the phrase in the source language that have been translated into the target language and use the most common translation. In this case the source language is the paper (in English) and the target language is in annotated math English readable by the Proof Verifier. Once the Database includes most of the papers ever published (twenty years from now?), statistical translation might actually become useful.
Examples
Example 1: Meaning varies with branch of math
- “Field” means one thing in an algebra paper and another in a mathematical physics paper.
- “Domain” means
- An open connected set in topology.
- A type of ring in algebra.
- A type of poset in theoretical computing science.
- The domain of a function –everywhere in math, which makes it seem that this is going to be very hard to distinguish without human help!
Example 2: Meaning varies even in the same article
- The notation “$(a,b)$” can mean an ordered pair, an open interval, or the GCD. What’s worse, there are many instances where the symbol is used without definition. Citation 139 in the Handbook provides a single sentence in which the first two meanings both occur:
$\dots$ Richard Darst and Gerald Taylor investigated the differentiability of functions $f^p$ (which for our purposes we will restrict to $(0,1)$) defined for each $p\geq1$ by\[F(x):=
\begin{cases}
0 &
\text{if }x\text{ is irrational}\\
\displaystyle{\frac{1}{n^p}} &
\text{if }x = \displaystyle{\frac{m}{n}}\text{ with }(m,n)=1\\ \end{cases}\]The sad thing is that any mathematician will know immediately what each occurrence means. This may be a case where the correct annotation will never be automatically detectable.
Example 3: One mention of a symbol may require several meanings
In the sentence, “This infinite series converges to $\zeta(2)=\frac{\pi^2}{6}\approx 1.65$,” the annotator would provide two pieces of information about “$\frac{\pi^2}{6}$”, namely that it is both the right constituent of the equation “$\zeta(2)=\frac{\pi^2}{6}$” and the left constituent of the approximation statement “$\frac{\pi^2}{6}\approx 1.65$” — and that these two statements were the constituents of an asserted conjunction. (See my post Pivoted symbols.)
Example 4: Function to a power
Some expressions not in the SME will almost always be annotated in the same way. This makes it discoverable by the Data Base Miner.
- “$\sin^{-1}x$” always means $\arcsin x$.
- For positive $n$, “$\sin^n x$” always means $(\sin x)^n$. It never means the $n$-fold application of $\sin$ to $x$.
- In contrast, for an arbitrary function symbol, $f^n(x)$ will often be annotated as $n$-fold application of $f$ and also often as $f(x)^n$. (And maybe those last two possibilities are correlated by branch of math.)
References
I believe that work in formal verification has tended to overlook the work on math language difficulties in math ed, so I have included some articles from that specialty.
- Formally Verified Mathematics, by Jeremy Avigad and John Harrison. Communications of the ACM, 2014. Vol. 57 No. 4, Pages 66-75. 10.1145/2591012. This article is behind a pay barrier, but if you have library privileges in a university you may be able to download it for free.
- Varieties of mathematical prose (1998), by Atish Bagchi and Charles Wells. The annotations I referred to in this post could include annotations of types of prose, including the ones listed in this article and also Ganesalingam’s distinction between formal and informal math prose.
- On the communication of mathematical reasoning (1998), by Atish Bagchi and Charles Wells. Contains examples of hard-to-parse usage and references to the math ed literature.
- Proofs in mathematics, by Alexander Bogomolny. Many examples of proofs that could be used as tests for the Proof Verifier.
- Contrapositive grammar, 2014, a post in the blog by David Butler.
- APOS: A constructivist theory of learning in undergraduate mathematics education research, Ed Dubinsky and Michael A. MacDonald. This is an outline with references to many papers giving more detail.
- Proof issues with existential quantification, by Susanna S. Epp.
- The Language of Quantification in Mathematics Instruction, 1999, by Susanna S. Epp.
- Variables in mathematics education, 2011, by Susanna S. Epp.
- Math lingo vs. plain English, 1997, excerpt from article by Reuben Hersh, American Mathematical Monthly
volume 104, Number 1, January 1997, pp 48-51. - Automated proof verification, 2012. Discussion in Math Stack Exchange. Has references not included here.
- Formalizing math, 2015. MathOverflow discussion.
- The Language of mathematics, 2013. By Mohan Ganesalingam. Description of mathematical language with an outline of a methodology for automatic translation into a proof-verification language. This methodology has been partially implemented. Available only as a book.
- MathResS, a Mathematical Research System, by Arnold Neumeier. This document describes a project that overlaps my proposal, and it lists many projects and programs that might be useful for the project.
- When the Rules of Discourse Change, but Nobody Tells You: Making Sense of Mathematics Learning From a Commognitive Standpoint, 2007. By Anna Sfard.
- Symbolizing mathematical reality into being: How mathematical discourse and mathematical objects create each other, (2000). By Anna Sfard. In P. Cobb, K. E. Yackel, & K. McClain (Eds), Symbolizing and communicating: perspectives on Mathematical Discourse, Tools, and Instructional Design (pp. 37-98). Available only as a book.
- How to read and do proofs: A introduction to mathematical and thought processes,2013, by Danny Solow. This book contains many simple proofs written out that could be used as tests for the Strict Math Language. It also discusses outlines of formats of proofs (contrapositve, contradiction, and so on). It might be useful to have the Proof Verifier recognize the various formats of proofs explicitly.
- Communicating mathematics: useful ideas from computer science (1995), by Charles Wells.
- The languages of math, written during 2005-2010 with many revisions since, by Charles Wells. Long article in abstractmath.org intended for students new to formal math.
- Handbook of Mathematical Discourse, 2003, by Charles Wells. Book about the languages of math together with citations of mathematical usage from the literature.
The following are posts from my blog Gyre&Gimble. They are in reverse chronological order.
- Pivoted symbols.
- Problems caused for students by the two languages of math.
- Variations in meaning in math.
- Representations of mathematical objects.
- Semantics of algebra I
- Algebra is a difficult foreign language.
- Bugs in English and in Math.
- Abuse of notation
- Mathematical usage
- Technical words in English.
- The languages of mathematics.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
Problems caused for students by the two languages of math
The two languages of math
Mathematics is communicated using two languages: Mathematical English and the symbolic language of math (more about them in two languages).
This post is a collection of examples of the sorts of trouble that the two languages cause beginning abstract math students. I have gathered many of them here since they are scattered throughout the literature. I would welcome suggestions for other references to problems caused by the languages of math.
In many of the examples, I give links to the literature and leave you to fish out the details there. Almost all of the links are to documents on the internet.
There is an extensive list of references.
Conjectures
Scattered through this post are conjectures. Like most of my writing about difficulties students have with math language, these conjectures are based on personal observation over 37 years of teaching mostly computer engineering and math majors. The only hard research of any sort I have done in math ed consists of the 426 citations of written mathematical writing included in the Handbook of Mathematical Discourse.
Disclaimer
This post is an attempt to gather together the ways in which math language causes trouble for students. It is even more preliminary and rough than most of my other posts.
- The arrangement of the topics is unsatisfactory. Indeed, the topics are so interrelated that it is probably impossible to give a satisfactory linear order to them. That is where writing on line helps: Lots of forward and backward references.
- Other people and I have written extensively about some of the topics, and they have lots of links. Other topics are stubs and need to be filled out. I have probably missed important points about and references to many of them.
- Please note that many of the most important difficulties that students have with understanding mathematical ideas are not caused by the languages of math and are not represented here.
I expect to revise this article periodically as I find more references and examples and understand some of the topics better. Suggestions would be very welcome.
Intricate symbolic expressions
I have occasionally had students tell me that have great difficulty understanding a complicated symbolic expression. They can’t just look at it and learn something about what it means.
Example
Consider the symbolic expression \[\displaystyle\left(\frac{x^3-10}{3 e^{-x}+1}\right)^6\]
Now, I could read this expression aloud as if it were text, or more precisely describe it so that someone else could write it down. But if I am in math mode and see this expression I don’t “read” it, even to myself.
I am one of those people who much of the time think in pictures or abstractions without words. (See references here.)
In this case I would look at the expression as a structured picture. I could determine a number of things about it, and when I was explaining it I would point at the board, not try to pronounce it or part of it:
- The denominator is always positive so the expression is defined for all reals.
- The exponent is even so the value of the expression is always nonnegative. I would say, “This (pointing at the exponent) is an even power so the expression is never negative.”
- It is zero in exactly one place, namely $x=\sqrt[3]{10}$.
- Its derivative is also $0$ at $\sqrt[3]{10}$. You can see this without calculating the formula for the derivative (ugh).
There is much more about this example in Zooming and Chunking.
Algebra in high school
There are many high school students stymied by algebra, never do well at it, and hate math as a result. I have known many such people over the years. A revealing remark that I have heard many times is that “algebra is totally meaningless to me”. This is sometimes accompanied by a remark that geometry is “obvious” or something similar. This may be because they think they have to “read” an algebraic expression instead of studying it as they would a graph or a diagram.
Conjecture
Many beginning abstractmath students have difficulty understanding a symbolic expression like the one above. Could this be cause by resistance to treating the expression as a structure to be studied?
Context-sensitive pronunciation
A symbolic assertion (“formula” to logicians) can be embedded in a math English sentence in different ways, requiring the symbolic assertion to be pronounced in different ways. The assertion itself is not modified in any way in these different situations.
I used the phrase “symbolic assertion” in abstractmath.org because students are confused by the logicians’ use of “formula“.
In everyday English, “$\text{H}_2\text{O}$” is the “formula” for water, but it is a term, not an assertion.
Example
“For every real number $x\gt0$ there is a real number $y$ such that $x\gt y\gt0$.”
- In the sentence above, the assertion “$x\gt0$” must be pronounced “$x$ that is greater than $0$” or something similar.
- The standalone assertion “$x\gt0$” is pronounced “$x$ is greater than $0$.”
- The sentence “Let $x\gt0$” must be pronounced “Let $x$ be greater than $0$”.
The consequence is that the symbolic assertion, in this case “$x\gt0$”, does not reveal that role it plays in the math English sentence that it is embedded in.
Many of the examples occurring later in the post are also examples of context-sensitive pronunciation.
Conjectures
Many students are subconsciously bothered by the way the same symbolic expression is pronounced differently in different math English sentences.
This probably impedes some students’ progress. Teachers should point this phenomenon out with examples.
Students should be discouraged from pronouncing mathematical expressions.
For one thing, this could get you into trouble. Consider pronouncing “$\sqrt{3+5}+6$”. In any case, when you are reading any text you don’t pronounce the words, you just take in their meaning. Why not take in the meaning of algebraic expressions in the same way?
Parenthetic assertions
A parenthetic assertion is a symbolic assertion embedded in a sentence in math English in such a way that is a subordinate clause.
Example
In the math English sentence
“For every real number $x\gt0$ there is a real number $y$ such that $x\gt y\gt0$”
mentioned above, the symbolic assertion “$x\gt0$” plays the role of a subordinate clause.
It is not merely that the pronunciation is different compared to that of the independent statement “$x\gt0$”. The math English sentence is hard to parse. The obvious (to an experienced mathematician) meaning is that the beginning of the sentence can be read this way: “For every real number $x$, which is bigger than $0$…”.
But new student might try to read it is “For every real number $x$ is greater than $0$ …” by literally substituting the standalone meaning of “$x\gt0$” where it occurs in the sentence. This makes the text what linguists call a garden path sentence. The student has to stop and start over to try to make sense of it, and the symbolic expression lacks the natural language hints that help understand how it should be read.
Note that the other two symbolic expressions in the sentence are not parenthetic assertions. The phrase “real number” needs to be followed by a term, and it is, and the phrase “such that” must be followed by a clause, and it is.
More examples
- “Consider the circle $S^1\subseteq\mathbb{C}=\mathbb{R}^2$.” This has subordinate clauses to depth 2.
- “The infinite series $\displaystyle\sum_{k=1}^\infty\frac{1}{k^2}$ converges to $\displaystyle\zeta(2)=\frac{\pi^2}{6}\approx1.65$”
- “We define a null set in $I:=[a,b]$ to be a set that can be covered by a countable of intervals with arbitrarily small total length.” This shows a parenthetical definition.
- “Let $F:A\to B$ be a function.”
A type declaration is a function? In any case, it would be better to write this sentence simply as “Let $F:A\to B$”.
David Butler’s post Contrapositive grammar has other good examples.
Math texts are in general badly written. Students need to be taught how to read badly written math as well as how to write math clearly. Those that succeed (in my observation) in being able to read math texts often solve the problem by glancing at what is written and then reconstructing what the author is supposedly saying.
Conjectures
Some students are baffled, or at least bothered consciously or unconsciously, by parenthetic assertions, because the clues that would exist in a purely English statement are missing.
Nevertheless, many if not most math students read parenthetic assertions correctly the first time and never even notice how peculiar they are.
What makes the difference between them and the students who are stymied by parenthetic assertions?
There is another conjecture concerning parenthetic assertions below.
Context-sensitive meaning
“If” in definitions
Example
The word “if” in definitions does not mean the same thing that it means in other math statements.
- In the definition “An integer is even if it is divisible by $2$,” “if” means “if and only if”. In particular, the definition implies that a function is not even if it is not divisible by $2$.
- In a theorem, for example “If a function is differentiable, then it is continuous”, the word “if” has the usual one-way meaning. In particular, in this case, a continuous function might not be differentiable.
Context-sensitive meaning occurs in ordinary English as well. Think of a strike in baseball.
Conjectures
The nearly universal custom of using “if” to mean “if and only if” in definitions makes it a harder for students to understand implication.
This custom is not the major problem in understanding the role of definitions. See my article Definitions.
Underlying sets
Example
In a course in group theory, a lecturer may say at one point, “Let $F:G\to H$ be a homomorphism”, and at another point, “Let $g\in G$”.
In the first sentence, $G$ refers to the group, and in the second sentence it refers to the underlying set of the group.
This usage is almost universal. I think the difficulty it causes is subtle. When you refer to $\mathbb{R}$, for example, you (usually) are referring to the set of real numbers together with all its canonical structure. The way students think of it, a real number comes with its many relations and connections with the other real numbers, ordering, field properties, topology, and so on.
But in a group theory class, you may define the Klein $4$-group to be $\mathbb{Z}_2\times\mathbb{Z}_2$. Later you may say “the symmetry group of a rectangle that is not a square is the Klein $4$-group.” Almost invariably some student will balk at this.
Referring to a group by naming its underlying set is also an example of synecdoche.
Conjecture
Students expect every important set in math to have a canonical structure. When they get into a course that is a bit more abstract, suddenly the same set can have different structures, and math objects with different underlying sets can have the same structure. This catastrophic shift in a way of thinking should be described explicitly with examples.
Way back when, it got mighty upsetting when the earth started going around the sun instead of vice versa. Remind your students that these upheavals happen in the math world too.
Overloaded notation
Identity elements
A particular text may refer to the identity element of any group as $e$.
This is as far as I know not a problem for students. I think I know why: There is a generic identity element. The identity element in any group is an instantiation of that generic identity element. The generic identity element exists in the sketch for groups; every group is a functor defined on that sketch. (Or if you insist, the generic identity element exists in the first order theory for groups.) I suspect mathematicians subconsciously think of identity elements in this way.
Matrix multiplication
Matrix multiplication is not commutative. A student may forget this and write $(A^2B^2=(AB)^2$. This also happens in group theory courses.
This problem occurs because the symbolic language uses the same symbol for many different operations, in this case the juxtaposition notation for multiplication. This phenomenon is called overloaded notation and is discussed in abstractmath.org here.
Conjecture
Noncommutative binary operations written using juxtaposition cause students trouble because going to noncommutative operations requires abandoning some overlearned reflexes in doing algebra.
Identity elements seem to behave the same in any binary operation, so there are no reflexes to unlearn. There are generic binary operations of various types as well. That’s why mathematicians are comfortable overloading juxtaposition. But to get to be a mathematician you have to unlearn some reflexes.
Negation
Sometimes you need to reword a math statement that contains symbolic expressions. This particularly causes trouble in connection with negation.
Ordinary English
The English language is notorious among language learners for making it complicated to negate a sentence. The negation of “I saw that movie” is “I did not see that movie”. (You have to put “d** not” (using the appropriate form of “do”) before the verb and then modify the verb appropriately.) You can’t just say “I not saw that movie” (as in Spanish) or “I saw not that movie” (as in German).
Conjecture
The method in English used to negate a sentence may cause problems with math students whose native language is not English. (But does it cause math problems with those students?)
Negating symbolic expressions
Examples
- The negation of “$n$ is even and a prime” is “$n$ is either odd or it is not a prime”. The negation should not be written “$n$ is not even and a prime” because that sentence is ambiguous. In the heat of doing a proof students may sometimes think the negation is “$n$ is odd and $n$ is not a prime,” essentially forgetting about DeMorgan. (He must roll over in his grave a lot.)
- The negation of “$x\gt0$” is “$x\leq0$”. It is not “$x\lt0$”. This is a very common mistake.
These examples are difficulties caused by not understanding the math. They are not directly caused by difficulties with the languages of math.
Negating expressions containing parenthetic assertions
Suppose you want to prove:
“If $f:\mathbb{R}\to\mathbb{R}$ is differentiable, then $f$ is continuous”.
A good way to do this is by using the contrapositive. A mechanical way of writing the contrapositive is:
“If $f$ is not continuous, then $f:\mathbb{R}\to\mathbb{R}$ is not differentiable.”
That is not good. The sentence needs to be massaged:
“If $f:\mathbb{R}\to\mathbb{R}$ is not continuous, then $f$ is not differentiable.”
Even better would be to write the original sentence as:
“Suppose $f:\mathbb{R}\to\mathbb{R}$. Then if $f$ is differentiable, then $f$ is continuous.”
This is discussed in detail in David Butler’s post Contrapositive grammar.
Conjecture
Students need to be taught to understand parenthetic assertions that occur in the symbolic language and to learn to extract a parenthetic assertion and write it as a standalone assertion ahead of the statement it occurs in.
Scope
The scope of a word or variable consists of the part of the text for which its current definition is in effect.
Examples
- “Suppose $n$ is divisible by $4$.” The scope is probably the current paragraph or perhaps the current proof. This means that the properties of $n$ are constrained in that section of the text.
- “In this book, all rings are unitary.” This will hold for the whole book.
There are many more examples in the abstractmath.org article Scope.
If you are a grasshopper (you like to dive into the middle of a book or paper to find out what it says), knowing the scope of a variable can be hard to determine. It is particularly difficult for commonly used words or symbols that have been defined differently from the usual usage. You may not suspect that this has happened since it might be define once early in the text. Some books on writing mathematics have urged writers to keep global definitions to a minimum. This is good advice.
Finding the scope is considerably easier when the text is online and you can search for the definition.
Conjecture
Knowing the scope of a word or variable can be difficult. It is particular hard when the word or variable has a large scope (chapter or whole book.)
Variables
Variables are often introduced in math writing and then used in the subsequent discussion. In a complicated discussion, several variables may be referred to that have different statuses, some of them introduced several pages before. There are many particular ways discussed below that can cause trouble for students. This post is restricted to trouble in connection with the languages of math. The concept of variable is difficult in itself, not just because of the way the math languages represent them, but that is not covered here.
Much of this part of the post is based on work of Susanna Epp, including three papers listed in the references. Her papers also include many references to other work in the math ed literature that have to do with understanding variables.
See also Variables in abstractmath.org and Variables in Wikipedia.
Types
Students blunder by forgetting the type of the variable they are dealing with. The example given previously of problems with matrix multiplication is occasioned by forgetting the type of a variable.
Conjecture
Students sometimes have problems because they forget the data type of the variables they are dealing with. This is primarily causes by overloaded notation.
Dependent and independent
If you define $y=x^2+1$, then $x$ is an independent variable and $y$ is a dependent variable. But dependence and independence of variablesare more general than that example suggests.
In an epsilon-delta proof of the limit of a function (example below,) $\varepsilon$ is independent and $\delta$ is dependent on $\varepsilon$, although not functionally dependent.
Conjecture
Distinguishing dependent and independent variables causes problems, particularly when the dependence is not clearly functional.
I recently ran across a discussion of this on the internet but failed to record where I saw it. Help!
Bound and free
This causes trouble with integration, among other things. It is discussed in abstractmath.org in Variables and Substitution. I expect to add some references to the math ed literature soon.
Instantiation
Some of these variables may be given by existential instantiation, in which case they are dependent on variables that define them. Others may be given by universal instantiation, in which case the variable is generic; it is independent of other variables, and you can’t impose arbitrary restrictions on it.
Existential instantiation
A theorem that an object exists under certain conditions allows you to name it and use it by that name in further arguments.
Example
Suppose $m$ and $n$ are integers. Then by definition, $m$ divides $n$ if there is an integer $q$ such that $n=qm$. Then you can use “$q$” in further discussion, but $q$ depends on $m$ and $n$. You must not use it with any other meaning unless you start a new paragraph and redefine it.
So the following (start of a) “proof” blunders by ignoring this restriction:
Theorem: Prove that if an integer $m$ divides both integers $n$ and $p$, then $m$ divides $n+p$.
“Proof”: Let $n = qm$ and $p = qm$…”
Universal instantiation
It is a theorem that for any integer $n$, there is no integer strictly between $n$ and $n+1$. So if you are given an arbitrary integer $k$, there is no integer strictly between $k$ and $k+1$. There is no integer between $42$ and $43$.
By itself, universal instantiation does not seem to cause problems, provided you pay attention to the types of your variables. (“There is no integer between $\pi$ and $\pi+1$” is false.)
However, when you introduce variables using both universal and existential quantification, students can get confused.
Example
Consider the definition of limit:
Definition: $\lim_{x\to a} f(x)=L$ if and only if for every $\epsilon\gt0$ there is a $\delta\gt0$ for which if $|x-a|\lt\delta$ then $|f(x)-L|\lt\epsilon$.
A proof for a particular instance of this definition is given in detail in Rabbits out of a Hat. In this proof, you may not put constraints on $\epsilon$ except the given one that it is positive. On the other hand, you have to come up with a definition of $\delta$ and prove that it works. The $\delta$ depends on what $f$, $a$ and $L$ are, but there are always infinitely many values of $\delta$ which fit the constraints, and you have to come up with only one. So in general, two people doing this proof will not get the same answer.
Reference
Susanna Epp’s paper Proof issues with existential quantification discusses the problems that students have with both existential and universal quantification with excellent examples. In particular, that paper gives examples of problems students have that are not hinted at here.
References
A nearly final version of The Handbook of Mathematical Discourse is available on the web with links, including all the citations. This version contains some broken links. I am unable to recompile it because TeX has evolved enough since 2003 that the source no longer compiles. The paperback version (without the citations) can be bought as a book here. (There are usually cheaper used versions on Amazon.)
Abstractmath.org is a website for beginning students in abstract mathematics. It includes most of the material in the Handbook, but not the citations. The Introduction gives you a clue as to what it is about.
Two languages
My take on the two languages of math are discussed in these articles:
- Mathematical English
- The Introduction to The Handbook of Mathematical Discourse.
- The symbolic language of math.
The Language of Mathematics, by Mohan Ganesalingam, covers these two languages in more detail than any other book I know of. He says right away on page 18 that mathematical language consists of “textual sentences with symbolic material embedded like ‘islands’ in the text.” So for him, math language is one language.
I have envisioned two separate languages for math in abstractmath.org and in the Handbook, because in fact you can in principle translate any mathematical text into either English or logical notation (first order logic or type theory), although the result in either case would be impossible to understand for any sizeable text.
Topics in abstractmath.org
Context-sensitive interpretation.
Topics in the Handbook of mathematical discourse.
These topics have a strong overlap with the topics with the same name in abstractmath.org. They are included here because the Handbook contains links to citations of the usage.
Posts in Gyre&Gimble
Syntactic and semantic thinkers
Technical meanings clash with everyday meanings
Three kinds of mathematical thinkers
Variations in meaning in math.
Other references
Contrapositive grammar, blog post by David Butler.
Proof issues with existential quantification, by Susanna Epp.
The role of logic in teaching proof, by Susanna Epp (2003).
The language of quantification in mathematics instruction, by Susanna Epp (1999).
The Language of Mathematics: A Linguistic and Philosophical Investigation
by Mohan Ganesalingam, 2013. (Not available from the internet.)
On the communication of mathematical reasoning, by Atish Bagchi, and Charles Wells (1998a), PRIMUS, volume 8, pages 15–27.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
Math majors attacked by cognitive dissonance
In some situations you may have conflicting information from different sources about a subject. The resulting confusion in your thinking is called cognitive dissonance.
It may happen that a person suffering cognitive dissonance suppresses one of the ways of understanding in order to resolve the conflict. For example, at a certain stage in learning English, you (small child or non-native-English speaker) may learn a rule that the past tense is made from the present form by adding “-ed”. So you say “bringed” instead of “brought” even though you may have heard people use “brought” many times. You have suppressed the evidence in favor of the rule.
Some of the ways cognitive dissonance can affect learning math are discussed here
Metaphorical contamination
We think about math objects using metaphors, as we do with most concepts that are not totally concrete. The metaphors are imperfect, suggesting facts about the objects that may not follow from the definition. This is discussed at length in the section on images and metaphors here.
The real line
Mathematicians think of the real numbers as constituting a line infinitely long in both directions, with each number as a point on the line. But this does not mean that you can think of the line as a row of points. Between any two points there are uncountably many other points. See density of the reals.
Infinite math objects
One of the most intransigent examples of metaphorical contamination occurs when students think about countably infinite sets. Their metaphor is that a sequence such as the set of natural numbers $\{0,1,2,3,4,\ldots\}$ “goes on forever but never ends”. The metaphor mathematicians have in mind is quite different: The natural numbers constitute the set that contains every natural number right now.
Example
An excruciating example of this is the true statement
“$.999\ldots=1.0$.” The notion that it can’t be true comes from thinking of “$0.999\ldots$” as consisting of the list of numbers \[0.9,0.99,0.999,0.9999,0.99999,\ldots\] which the student may say “gets closer and closer to $1.0$ but never gets there”.
Now consider the way a mathematician thinks: The numbers are all already there, and they make a set.
The proof that $.999\ldots=1.0$ has several steps. In the list below, I have inserted some remarks in red that indicate areas of abstract math that beginning students have trouble with.
- The elements of an infinite set are all in it at once. This is the way mathematicians think about infinite sets.
- By definition, an infinite decimal expansion represents the unique real number that is a limit point of its set of truncations.
- It follows from $\epsilon-\delta$ machinations that the limit of the sequence $0.9,0.99,0.999,0.9999,0.99999,\ldots$ is $1.0$
- That means “$0.999\ldots$” represents $1.0$. (Enclosing a mathematical expression in quotes turns it into a string of characters.)
- The statement “$A$” represents $B$ is equivalent to the statement $A=B$. (Have you ever heard a teacher point this out?)
- It follows that that $0.999\ldots=1.0$.
The problem that occurs with the word “definition” in this case is that a definition appears to be a dictatorial act. The student needs to know why you made this definition. This is not a stupid request. The act can be justified by the way the definition gets along with the algebraic and topological characteristic of the real numbers.
Each one of these steps should be made explicit. Even the Wikipedia article, which is regarded as a well written document, doesn’t make all of the points explicit.
Semantic contamination
Many math objects have names that are ordinary English words.
(See names.) So the person learning about them is faced with two inputs:
- The definition of the word as a math object.
- The meaning and connotations of the word in English.
It is easy and natural to suppress the information given by the definition (or part of it) and rely only on the English meaning. But math does not work that way:
If another source of understanding contradicts the definition
THE DEFINITION WINS.
“Cardinality”
The connotations of a name may fit the concept in some ways and not others. Infinite cardinal numbers are a notorious example of this: there are some ways in which they are like numbers and other in which they are not.
For a finite set, the cardinality of the set is the number of elements in the set. Long ago, mathematicians started talking about the cardinality of an infinite set. They worked out a lot of facts about that, for example:
- The cardinality of the set of natural numbers is the same as the cardinality of the set of rational numbers.
- The cardinality of the number of points on the real line is the same as the cardinality of points in the real plane.
The teacher may even say that there are just as many points on the real line as in the real points. And know-it-all math majors will say that to their friends.
Many students will find that totally bizarre. Essentially, what has happened is that the math dictators have taken the phrase “cardinality” to mean what it usually means for finite sets and extend it to infinite sets by using a perfectly consistent (and useful) definition of “cardinality” which has very different properties from the finite case.
That causes a perfect storm of cognitive dissonance.
Math majors must learn to get used to situations like this; they occur in all branches of math. But it is bad behavior to use the phrase “the same number of elements” to non-mathematicians. Indeed, I don’t think you should use the word cardinality in that setting either: you should refer to a “one-to-one correspondence” instead and admit up front that the existence of such a correspondence is quite amazing.
“Series”
Let’s look at the word “series”in more detail. In ordinary English, a series is a bunch of things, one after the other.
- The World Series is a series of up to seven games, coming one after another in time.
- A series of books is not just a bunch of books, but a bunch of books in order.
- In the case of the Harry Potter series the books are meant to be read in order.
- A publisher might publish a series of books on science, named Physics, Chemistry,
Astronomy, Biology, and so on, that are not meant to be read in order, but the publisher will still list them in order.(What else could they do? See Representing and thinking about sets.)
Infinite series in math
In mathematics an infinite series is an object expressed like this:
\[\sum\limits_{k=1}^{\infty
}{{{a}_{k}}}\]
where the ${{a}_{k}}$ are numbers. It has partial sums
\[\sum\limits_{k=1}^{n}{{{a}_{k}}}\]
For example, if ${{a}_{k}}$ is defined to be $1/{{k}^{2}}$ for positive integers $k$, then
\[\sum\limits_{k=1}^{6}{{{a}_{k}}}=1+\frac{1}{4}+\frac{1}{9}+\frac{1}{16}+\frac{1}{25}+\frac{1}{36}=\frac{\text{5369}}{\text{3600}}=\text{
about }1.49\]
This infinite series converges to $\zeta (2)$, which is about $1.65$. (This is not obvious. See the Zeta-function article in Wikipedia.) So this “infinite series” is really an infinite sum. It does not fit the image given by the English word “series”. The English meaning contaminates the mathematical meaning. But the definition wins.
The mathematical word that corresponds to the usual meaning of “series” is “sequence”. For example, $a_k:=1/{{k}^{2}}$ is the infinite sequence $1,\frac{1}{4},\frac{1}{9},\frac{1}{16}\ldots$ It is not an infinite series.
“Only if”
“Only if” is also discussed from a more technical point of view in the article on conditional assertions.
In math English, sentences of the form $P$ only if $Q$” mean exactly the same thing as “If $P$ then $Q$”. The phrase “only if” is rarely used this way in ordinary English discourse.
Sentences of the form “$P$ only if $Q$” about ordinary everyday things generally do not mean the same thing as “If $P$ then $Q$”. That is because in such situations there are considerations of time and causation that do not come up with mathematical objects. Consider “If it is raining, I will carry an umbrella” (seeing the rain will cause me to carry the umbrella) and “It is raining only if I carry an umbrella” (which sounds like my carrying an umbrella will cause it to rain). When “$P$ only if $Q$” is about math objects,
there is no question of time and causation because math objects are inert and unchanging.
Students sometimes flatly refuse to believe me when I tell them about the mathematical meaning of “only if”. This is a classic example of semantic contamination. Two sources of information appear to contradict each other, in this case (1) the professor and (2) a lifetime of intimate experience with the English language. The information from one of these sources must be rejected or suppressed. It is hardly surprising that many students prefer to suppress the professor’s apparently unnatural and usually unmotivated claims.
These words also cause severe cognitive dissonance
- “If” causes notorious difficulties for beginners and even later. They are discussed in abmath here and here.
- A, an
and the implicitly signal the universal quantifier in certain math usages. They cause a good bit of trouble in the early days of some students.
The following cause more minor cognitive dissonance.
References for semantic contamination
Besides the examples given above, you can find many others in these two works:
- Pimm, D. (1987), Speaking Mathematically: Communications in Mathematics Classrooms. Routledge & Kegan Paul.
- Hersh, R. (1997),”Math lingo vs. plain English: Double entendre”. American Mathematical Monthly, vol 104,pages 48-51.
More alphabets
This post is the third and last in a series of posts containing revisions of the abstractmath.org article Alphabets. The first two were:
Addition to the listings for the Greek alphabet
Sigma: $\Sigma,\,\sigma$ or ς: sĭg'mɘ. The upper case $\Sigma $ is used for indexed sums. The lower case $\sigma$ (don't call it "oh") is used for the standard deviation and also for the sum-of-divisors function. The ς form for the lower case has not as far as I know been used in math writing, but I understood that someone is writing a paper that will use it.
Hebrew alphabet
Aleph, א is the only Hebrew letter that is widely used in math. It is the cardinality of the set of integers. A set with cardinality א is countably infinite. More generally, א is the first of the aleph numbers $א_1$, $א_2$, $א_3$, and so on.
Cardinality theorists also write about the beth (ב) numbers, and the gimel (ג) function. I am not aware of other uses of the Hebrew alphabet.
If you are thinking of using other Hebrew letters, watch out: If you type two Hebrew letters in a row in HTML they show up on the screen in reverse order. (I didn't know HTML was so clever.)
Cyrillic alphabet
The Cyrillic alphabet is used to write Russian and many other languages in that area of the world. Wikipedia says that the letter Ш, pronounced "sha", is the only Cyrillic letter used in math. I have not investigated further.
The letter is used in several different fields, to denote the Tate-Shafarevich group, the Dirac comb and the shuffle product.
It seems to me that there are a whole world of possibillities for brash young mathematicians to name mathematical objects with other Cyrillic letters. Examples:
- Ж. Use it for a ornate construction, like the Hopf fibration or a wreath product.
- Щ. This would be mean because it is hard to pronounce.
- Ъ. Guaranteed to drive people crazy, since it is silent. (It does have a name, though: "Yehr".)
- Э. Its pronunciation indicates you are unimpressed (think Fonz).
- ю. Pronounced "you". "ю may provide a counterexample". "I do?"
Type styles
Boldface and italics
A typeface is a particular design of letters. The typeface you are reading is Arial. This is Times New Roman. This is Goudy. (Goudy may not render correctly on your screen if you don't have it installed.)
Typefaces typically come in several styles, such as bold (or boldface) and italic.
Examples
Arial Normal | Arial italic | Arial bold | |||
Times Normal | Times italic | Times bold | Goudy Normal | Goudy italic | Goudy bold |
Boldface and italics are used with special meanings (conventions) in mathematics. Not every author follows these conventions.
Styles (bold, italic, etc.) of a particular typeface are supposedly called fonts. In fact, these days “font” almost always means the same thing as “typeface”, so I use “style” instead of “font”.
Vectors
A letter denoting a vector is put in boldface by many authors.
Examples
- “Suppose $\mathbf{v}$ be an vector in 3-space.” Its coordinates typically would be denoted by $v_1$, $v_2$ and $v_3$.
- You could also define it this way: “Let $\mathbf{v}=({{v}_{1}},{{v}_{2}},{{v}_{3}})$ be a vector in 3-space.” (See parenthetic assertion.)
It is hard to do boldface on a chalkboard, so lecturers may use $\vec{v}$ instead of $\mathbf{v}$. This is also seen in print.
Definitions
The definiendum (word or phrase being defined) may be put in boldface or italics. Sometimes the boldface or italics is the only clue you have that the term is being defined. See Definitions.
Example
“A group is Abelian if its multiplication is commutative,” or “A group is Abelian if its multiplication is commutative.”
Emphasis
Italics are used for emphasis, just as in general English prose. Rarely (in my experience) boldface may be used for emphasis.
In the symbolic language
It is standard practice in printed math to put single-letter variables in italics. Multiletter identifiers are usually upright.
Example
Example: "$f(x)=a{{x}^{2}}+\sin x$". Note that mathematicians would typically refer to $a$ as a “constant” or “parameter”, but in the sense we use the word “variable” here, it is a variable, and so is $f$.
Example
On the other hand, “e” is the proper name of a specific number, and so is “i”. Neither is a variable. Nevertheless in print they are usually given in italics, as in ${{e}^{ix}}=\cos x+i\sin x$. Some authors would write this as ${{\text{e}}^{\text{i}x}}=\cos x+\text{i}\,\sin x$. This practice is recommended by some stylebooks for scientific writing, but I don't think it is very common in math.
Blackboard bold
Blackboard bold letters are capital Roman letters written with double vertical strokes. They look like this:
\[\mathbb{A}\,\mathbb{B}\,\mathbb{C}\,\mathbb{D}\,\mathbb{E}\,\mathbb{F}\,\mathbb{G}\,\mathbb{H}\,\mathbb{I}\,\mathbb{J}\,\mathbb{K}\,\mathbb{L}\,\mathbb{M}\,\mathbb{N}\,\mathbb{O}\,\mathbb{P}\,\mathbb{Q}\,\mathbb{R}\,\mathbb{S}\,\mathbb{T}\,\mathbb{U}\,\mathbb{V}\,\mathbb{W}\,\mathbb{X}\,\mathbb{Y}\,\mathbb{Z}\]
In lectures using chalkboards, they are used to imitate boldface.
In print, the most common uses is to represent certain sets of numbers:
- $\mathbb{C}$ Complex numbers
- $\mathbb{H}$ Quaternions
- $\mathbb{I}$ Integers, mostly at high school levels. Also the unit interval.
- $\mathbb{N}$ Natural numbers, either including or excluding $0$.
- $\mathbb{R}$ Real numbers
- $\mathbb{Z}$ Integers. $\mathbb{Z}$ is much more common than $\mathbb{I}$ in research math.
Remarks
- Mathematica uses some lower case blackboard bold letters.
- Many mathematical writers disapprove of using blackboard bold in print. I say the more different letter shapes that are available the better. Also a letter in blackboard bold is easier to distinguish from ordinary upright letters than a letter in boldface is, particularly on computer screens.