Mathematical Information II

Introduction

This is the second post about Mathematical Information inspired by talks the AMS meeting in Seattle in January, 2016. The first post was Mathematical Information I. That post covered, among other things, types of explanations.

In this post as in the previous one, footnotes link to talks at Seattle that inspired me to write about a topic. The speakers may not agree with what I say.

The internet

Math sources on the internet

Publishing math on the internet

  • Publishing on the internet is instantaneous, in the sense that once it is written (which of course may take a long time), it can be made available on the internet immediately.
  • Publishing online is also cheap. It requires only a modest computer, an editor and LaTeX or MathJax, all of which are either free, one-time purchases, or available from your university. (These days all these items are required for publishing a math book on paper or submitting an article to a paper journal as well as for publishing on the internet.)
  • Publishing online has the advantage that taking up more space does not cost more. I believe this is widely underappreciated. You can add comments explaining how you think about some type of math object, or about false starts that you had to abandon, and so on. If you want to refer to a diagram that occurs in another place in the paper, you can simply include a copy in the current place. (It took me much too long to realize that I could do things like that in abstractmath.org.)

Online journals

Many new online journals have appeared in the last few years. Some of them are deliberately intended as a way to avois putting papers behind a paywall. But aside from that, online journals speed up publication and reduce costs (not necessarily to zero if the journal is refereed).

A special type of online journal is the overlay journalG. A paper published there is posted on ArXiv; the journal merely links to it. This provides a way of refereeing articles that appear on ArXiv. It seems to me that such journals could include articles that already appear on ArXiv if the referees deem them suitable.

Types of mathematical communication

I wrote about some types of math communication in Mathematical Information I.

The paper Varieties of Mathematical Prose, by Atish Bagchi and me, describes other forms of communicating math not described here.

What mathematicians would like to know

Has this statement been proved?G

  • The internet has already made it easier to answer this query: Post it on MathOverflow or Math Stack Exchange.
  • It should be a long-term goal of the math community to construct a database of what is known. This would be a difficult, long-term project. I discussed it in my article The Mathematical Depository: A Proposal, which concentrated on how the depository should work as a system. Constructing it would require machine reading and understanding of mathematical prose, which is difficult and not something I know much about (the article gives some references).
  • An approach that would be completely different from the depository might be through a database of proved theorems that anyone could contribute to, like a wiki, but with editing to maintain consistency, avoid repetition, etc.

Known information about a conjecture

This information could include partial results.G An example would be Falting’s Theorem, which implies a partial result for Fermat’s Last Theorem: there is only a finite number of solutions of $x^n+y^n=z^n$ for integers $x, y, z, n$, $n\gt2$. That theorem became widely known, but many partial results never even get published.

Strategies for proofs

Strategies that are useful in a particular field.

The website Tricki is developing a list of such strategies.

It appears that Tricki should be referred to as “The Tricki”, like The Hague and The Bronx.

Note that there are strategies that essentially work just once, to prove some important theorem. For example, Craig’s Trick, to prove that a recursively enumerable theory is recursive. But of course, who can say that it will never be useful for some other theorem? I can’t think of how, though.

Strategies that don’t work, and whyG

The article How to discover for yourself the solution of the cubic, by Timothy Gowers, leads you down the garden path of trying to “complete the cubic” by copying the way you solve a quadratic, and then showing conclusively that that can’t possibly work.

Instructors should point out situations like that in class when they are relevant. A database of Methods That Work Here But Not There would be helpful, too. And, most important of all, if you run into a method that doesn’t work when you are trying to prove a theorem, when you do prove it, mention the failed method in your paper! (Remember: space is now free.)

Examples and Counterexample

I discovered these examples in twenty minutes on the internet.

Discussions

“Mathematical discussion is very useful and virtually unpublishable.”G But in the internet age they can take place online, and they do, in discussion lists for particular branches of math. That is not the same thing as discussing in person, but it is still useful.

PolymathG

Polymath sessions are organized attempts to use a kind of crowdsourcing to study (and hopefully prove) a conjecture. The Polymath blog and the Polymath wiki provide information about ongoing efforts.

Videos

  • Videos that teach math are used all over the world now, after the spectacular success of Khan Academy.
  • Some math meetings produce videos of invited talks and make them available on You Tube. It would be wonderful if a systematic effort could be made to increase the number of such videos. I suppose part of the problem is that it requires an operator to operate the equipment. It is not impossible that filming an academic lecture could be automated, but I don’t know if anyone is doing this. It ought to be possible. After all, some computer games follow the motions of the player(s).
  • There are some documentaries explaining research-level math to the general public, but I don’t know much about them. Documentaries about other sciences seem much more common.

References

The talks in Seattle

  • List of all the talks.
  • W. Timothy Gowers, How should mathe­matical knowledge be organized? Talk at the AMS Special Session on Mathe­matical Information in the Digital Age of Science, 6 January 2016.
  • Mathematical discussions, links to pages by Timothy Gowers. “Often [these pages] contain ideas that I have come across in one way or another and wish I had been told as an undergraduate.”
  • Colloquium notes

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

abstractmath.org beta

Around two years ago I began a systematic revision of abstractmath.org. This involved rewriting some of the articles completely, fixing many errors and bad links, and deleting some articles. It also involved changing over from using Word and MathType to writing directly in html and using MathJax. The changeover was very time consuming.

Before I started the revision, abstractmath.org was in alpha mode, and now it is in beta. That means it still has flaws, and I will be repairing them probably till I can’t work any more, but it is essentially in a form that approximates my original intention for the website.

I do not intend to bring it out of beta into “final form”. I have written and published three books, two of them with Michael Barr, and I found the detailed work necessary to change it into its final form where it will stay frozen was difficult and took me away from things I want to do. I had to do it that way then (the olden days before the internet) but now I think websites that are constantly updated and have live links are far more useful to people who want to learn about some piece of math.

My last book, the Handbook of Mathematical Discourse, was in fact published after the internet was well under way, but I was still thinking in Olden Days Paper Mode and never clearly realized that there was a better way to do things.

In any case, the entire website (as well as Gyre&Gimble) is published under a Creative Commons license, so if someone wants to include part or all of it in another website, or in a book, and revise it while they do it, they can do so as long as they publish under the terms of the license and link to abstractmath.org.

Previous posts about the evolution of abstractmath.org

Books by Michael Barr and Charles Wells

Toposes, triples and theories

Category theory for computing science

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

Mathematical Information I

Introduction

The January, 2016 meeting of the American Mathematical Society in Seattle included a special session on Mathe­matical Information in the Digital Age of Science. Here is a link to the list of talks in that session (you have to scroll down a ways to get to the list).

Several talks at that session were about communi­cating math, to other mathe­maticians and to the general public. Well, that’s what I have been about for the last 20 years. Mostly.

Overview

These posts discuss the ways we communi­cate math and (mostly in later posts) the revolution in math communication that the internet has caused. Parts of this discussion were inspired by the special session talks. When they are relevant, I include footnotes referring to the talks. Be warned that what I say about these ideas may not be the same as what the speakers had to say, but I feel I ought to give them credit for getting me to think about those concepts.

Some caveats

  • The distinctions between different kinds of math communi­cation are inevitably fuzzy.
  • Not all kinds of communication are mentioned.
  • Several types of communication normally occur in the same document.

Articles published in journals

Until recently, math journals were always published on paper. Now many journals exist only on the internet. What follows is a survey of the types of articles published in journals.

Refereed papers containing new results

These communications typically containing proofs of (usually new) theorems. Such papers are the main way that academic mathematicians get credit for their researchG for the purpose of getting tenure (at least in the USA), although some other types of credit are noted below.

Proofs published in refereed journals in the past were generally restricted to formal proofs, without very many comments intended to aid the reader’s under­standing. This restricted text was often enforced by the journal. In the olden days this would have been prompted by the expense of publishing on paper. I am not sure how much this restriction has relaxed in electronic journals.

I have been writing articles for abstractmath.org and Gyre&Gimble for many years, and it has taken me a very long time to get over unnecessarily restricting the space I use in what I write. If I introduce a diagram in an article and then want to refer to it later, I don’t have to link to it — I can copy it into the current location. If it makes sense for an informative paragraph to occur in two different articles, I can put it into both articles. And so on. Nowadays, that sort of thing doesn’t cost anything.

Survey articles and invited addresses

You may also get credit for an invited address to a prestigious organi­zation, or for a survey of your field, in for example the Bulletin of the AMS. Invited addresses and surveys may contain considerably more explanatory asides. This was quite noticeable in the invited talks at the AMS Seattle meeting.

Books

There is a whole spectrum of math books. The following list mentions some Fraunhofer lines on the spectrum, but the gamut really is as continuous as a large finite list of books could be. This list needs more examples. (This is a blog post, so it has the status of an alpha release.)

Research books that are concise and without much explanation.

The Bourbaki books that I have dipped into (mostly the algebra book and mostly in the 1970’s) are definitely concise and seem to strictly avoid explanation, diagrams, pictures, etc). I have heard people say they are unreadable, but I have not found them so.

Contain helpful explanations that will make sense to people in the field but probably would be formidable to someone in a substantially different area.

Toposes, triples and theories, by Michael Barr and Charles Wells. I am placing our book here in the spectrum because several non-category-theorists (some of them computer scientists) have remarked that it is “formidable” or other words like that.

Intended to introduce professional mathematicians to a particular field.

Categories for the working mathematician, by Saunders Mac Lane. I learned from this (the 1971 edition) in my early days as a category theorist, six years after getting my Ph.D. In fact, I think that this book belongs to the grad student level instead of here, but I have not heard any comments one way or another.

Intended to introduce math graduate students to a particular field.

There are lots of examples of good books in this area. Years ago (but well after I got my Ph.D.), I found Serge Lang’s Algebra quite useful and studied parts of it in detail.

But for grad students? It is still used for grad students, but perhaps Nathan Jacobson’s Basic Algebra would be a better choice for a first course in algebra for first-year grad students.

The post My early life as a mathematician discusses algebra texts in the olden days, among other things.

Intended to explain a part of math to a general audience.

Love and math: the heart of hidden reality. by Edward Frenkel, 2014. This is a wonderful book. After reading it, I felt that at last I had some clue as to what was going on with the Langlands Program. He assumes that the reader knows very little about math and gives hand-waving pictorial expla­nations for some of the ideas. Many of the concepts in the book were already familiar to me (not at an expert level). I doubt that someone who had had no college math courses that included some abstract math would get much out of it.

Symmetry: A Journey into the Patterns of Nature, by Marcus du Sautoy, 2009. He also produced a video on symmetry.

My post Explaining “higher” math to beginners, describes du Sautoy’s use of terminology (among others).

Secrets of creation: the mystery of the prime numbers (Volume 1) by Matthew Watkins (author) and Matt Tweed (Illustrator), 2015. This is the first book of a trilogy that explains the connection between the Riemann $\zeta$ function and the primes. He uses pictures and verbal descriptions, very little terminology or symbolic notation. This is the best attempt I know of at explaining deep math that might really work for non-mathe­maticians.

My post The mystery of the prime numbers: a review describes the first book.

Piper Harron’s Thesis

The Equidistribution of Lattice Shapes of Rings of Integers of Cubic, Quartic, and Quintic Number Fields: an Artist’s Rendering, Ph.D. thesis by Piper Harron.

This is a remarkable departure from the usual dry, condensed, no-useful-asides Ph.D. thesis in math. Each chapter has three main parts, Layscape (explanations for nonspecialists — not (in my opinion) for nonmathe­maticians), Mathscape (most like what goes into the usual math paper but with much more explanation) and Weedscape (irrelevant stuff which she found helpful and perhaps the reader will too). The names of these three sections vary from chapter to chapter. This seems like a great idea, and the parts I have read are well-done.

These blog posts have useful comments about her thesis:

Types of explanations

Any explanation of math in any of the categories above will be of several different types. Some of them are considered here, and more will appear in Mathematical Information II.

The paper Varieties of Mathematical Prose, by Atish Bagchi and me, provides a more fine-grained description of certain types of math communication that includes some types of explanations and also other types of communication.

Images and metaphors

In abstractmath.org

I have written about images and metaphors in abstractmath.org:

Abstractmath.org is aimed at helping students who are beginning their study of abstract math, and so the examples are mostly simple and not at a high level of abstraction. In the general literature, the images and metaphors that are written about may be much more sophisticated.

The User’s GuideW

Luke Wolcott edits a new journal called Enchiridion: Mathematics User’s Guides (this link allows you to download the articles in the first issue). Each article in this journal is written by a mathematician who has published a research paper in a refereed journal. The author’s article in Enchiridion provides information intended to help the reader to understand the research paper. Enchiridion and its rationale is described in more detail in the paper The User’s Guide Project: Giving Experential Context to Research Papers.

The guidelines for writing a User’s Guide suggest writing them in four parts, and one of the parts is to introduce useful images and metaphors that helped the author. You can see how the authors’ user’s guides carry this out in the first issue of Enchiridion.

Piper Harron’s thesis

Piper Harron’s explanation of integrals in her thesis is a description of integrals and measures using creative metaphors that I think may raise some mathematicians’ consciousness and others’ hackles, but I doubt it would be informative to a non-mathematician. I love “funky-summing” (p. 116ff): it communicates how integration is related to real adding up a finite bunch of numbers in a liberal-artsy way, in other words via the connotations of the word “funky”, in contrast to rigorous math which depends on every word have an accumulation-of-properties definition.

The point about “funky-summing” (in my opinion, not necessarily Harron’s) is that when you take the limit of all the Riemann sums as all meshes go to zero, you get a number which

  • Is really and truly not a sum of numbers in any way
  • Smells like a sum of numbers

Connotations communicate metaphors. Metaphors are a major cause of grief for students beginning abstract math, but they are necessary for understanding math. Working around this paradox is probably the most important problem for math teachers.

Informal summaries of a proofW

The User’s Guide requires a “colloquial summary” of a paper as one of the four parts of the guide for that paper.

  • Wolcott’s colloquial summary of his paper keeps the level aimed at non-mathematicians, starting with a hand-waving explanation of what a ring is. He uses many metaphors in the process of explaining what his paper does.
  • The colloquial summary of another User’s Guide, by Cary Malkiewich, stays strictly at the general-public level. He uses a few metaphors. I liked his explanation of how mathematicians work first with examples, then finding patterns among the examples.
  • The colloquial summary of David White’s paper stays at the general-public level but uses some neat metaphors. He also has a perceptive paragraph discussing the role of category theory in math.

The summaries I just mentioned are interesting to read. But I wonder if informal summaries aimed at math majors or early grad students might be more useful.

Insights

The first of the four parts of the explanatory papers in Enchiridion is supposed to present the key insights and organizing principles that were useful in coming up with the proofs. Some of them do a good job with this. They are mostly very special to the work in question, but some are more general.

This suggests that when teaching a course in some math subject you make a point of explaining the basic techniques that have turned out very useful in the subject.

For example, a fundamental insight in group theory is:

Study the linear representations of a group.

That is an excellent example of a fundamental insight that applies everywhere in math:

Find a functor that maps the math objects you are studying to objects in a different branch of math.

The organizing principles listed in David White’s article has (naturally more specialized) insights like that.

Proof stories

“Proof stories” tell in sequence (more or less) how the author came up with a proof. This means describing the false starts, insights and how they came about. Piper Harron’s thesis does that all through her work.

Some authors do more than that: their proof stories intertwine the mathe­matical events of their progress with a recount of life events, which sometimes make a mathe­matical difference and sometimes just produces a pause to let the proof stew in their brain. Luke Wolcott wrote a User’s Guide for one of his own papers, and his proof story for that paper involves personal experiences. (I recommend his User’s Guide as a model to learn from.)

Reports of personal experiences in doing math seem to add to my grasp of the math, but I am not sure I understand why.

References

The talks in Seattle

  • List of all the talks.
  • W. Timothy Gowers, How should mathe­matical knowledge be organized? Talk at the AMS Special Session on Mathe­matical Information in the Digital Age of Science, 6 January 2016.
  • Colloquium notes. Gowers gave a series of invited addresses for which these are the notes. They have many instances of describing what sorts of problems obstruct a desirable step in the proof and what can be done about it.

  • Luke Wolcott, The User’s Guide. Talk at the AMS Special Session on Mathe­matical Information in the Digital Age of Science, 6 January 2016.

Creative Commons License< ![endif]>

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

My early life as a mathematician

My early life as a mathematician.

Revised 22 January 2016.

In 1965, I received my Ph.D. at Duke University based on a dissertation about polynomials over finite fields. My advisor was Leonard Carlitz.

In Carlitz’s algebra course, the textbook was Van der Waerden’s Algebra. It is way too old-fashioned to be used nowadays, but it did indeed present post-Noether type abstract algebra. Carlitz also had me read large chunks of Martin Weber’s Lehrbuch der Algebra, written in German in 1895 (so totally not post-Noether) and published using Fraktur. A few years ago one of my sons asked me to retype the words to some of the songs written in Fraktur in a German-American shape note book in Roman type (but still in German), which I did. This was for German teachers in the Concordia Language Villages to use with their students. I sometimes wonder if I am the last person on earth able to read Fraktur fluently.

I learned mathematical logic from Joe Shoenfield from his dittoed notes that later became an excellent textbook. I rediscovered Craig’s Trick while working on problem he gave. That considerably strengthened my sense of self-worth.

I accepted a job at Western Reserve University, now Case Western Reserve University, where I stayed until I retired in 1999. In the few years after 1965, I wrote several papers about finite fields. They are all summarized in the book Finite Fields, by Rudolf Lidl and Harald Niederreiter.

I was almost immediately attracted to category theory and to computing science, both of which Carlitz hated. I did not let that stop me. (Now is the time to say, Follow The Beat of your Own Drum or some such cliché.)

Early on, Paul Dedecker was at CWRU briefly, and from him I learned about sheaves, cribles and the like. This inspired me to take part in an algebraic geometry summer school at Bowdoin College, where I learned from lectures by David Mumford and by reading his Red Book when it was still red.

Because one of the papers in finite fields showed that certain types of permutation polynomials formed wreath products of groups, I also pursued group theory, in particular by taking part in the finite group theory summer school at Bowdoin in 1970.

During that time I pored over Beck’s thesis on cohomology, which with the group theory I had learned resulted in my paper Automorphisms of group extensions. That paper has the most citations of all my research papers.

In the early days, I had several graduate students. All of them worked in group theory. One of them, Shair Ahmad, went on to produce several Ph.D. students, all in differential equations and dynamical systems.

One thing I can brag about is that I never ever told him I hated differential equations or dynamical systems. In fact, I didn’t hate either one. There were people in the department in both fields and they made me jealous the way they could model real life phenomena with those tools. One relevant point about that is that I was a liberal arts math major from Oberlin before going to Duke and had had very few courses in any kind of science. This made me very different from most people in the department, who has B.S. undergrad degrees.

In those days, John Isbell and Peter Hilton were in the math department at CWRU for awhile, which boosted my knowledge and interest in category theory. Hilton arranged for me to spend a year at the E.T.H. in Zürich, where I met Michael Barr. I eventually wrote two books on category theory with him. But that is getting away from Early Days, so I will stop here.

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

Names of mathematical objects

This is a revision of the abstractmath.org article on names.

The name of a mathematical object is a word or phrase in math English used to identify an object. A name plays the same role that symbolic terms play in the symbolic language.

Sources of names


Suggestive English words

A suggestive name is a a common English word or phrase, chosen to suggest its meaning. This means it is a type of metaphor.

Examples


In none of these examples is
the metaphorical meaning
exactly suitable to be
the mathe­matical definition.

  • “Curve”, “point”, “line”, “slope“, “circle” and many other English words are used in elementary math with precise meanings that more or less fit their everyday meanings.
  • Connected subspace (of a topological space). When you draw a picture of a connected set it looks “connected”.
  • “Set” suggests a collection of things and provides a reasonable metaphor for its mathe­matical meaning. Both the abstractmath article on sets and the Wikipedia article on sets give you insight on why this metaphor cannot be entirely accurate.
  • Random English words

    Most English words used in math are not suggestive. They are either chosen at random or were intended to suggest something but misfired in some way.

    Groups

    A group is a collection of math objects with a binary operation defined on it subject to certain constraints. The binary operation is much more impor­tant than the underlying set! To many non-mathe­maticians, a “group” sounds like essentially what a mathe­matician calls a “set”.

    The concept of group was one of the earliest mathe­matical concepts des­cribed as a set-with-structure. I believe that a group was origi­nally referred to as a “group of trans­forma­tions”. May­be that phrase got shortened to “group” without anyone realizing what a disas­trous met­a­phor it caused.

    Fields

    A field in the algebraic sense is a structure which is not in any way suggested by the word “field”. The German word for field in this sense is “Körper”, which means “body”. That is about as bad as “group”, and I suspect it was motivated in much the same way. The name “Körper” may be due to Dedekind. I don’t know who to blame for “field”.

    A field in the sense of an assignment of a scalar or a vector to every point in a space is a completely separate notion than that of field as an algebra. The concept was invented in the nineteenth century by physicists, but any math student is likely to see fields in this sense in several different courses.

    Perhaps the second meaning of field was suggested by contour plowing.

    The word “field” is also discussed in the Glossary.

    Person’s name

    A concept may be named after a person.

    Examples

    • L’Hôpital’s Rule
    • Hausdorff space
    • Turing machine
    • Riemann surface
    • Riemannian manifold
    • Pythagorean Theorem
    • I have no idea why “Riemann” gets an ending when it is a manifold but not when it is a surface.

      Made-up name

      Some names are made up in a random way, not based on any oter language. Googol is an example.

      Named after notation

      Symbols

      A mathematical object may be named by the typographical symbol(s) used to denote it. This is used both formally and in on-the-fly references.  

      Some objects have standard names that are single letters (Greek or Roman), such as $e$, $i$ and $\pi$. There is much more about this in Alphabets.

      Be warned that any letter can be given another definition. $\pi$ is also used to name a projection, $i$ is commonly used as an index, and $e$ means energy in physics.

      Expressions

      • The multiplication in a Lie Algebra is called the “Lie bracket”. It is written “$[v,w]$”.
      • In quantum mechanics, a vector $\vec{w}$ may be notated “$|w\rangle$” and called a “ket”. Another vector $\vec{v}$ induces a linear operator on vectors that is denoted by “$\langle v|$”, which is called a “bra”. The action of $\langle v|$ on $|w\rangle$ is the inner product $\langle v|w\rangle$, which suggested the “bra” and “ket” terminology (from “bracket”). You can blame Paul Dirac for this stuff.
      • In 1985, Michael Barr and I published a book in category theory called Triples, Toposes and Theories. Immediately after that everyone in category theory started saying “monad” for what had been called “triple”. (The notation for a triple, er, monad, is of the form “$(T,\eta,\mu)$”.)
      • Synecdoche

        A synecdoche is a name of part of something that is used as a name for the whole thing.

        Examples

      The Tochar­ians appear to have called a cart by their word for wheel several thousand years ago. See the blog post by Don Ringe.

      Names from other languages

      In English, many technical names are borrowed from other languages. It may be difficult to determine what the meaning in the old language has to do with the mathematical meaning.

      Examples

    • Matrix. This is the Latin word for “uterus”. I suppose the analogy is with “container”.
    • Parabola. “Parabola” is a word borrowed from Greek in late Latin, meaning something like “comparison”. The parabola $y=x^2$ “compares” a number with its square: it curves upward because the area of a square grows faster than the length of its side. “Parable” is from the same word.
    • Algebra. This comes from an Arabic word meaning the art of setting joints, or more generally “restore”. It came through Spanish where it once meant “surgical procedure” but that meaning is now obsolete.

    Much of this information comes from The On-Line Etymological Dictionary. (Read its article about “sine”.) See also my articles on secant and tangent.

    I enjoy finding out about etymol­ogies, but I concede that knowing an ety­mol­ogy doesn’t help you very much in under­standing the math.

    Names made up from other languages’ roots

    A name may be a new word made out of (usually) Greek or Latin roots.

    Examples

    • Homomorphism. “Homo” in Greek is a root meaning “same” and “morphism” comes from a root referring to shape.
    • Quasiconformal. “Quasi” is a Latin word meaning something like “as if”. It is a prefix mathematicians use a bunch. It usually implies a weakening of the constraints that define the word it is attached to. A map is conformal if it preserves angles in a certain sense, and it is quasiconformal then it does not preserve angles but it does take circles into ellipses in a certain restricted sense (which conformal maps also do). So it replaces a constraint by a weaker constraint.


    Mathematical names cause problems for students

    The name may suggest the wrong meaning

    This is discusses in detail in the article cognitive dissonance.

    The name may not suggest any meaning

    English is unusual among major languages in the number of technical words borrowed from other languages instead of being made up from native roots.  We have some, listed under suggestive names.  But how can you tell from looking at them what “parabola” or “homomorphism” mean?   This applies to concepts named after people, too: The fact that “Hausdorff” is German for a village near an estate doesn’t tell me what a Hausdorff space is.

    The English word “carnivore” (from Latin roots) can be translated as “Fleischfresser” in German; to a German speaker, that word means literally “meat eater”.  So a question such as “What does a carnivore eat” translates into something like, “What does a meat-eater eat?” 

    Chinese is another language that forms words in that way: see the discussion of “diagonal” in Julia Lan Dai’s blog.  (I stole the carnivore example from her blog, too.)

    The result is that many technical words in English do not suggest their meaning at all to a reader not familiar with the subject.  Of course, in the case of “carnivore” if you know Latin, French or Spanish you are likely to guess the meaning, but it is nevertheless true that English has a kind of elitist stratum of technical words that provide little or no clue to their meaning and Chinese and German do not, at least not so much. This is a problem in all technical fields, not just in math.

    Pronunciation

    There are two main reasons math students have difficulties in pronouncing technical words in math.

    Most students have little knowledge of other languages

    Forty years ago nearly all Ph.D. students had to show mastery in reading math in two foreign languages; this included pronunciation, although that was not emphasized. Today the language requirements in the USA are much weaker, and younger educated Americans are generally weak in foreign languages. As a result, graduate students pronounce foreign names in a variety of ways, some of which attract ridicule from older mathematicians.

    Example: the graduate student at a blackboard who came to the last step of a long proof and announced, “Viola!”, much to the hilarity of his listeners.

    Pronunciation of words from other languages has become unpredictable

    In English-speaking countries until the early twentieth century, the practice was to pronounce a name from another language as if it were English, following the rules of English pronunciation.

    We still pronounce many common math words this way: “Euclid” is pronounced “you-clid” and “parabola” with the second syllable rhyming with “dab”.

    But other words (mostly derived from people’s names) are pronounced using the pronunciation of the language they came from, or what the speaker thinks is the foreign pronunciation. This particularly involves pronouncing “a” as “ah”, “e” like “ay”, and “i” like “ee”.

    Examples
    • Euler (oiler)
    • Fourier (foo-ree-ay)
    • Lagrange (second a pronounced “ah”)
    • Lie (lee)
    • Riemann (ree-monn)

    The older practice of pronunciation is explained by history: In 1100 AD, the rules of pronunciation of English, Ger­man and French, in particular, were remarkably similar. Over the centuries, the sound systems changed, and Eng­lish­men, for example, changed their pronunciation of “Lagrange” so that the second syllable rhymes with “range”, whereas the French changed it so that the second vowel is nasalized (and the “n” is not otherwise pronounced) and rhymes with the “a” in “father”.

    German spelling

    The German letters “ä”, “ö” and “ü” may also be spelled “ae”, “oe” and “ue” respectively. It is far better to spell “Möbius” as “Moebius” than to spell it “Mobius”.

    The German letter “ß” may be spelled “ss” and often is by the Swiss. Thus Karl Weierstrass spelled his last name “Weierstraß”. Students sometimes confuse the letter “ß” with “f” or “r”. In English language documents it is probably better to use “ss” than “ß”.

    Transliterations from Cyrillic

     The name of the Russian mathematician mot commonly spelled “Chebyshev” in English is also spelled Chebyshov, Chebishev, Chebysheff, Tschebischeff, Tschebyshev, Tschebyscheff and Tschebyschef. (Also Tschebyschew in papers written in German.) The only spelling in the list above that could be said to have some official sanction is “Chebyshev”, which is used by the Library of Congress.

    The correct spelling of his name is “Чебышев” since he was Russian and the Russian language uses the Cyrillic alphabet.

    In spite of the fact that most of the transliterations show the last vowel to be an “e”, the name in Russian is pronounced approximately “chebby-SHOFF”, accent on the last syllable.  Now, that is a ridiculous situation, and it is the transliterators who are ridiculous, not Russian spelling, which in spite of that peculiarity about the Cyrillic letter “e” is much more nearly phonetic than English spelling.

    Some other Russian names have variant spellings (Tychonov, Vinogradov) but Chebyshev probably wins the prize for the most.

Plurals

Many authors form the plural of certain technical words using endings from the language from which the words originated. Students may get these wrong, and may sometimes meet with ridicule for doing so.

Plurals ending in a vowel

Here are some of the common mathematical terms with vowel plurals.

singular plural
automaton automata
polyhedron polyhedra
focus foci
locus loci
radius radii
formula formulae
parabola parabolae
  • Linguists have noted that such plurals seem to be processed differently from s-plurals.  In particular, when used as adjectives, most nouns appear in the singular, but vowel-plural nouns appear in the plural: Compare “automata theory” with “group theory”.  No one says groups theory.  I used to say “automaton theory” but people looked at me funny.
  • The plurals that end in a (of Greek and Latin neuter nouns) are often not recognized as plurals and are therefore used as singulars.  That is how “data” became singular.  This does not seem to happen with my students with the -i plurals and the -ae plurals.
  • In the written literature, the -ae plural appears to be dying, but the -a and -i plurals are hanging on. The commonest -ae plural is “formulae”; other feminine Latin nouns such as “parabola” are usually used with the English plural. In the 1990-1995 issues of six American mathematics journals, I found 829 occurrences of “formulas” and 260 occurrences of “formulae”, in contrast with 17 occurrences of “parabolas” and and no occurrences of “parabolae”. (There were only three occurrences of “parabolae” after 1918.)  In contrast, there were 107 occurrences of “polyhedra” and only 14 of “polyhedrons”.
  • Plurals in s with modified roots

    singular

    plural

    matrix

    matrices

    simplex

    simplices

    vertex

    vertices

    Students recognize these as plurals but produce new singulars for the words as back formations. For example, one hears “matricee” and “verticee” as the singular for “matrix” and “vertex”. I have also heard “vertec”.

    Remarks

    It is not unfair to say that some scholars insist on using foreign plurals as a form of one-upmanship. Students and young professors need to be aware of these plurals in their own self interest.

    It appears to me that ridicule and put-down for using standard English plurals instead of foreign plurals, and for mispronouncing foreign names, is much less common than it was thirty years ago. However, I am assured by students that it still happens.

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

Recent revisions to abstractmath.org

For the last six months or so I have been systematically going through the abstractmath.org files, editing them for consistency, updating them, and in some cases making major revisions.

In the past I have usually posted revised articles here on Gyre&Gimble, but WordPress makes it difficult to simply paste the HTML into the WP editor, because the editor modifies the HTML and does things such as recognizing line breaks and extra spaces which an HTML interpreters is supposed to ignore.

Here are two lists of articles that I have revised, with links.

Major revisions

Other revised articles

Other recent changes

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

The intent of mathematical assertions

An assertion in mathematical writing can be a claim, a definition or a constraint.  It may be difficult to determine the intent of the author.  That is discussed briefly here.

Assertions in math texts can play many different roles.

English sentences can state facts, ask question, give commands, and other things.  The intent of an English sentence is often obvious, but sometimes it can be unexpectedly different from what is apparent in the sentence.  For example, the statement “Could you turn the TV down?” is apparently a question expecting a yes or no answer, but in fact it may be a request. (See the Wikipedia article on speech acts.) Such things are normally understood by people who know each other, but people for whom English is a foreign language or who have a different culture have difficulties with them.

There are some problems of this sort in math English and the symbolic language, too.  An assertion can have the intent of being a claim, a definition, or a constraint.

Most of the time the intent of an assertion in math is obvious. But there are conventions and special formats that newcomers to abstract math may not recognize, so they misunderstand the point of the assertion. This section takes a brief look at some of the problems.

Terminology

The way I am using the words “assertion”, “claim”, and “constraint” is not standard usage in math, logic or linguistics.


Claims

In most circumstances, you would expect that if a lecturer or author makes a math assertion, they are claiming that it is a true statement, and you would be right.

Examples
  1. “The $240$th digit of $\pi$ after the decimal point is $4$.”
  2. “If a function is differentiable, it must be continuous.”
  3. “$7\gt3$”

Remarks

  • You don’t have to know whether these statements are true or not to recognize them as claims. An incorrect claim is still a claim.
  • The assertion in (a) is a statement, in this case a false one.  If it claimed the googolth digit was $4$ you would never be able to tell whether it is true or not, but it
    still would be an assertion intended as a claim.
  • The assertion in (b) uses the standard math convention that an indefinite noun phrase (such as “a widget”) in the subject of a sentence is universally quantified (see also the article about “a” in the Glossary.) In other words, “An integer divisible by $4$ must be even” claims that any integer divisible by $4$ is even. This statement is claim, and it is true.
  • (c) is a (true) claim in the symbolic language. (Note that “$3 + 4$” is not an assertion at all, much less a claim.)


Definitions

Definitions are discussed primarily in the chapter on definitionsA definition is not the same thing as a claim. 

Example

The definition

“An integer is even if it is divisible by $2$”

makes the claim

“An
integer is even if and only if it is
divisible by $2$”

true.

(If you are surprised that the definition uses “if” but the claim uses “if and only if”, see the Glossary article on “if”.)

Unmarked definitions

Math texts sometimes define something without saying that it is a definition. Because of that, students may sometimes think a claim is a definition.

Example

Suppose that the concept of “even integer” was new to you and the book said, “A number is even if it is divisible by $4$.” Perhaps you thought that this was a definition. Later the book refers to $6$ as even and you pull your hair out wondering why. The statement is a correct claim but an incorrect definition. A good writer would write something like “Recall that a number is even if it is divisible by $2$, so that in particular it is even if it is divisible by $4$.”

On the other hand, you may think a definition is only a claim.

Example

A lecturer may say “By definition, an integer is even if it is divisible by $2$”, and you write down: “An integer is even if it is divisible by $2$”. Later, you get all panicky wondering How did she know that?? (This has happened to me.)

The confusion in the preceding example can also occur if a books says, “An integer is even if it is divisible by $2$” and you don’t know about the convention that when an author puts a word or phrase in boldface or italics it may mean that they are defining it.

A good writer always labels definitions


Constraints

Here are two assertions that contain variables.

  • “$n$ is even.”
  • “$x\gt1$”.

Such an assertion is a constraint (or a condition) if the intent is
that the assertion will hold in that part of the text (the scope of the constraint). The part of the text in which it holds is usually the immediate vicinity unless the authors explicitly says it will hold in a larger part of the text such as “this chapter” or “in the rest of the book”.

Examples
  • Sometimes the wording makes it clear that the phrase is a constraint. So a statement such as “Suppose $3x^2-2x-5\geq0$” is a constraint on the possible values of $x$.
  • The statement “Suppose $n$ is even” is an explicit requirement that $n$ be even and an implicit requirement that $n$ be an integer.
  • A condition for which you are told to find the solution(s) is a constraint. For example: “Solve the equation $3x^2-2x-5=0$”. This equation is a constraint on the variable $x$. “Solving” the equation means saying explicitly which numbers make the equation true.

Postconditions

The constraint may appear in parentheses after the assertion as a postcondition on an assertion.

Example

“$x^2\gt x\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,(\text{all }x\gt1)$”

which means that if the constraint “$x\gt1$” holds, then “$x^2\gt x$” is true. In other words, for all $x\gt1$, the statement $x^2\gt x$ is true. In this statement, “$x^2\gt x$” is not a constraint, but a claim which is true when the constraint is true.

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

Send to Kindle

Context

This is a revised draft of the abstractmath.org article on context in math texts. Note: WordPress changed double primes into quotes. Tsk.

Context

Written and especially spoken language depends heavily on the context – the physical surroundings, the preceding conversation, and social and cultural assumptions.  Mathematical statements are produced in such contexts, too, but here I will discuss a special thing that happens in math conversation and writing that does not seem to happen much in other sorts of discourse:

The meanings of expressions
in both the symbolic language and math English
change from phrase to phrase
as the speaker or writer changes the constraints on them.

Example

In a math text, before the occurrence of a phrase such as “Let $n=3$”, $n$ may be known only as an integer variable.  After the phrase, it means specifically $3$.  So this phrase changes the meaning of $n$ by constraining $n$
to be $3$.  We say the context of occurrences of “$n$” before the phrase requires only that $n$ be an integer, but after the occurrence the context requires $n=3$.

Definition

In this article, the context at a particular location in mathematical discourse is the sum total of what the reader or listener can know about the symbols and names used in the discourse when they have read everything up to that location.

Remarks

  • Each clause can change the meaning of or constraints on one or more symbols or names. The conventions in effect during the discourse can also put constraints on the symbols and names.
  • Chierchia and McConnell-Ginet give a mathematical definition of context in the sense described here.
  • The references to “before” and “after” the phrase “Let $3$” refer to the physical location in text and to actual time in spoken math. There is more about this phenomenon in the Handbook of Mathematical Discourse, page 252, items (f) and (g).
  • Contextual changes of this sort take place using the pretense that you are reading the text in order, which many students and professionals do not do (they are “grasshoppers”).
  • I am not aware of much context-changing in everyday speech. One place it does occur is in playing games. For example, during some card games the word “trumps” changes meaning from time to time.
  • In symbolic logic, the context at a given place may be denoted by “$\Gamma$”.

Detailed example of a math text

Here is a typical example of a theorem and its proof.  It is printed twice, the second time with comments about the changes of context.  This is the same proof that is already analyzed practically to death in the chapter on presentation of proofs.

First time through

Definition: Divides

Let $m$ and $n$ be integers with $m\ne 0$. The statement “$m$ divides $n$” means that there is an integer $q$ for which $n=qm$

Theorem

Let $m$, $n$ and $p$ be integers, with $m$ and $n$ nonzero, and suppose $m$ divides $n$ and $n$ divides $p$.  Then $m$ divides $p$.

Proof

By definition of divides, there are integers $q$ and $q’$ for which $n=qm$ and $p=q’n$. We must prove that there is an integer $q”$ for which $p=q”m$. But $p=q’n=q’qm$, so let $q”=q’q$.  Then $p=q”m$.

Second time, with analysis

Definition: Divides

Begins a definition. The word “divides” is the word being defined. The scope of the definition is the following paragraph.

Let $m$ and $n$ be integers

$m$ and $n$ are new symbols in this discourse, constrained to be integers.

with $m\ne 0$

Another constraint on $m$.

The statement “$m$ divides $n$ means that”

This phrase means that what follows is the definition of “$m$ divides $n$”

there is an integer $q$

“There is” signals that we are beginning an existence statement and that $q$ is the bound variable within the existence statement.

for which $n=qm$

Now we know that “$m$ divides $n$” and “there is an integer $q$ for which $m=qn$” are equivalent statements.  Notes: (1) The first statement would only have implied the second statement if this had not been in the context of a definition. (2) After the conclusion of the definition, $m$, $n$ and $q$ are undefined variables.

Theorem

This announces that the next paragraph is a statement has been proved. In fact, in real time the statement was proved long before this discourse was written, but in terms of reading the text in order, it has not yet been proved.

Let $m$, $n$ and
$p$ be integers,

“Let” tells us that the following statement is the hypothesis of an implication, so we can assume that $m$, $n$ and $p$ are all integers.  This changes the status of $m$ and $n$, which were variables used in the preceding paragraph, but whose constraints disappeared at the end of the paragraph.  We are starting over with $m$ and $n$.

with $m$
and $n$ nonzero.

This clause is also part of the hypothesis. We can assume $m$ and $n$ are constrained to be nonzero.

and suppose $m$ divides $n$ and $n$ divides $p$.

This is the last clause in the hypothesis. We can assume that $m$ divides $n$ and $n$ divides $p$.

Then $m$
divides $p$.

This is a claim that $m$ divides $p$. It has a different status from the assumptions that $m$ divides $n$ and $n$ divides $p$. If we are going to follow the proof we have to treat $m$ and $n$ as if they divide $n$ and $p$ respectively. However, we can’t treat $m$ as if it divides $p$. All we know is that the author is claiming that $m$ divides $p$, given the facts in the hypothesis.

Proof

An announcement that a proof is about to begin, meaning a chain of math reasoning. The fact that it is a proof of the Theorem just stated is not explicitly stated.

By definition of divides, there are integers $q$ and $q’$ for which $n=qm$ and $p=q’n$.

The proof uses the direct method (rather than contradiction or induction or some other method) and begins by rewriting the hypothesis using the definition of “divides”. The proof does not announce the use of these techniques, it just starts in doing it. So $q$ and $q$’ are new symbols that satisfy the equations $n=qm$ and $p=q’n$. The phrase “by definition of divides” justifies the introduction of $q$ and $q’$. $m$, $n$ and $p$ have already been introduced in the statement of the Theorem.

We must prove that there is an integer $q”$ for which $p=q”m$.

Introduces a new variable $q”$ which has not been given a value. We must define it so that $p=q”m$; this requirement is justified (without saying so) by the definition of “divides”.

But $p=q’n=q’qm$,

This is a claim about $p$, $q$, $q’$, $m$ and $n$.  It is justified by certain preceding sentences but this justification is not made explicit. Note that “$p=q’n=q’qm$” pivots on $q’n$, in other words makes two claims about it.

so let $q”=q’q$.

We have already introduced $q”$; now we give it the value $q”=q’q$.

Then $p=q”m$

This is an assertion about $p$, $q”$ and $n$, justified (but not explicitly — note the hidden use of associativity) by the previous claim that $p=q’n=q’qm$.

 

The proof is now complete, although no
statement asserts that it is.

Remark

If you have some skill in reading proofs, all the stuff in the right hand column happens in your brain without, for the most part, your being conscious of it.

Acknowledgment

Thanks to Chris Smith for correcting errors.

References for “context”

Chierchia, G. and S. McConnell-Ginet
(1990), Meaning and Grammar. The MIT Press.

de Bruijn, N. G. (1994), “The mathematical vernacular, a
language for mathematics with typed sets”. In Selected Papers on Automath,
Nederpelt, R. P., J. H. Geuvers, and R. C. de Vrijer, editors, volume 133 of
Studies in Logic and the Foundations of Mathematics, pages 865 – 935. Elsevier

Steenrod, N. E., P. R. Halmos, M. M. Schif­fer,
and J. A. Dieudonné (1975), How to Write Mathematics.
American Mathematical Society.

Send to Kindle

The Mathematics Depository: A Proposal

Introduction

This post is about taking texts written in mathematical English and the symbolic language and encoding it in a formal language that could be tested by an automated proof verifier. This is a very difficult undertaking, but we could get closer and closer to a working system by a worldwide effort continuing over, probably, decades. The system would have to contain many components working together to create incremental improvements in the process.

This post, which is a first draft, outlines some suggestions as to how this could work. I do not discuss the encoding required, which is not my area of expertise. Yes, I understand that coding is the hard part!

Much work has been done by computing scientists in developing proof checking and proof-finding programs. Work has also been done, primarily by math education workers but also by some philosophers and computing scientists, in uncovering the many areas where ordinary math language is ambiguous and deviates from ordinary English usage. These characteristics confuse students and also make it hard to design a program that can interpret the language. I have been working in that area mostly from the math ed point of view for the last twenty years.

The Reference section lists many references to the problem of parsing mathematical English, some from the point of view of automatic translation of math language into code, but most from the point of view of helping students understand how to understand it.

The Mathematics Depository

I imagine a system for converting documents written in math language into machine-readable language and testing their claims. An organization, call it the Mathematics Depository, would be developed that is supported by many countries, organizations and individual supporters. It should consist of several components listed below, no doubt with other components as we become aware of needing them. The organization would be tasked with supporting and improving these components over time.

The main parts of the system

Each component is linked to a more detailed description that is given later in this post.

  • A Proof Verifier (PV), that inputs a proof and determines if it is correct.
  • A specification of a supported subset of Mathematical English and the symbolic language, that I will call Strict Math English (SME).
  • A Text-SME Converter, a program that would input a text written in ordinary math English that has been annotated by a knowledgeable person and convert it into SME.
  • An SME-PV Converter that will convert text written in SME into code that can be directly read by the Proof Verifier.
  • One or more Automatic Theorem Provers, that to begin with can take fairly simple conjectures written in SME and sometimes succeed in proving them.
  • An Annotation System containing an Annotation Editor that would allow a person to use SME to annotate an article written in ordinary math English so that it could be read by the Text-SME Converter.
  • A Data Base that would include the texts that have been collected in this endeavor, along with the annotations and the results of the proof checking.
  • A Data Base Miner that would watch for patterns in the annotations as new papers were submitted. The operators might also program it to watch for patterns in other aspects of the operation.

These facilities would be organized so that the systems work together, with the result that the individual components I named improve over time, both automatically and via human intervention.

Flow of Work

  1. A math text is submitted.
  2. If it is already in Strict Math English (SME), it is input to the Proof Verifier (PV).
  3. Otherwise, the math text is input into the Annotation System.
  4. The resulting SME text is input into the Text-SME Converter.
  5. The output of the Text-SME Converter is input into the Proof Verifier.
  6. The PV incorporates each definition in the text into the context of the math text. This is a specific meaning of the word “context”, including a list of the status of variables (bound, unbound, type, and so on), meanings of technical words, and other facts created in the text. “Context” is described informally in my article Context in abstractmath.org. That article gives references to the formal literature.
  7. In my experience mathematicians spend only a little time reading arguments step by step as described in the Context article. They usually look at a theorem and try to figure it out themselves, “cheating” occasionally by glancing at parts of the proof.

  8. Each mathematical assertion in the text is marked as a claim.
  9. The checking process records those claims occurring in the proof that are not proved in the text, along with any references given to other texts.
  10. If a reference to a result in another text is made, the PV looks for the result in the Database. If it does not find it, the PV incorporates the result and its location in the Database as an externally proven but untested claim.
  11. If no reference or proof for a claim is given, the PV checks the Database to see if it has already been proved.
  12. Any claim in the current text not shown as proven in the Database is submitted to the Automatic Theorem Prover (ATP). The output of the ATP is put in the database (proved, counterexample found, or unable to determine truth).
  13. If a segment of text is presented as a proof, it is input into the PV to be verified.
  14. The PV reports the result for each claimed proof, which can consist of several possibilities:
    • A counterexample for a proof is found, so the claim that the proof was supposed to report is false.
    • The proof contains gaps, so the claim is unsettled.
    • The proof is reported as correct.
  15. At the end of the process, all the information gathered is put into the Database:
    • The original text showing all the annotations.
    • The text in SME.
    • All claims, with their status (proven true, proven false, truth unknown, reference if one was given).
    • Every proof, with its status and the entire context at each step of the proof.

Details

The proof verifier

  • Proof checking programs have been developed over the last thirty or so years. The MD should write or adapt one or more Proof Verifiers and improve it incrementally as a result of experience in running the system. In this post I have assumed the use of just one Proof Verifier.
  • The Proof Verifier should be designed to read the output of the SME-PV converter.
  • The PV must read a whole math text in SME, identify and record each claim and check each proof (among other things). This is different from current proof verifiers, which take exactly one proof as input.
  • The PV must create the context of each proof and change it step by step as it reads each syntactic fragment of the math text.
  • Typically the context for a claimed proof is built up in the whole math text, not just in the part called “Proof”.
  • The PV should automatically query the Data Base for unproved steps in a proof in the input text to see if they have already been verified somewhere else. These results should be quoted in a proof verifier output.
  • The PV should also automatically submit steps in the proof that haven’t been verified to the Automatic Theorem Provers and wait for the step to be verified or not.
  • The Proof Verifier should output details of the result of the checking whether it succeeded in verifying the whole input text or not. In particular, it should list steps in proofs it failed to verify, including steps in proofs for which the input text cited the proof in some other paper, in the MD system or not.
  • The Proof Verifier should be available online for anyone to submit, in SME, a mathematical text claiming to prove a theorem. Submission might require a small charge.

Strict Math English

  • One of the most important aspects of the system would be the simultaneous incremental updating of the SME and the SME-PV Converter.
  • The idea is that SME would get more and more inclusive of the phrases and clauses it allows.

Example: Universal Assertions

At the start SME might allow these statements to be recognized as the same universal assertion:

  • “$\forall x(x^2+1\gt0)$”
  • “For all [every, any] $x$, $x^2+1\gt0$.” (universality asserted using an English word.)
  • “For all [every, any] $x$, $x^2+1$ is positive.”

As time goes on, a person or the Data Base Miner might detect that many annotators also recognized these statements as saying the same thing:

  • “$x^2+1\gt0\,\,\,\,\,(\text{all } x)$” (as a displayed statement)
  • “$x^2+1$ is positive for every $x$.” Universality asserted using an adjective in a postposited phrase.
  • “$x^2+1$ is always positive.” Universality hidden in a postposited adverb that seems to be referring to time!
  • There are more examples in my article Universally True Assertions. See also Susanna Epp’s article on quantification for other problems in this area.

These other variations would then be added to the Strict Math Language. (This is only an example of how the system would evolve. I have no doubt that in fact all the terminology mentioned above would be included at the outset, since they are all documented in the math ed literature.)

Even at the start, SME will include phrases and clauses in the English language as well as symbolic expressions. It is notorious that automatically parsing general English sentences is difficult and that the ubiquity of metaphors makes it essentially impossible to reliably construct the meaning of a sentence. That is why SME must start with a very narrow subset of math English. But even in early days, it should include some stereotyped metaphors, such as using “always” in universal assertions.

The SME-PV Converter

  • The SME-PV Converter would read documents written in SME and convert them into code readable by the proof checking program, as well as by the automatic theorem provers.
  • Such a program is essentially the subject of Ganesingalam’s book.
  • Converting SME so that the Proof Verifier can handle it involves lots of subtleties. For example, if the text says, “For any $x$, $x^2+1\gt0$”, the translation has to recognize not only that this is a universally quantified statement with $x$ as the bound variable, but that $x$ must be a real number, since complex numbers don’t do greater-than.
  • Frequent revisions of the SME-PV Converter will be necessary since its input language, the SME, will be constantly expanded.
  • It may be that the output language of the SME-PV Converter (which the Proof Verifier and Automatic Theorem Provers read) will require only infrequent revisions.

The Automatic Theorem Provers

  • The system could support several ATP’s, each one adapted to read the output of the SME-PV Converter.
  • The Automatic Theorem Provers should provide output in such a way that the Proof Verifier can include in its report the positive or negative results of the Theorem Prover in detail.

The Annotation System

  • The Annotation system would facilitate construction of a data structure that connects each annotation to the specific piece of text it rewrites. The linking should be facilitated by the Annotation Editor.
  • For example, an annotation that is meant to explain that the statement (in the input text) “$x^2+1$ is always greater than $0$” is to be translated as “$\forall x(x^2+1\gt0)$” (which is presumably allowed by SME) should cause the first statement to be be linked to the second statement. The first statement, the one in the input text, should not be changed. This will enable the Data Base Miner to find patterns of similar text being annotated in similar ways.
  • The annotations should clarify words, symbolic expressions and sentences in the input text to allow the Proof Verifier to input them correctly.
  • In particular, every claim that a statement is true should be marked as a proposed theorem, and similarly every proof should be marked as a proof and every definition should be marked as a definition. Such labeling is often omitted in the math literature. Annotators would have to recognize segments of the text as claims, proofs and definitions and annotate them as such.
  • The annotations would be written in the current version of Strict Math English. Since SME is frequently updated, the instructions for the annotator would also have to be frequently updated.

Examples

  • If a paper used the word “domain” without defining it, the annotator would clarify whether it meant an open connected set, a type of ring, a type of poset, or the domain of a function. See Example 1
  • Annotators will note instances in which the same text will use a symbol with two different meanings. See Example 2.
  • In a phrase, a single occurrence of a symbol can require an annotation that assigns more than one attribute to the symbol. See Example 3.

The Annotation Editor

  • The annotators should be provided with an Annotation Editor designed specifically for annotation.
  • The editor should include a system of linking an annotation to the exact phrase it annotates that is easy for a person reading the annotated document to understand it as well as providing the information to the Text-SME Converter.

The Annotators

  • Great demands will be made of an annotator.
  • They must understand the detailed meaning of the text they annotate. This means they must be quite familiar with the field of math the text is concerned with.
  • They must learn SME. I know for a fact that many mathematicians are not good at learning foreign languages. It will help that SME will be a subset of the full language of math.
  • All this means that annotators must be chosen carefully and paid well. This means that not very many papers will get annotated by paid annotators, so that there will have to be some committee that chooses the papers to be annotated. This will be a genuine bottleneck.
  • One thing that will help in the long run is that the SME should evolve to include more features of the general language of math, so many mathematicians will actually write their papers in SME and submit it directly to the Depository. (“Long run” may mean more than ten years).

The Text-to-SME Converter

  • This converter takes a math text in ordinary Math English that has been annotated and convert it into SME.
  • The format for feeding it to the Automatic Theorem Prover may very well have to be different from the format to be read by a human. Both formats should be saved.

The Data Base

  • The Data Base would contain all math papers that have been run through the Proof Verifier, along with the results found by the Proof Verifier. A paper should be included whether or not every claim in the paper was verified.
  • Funding agencies (and private individuals) might choose particularly important papers and pay more money for annotation for those than for other papers.
  • Mathematicians in a particular field could be hired to annotate particular articles in their field, using a standard annotation language that would develop through time.
  • The annotated papers would be made freely available to the public.
  • It will no doubt prove useful for the Data Base to contain many other items. Possibilities:
  • A searchable list of all theorems that have been verified.
  • A glossary: a list of math words that have been defined in the papers in the Depository. This will include synonyms and words with multiple meanings.

The Data Base Miner

Watch for patterns

The DBM would watch for patterns in annotation as new annotated papers were submitted. It should probably look only at annotated papers whose proofs had been verified. The patterns might include:

  • Correlation between annotations that associate particular meanings to particular words or symbols with the branch of math the paper belongs to. See Example 1.
  • Noting that a particular format of combining symbols usually results in the same kind of annotation. See Example 4.
  • Providing data in such a way that lexicographers studying math English could make use of them. My Handbook began with my doing lexicographical research on math English, but I found it so slow that when I started abstractmath.org I resolved not to such research any more. Nevertheless, it needs to be done and the Database should make the process much easier.

Statistical translation

Since the annotated papers will be stored in the Data Base, the Data Base Miner could use the annotations in somewhat the same way some language translators work (in part): to translate a phrase, it will find occurrences of the phrase in the source language that have been translated into the target language and use the most common translation. In this case the source language is the paper (in English) and the target language is in annotated math English readable by the Proof Verifier. Once the Database includes most of the papers ever published (twenty years from now?), statistical translation might actually become useful.

Examples

Example 1: Meaning varies with branch of math

  • Field” means one thing in an algebra paper and another in a mathematical physics paper.
  • Domain” means
  • An open connected set in topology.
  • A type of ring in algebra.
  • A type of poset in theoretical computing science.
  • The domain of a function –everywhere in math, which makes it seem that this is going to be very hard to distinguish without human help!
  • Log” usually implies base $2$ in the computing world, base $10$ in engineering (but I am not sure how prevalant this meaning is there), and base $e$ in pure math. With exceptions!
  • Example 2: Meaning varies even in the same article

    • The notation “$(a,b)$” can mean an ordered pair, an open interval, or the GCD. What’s worse, there are many instances where the symbol is used without definition. Citation 139 in the Handbook provides a single sentence in which the first two meanings both occur:

      $\dots$ Richard Darst and Gerald Taylor investigated the differentiability of functions $f^p$ (which for our purposes we will restrict to $(0,1)$) defined for each $p\geq1$ by\[F(x):=
      \begin{cases}
      0 &
      \text{if }x\text{ is irrational}\\
      \displaystyle{\frac{1}{n^p}} &
      \text{if }x = \displaystyle{\frac{m}{n}}\text{ with }(m,n)=1\\ \end{cases}\]

      The sad thing is that any mathematician will know immediately what each occurrence means. This may be a case where the correct annotation will never be automatically detectable.

    Example 3: One mention of a symbol may require several meanings

    In the sentence, “This infinite series converges to $\zeta(2)=\frac{\pi^2}{6}\approx 1.65$,” the annotator would provide two pieces of information about “$\frac{\pi^2}{6}$”, namely that it is both the right constituent of the equation “$\zeta(2)=\frac{\pi^2}{6}$” and the left constituent of the approximation statement “$\frac{\pi^2}{6}\approx 1.65$” — and that these two statements were the constituents of an asserted conjunction. (See my post Pivoted symbols.)

    Example 4: Function to a power

    Some expressions not in the SME will almost always be annotated in the same way. This makes it discoverable by the Data Base Miner.

    • “$\sin^{-1}x$” always means $\arcsin x$.
    • For positive $n$, “$\sin^n x$” always means $(\sin x)^n$. It never means the $n$-fold application of $\sin$ to $x$.
    • In contrast, for an arbitrary function symbol, $f^n(x)$ will often be annotated as $n$-fold application of $f$ and also often as $f(x)^n$. (And maybe those last two possibilities are correlated by branch of math.)

    References

    I believe that work in formal verification has tended to overlook the work on math language difficulties in math ed, so I have included some articles from that specialty.

    The following are posts from my blog Gyre&Gimble. They are in reverse chronological order.

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.


    Send to Kindle

    Pivoted symbols

    This post describes one problem that someone new to abstract math who is reading mathematical papers might find difficult to understand. Any attempt at machine translation of mathematical prose into computer-readable code would also have to take this phenomenon into account. The math ed literature describes many other problems like this for readers, some of which cause much more trouble than this example.

    I referred to the phenomenon described below as a parenthetic assertion in these publications:

    A symbol is pivoted if it is embedded just once in an expression that, when spelled out explicitly, requires it appear twice because it play two different roles in two different phrases or clauses. The examples below convince me that “pivoted symbol” is a better name than “parenthetic assertion”.

    Example

    The expression “$1\lt x\lt 2$” can be spelled out in these ways:

    1. “$1$ is less than $x$, which is less than $2$”.
    2. “$1$ is less than $x$ and $x$ is less than $2$”.

    In both cases, $x$ appears just once in the expression, but any reasonable rendering requires it to appear in two clauses. (The word “which” in the first statement refers to $x$ according to the rules for anaphora in English, so “which” is a homonym for $x$ there.)

    Example

    “For any $x\gt0$ there is a $y\gt0$ such that $x\gt y$.”

    • This is mathematical shorthand for: “For any $x$ that is greater than $0$ there is a $y$ that is greater than $0$ such that $x$ is greater than $y$.”
    • The phrase “For any $x$ that is greater than $0$” conveys two pieces of information:
      • $x$ is bound by a universal quantifier.
      • $x$ is a variable in a phrase constraining it.
    • The constraint on $x$ to be bigger than $0$ fills the slot of a subordinate clause playing the role of an adjective, with $x$ as head. The word “that” is anaphoric.
    • The situation for $y$ in this expression is similar.
    • The elimination of “that” (which is one occurrence of $x$) in the expressions “$x$ greater than $0$” and “$x\gt 0$” fit a common pattern in English of omitting “that” or “that is”. Consider: “I saw a house bigger than the White House.” So calling the statement a “parenthetic assertion” doesn’t seem to fit the situation very well.
    • That’s why I have titled this post “Pivoted symbols”. The key to this name is that a translation into a formal language is going to have to encode two facts about $x$: It is bound by a quantifier and it is constrained to be greater than $0$. It has to be copied to accomplish this.
    • There are similar examples in Contrapositive grammar, by David Butler.
    Example

    “This infinite series converges to $\zeta(2)=\frac{\pi^2}{6}\approx 1.65$.”

    This example can be read in two ways that are different in English grammar but have the same logical content:

    • “This infinite series converges to $\zeta(2)$, which is $\frac{\pi^2}{6}$, which is approximately $1.65$.”
    • “$\ldots$ converges to $\zeta(2)$, and $\zeta(2)=\frac{\pi^2}{6}$, and $\frac{\pi^2}{6}$ is approximately $1.65$.”
    Example

    “Let us return for a moment to the circle $S^1\subseteq \mathbb{C}=\mathbb{R}^2$.” (Citation 426 in the Handbook of mathematical discourse.)

    Example

    \[B(t):=\frac{1+t^2}{2+t^2}\in\mathbb{Q}(t)\]
    is a sum of $2n$th powers of elements in $\mathbb{Q}(t)$ for all $n$. (Citation 332 in the Handbook of mathematical discourse.)

    Science-fiction fans used to do something similar with words instead of clauses, writing “yed” for “ye editor” and “scientifiction” for “scientific fiction”.

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.


    Send to Kindle

    math, language and other things that may show up in the wabe