Category Archives: understanding math

More about the definition of function

Maya Incaand commented on my post Definition of "function":

Why did you decide against "two inequivalent descriptions in common use"?  Is it no longer true?

This question concerns [1], which is a draft article.  I have not promoted it to the standard article in abstractmath because I am not satisfied with some things in it. 

More specifically, there really are two inequivalent descriptions in common use.  This is stated by the article, buried in the text, but if you read the beginning, you get the impression that there is only one specification.  I waffled, in other words, and I expect to rewrite the beginning to make things clearer.

Below are the two main definitions you see in university courses taken by math majors and grad students.  A functional relation has the property that no two distinct ordered pairs have the same first element.

Strict definition: A function consists of a functional relation with specified codomain (the domain is then defined to be the set of first elements of pairs in the relation).  Thus if $A$ and $B$ are sets and $A\subseteq B$, then the identity function $1_A:A\to A$ and the inclusion function $i:A\to B$  are two different functions.

Relational definition: A function is a functional relation.  Then the identity and inclusion functions are the same function.  This means that a function and its graph are the same thing (discussed in the draft article).

These definitions are subject to variations:

Variations in the strict definition: Some authors use "range" for "codomain" in the definition, and some don't make it clear that two functions with the same functional relation but different codomains are different functions.

Variations in the relational definition: Most such definitions state explicitly that the domain and range are determined by the relation (the set of first coordinates and the set of second coordinates). 

Formalism

There are many other variations in the formalism used in the definition.  For example, the strict definition can be formalized (as in Wikipedia) as an ordered triple $(A, B, f)$ where $A$ and $B$ are sets and $f$ is a functional relation with the property thar every element of $A$ is the first element of an ordered pair in the relation.  

You could of course talk about an ordered triple $(A,f,B)$ blah blah.  Such definitions introduce arbitrary constructions that have properties irrelevant to the concept of function.  Would you ever say that the second element of the function $f(x)=x+1$ on the reals is the set of real numbers?  (Of course, if you used the formalism $(A,f,B)$ you would have to say the second element of the function is its graph! )

It is that kind of thing that led me to use a specification instead of a definition.  If you pay attention to such irrelevant formalism there seems to be many definitions of function.  In fact, at the university level there are only two, the strict definition and the relational definition.  The usage varies by discipline and age.  Younger mathematicians are more likely to use the strict definition.  Topologists use the strict definition more often than analysts (I think).

Usage

There is also variation in usage.

  • Most authors don't tell you which definition they use, and it often doesn't matter anyway. 
  • If an author defines a function using a formula, there is commonly an implicit assumption that the domain includes everything for which the formula is well-defined.  (The "everything" may be modified by referring to it as an integer, real, or complex function.)

Definitions of function on the web

Below are some definitions of function that appear on the web.  I have excluded most definitions aimed at calculus students or below; they often assume you are talking about numbers and formulas.  I have not surveyed textbooks and research papers.  That would have to be done for a proper scholarly article about mathematical usage of "function". But most younger people get their knowledge from the web anyway.

  1. Abstractmath draft article: Functions: Specification and Definition.  (Note:  Right now you can't get to this from the Table of Contents; you have to click the preceding link.) 
  2. Gyre&Gimble post: Definition of "function"
  3. Intmath discussion of function  Function as functional relation between numbers, with induced domain and range.
  4. Mathworld definition of function Functional-relation definition.  Defines $F:A\to B$ in a way that requires $B$ to be the image.
  5. Planet Math definition of function Strict definition.
  6. Prime Encyclopedia of Mathematics Functional-relation definition.
  7. Springer Encyclopedia of Math definition of function  Strict definition, except not clear if different codomains mean different functions.
  8. Wikipedia definition of function Discusses both definitions.
  9. Wisconsin Department of Public Instruction Definition of function  Function as functional relation.
Send to Kindle

Offloading chunking

In my previous post I wrote about the idea of offloading abstraction, the sort of things we do with geometric figures, diagrams (that post emphasized manipulable diagrams), drawing the tree of an algebraic expression, and so on.  This post describes a way to offload chunking.  

Chunking

I am talking about chunking in the sense of encapsulation, as some math ed. people use it.  I wrote about it briefly in [1], and [2] describes the general idea.  I don't have a good math ed reference for it, but I will include references if readers supply them.  

Chunking for some educators means breaking a complicated problem down into pieces and concentrating on them one by one.  That is not really the same thing as what I am writing about.  Chunking as I mean it enables you to think more coherently and efficiently about a complicated mathematical structure by objectifying some of the data in the structure.  

Project 

This project an example of how chunking could be made visible in interactive diagrams, so that the reader grasps the idea of chunking.  I guess I am chunking chunking.  

Here is a short version of an example of chunking worked out in ridiculous detail in reference [1]. 

Let \[f(x)=.0002{{\left( \frac{{{x}^{3}}-10}{3{{e}^{-x}}+1} \right)}^{6}}\]  How do I know it is never negative?  Well, because it has the form (a positive number)(times)(something)$^6$.    Now (something)$^6$ is ((something)$^3)^2$ and a square is always nonnegative, so the function is (positive)(times)(nonnegative), so it has to be nonnegative.  

I recognized a salient fact about .0002, namely that it was positive: I grayed out (in my mind) its exact value, which is irrelevant.  I also noticed a salient fact about \[{{\left( \frac{{{x}^{3}}-10}{3{{e}^{-x}}+1} \right)}^{6}}\] namely that it was (a big mess that I grayed out)(to the 6th power).  And proceeded from there.  (And my chunking was inefficient; for example, it is more to the point that .0002 is nonnegative).

I believe you could make a movie of chunking like this using Mathematica CDF.  You would start with the formula, and then as the voiceover said "what's really important is that .0002 is nonnegative" the number would turn into a gray cloud with a thought balloon aimed at it saying "nonnegative".  The other part would turn into a gray cloud to the sixth, then the six would break into 3 times 2 as the voice comments on what is happening.  

It would take a considerable amount of work to carry this out.  Lots of decisions would need to be made.  

One problem is that Mathematica doesn't provide a way to do voiceovers directly (as far as I know).  Perhaps you could make a screen movie using screenshot software in real time while you talked and (offscreen) pushed buttons that made the various changes happen.

You could also do it with print instead of voiceover, as I did in the example in this post. In this case you need to arrange to have the printed part and the diagram simultaneously visible.  

I may someday try my hand at this.  But I would encourage others to attack this project if it interests them.  This whole blog is covered by the Creative Commons Attribution – ShareAlike 3.0 License", which means you may use, adapt and distribute the work freely provided you follow the requirements of the license.

I have other projects in mind that I will post separately.

References

  1. Abstractmath article on chunking.
  2. Wikipedia on chunking
Send to Kindle

Offloading abstraction

The interactive examples in this post require installing Wolfram CDF Player., which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook Tangent Line.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.


The diagram above shows you the tangent line to the curve $y=x^3-x$ at a specific point.  The slider allows you to move the point around, and the tangent line moves with it. You can click on one of the plus signs for options about things you can do with the slider.  (Note: This is not new.  Many other people have produced diagrams like this one.)

I have some comments to make about this very simple diagram. I hope they raise your consciousness about what is going on when you use a manipulable demonstration.

Farming out your abstraction load

A diagram showing a tangent line drawn on the board or in a paper book requires you visualize how the tangent line would look at other points.  This imposes a burden of visualization on you.  Even if you are a new student you won't find that terribly hard (am I wrong?) but you might miss some things at first:

  • There are places where the tangent line is horizontal.
  • There are places where some of the tangent lines cross the curve at another point. Many calculus students believe in the myth that the tangent line crosses the curve at only one point.  (It is not really a myth, it is a lie.  Any decent myth contains illuminating stories and metaphors.)
  • You may not envision (until you have some experience anyway) how when you move the tangent line around it sort of rocks like a seesaw.

You see these things immediately when you manipulate the slider.

Manipulating the slider reduces the load of abstract thinking in your learning process.     You have less to keep in your memory; some of the abstract thinking is offloaded onto the diagram.  This could be described as contracting out (from your head to the picture) part of the visualization process.  (Visualizing something in your head is a form of abstraction.)

Of course, reading and writing does that, too.  And even a static graph of a function lowers your visualization load.  What interactive diagrams give the student is a new tool for offloading abstraction.

You can also think of it as providing external chunking.  (I'll have to think about that more…)

Simple manipulative diagrams vs. complicated ones

The diagram above is very simple with no bells and whistles.  People have come up with much more complicated diagrams to illustrate a mathematical point.  Such diagrams:

  • May give you buttons that give you a choice of several curves that show the tangent line.
  • May give a numerical table that shows things like the slope or intercept of the current tangent line.
  • May also show the graph of the derivative, enabling you to see that it is in fact giving the value of the slope.

Such complicated diagrams are better suited for the student to play with at home, or to play with in class with a partner (much better than doing it by yourself).  When the teacher first explains a concept, the diagrams ought to be simple.

Examples

  • The Definition of derivative demo (from the Wolfram Demonstration Project) is an example that provides a table that shows the current values of some parameters that depend on the position of the slider.
  • The Wolfram demo Graphs of Taylor Polynomials is a good example of a demo to take home and experiment extensively with.  It gives buttons to choose different functions, a slider to choose the expansion point, another one to choose the number of Taylor polynomials, and other things.
  • On the other hand, the Wolfram demo Tangent to a Curve is very simple and differs from the one above in one respect: It shows only a finite piece of the tangent line.  That actually has a very different philosophical basis: it is representing for you the stalk of the tangent space at that point (the infinitesimal vector that contains the essence of the tangent line).
  • Brian Hayes wrote an article in American Scientist containing a moving graph (it moves only  on the website, not in the paper version!) that shows the changes of the population of the world by bars representing age groups.  This makes it much easier to visualize what happens over time.  Each age group moves up the graph — and shrinks until it disappears around age 100 — step by step.  If you have only the printed version, you have to imagine that happening.  The printed version requires more abstract visualization than the moving version.
  • Evaluating an algebraic expression requires seeing the abstract structure of the expression, which can be shown as a tree.  I would expect that if the students could automatically generate the tree (as you can in Mathematica)  they would retain the picture when working with an expression.  In my post computable algebraic expressions in tree form I show how you could turn the tree into an evaluation aid.  See also my post Syntax trees.

This blog has a category "Mathematica" which contains all the graphs (many of the interactive) that are designed as an aid to offloading abstraction.

Send to Kindle

Abstract objects

Some thoughts toward revising my article on mathematical objects.  

Mathematical objects are a kind of abstract object.  There are lots of abstract objects that are not mathematical objects,  For example, if you keep a calendar or schedule for appointments, that calendar is an abstract object.  (This example comes from [2]). 

It may be represented as a physical object or you may keep it entirely in your head.  I am not going to talk about the latter possibility, because I don't know what to say.

  1. If it is a paper calendar, that physical object represents the information that is contained in your calendar.  
  2. Same for a calendar on a computer, but that is stored as magnetic bits on a disk or in flash memory. A computer program (part of the operating system) is required to present it on the screen in such a way that you can read it.  Each time you open it, you get a new physical representation of the calendar.

Your brain contains a module (see [5], [7]) that interprets the representation in (1) or (2) and which has connections with other modules in your brain for dates, times, locations and whether the appointment is for a committee, a medical exam, or whatever.  

The calendar-interpreter module in your brain is necessary for the physical object to be a calendar.  The physical object is not in itself your calendar.  The calendar in this sense does not exist in the physical world.  It is abstract.  Since we think of it as a thing, it is an abstract object.

The abstract object "my calendar" affects the physical world (it causes you to go to the dentist next Tuesday).  The relation of the abstract object to the physical world is mediated by whatever physical object you call your calendar along with the modules in the brain that relate to it.  The modules in the brain are actions by physical objects, so this point of view does not involve Cartesian style dualism.

Note:  A module is a meme.  Are all memes modules?  This needs to be investigated.  Whatever they are, they exist as physical objects in people's brains.

Mathematical objects

A rigorous proof of a theorem about a mathematical object tends to refer to the object as if it were absolutely static and did not affect anything in the physical world.  I talked about this in [10], where I called it the dry bones representation of a mathematical object.  Mathematical objects don't have to be thought of this way, but (I suggest) what makes them mathematical objects is that they can be thought of in dry bones mode.  

If you use calculus to figure out how much fuel to use in a rocket to make it go a mile high, then actually use that amount in the rocket and send it off, your calculations have affected your physical actions, so you were thinking of the calculations as an abstract object.  But if you sit down to check your calculations, you concentrate on the steps one by one with the rules of algebra and calculus in mind.  You are looking at them as inert objects, like you would look at a bone of a dinosaur to see what species it belongs to. From that point of view your calculations form a mathematical object, because you are using the dry-bones approach.

Caveat

All this blather is about how you should think about mathematical objects.  It can be read as philosophy, but I have no intention of defending it as philosophy.  People learning abstract math at college level have a lot of trouble thinking about mathematical objects as objects, and my intention is to start clarifying some aspects of how you think about them in different circumstances.  (The operative word is "start" — there is a lot more to be said.)

About the exposition of this post (a commercial)

You will notice that I gave examples of abstract objects but did not define the word "abstract object".  I did the same with mathematical objects.  In both cases, I put the word "abstract object" or "mathematical object" in boldface at a suitable place in the exposition.

That is not the way it is done in math, where you usually make the definition of a word in a formal way, marking it as Definition, putting the word in bold or italics, and listing the attributes it must have.  I want to point out two things:

  • For the most part, that behavior is peculiar to mathematics.
  • This post is not a presentation of mathematical ideas.  

This gives me an opportunity for a commercial:  Read what we have written about definitions in References [1], [3] and [4].

References

  1. Atish Bagchi and Charles Wells, Varieties of Mathematical Prose, 1998.
  2. Reuben Hersh, What is mathematics, really? Oxford University Press, 1997
  3. Charles Wells, Handbook of Mathematical Discourse.
  4. Charles Wells, Mathematical objects in abstractmath.org
  5. Math and modules of the mind (previous post)
  6. Mathematical Concepts (previous post)
  7. Thinking about abstract math (previous post)
  8. Terrence W. Deacon, Incomplete Nature.  W. W. Norton, 2012. [I have read only a little of this book so far, but I think he is talking about abstract objects in the sense I have described above.]
  9. Gideon Rosen, Abstract Objects.  Stanford Encyclopedia of Philosophy.
  10. Representations II: Dry Bones (previous post)

 

http://plato.stanford.edu/entries/abstract-objects/

Send to Kindle

Whole numbers

Sue Van Hattum wrote in response to a recent post:

I’d like to know what you think of my ‘abuse of terminology’. I teach at a community college, and I sometimes use incorrect terms (and tell the students I’m doing so), because they feel more aligned with common sense.

To me, and to most students, the phrase “whole numbers” sounds like it refers to anything that doesn’t need fractions to represent it, and should include negative numbers. (It then, of course, would mean the same thing that the word integers does.) So I try to avoid the phrase, mostly. But I sometimes say we’ll use it with the common sense meaning, not the official math meaning.

Her comments brought up a couple of things I want to blather about.

Official meaning

There is no such thing as an "official math meaning".  Mathematical notation has no governing authority and research mathematicians are too ornery to go along with one anyway.  There is a good reason for that attitude:  Mathematical research constantly causes us to rethink the relationship among different mathematical ideas, which can make us want to use names that show our new view of the ideas.  An excellent example of that is the evolution of the concept of "function" over the past 150 years, traced in the Wikipedia article.

What some "authorities" say about "whole number":

  • MathWorld  says that "whole number" is used to mean any of these:  Any positive integer, any nonnegative integer or any integer.
  • Wikipedia also allows all three meanings.
  • Webster's New World dictionary (of which I have been a consultant, but they didn't ask me about whole numbers!) gives "any integer" as a second meaning.
  • American Heritage Dictionary give "any integer" as the only meaning.
  • Someone stole my copy of Merriam Webster.

Common Sense Meaning

Mathematicians think about and talk any particular kind of math object using images and metaphors.  Sometimes (not very often) the name they give to a math object embodies a metaphor.  Examples:

  • A complex number is usually notated using two real parameters, so it looks more complicated than a real number.
  • "Rings" were originally called that because the first examples were integers (mod n) for some positive integer, and you can think of them as going around a clock showing n hours.

Unfortunately, much of the time the name of a kind of object contains a suggestive metaphor that is bad,  meaning that it suggests an erroneous picture or idea of what the object is like.

  • A "group" ought to be a bunch of things.  In other words, the word ought to mean "set".
  • The word "line" suggests that it ought to be a row of points.  That suggests that each point on a line ought to have one next to it.  But that's not true on the "real line"!

Sue's idea that the "common sense" meaning of "whole number" is "integer" refers, I think, to the built-in metaphor of the phrase "whole number" (unbroken number).

I urge math teachers to do these things:

  • Explain to your students that the same math word or phrase can mean different things in different books.
  • Convince your  students to avoid being fooled by the common-sense (metaphorical meaning) of a mathematical phrase.

 

Send to Kindle

Mathematical usage

Comments about mathematical usage, extending those in my post on abuse of notation.

Geoffrey Pullum, in his post Dogma vs. Evidence: Singular They, makes some good points about usage that I want to write about in connection with mathematical usage.  There are two different attitudes toward language usage abroad in the English-speaking world. (See Note [1])

  • What matters is what people actually write and say.   Usage in this sense may often be reported with reference to particular dialects or registers, but in any case it is based on evidence, for example citations of quotations or a linguistic corpus.  (Note [2].)  This approach is scientific.
  • What matters is what a particular writer (of usage or style books) believes about  standards for speaking or writing English.  Pullum calls this "faith-based grammar".  (People who think in this way often use the word "grammar" for usage.)  This approach is unscientific.

People who write about mathematical usage fluctuate between these two camps.

My writings in the Handbook of Mathematical Discourse and abstractmath.org are mostly evidence based, with some comments here and there deprecating certain usages because they are confusing to students.  I think that is about the right approach.  Students need to know what is actual mathematical usage, even usage that many mathematicians deprecate.

Most math usage that is deprecated (by me and others) is deprecated for a reason.  This reason should be explained, and that is enough to stop it being faith-based.  To make it really scientific you ought to cite evidence that students have been confused by the usage.  Math education people have done some work of this sort.  Most of it is at the K-12 level, but some have worked with college students observing the way the solve problems or how they understand some concepts, and this work often cites examples.

Examples of usage to be deprecated

 

Powers of functions

f^n(x) can mean either iterated composition or multiplication of the values.  For example, f^2(x) can mean f(x)f(x) or f(f(x)).  This is exacerbated by the fact that in undergrad calculus texts,  \sin^{-1}x refers to the arcsine, and \sin^2 x refers to \sin x\sin x.  This causes innumerable students trouble.  It is a Big Deal.

In

Set "in" another set.  This is discussed in the Handbook.  My impression is that for students the problem is that they confuse "element of" with "subset of", and the fact that "in" is used for both meanings is not the primary culprit.  That's because most sets in practice don't have both sets and non-sets as elements.  So the problem is a Big Deal when students first meet with the concept of set, but the notational confusion with "in" is only a Small Deal.

Two

This is not a Big Deal.  But I have personally witnessed students (in upper level undergrad courses) that were confused by this.

Parentheses

The many uses of parentheses, discussed in abstractmath.  (The Handbook article on parentheses gives citations, including one in which the notation "(a,b)" means open interval once and GCD once in the same sentence!)  I think the only part that is a Big Deal, or maybe Medium Deal, is the fact that the value of a function f at an input x can be written either  "f\,x" or as "f(x)".  In fact, we do without the parentheses when the name of the function is a convention, as in \sin x or \log x, and with the parentheses when it is a variable symbol, as in "f(x)".  (But a substantial minority of mathematicians use f\,x in the latter case.  Not to mention xf.)  This causes some beginning calculus students to think "\sin x" means "sin" times x.

More

The examples given above are only a sampling of troubles caused by mathematical notation.   Many others are mentioned in the Handbook and in Abstractmath, but they are scattered.  I welcome suggestions for other examples, particularly at the college and graduate level. Abstractmath will probably have a separate article listing the examples someday…

Notes

[1] The situation Pullum describes for English is probably different in languages such as Spanish, German and French, which have Academies that dictate usage for the language.  On the other hand, from what I know about them most speakers of those languages ignore their dictates.

[2] Actually, they may use more than one corpus, but I didn't want to write "corpuses" or "corpora" because in either way I would get sharp comments from faith-based usage people.

References on mathematical usage

Bagchi, A. and C. Wells (1997), Communicating Logical Reasoning.

Bagchi, A. and C. Wells (1998)  Varieties of Mathematical Prose.

Bullock, J. O. (1994), ‘Literacy in the language of mathematics’. American Mathematical Monthly, volume 101, pages 735743.

de Bruijn, N. G. (1994), ‘The mathematical vernacular, a language for mathematics with typed sets’. In Selected Papers on Automath, Nederpelt, R. P., J. H. Geuvers, and R. C. de Vrijer, editors, volume 133 of Studies in Logic and the Foundations of Mathematics, pages 865  935.  

Epp, S. S. (1999), ‘The language of quantification in mathematics instruction’. In Developing Mathematical Reasoning in Grades K-12. Stiff, L. V., editor (1999),  NCTM Publications.  Pages 188197.

Gillman, L. (1987), Writing Mathematics Well. Mathematical Association of America

Higham, N. J. (1993), Handbook of Writing for the Mathematical Sciences. Society for Industrial and Applied Mathematics.

Knuth, D. E., T. Larrabee, and P. M. Roberts (1989), Mathematical Writing, volume 14 of MAA Notes. Mathematical Association of America.

Krantz, S. G. (1997), A Primer of Mathematical Writing. American Mathematical Society.

O'Halloran, K. L.  (2005), Mathematical Discourse: Language, Symbolism And Visual Images.  Continuum International Publishing Group.

Pimm, D. (1987), Speaking Mathematically: Communications in Mathematics Classrooms.  Routledge & Kegan Paul.

Schweiger, F. (1994b), ‘Mathematics is a language’. In Selected Lectures from the 7th International Congress on Mathematical Education, Robitaille, D. F., D. H. Wheeler, and C. Kieran, editors. Sainte-Foy: Presses de l’Université Laval.

Steenrod, N. E., P. R. Halmos, M. M. Schiffer, and J. A. Dieudonné (1975), How to Write Mathematics. American Mathematical Society.

Wells, C. (1995), Communicating Mathematics: Useful Ideas from Computer Science.

Wells, C. (2003), Handbook of Mathematical Discourse

Wells, C. (ongoing), Abstractmath.org.

Send to Kindle

Two

The post Are these questions unambiguous? in the blog Explaining Mathematics concerns the funny way mathematicians use the number “two” (Note [3]).  This is discussed in Abstractmath.org, based on usage quotations (see Note [1]) in the Handbook of Mathematical Discourse. They are citations  54, 119, 220, 229, 260, 322, 323 and 338.  The list is in the online version of the Handbook (see Note [2]) which takes forever to load.  (There is a separate file for users of the paperback book but it is currently trashed.)

The usage quirk concerning “two” is exemplified by statements such as these:

  1. The sum of any two even integers is even.
  2. Courant gives Leibniz’ rule for finding the Nth derivative of the product of two functions.  (This is from Citation 323.)
  3. Are there two positive integers m and n, both greater than 1, satisfying mn=9? (This is from Explaining Mathematics.)

Statements 1 and 2 are of course true.  They are still true if the “two” things are the same.  Mathematicians generally assume that such a statement includes the case where the two things are the same.  If the case that they are the same is excluded, the statement becomes an unnecessarily weak assertion.

Statement 3, in my opinion, is badly written.  If the two positive integers have to be distinct, the answer is “no”.   I think any competent mathematical writer would write something like, “There are not two distinct integers m and n both greater than 1 for which mn = 9″.

It is fair to say that when mathematicians refer to “two integers” in statements like these, they are allowed to be the same.  If they can’t be the same for the sentence to remain true, they will (or at least should) insert a word such as “distinct”.

Of course, in some sentences the two integers can’t be the same because of some condition imposed in the context.  That doesn’t happen in the citations I have listed.  Maybe someone can contribute an example.

Notes

[1] In the Handbook, usage quotations are called “citations”.  It appears to me that the commonest name for citations among lexicographers is “usage quotations”, so I will start calling them that.

[2] I created the online version of the Handbook hastily in 2006.  It needs work, since it has TeX mistakes (which may irritate you but should not interfere with readability) and omits the quotations, illustrations, and some backlinks, including backlinks for the citations.  Some Day When I Get A Round Tuit…

[3] This funny property of “two” was discussed many years ago by Steenrod or Knuth or someone, and is mentioned in a paper by Susanna Epp, but I don’t currently have access to any of the references.

 

Send to Kindle

Abuse of notation

I have recently read the Wikipedia article on Abuse of Notation (this link is to the version of 29 December 2011, since I will eventually edit it).  The Handbook of Mathematical Discourse and abstractmath.org mention this idea briefly.  It is time to expand the abstractmath article and to redo parts of the Wikipedia article, which  contains some confusions.

This is a preliminary draft, part of which I’ll incorporate into abstractmath after you readers make insightful comments :).

The phrase “Abuse of Notation” is used in articles and books written by research mathematicians.  It is part of Mathematical English.  This post is about

  • What “abuse of notation” means in mathematical writing and conversation.
  • What it could be used to mean.
  • Mathematical usage in general.  I will discuss this point in the context of the particular phrase “abuse of notation”, not a bad way to talk about a subject.

Mathematical Usage

Sources

If I’m going to write about the usage of Mathematical English, I should ideally verify what I claim about the usage by finding citations for a claim: documented quotations that illustrate the usage.  This is the standard way to produce any dictionary.

There is no complete authoritative source for usage of words and phrases in Mathematical English (ME), or for that matter for usage in the Symbolic Language (SL).

  • The Oxford Concise Dictionary of Mathematics [2] covers technical terms and symbols used in school math and in much of undergraduate math, but not so much of research math.  It does not mention being based on citations and it hardly talks about usage at all, even for notorious student-confusing notations such as “\sin^k x“. But it appears quite accurate with good explanations of the math it covers.
  • I wrote Handbook of Mathematical Discourse to stimulate investigations into mathematical usage.  It describes a good many usages in Mathematical English and the Symbolic Language, documented with citations of quotations, but is quite incomplete (as I said in its Introduction).  The Handbook has 428 citations for various usages.  (They are at the end of the on-line PDF version. They are not in the printed book, but are on the web with links to pages in the printed book.)
  • MathWorld has an extensive list of mathematical words, phrases and symbols, and accurate definitions or descriptions of them, even for a great many advanced research topics. It also frequently mentions usage (see formula and inverse sine), but does not give citations.
  • Wikipedia has the most complete set of definitions of mathematical objects that I know of.  The entries sometimes mention usage. I have not detected any entry that gives citations for usage.  Not that that should stop anyone from adding them.

Teaching mathematical usage

In explaining mathematical usage to students, particularly college-level or higher math students, you have choices:

  1. Tell them what you think the usage of a word, phrase, or symbol is, without researching citations.
  2. Tell them what you think the usage ought to be.
  3. Tell them what you think the usage is, supported by citations.

(1) has the problem that you can be wrong.  In fact when I worked on the Handbook I was amazed  at how wrong I could be in what the usage was, in spite of the fact that I had been thinking about usage in ME and SL since I first started teaching (and kept a folder of what I had noticed about various usages).  However,  professional mathematicians generally have a reasonably accurate idea about usage for most things, particularly in their field and in undergraduate courses.

(2) is dangerous.  Far too many mathematicians (but nevertheless a minority), introduce usage in articles and lecturing that is not common or that they invented themselves. As a result their students will be confused in trying to read other sources and may argue with other teachers about what is “correct”.  It is a gross violation of teaching ethics to tell the students that (for example) “x > 0″ allows x = 0 and not mention to them that nearly all written mathematics does not allow that.  (Did you know that a small percentage of mathematicians and educators do use that meaning, including in some secondary institutions in some countries?  It is partly Bourbaki’s fault.)

(3) You often can’t tell them what the usage is, supported by citations, because, as mentioned above, documented mathematical usage is sparse.

I think people should usually choose (1) instead of (2).  If they do want to introduce a new usage or notation because it is “more logical” or because “my thesis advisor used it” or something, they should reconsider.  Most such attempts have failed, and thousands of students have been confused by the attempts.

Abuse of notation

“Abuse of notation” is a phrase used in mathematical writing to describe terminology and notation that does not have transparent meaning. (Transparent meaning is described in some detail under “compositional” in the Handbook.)

Abuse of notation was originally defined in French, where the word “abus” does not carry the same strongly negative connotation that it does in English.

Suppression of parameters

One widely noticed practice called “abuse of notation”  is the use of the name of the underlying set of a mathematical structure to refer to a structure. For example, a group is a structure (G,\text{*}) where G is a set and * is a binary operation with certain properties. The most common way to refer to this structure is simply to call it G. Since any set of cardinality greater than 1 has more than one group structure on it, this does not include all the information needed to determine the group. This type of usage is cited in 82 below.  It is an example of suppression of parameters.

Writing “\log x” without mentioning the base of the logarithm is also an example of suppression of parameters.  I think most mathematicians would regard this as a convention rather than as an abuse of notation.  But I have no citations for this (although they would probably be easy to find).  I doubt that it is possible to find a rational distinction between “abuse of notation” and “convention”; it is all a matter of what people are used to saying.

Synecdoche

The naming of a structure by using the name of its underlying set is also an example of synecdoche, the naming of a whole by a part (for example, “wheels” to mean a car).

Another type of synecdoche that has been called abuse of notation is referring to an equivalence class by naming one of its elements.  I do not have a good quotation-citation that shows this use.  Sometimes people write 2 + 4 = 1 when they are working in the Galois field with 5 elements.  But that can be interpreted in more than one way.  If GF[5] consists of equivalence classes of integers (mod 5) then they are indeed using 2 (for example) to stand for the equivalence class of 2.  But they could instead define GF[5] in the obvious way with underlying set {0,1,2,3,4}.  In any case, making distinctions of that sort is pedantic, since the two structures are related by a natural isomorphism (next paragraph!)

Identifying objects via isomorphism

This is quite commonly called “abuse of notation” and is exemplified in citations 209, 395 and AB3.

Overloaded notation

John Harrison, in [1], uses “abuse of notation” to describe the use of a function symbol to apply to both an element of its domain and a subset of the domain.  This is an example of overloaded notation.  I have not found another citation for this usage other than Harrison and I don’t remember anyone using it.  Another example of overloaded notation is the use of the same symbol “\times” for multiplication of numbers, matrices and 3-vectors.  I have never heard that called abuse of notation.  But I have no authority to say anything about this usage because I haven’t made the requisite thorough search of the literature.

Powers of functions

The Wikipedia Article on abuse of notation (29 Dec 2011 version) mentions the fact that f^2(x) can mean either f(x)f(x) or f(f(x)).   I have never heard this called abuse of notation and I don’t think it should be called that.  The notation “f^2(x)” can in ordinary usage mean one of two things and the author or teacher should say which one they mean.  Many math phrases or symbolic expressions  can mean more than one thing and the author generally should say which.  I don’t see the point of calling this phenomenon abuse of notation.

Radial concept

The Wikipedia article mentions phrases such as “partial function”.  This article does provide a citation for Bourbaki for calling a sentence such as “Let f:A\to B be a partial function” abuse of notation.  Bourbaki is wrong in a deep sense (as the article implies).  There are several points to make about this:

  • Some authors, particularly in logic, define a function to be what most of us call a partial function.  Some authors  require a ring to have a unit and others don’t.  So what?
  • The phrase “partial function” has a standard meaning in math:  Roughly “it is a function except it is defined on only part of its domain”.  Precisely, f:A\to B is a partial function if it is a function f:A'\to B for some subset A' of A.
  • A partial function is not in general a function.  A stepmother is not a mother.  A left identity may not be an identity, but the phrase “left identity” is defined precisely.   An incomplete proof is not a proof, but you know what the phrase means! (Compare “expectant mother”).   This is the way we normally talk and think.  See the article “radial concept” in the Handbook.

Other uses

AB4 involves a redefinition of  “\in” in a special case.  Authors redefine symbols all the time.  This kind of redefinition on the fly probably should be avoided, but since they did it I am glad they mentioned it.

I have not talked about some of the uses mentioned in the Wikipedia article because I don’t yet understand them well enough.  AB1 and AB2 refer to a common use with pullback that I am not sure I understand (in terms of how they author is thinking of it).  I also don’t understand AB5.  Suggestions from readers would be appreciated.

Kill it!

Well, it’s more polite to say, we don’t need the phrase “abuse of notation” and it should be deprecated.

  • The use of the word “abuse” makes it sound like a bad thing, and most instances of abuse of notation are nothing of the sort.  They make mathematical writing much more readable.
  • Nearly everywhere it is used it could just as well be called a convention.  (This requires verification by studying math texts.)

Citations

The first three citations at in the Handbook list; the numbers refer to that list’s numbering. The others I searched out for the purpose of this post.

82. Busenberg, S., D. C. Fisher, and M. Martelli (1989), Minimal periods of discrete and smooth orbits. American Mathematical Monthly, volume 96, pages 5–17. [p. 8. Lines 2–4.]

Therefore, a normed linear space is really a pair (\mathbf{E},\|\cdot\|) where \mathbf{E} is a linear vector space and \|\cdot\|:\mathbf{E}\to(0,\infty) is a norm. In speaking of normed spaces, we will frequently abuse this notation and write \mathbf{E} instead of the pair (\mathbf{E},\|\cdot\|).

209. Hunter, T. J. (1996), On the homology spectral sequence for topological Hochschild homology. Transactions of the American Mathematical Society, volume 348, pages 3941–3953. [p. 3934. Lines 8–6 from bottom.]

We will often abuse notation by omitting mention of the natural isomorphisms making \wedge associative and unital.

395. Teitelbaum, J. T. (1991), ‘The Poisson kernel for Drinfeld modular curves’. Journal of the American Mathematical Society, volume 4, pages 491–511. [p. 494. Lines 1–4.]

\ldots may find a homeomorphism x:E\to \mathbb{P}^1_k such that \displaystyle x(\gamma u) = \frac{ax(u)+b}{cx(u)+d}. We will tend to abuse notation and identify E with \mathbb{P}^1_k by means of the function x.

AB1. Fujita, T. On the structure of polarized manifolds with total deficiency one.  I. J. Math. Soc. Japan, Vol. 32, No. 4, 1980.

Here we show examples of symbols used in this paper \ldots

L_{T}: The pull back of L to a space T by a given morphism T\rightarrow S . However, when there is no danger of confusion, we OFTEN write L instead of L_T by abuse of notation.

AB2. Sternberg, S. Minimal coupling and the symplectic mechanics of a classical
particle in the presence of a Yang-Mills field. Physics, Vol. 74, No. 12, pp. 5253-5254, December 1977.

On the other hand, let us, by abuse of notation, continue to write \Omega for the pullback of \Omega from F to P \times F by projection onto the second factor. Thus, we can write \xi_Q\rfloor\Omega = \xi_F\rfloor\Omega and \ldots

AB3. Dobson, D, and Vogel, C. Convergence of an iterative method for total variation denoising. SIAM J. Numer. Anal., Vol. 34, pp. 1779, October, 1997.

Consider the approximation

(3.7) u\approx U\stackrel{\text{def}}{=}\sum_{j=1}^N U_j\phi_j \ldots

In an abuse of notation, U will represent both the coefficient vector \{U_j\}_{j=1}^N and the corresponding linear combination (3.7).

AB4. Lewis, R, and Torczon, V. Pattern search algorithms for bound constrained minimization.  NASA Contractor Report 198306; ICASE Report No. 96-20.

By abuse of notation, if A is a matrix, y\in A means that the vector y is a column of A.

AB5. Allemandi, G, Borowiecz, A. and Francaviglia, M. Accelerated Cosmological Models in Ricci squared Gravity. ArXiv:hep-th/0407090v2, 2008.

This allows to reinterpret both f(S) and f'(S) as functions of \tau in the expressions:
\begin{equation*}\begin{cases}  f(S) = f(F(\tau)) = f(\tau )\\  f'(S) = f'(F(\tau )) = f'(\tau )\end{cases}\end{equation*}
following the abuse of notation f(F(t )) = f(t ) and f'(F(t )) = f'(t ).

References

[1] Harrison, J. Criticism and reconstruction, in Formalized Mathematics (1996).

[2] Clapham, C. and J. Nicholson.  Oxford Concise Dictionary of Mathematics, Fourth Edition (2009).  Oxford University Press.

 

Send to Kindle

More about defining “category”


In a recent post, I wrote about defining “category” in a way that (I hope) makes it accessible to undergraduate math majors at an early stage.  I have several more things to say about this.

Early intro to categories

The idea is to define a category as a directed graph equipped with an additional structure of composition of paths subject to some axioms.  By giving several small finite examples of categories drawn in that way that gives you an understanding of “category” that has several desirable properties:

  • You get the idea of what a category is in one lecture.
  • With the right choice of examples you get several fine points cleared up:
    • The composition is added structure.
    • A loop doesn’t have to be an identity.
    • Associativity is a genuine requirement —  it is not automatic.
  • You get immediate access to what is by far the most common notation used to work with a category — objects (nodes) and arrows.
  • You don’t have to cope with the difficult chunking required when the first examples given are sets-with-structure and structure-preserving functions.  It’s quite hard to focus on a couple of dots on the paper each representing a group or a topological space and arrows each representing a whole function (not the value of the function!).

Introduce more examples

Then the teacher can go on with the examples that motivated categories in the first place: the big deal categories such as sets, groups and topological spaces.   But they can be introduced using special cases so they don’t require much background.

  • Draw some finite sets and functions between them.  (As an exercise, get the students to find some finite sets and functions that make the picture a category with $f=kh$ as the composite and $f\neq g$.)
  • If the students have had calculus,  introduce them to the category whose objects are real finite nonempty intervals with continuous or differentiable mappings between them.  (Later you can prove that this category is a groupoid!)
  • Find all the groups on a two element set and figure out which maps preserve group multiplication.  (You don’t have to use the word “group” — you can simply show both of them and work out which maps preserve multiplication — and discover isomorphism!.)  This introduces the idea of the arrows being structure-preserving mape. You can get more complicated and use semigroups as well.  If the students know Mathematica you could even do magmas.  Well, maybe not.

All this sounds like a project you could do with high school students.

Large and small

If all this were just a high school (or intro-to-math-for-math-majors) project you wouldn’t have to talk about large vs. small.  However, I have some ideas about approaching this topic.

In the first place, you can define category, or any other mathematical object that might involve a proper class, using the syntactic approach I described in Just-in-time foundations.  You don’t say “A category consists of a set of objects and a set of arrows such that …”.  Instead you say something like “A category $\mathcal{C}$ has objects $A,\,B,\,C\ldots$ such that…”.

This can be understood as meaning “For any $A$, the statement $A$ is an object of  $\mathcal{C}$ is either true or false”, and so on.

This approach is used in the Wikibook on category theory.  (Note: this is a permanent link to the November 28 version of the section defining categories, which is mostly my work.  As always with Wikimedia things it may be entirely different when you read this.)

If I were dictator of the math world (not the same thing as dictator of MathWorld) I would want definitions written in this syntactic style.  The trouble is that mathematicians are now so used to mathematical objects having to be sets-with-structure that wording the definition as I did above may leave them feeling unmoored.  Yet the technique avoids having to mention large vs. small until a problem comes up. (In category theory it sometimes comes up when you want to quantify over all objects.)

The ideas outlined in this subsection could be a project for math majors.  You would have to introduce Russell’s Paradox.  But for an early-on intro to categories you could just use the syntactic wording and avoid large vs. small altogether.

 

http://en.wikibooks.org/w/index.php?title=Category_Theory/Categories&stableid=2221684

Send to Kindle