Category Archives: math

Bugs in English and in math

Everyone knows that computer programs have bugs.  In fact, languages have bugs, too, although we don't usually call them that.  

Bugs in English 

  

Right

Q: "Should I turn left at the next corner?" A: "Right".  Probably most Americans who drive now know this bug.  The answer could mean "yes" or "turn right".  So we have to stop and think how to answer this question.  That makes it a bug.  

Too, two

Comment: " We will take Route 30".  Answer: "We will take Route 30 too".  This bug is probably responsible for the survival of the word "also".  

Note that unlike the case of "right", this is a bug only of spoken English.

Subject and predicate

In Comma rule found dysfunctional, I wrote about the problem that in formal English writing there is no way to indicate where the subject ends and the predicate begins.  This causes a problem reading complicated sentences with many clauses such as academic writing often uses.  Of course, one way around this is to write short, simple sentences!  (That sounds like the subject of a future blog…) 

Bugs in the symbolic language of math

  

Fractions

In both Excel and Mathematica, "1/2*3" means 3/2. Now, I would think "1/2a" means "1/(2a)", but younger mathematicians are taught PEMDAS (see Purplemath), which says that division and multiplication have the same precedence and operations are evaluated from left to right.  

 If in Mathematica you define a function f[a_] := 1/2a, f[3] evaluates to 3/2, so Mathematica (and most other computer languages) agree with PEMDAS. (Note: When you write 1/2a in a Mathematica notebook, it automatically puts a space between the 2 and the a, and space in Mathematica means times, so it does warn you.)

Nevertheless, my ancient education would lead me to write (1/2)a for that meaning.  This means I must learn to write 1/(2a) for the other meaning instead of 1/2a.  

Questions:

  • Did the language really change or was I always "doing it wrong"?  I would like to hear from other ancient mathematicians.  (But I don't know very many who would read blogs or Purplemath.)
  • Should such a phenomenon be called a bug? 

Repeated exponentiation

In Excel, "2^2^3" means $(2^2)^3$, in other words, 64.  In Mathematica, it means $2^{(2^3)}=2^8=256$.  My impression is that most mathematicians expect it to mean $2^{(2^3)}$.  

References: This post in Walking Randomly, my post Mathematical UsageWikipedia's article.  

Exponentiation on functions is ambiguous

If $f:\mathbb{R}\to\mathbb{R}$ is a function, $f^2(x)$ can mean either $f(f(x))$ or $f(x)f(x)$, and both usages are common.  You should tell your students about this because no one is ever going to make one of the usages go away.

A far worse catastrophe is the fact that in calculus books, $\sin^2x=(\sin\,x)(\sin\,x)$ but $\sin^{-1}x=\text{arcsin}\,x$.  I betcha (lived in Minnesota four years now) we could succeed with a campaign to convince calc book publishers to always write $(\sin\,x)^2$ and $\arcsin\,x$.  

Bugs in the Mathematical Dialect of English

The mathematical dialect of English is what I call Mathematical English in the abstractmath website.  It is a different language from the symbolic language, which is not a dialect of English.

I have written about the problems with Mathematical English in a ridiculous number of places.  (See references in The Handbook of Mathematical Discourse).  It is normal for a dialect of a language to use words and grammatical structures that in the original language mean different things.  (See Dialects below).

Words with different meanings

  • A set is a group in standard English, but not in math English.  
  • The number 2+3i is a real number in standard English, but not in math English.  
  • And so on.

Use of adjectives and prefixes

  • A "noncommutative ring" has commutative addition.
  • A "semigroup" has a fully defined binary operation.

If, then

The bug that grabs math newbies by the throat and won't let go is the meaning of "If P, then Q".  

  • "If a number is divisible by 4, then it is even" in math dialect means a number not divisible by 4 might be even anyway.
  • "If you eat your broccoli you will get your dessert" in standard American Parental English does not mean you might get your dessert if you don't eat your broccoli.

And then there is the phenomenon of Vacuous Implication, which leaves students gasping and writhing.

About "dialects"

Most Americans are not familiar with dialects in the sense I am using the word here, since the only really different dialects we have are Gullah and Hawaiian Pidgin, both of which are very hard to understand; although for example Appalachian English and African-American urban vernacular [1] are dialects of a milder sort.  I grew up in Savannah and heard diluted Gullah sometimes on the street (didn't understand much).  I am also rather familiar with Züritüütsch since we lived in Zürich for a year.   

What the rest of the world call dialects have many distinctive properties:

  • They have nonstandard pronunciation to the point where they are difficult to understand. 
  • They have differences in grammar.  (Both Gullah and especially Hawaiian Creole have differences in grammar from Standard English.) 
  • They have differences in vocabulary, enough sometimes to cause misunderstanding.

I grew up speaking an Atlanta dialect, which really did have differences in all those parameters.  But what people today call a Southern accent is really just an accent (minor variations in pronunciation), not a dialect.  

Hawaiian Creole, and possibly Gullah, but not the other dialects I mentioned, are singled out by linguists as creoles because they been modified heavy influence from another language.  Züritüütsch is not a creole, but it is quite difficult for native German-speakers to understand.  The Swiss situation particularly emphasizes the distinction between "dialect" and "accent".  The typical native of Zürich speaks Züritüütsch and also speaks standard German with a Swiss accent.  

Reference

[1] What Language Is (And What It Isn't and What It Could Be) by John H. McWhorter. Gotham, 2011.

 

 

Send to Kindle

An Introduction to Forms

In 2009, I wrote a sequence of posts on this blog explaining the concept of form that I introduced in [1].  I have now updated and combined them into an article [2].  The posts no longer exist on the blog. The article contains links to other papers on forms.

[1] A generalization of the concept of sketch, Theoretical Computer Science 70, 1990.

[2] An Introduction to forms.

Send to Kindle

Abstract objects

Some thoughts toward revising my article on mathematical objects.  

Mathematical objects are a kind of abstract object.  There are lots of abstract objects that are not mathematical objects,  For example, if you keep a calendar or schedule for appointments, that calendar is an abstract object.  (This example comes from [2]). 

It may be represented as a physical object or you may keep it entirely in your head.  I am not going to talk about the latter possibility, because I don't know what to say.

  1. If it is a paper calendar, that physical object represents the information that is contained in your calendar.  
  2. Same for a calendar on a computer, but that is stored as magnetic bits on a disk or in flash memory. A computer program (part of the operating system) is required to present it on the screen in such a way that you can read it.  Each time you open it, you get a new physical representation of the calendar.

Your brain contains a module (see [5], [7]) that interprets the representation in (1) or (2) and which has connections with other modules in your brain for dates, times, locations and whether the appointment is for a committee, a medical exam, or whatever.  

The calendar-interpreter module in your brain is necessary for the physical object to be a calendar.  The physical object is not in itself your calendar.  The calendar in this sense does not exist in the physical world.  It is abstract.  Since we think of it as a thing, it is an abstract object.

The abstract object "my calendar" affects the physical world (it causes you to go to the dentist next Tuesday).  The relation of the abstract object to the physical world is mediated by whatever physical object you call your calendar along with the modules in the brain that relate to it.  The modules in the brain are actions by physical objects, so this point of view does not involve Cartesian style dualism.

Note:  A module is a meme.  Are all memes modules?  This needs to be investigated.  Whatever they are, they exist as physical objects in people's brains.

Mathematical objects

A rigorous proof of a theorem about a mathematical object tends to refer to the object as if it were absolutely static and did not affect anything in the physical world.  I talked about this in [10], where I called it the dry bones representation of a mathematical object.  Mathematical objects don't have to be thought of this way, but (I suggest) what makes them mathematical objects is that they can be thought of in dry bones mode.  

If you use calculus to figure out how much fuel to use in a rocket to make it go a mile high, then actually use that amount in the rocket and send it off, your calculations have affected your physical actions, so you were thinking of the calculations as an abstract object.  But if you sit down to check your calculations, you concentrate on the steps one by one with the rules of algebra and calculus in mind.  You are looking at them as inert objects, like you would look at a bone of a dinosaur to see what species it belongs to. From that point of view your calculations form a mathematical object, because you are using the dry-bones approach.

Caveat

All this blather is about how you should think about mathematical objects.  It can be read as philosophy, but I have no intention of defending it as philosophy.  People learning abstract math at college level have a lot of trouble thinking about mathematical objects as objects, and my intention is to start clarifying some aspects of how you think about them in different circumstances.  (The operative word is "start" — there is a lot more to be said.)

About the exposition of this post (a commercial)

You will notice that I gave examples of abstract objects but did not define the word "abstract object".  I did the same with mathematical objects.  In both cases, I put the word "abstract object" or "mathematical object" in boldface at a suitable place in the exposition.

That is not the way it is done in math, where you usually make the definition of a word in a formal way, marking it as Definition, putting the word in bold or italics, and listing the attributes it must have.  I want to point out two things:

  • For the most part, that behavior is peculiar to mathematics.
  • This post is not a presentation of mathematical ideas.  

This gives me an opportunity for a commercial:  Read what we have written about definitions in References [1], [3] and [4].

References

  1. Atish Bagchi and Charles Wells, Varieties of Mathematical Prose, 1998.
  2. Reuben Hersh, What is mathematics, really? Oxford University Press, 1997
  3. Charles Wells, Handbook of Mathematical Discourse.
  4. Charles Wells, Mathematical objects in abstractmath.org
  5. Math and modules of the mind (previous post)
  6. Mathematical Concepts (previous post)
  7. Thinking about abstract math (previous post)
  8. Terrence W. Deacon, Incomplete Nature.  W. W. Norton, 2012. [I have read only a little of this book so far, but I think he is talking about abstract objects in the sense I have described above.]
  9. Gideon Rosen, Abstract Objects.  Stanford Encyclopedia of Philosophy.
  10. Representations II: Dry Bones (previous post)

 

http://plato.stanford.edu/entries/abstract-objects/

Send to Kindle

Whole numbers

Sue Van Hattum wrote in response to a recent post:

I’d like to know what you think of my ‘abuse of terminology’. I teach at a community college, and I sometimes use incorrect terms (and tell the students I’m doing so), because they feel more aligned with common sense.

To me, and to most students, the phrase “whole numbers” sounds like it refers to anything that doesn’t need fractions to represent it, and should include negative numbers. (It then, of course, would mean the same thing that the word integers does.) So I try to avoid the phrase, mostly. But I sometimes say we’ll use it with the common sense meaning, not the official math meaning.

Her comments brought up a couple of things I want to blather about.

Official meaning

There is no such thing as an "official math meaning".  Mathematical notation has no governing authority and research mathematicians are too ornery to go along with one anyway.  There is a good reason for that attitude:  Mathematical research constantly causes us to rethink the relationship among different mathematical ideas, which can make us want to use names that show our new view of the ideas.  An excellent example of that is the evolution of the concept of "function" over the past 150 years, traced in the Wikipedia article.

What some "authorities" say about "whole number":

  • MathWorld  says that "whole number" is used to mean any of these:  Any positive integer, any nonnegative integer or any integer.
  • Wikipedia also allows all three meanings.
  • Webster's New World dictionary (of which I have been a consultant, but they didn't ask me about whole numbers!) gives "any integer" as a second meaning.
  • American Heritage Dictionary give "any integer" as the only meaning.
  • Someone stole my copy of Merriam Webster.

Common Sense Meaning

Mathematicians think about and talk any particular kind of math object using images and metaphors.  Sometimes (not very often) the name they give to a math object embodies a metaphor.  Examples:

  • A complex number is usually notated using two real parameters, so it looks more complicated than a real number.
  • "Rings" were originally called that because the first examples were integers (mod n) for some positive integer, and you can think of them as going around a clock showing n hours.

Unfortunately, much of the time the name of a kind of object contains a suggestive metaphor that is bad,  meaning that it suggests an erroneous picture or idea of what the object is like.

  • A "group" ought to be a bunch of things.  In other words, the word ought to mean "set".
  • The word "line" suggests that it ought to be a row of points.  That suggests that each point on a line ought to have one next to it.  But that's not true on the "real line"!

Sue's idea that the "common sense" meaning of "whole number" is "integer" refers, I think, to the built-in metaphor of the phrase "whole number" (unbroken number).

I urge math teachers to do these things:

  • Explain to your students that the same math word or phrase can mean different things in different books.
  • Convince your  students to avoid being fooled by the common-sense (metaphorical meaning) of a mathematical phrase.

 

Send to Kindle

Two

The post Are these questions unambiguous? in the blog Explaining Mathematics concerns the funny way mathematicians use the number “two” (Note [3]).  This is discussed in Abstractmath.org, based on usage quotations (see Note [1]) in the Handbook of Mathematical Discourse. They are citations  54, 119, 220, 229, 260, 322, 323 and 338.  The list is in the online version of the Handbook (see Note [2]) which takes forever to load.  (There is a separate file for users of the paperback book but it is currently trashed.)

The usage quirk concerning “two” is exemplified by statements such as these:

  1. The sum of any two even integers is even.
  2. Courant gives Leibniz’ rule for finding the Nth derivative of the product of two functions.  (This is from Citation 323.)
  3. Are there two positive integers m and n, both greater than 1, satisfying mn=9? (This is from Explaining Mathematics.)

Statements 1 and 2 are of course true.  They are still true if the “two” things are the same.  Mathematicians generally assume that such a statement includes the case where the two things are the same.  If the case that they are the same is excluded, the statement becomes an unnecessarily weak assertion.

Statement 3, in my opinion, is badly written.  If the two positive integers have to be distinct, the answer is “no”.   I think any competent mathematical writer would write something like, “There are not two distinct integers m and n both greater than 1 for which mn = 9″.

It is fair to say that when mathematicians refer to “two integers” in statements like these, they are allowed to be the same.  If they can’t be the same for the sentence to remain true, they will (or at least should) insert a word such as “distinct”.

Of course, in some sentences the two integers can’t be the same because of some condition imposed in the context.  That doesn’t happen in the citations I have listed.  Maybe someone can contribute an example.

Notes

[1] In the Handbook, usage quotations are called “citations”.  It appears to me that the commonest name for citations among lexicographers is “usage quotations”, so I will start calling them that.

[2] I created the online version of the Handbook hastily in 2006.  It needs work, since it has TeX mistakes (which may irritate you but should not interfere with readability) and omits the quotations, illustrations, and some backlinks, including backlinks for the citations.  Some Day When I Get A Round Tuit…

[3] This funny property of “two” was discussed many years ago by Steenrod or Knuth or someone, and is mentioned in a paper by Susanna Epp, but I don’t currently have access to any of the references.

 

Send to Kindle

Abuse of notation

I have recently read the Wikipedia article on Abuse of Notation (this link is to the version of 29 December 2011, since I will eventually edit it).  The Handbook of Mathematical Discourse and abstractmath.org mention this idea briefly.  It is time to expand the abstractmath article and to redo parts of the Wikipedia article, which  contains some confusions.

This is a preliminary draft, part of which I’ll incorporate into abstractmath after you readers make insightful comments :).

The phrase “Abuse of Notation” is used in articles and books written by research mathematicians.  It is part of Mathematical English.  This post is about

  • What “abuse of notation” means in mathematical writing and conversation.
  • What it could be used to mean.
  • Mathematical usage in general.  I will discuss this point in the context of the particular phrase “abuse of notation”, not a bad way to talk about a subject.

Mathematical Usage

Sources

If I’m going to write about the usage of Mathematical English, I should ideally verify what I claim about the usage by finding citations for a claim: documented quotations that illustrate the usage.  This is the standard way to produce any dictionary.

There is no complete authoritative source for usage of words and phrases in Mathematical English (ME), or for that matter for usage in the Symbolic Language (SL).

  • The Oxford Concise Dictionary of Mathematics [2] covers technical terms and symbols used in school math and in much of undergraduate math, but not so much of research math.  It does not mention being based on citations and it hardly talks about usage at all, even for notorious student-confusing notations such as “\sin^k x“. But it appears quite accurate with good explanations of the math it covers.
  • I wrote Handbook of Mathematical Discourse to stimulate investigations into mathematical usage.  It describes a good many usages in Mathematical English and the Symbolic Language, documented with citations of quotations, but is quite incomplete (as I said in its Introduction).  The Handbook has 428 citations for various usages.  (They are at the end of the on-line PDF version. They are not in the printed book, but are on the web with links to pages in the printed book.)
  • MathWorld has an extensive list of mathematical words, phrases and symbols, and accurate definitions or descriptions of them, even for a great many advanced research topics. It also frequently mentions usage (see formula and inverse sine), but does not give citations.
  • Wikipedia has the most complete set of definitions of mathematical objects that I know of.  The entries sometimes mention usage. I have not detected any entry that gives citations for usage.  Not that that should stop anyone from adding them.

Teaching mathematical usage

In explaining mathematical usage to students, particularly college-level or higher math students, you have choices:

  1. Tell them what you think the usage of a word, phrase, or symbol is, without researching citations.
  2. Tell them what you think the usage ought to be.
  3. Tell them what you think the usage is, supported by citations.

(1) has the problem that you can be wrong.  In fact when I worked on the Handbook I was amazed  at how wrong I could be in what the usage was, in spite of the fact that I had been thinking about usage in ME and SL since I first started teaching (and kept a folder of what I had noticed about various usages).  However,  professional mathematicians generally have a reasonably accurate idea about usage for most things, particularly in their field and in undergraduate courses.

(2) is dangerous.  Far too many mathematicians (but nevertheless a minority), introduce usage in articles and lecturing that is not common or that they invented themselves. As a result their students will be confused in trying to read other sources and may argue with other teachers about what is “correct”.  It is a gross violation of teaching ethics to tell the students that (for example) “x > 0″ allows x = 0 and not mention to them that nearly all written mathematics does not allow that.  (Did you know that a small percentage of mathematicians and educators do use that meaning, including in some secondary institutions in some countries?  It is partly Bourbaki’s fault.)

(3) You often can’t tell them what the usage is, supported by citations, because, as mentioned above, documented mathematical usage is sparse.

I think people should usually choose (1) instead of (2).  If they do want to introduce a new usage or notation because it is “more logical” or because “my thesis advisor used it” or something, they should reconsider.  Most such attempts have failed, and thousands of students have been confused by the attempts.

Abuse of notation

“Abuse of notation” is a phrase used in mathematical writing to describe terminology and notation that does not have transparent meaning. (Transparent meaning is described in some detail under “compositional” in the Handbook.)

Abuse of notation was originally defined in French, where the word “abus” does not carry the same strongly negative connotation that it does in English.

Suppression of parameters

One widely noticed practice called “abuse of notation”  is the use of the name of the underlying set of a mathematical structure to refer to a structure. For example, a group is a structure (G,\text{*}) where G is a set and * is a binary operation with certain properties. The most common way to refer to this structure is simply to call it G. Since any set of cardinality greater than 1 has more than one group structure on it, this does not include all the information needed to determine the group. This type of usage is cited in 82 below.  It is an example of suppression of parameters.

Writing “\log x” without mentioning the base of the logarithm is also an example of suppression of parameters.  I think most mathematicians would regard this as a convention rather than as an abuse of notation.  But I have no citations for this (although they would probably be easy to find).  I doubt that it is possible to find a rational distinction between “abuse of notation” and “convention”; it is all a matter of what people are used to saying.

Synecdoche

The naming of a structure by using the name of its underlying set is also an example of synecdoche, the naming of a whole by a part (for example, “wheels” to mean a car).

Another type of synecdoche that has been called abuse of notation is referring to an equivalence class by naming one of its elements.  I do not have a good quotation-citation that shows this use.  Sometimes people write 2 + 4 = 1 when they are working in the Galois field with 5 elements.  But that can be interpreted in more than one way.  If GF[5] consists of equivalence classes of integers (mod 5) then they are indeed using 2 (for example) to stand for the equivalence class of 2.  But they could instead define GF[5] in the obvious way with underlying set {0,1,2,3,4}.  In any case, making distinctions of that sort is pedantic, since the two structures are related by a natural isomorphism (next paragraph!)

Identifying objects via isomorphism

This is quite commonly called “abuse of notation” and is exemplified in citations 209, 395 and AB3.

Overloaded notation

John Harrison, in [1], uses “abuse of notation” to describe the use of a function symbol to apply to both an element of its domain and a subset of the domain.  This is an example of overloaded notation.  I have not found another citation for this usage other than Harrison and I don’t remember anyone using it.  Another example of overloaded notation is the use of the same symbol “\times” for multiplication of numbers, matrices and 3-vectors.  I have never heard that called abuse of notation.  But I have no authority to say anything about this usage because I haven’t made the requisite thorough search of the literature.

Powers of functions

The Wikipedia Article on abuse of notation (29 Dec 2011 version) mentions the fact that f^2(x) can mean either f(x)f(x) or f(f(x)).   I have never heard this called abuse of notation and I don’t think it should be called that.  The notation “f^2(x)” can in ordinary usage mean one of two things and the author or teacher should say which one they mean.  Many math phrases or symbolic expressions  can mean more than one thing and the author generally should say which.  I don’t see the point of calling this phenomenon abuse of notation.

Radial concept

The Wikipedia article mentions phrases such as “partial function”.  This article does provide a citation for Bourbaki for calling a sentence such as “Let f:A\to B be a partial function” abuse of notation.  Bourbaki is wrong in a deep sense (as the article implies).  There are several points to make about this:

  • Some authors, particularly in logic, define a function to be what most of us call a partial function.  Some authors  require a ring to have a unit and others don’t.  So what?
  • The phrase “partial function” has a standard meaning in math:  Roughly “it is a function except it is defined on only part of its domain”.  Precisely, f:A\to B is a partial function if it is a function f:A'\to B for some subset A' of A.
  • A partial function is not in general a function.  A stepmother is not a mother.  A left identity may not be an identity, but the phrase “left identity” is defined precisely.   An incomplete proof is not a proof, but you know what the phrase means! (Compare “expectant mother”).   This is the way we normally talk and think.  See the article “radial concept” in the Handbook.

Other uses

AB4 involves a redefinition of  “\in” in a special case.  Authors redefine symbols all the time.  This kind of redefinition on the fly probably should be avoided, but since they did it I am glad they mentioned it.

I have not talked about some of the uses mentioned in the Wikipedia article because I don’t yet understand them well enough.  AB1 and AB2 refer to a common use with pullback that I am not sure I understand (in terms of how they author is thinking of it).  I also don’t understand AB5.  Suggestions from readers would be appreciated.

Kill it!

Well, it’s more polite to say, we don’t need the phrase “abuse of notation” and it should be deprecated.

  • The use of the word “abuse” makes it sound like a bad thing, and most instances of abuse of notation are nothing of the sort.  They make mathematical writing much more readable.
  • Nearly everywhere it is used it could just as well be called a convention.  (This requires verification by studying math texts.)

Citations

The first three citations at in the Handbook list; the numbers refer to that list’s numbering. The others I searched out for the purpose of this post.

82. Busenberg, S., D. C. Fisher, and M. Martelli (1989), Minimal periods of discrete and smooth orbits. American Mathematical Monthly, volume 96, pages 5–17. [p. 8. Lines 2–4.]

Therefore, a normed linear space is really a pair (\mathbf{E},\|\cdot\|) where \mathbf{E} is a linear vector space and \|\cdot\|:\mathbf{E}\to(0,\infty) is a norm. In speaking of normed spaces, we will frequently abuse this notation and write \mathbf{E} instead of the pair (\mathbf{E},\|\cdot\|).

209. Hunter, T. J. (1996), On the homology spectral sequence for topological Hochschild homology. Transactions of the American Mathematical Society, volume 348, pages 3941–3953. [p. 3934. Lines 8–6 from bottom.]

We will often abuse notation by omitting mention of the natural isomorphisms making \wedge associative and unital.

395. Teitelbaum, J. T. (1991), ‘The Poisson kernel for Drinfeld modular curves’. Journal of the American Mathematical Society, volume 4, pages 491–511. [p. 494. Lines 1–4.]

\ldots may find a homeomorphism x:E\to \mathbb{P}^1_k such that \displaystyle x(\gamma u) = \frac{ax(u)+b}{cx(u)+d}. We will tend to abuse notation and identify E with \mathbb{P}^1_k by means of the function x.

AB1. Fujita, T. On the structure of polarized manifolds with total deficiency one.  I. J. Math. Soc. Japan, Vol. 32, No. 4, 1980.

Here we show examples of symbols used in this paper \ldots

L_{T}: The pull back of L to a space T by a given morphism T\rightarrow S . However, when there is no danger of confusion, we OFTEN write L instead of L_T by abuse of notation.

AB2. Sternberg, S. Minimal coupling and the symplectic mechanics of a classical
particle in the presence of a Yang-Mills field. Physics, Vol. 74, No. 12, pp. 5253-5254, December 1977.

On the other hand, let us, by abuse of notation, continue to write \Omega for the pullback of \Omega from F to P \times F by projection onto the second factor. Thus, we can write \xi_Q\rfloor\Omega = \xi_F\rfloor\Omega and \ldots

AB3. Dobson, D, and Vogel, C. Convergence of an iterative method for total variation denoising. SIAM J. Numer. Anal., Vol. 34, pp. 1779, October, 1997.

Consider the approximation

(3.7) u\approx U\stackrel{\text{def}}{=}\sum_{j=1}^N U_j\phi_j \ldots

In an abuse of notation, U will represent both the coefficient vector \{U_j\}_{j=1}^N and the corresponding linear combination (3.7).

AB4. Lewis, R, and Torczon, V. Pattern search algorithms for bound constrained minimization.  NASA Contractor Report 198306; ICASE Report No. 96-20.

By abuse of notation, if A is a matrix, y\in A means that the vector y is a column of A.

AB5. Allemandi, G, Borowiecz, A. and Francaviglia, M. Accelerated Cosmological Models in Ricci squared Gravity. ArXiv:hep-th/0407090v2, 2008.

This allows to reinterpret both f(S) and f'(S) as functions of \tau in the expressions:
\begin{equation*}\begin{cases}  f(S) = f(F(\tau)) = f(\tau )\\  f'(S) = f'(F(\tau )) = f'(\tau )\end{cases}\end{equation*}
following the abuse of notation f(F(t )) = f(t ) and f'(F(t )) = f'(t ).

References

[1] Harrison, J. Criticism and reconstruction, in Formalized Mathematics (1996).

[2] Clapham, C. and J. Nicholson.  Oxford Concise Dictionary of Mathematics, Fourth Edition (2009).  Oxford University Press.

 

Send to Kindle

More about defining “category”


In a recent post, I wrote about defining “category” in a way that (I hope) makes it accessible to undergraduate math majors at an early stage.  I have several more things to say about this.

Early intro to categories

The idea is to define a category as a directed graph equipped with an additional structure of composition of paths subject to some axioms.  By giving several small finite examples of categories drawn in that way that gives you an understanding of “category” that has several desirable properties:

  • You get the idea of what a category is in one lecture.
  • With the right choice of examples you get several fine points cleared up:
    • The composition is added structure.
    • A loop doesn’t have to be an identity.
    • Associativity is a genuine requirement —  it is not automatic.
  • You get immediate access to what is by far the most common notation used to work with a category — objects (nodes) and arrows.
  • You don’t have to cope with the difficult chunking required when the first examples given are sets-with-structure and structure-preserving functions.  It’s quite hard to focus on a couple of dots on the paper each representing a group or a topological space and arrows each representing a whole function (not the value of the function!).

Introduce more examples

Then the teacher can go on with the examples that motivated categories in the first place: the big deal categories such as sets, groups and topological spaces.   But they can be introduced using special cases so they don’t require much background.

  • Draw some finite sets and functions between them.  (As an exercise, get the students to find some finite sets and functions that make the picture a category with $f=kh$ as the composite and $f\neq g$.)
  • If the students have had calculus,  introduce them to the category whose objects are real finite nonempty intervals with continuous or differentiable mappings between them.  (Later you can prove that this category is a groupoid!)
  • Find all the groups on a two element set and figure out which maps preserve group multiplication.  (You don’t have to use the word “group” — you can simply show both of them and work out which maps preserve multiplication — and discover isomorphism!.)  This introduces the idea of the arrows being structure-preserving mape. You can get more complicated and use semigroups as well.  If the students know Mathematica you could even do magmas.  Well, maybe not.

All this sounds like a project you could do with high school students.

Large and small

If all this were just a high school (or intro-to-math-for-math-majors) project you wouldn’t have to talk about large vs. small.  However, I have some ideas about approaching this topic.

In the first place, you can define category, or any other mathematical object that might involve a proper class, using the syntactic approach I described in Just-in-time foundations.  You don’t say “A category consists of a set of objects and a set of arrows such that …”.  Instead you say something like “A category $\mathcal{C}$ has objects $A,\,B,\,C\ldots$ such that…”.

This can be understood as meaning “For any $A$, the statement $A$ is an object of  $\mathcal{C}$ is either true or false”, and so on.

This approach is used in the Wikibook on category theory.  (Note: this is a permanent link to the November 28 version of the section defining categories, which is mostly my work.  As always with Wikimedia things it may be entirely different when you read this.)

If I were dictator of the math world (not the same thing as dictator of MathWorld) I would want definitions written in this syntactic style.  The trouble is that mathematicians are now so used to mathematical objects having to be sets-with-structure that wording the definition as I did above may leave them feeling unmoored.  Yet the technique avoids having to mention large vs. small until a problem comes up. (In category theory it sometimes comes up when you want to quantify over all objects.)

The ideas outlined in this subsection could be a project for math majors.  You would have to introduce Russell’s Paradox.  But for an early-on intro to categories you could just use the syntactic wording and avoid large vs. small altogether.

 

http://en.wikibooks.org/w/index.php?title=Category_Theory/Categories&stableid=2221684

Send to Kindle

Defining “category”

The concept of category is typically taught later in undergrad math than the concept of group is.  It is supposedly a more advanced concept.  Indeed, the typical examples of categories used in applications are more advanced than some of those in group theory (for example, symmetries of geometric shapes and operations on numbers).

Here are some thoughts on how categories could be taught as early as groups, if not earlier.

Nodes and arrows

Small finite categories can be pictured as a graph using nodes and arrows, together with a specification of the identity arrows and a definition of the composition.  (I am using the word “graph” the way category people use it:  a directed graph with possible multiple edges and loops.)

An example is the category pictured below with three objects and seven arrows. The composition is forced except for $kh$, which I hereby define to be $f$.

This way of picturing a category is  easy to grasp. The composite $kh$ visibly has to be either $f$ or $g$.  There is only one choice for the composite of any other composable pair.  Still, the choice of composite is not deducible directly by looking at the graph.

A first class in category theory using graphs as examples could start with this example, or the example in Note 1 below.  This example is nontrivial (never start any subject with trivial examples!) and easy to grasp, in this case using the extraordinary preprocessing your brain does with the input from your eyes.  The definition of category is complicated enough that you should probably present the graph and then give the definition while pointing to what each clause says about the graph.

Most abstract structures have several different ways of representing them. In contrast, when you discuss categorial concepts the standard object-and-arrow notation is the overwhelming favorite.  It reveals domains and codomains and composable pairs, in fact almost everything except which of several possible arrows the composite actually is.  If for example you try to define category using sets and functions as your running example, the student has to do a lot of on-the-go chunking — thinking of a set as a single object, of a set function (which may involve lots of complicated data) as a single chunk with a domain and a codomain, and so on.  But an example shown as a graph comes already chunked and in a picture that is guaranteed to be the most common kind of display they will see in discussions of categories.

After you do these examples, you can introduce trivial and simple graph examples in which the composition is entirely induced; for example these three:

(In case you are wondering, one of them is the empty category.)  I expect that you should also introduce another graph non-example in which associativity fails.

Multiplication tables

The multiplication table for a group is easy to understand, too, in the sense that it gives you a simple method of calculating the product of any two elements.  But it doesn’t provide a visual way to see the product as a category-as-graph does.  Of course, the graph representation works only for finite categories, just as the multiplication table works only for finite groups.

You can give a multiplication table for a small finite category, too, like the one below for the category above.  (“iA” means the identity arrow on A and composition, as usual in category theory, is right to left.) This is certainly more abstract than the graph picture, but it does hit you in the face with the fact that the multiplication is partial.

Notes

1. My suggested example of a category given as a graph shows clearly that you can define two different categorial structures on the graph.  One problem is that the two different structures are isomorphic categories.  In fact, if you engage the students in a discussion about these examples someone may notice that!  So you should probably also use the graph below,where you can define several different category structures that are not all isomorphic. 

2. Multiplication tables and categories-as-graphs-with-composition are extensional presentations.  This means they are presented with all their parts laid out in front of you.  Most groups and categories are given by definitions as accumulations of properties (see concept in the Handbook of Mathematical Discourse).  These definitions tend to make some requirements such as associativity obvious.

Students are sometimes bothered by extensional definitions.  “What are h and k (in the category above)?  What are a, b and c?” (in a group given as a set of letters and a multiplication table).

Send to Kindle

Definition of “function”

I have made a major revision of the abstractmath.org article Functions: Specification and Definition.   The links from the revised article lead into the main abstractmath website, but links from other articles on the website still go back to the old version. So if you click on a link in the revised version, make it come up in a new window.

I expect to link the revision in after I make a few small changes, and I will take into account any comments from you all.

Remarks

1.  You will notice that the new version is in PDF instead of HTML.  A couple of other articles on the website are already in PDF, but I don’t expect to continue replacing HTML by PDF.   It is too much work.  Besides, you can’t shrink it to fit tablets.

2. It would also have been a lot of work to adapt the revision so that I could display it directly on Word Press.  In some cases I have written revisions first in WP and then posted them on the abmath website.  That is not so difficult and I expect to do it again.

Send to Kindle

Freezing a family of functions

The interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook algebra1.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.

Some background

  • Generally, I have advocated using all sorts of images and metaphors to enable people to think about particular mathematical objects more easily.
  • In previous posts I have illustrated many ways (some old, some new, many recently using Mathematica CDF files) that you can provide such images and metaphors, to help university math majors get over the abstraction cliff.
  • When you have to prove something you find yourself throwing out the images and metaphors (usually a bit at a time rather than all at once) to get down to the rigorous view of math [1], [2], [3], to the point where you think of all the mathematical objects you are dealing with as unchanging and inert (not reacting to anything else).  In other words, dead.
  • The simple example of a family of functions in this post is intended to give people a way of thinking about getting into the rigorous view of the family.  So this post uses image-and-metaphor technology to illustrate a way of thinking about one of the basic proof techniques in math (representing the object in rigor mortis so you can dissect it).  I suppose this is meta-math-ed.  But I don’t want to think about that too much…
  • This example also illustrates the difference between parameters and variables. The bottom line is that the difference is entirely in how we think about them. I will write more about that later.

 A family of functions

This graph shows individual members of the family of functions \( y=a\sin\,x\) for various values of a. Let’s look at some of the ways you can think about this.

  • Each choice of  “shows the function for that value of the parameter a“.  But really, it shows the graph of the function, in fact only the part between x=-4 and x= 4.
  • You can also think of it as showing the function changing shape as a changes over time (as you slide the controller back and forth).

Well, you can graph something changing over time by introducing another axis for time.  When you graph vertical motion of a particle over time you use a two-dimensional picture, one axis representing time and the other the height of the particle. Our representation of the function y=a\sin\,x is a two-dimensional object (using its graph) so we represent the function in 3-space, as in this picture, where the slider not only shows the current (graph of the) function for parameter value a but also locates it over a on the z axis.

The picture below shows the surface given by y=a\sin\,x as a function of both variables a and x. Note that this graph is static: it does not change over time (no slide bar!). This is the family of functions represented as a rigorous (dead!) mathematical object.

If you click the “Show Curves” button, you will see a selection of the curves in middle diagram above drawn as functions of x for certain values of a. Each blue curve is thus a sine wave of amplitude a. Pushing that button illustrates the process going on in your mind when you concentrate on one aspect of the surface, namely its cross-sections in the x direction.

Reference [4] gives the code for the diagrams in this post, as well as a couple of others that may add more insight to the idea. Reference [5] gives similar constructions for a different family of functions.

References

  1. Rigorous view in abstractmath.org 
  2. Representations II: Dry Bones (post)
  3. Representations III: Rigor and Rigor Mortis (post)
  4. FamiliesFrozen.nb.
  5. AnotherFamiliesFrozen.nb (Mathematica file showing another family of functions)
Send to Kindle