All posts by SixWingedSeraph

Charles Wells. Professor Emeritus at Case Western Reserve University. Now living in Minneapolis, Minnesota, USA. CWRU provides support for this blog with software and library privileges.

Two

The post Are these questions unambiguous? in the blog Explaining Mathematics concerns the funny way mathematicians use the number “two” (Note [3]).  This is discussed in Abstractmath.org, based on usage quotations (see Note [1]) in the Handbook of Mathematical Discourse. They are citations  54, 119, 220, 229, 260, 322, 323 and 338.  The list is in the online version of the Handbook (see Note [2]) which takes forever to load.  (There is a separate file for users of the paperback book but it is currently trashed.)

The usage quirk concerning “two” is exemplified by statements such as these:

  1. The sum of any two even integers is even.
  2. Courant gives Leibniz’ rule for finding the Nth derivative of the product of two functions.  (This is from Citation 323.)
  3. Are there two positive integers m and n, both greater than 1, satisfying mn=9? (This is from Explaining Mathematics.)

Statements 1 and 2 are of course true.  They are still true if the “two” things are the same.  Mathematicians generally assume that such a statement includes the case where the two things are the same.  If the case that they are the same is excluded, the statement becomes an unnecessarily weak assertion.

Statement 3, in my opinion, is badly written.  If the two positive integers have to be distinct, the answer is “no”.   I think any competent mathematical writer would write something like, “There are not two distinct integers m and n both greater than 1 for which mn = 9″.

It is fair to say that when mathematicians refer to “two integers” in statements like these, they are allowed to be the same.  If they can’t be the same for the sentence to remain true, they will (or at least should) insert a word such as “distinct”.

Of course, in some sentences the two integers can’t be the same because of some condition imposed in the context.  That doesn’t happen in the citations I have listed.  Maybe someone can contribute an example.

Notes

[1] In the Handbook, usage quotations are called “citations”.  It appears to me that the commonest name for citations among lexicographers is “usage quotations”, so I will start calling them that.

[2] I created the online version of the Handbook hastily in 2006.  It needs work, since it has TeX mistakes (which may irritate you but should not interfere with readability) and omits the quotations, illustrations, and some backlinks, including backlinks for the citations.  Some Day When I Get A Round Tuit…

[3] This funny property of “two” was discussed many years ago by Steenrod or Knuth or someone, and is mentioned in a paper by Susanna Epp, but I don’t currently have access to any of the references.

 

Send to Kindle

Abuse of notation

I have recently read the Wikipedia article on Abuse of Notation (this link is to the version of 29 December 2011, since I will eventually edit it).  The Handbook of Mathematical Discourse and abstractmath.org mention this idea briefly.  It is time to expand the abstractmath article and to redo parts of the Wikipedia article, which  contains some confusions.

This is a preliminary draft, part of which I’ll incorporate into abstractmath after you readers make insightful comments :).

The phrase “Abuse of Notation” is used in articles and books written by research mathematicians.  It is part of Mathematical English.  This post is about

  • What “abuse of notation” means in mathematical writing and conversation.
  • What it could be used to mean.
  • Mathematical usage in general.  I will discuss this point in the context of the particular phrase “abuse of notation”, not a bad way to talk about a subject.

Mathematical Usage

Sources

If I’m going to write about the usage of Mathematical English, I should ideally verify what I claim about the usage by finding citations for a claim: documented quotations that illustrate the usage.  This is the standard way to produce any dictionary.

There is no complete authoritative source for usage of words and phrases in Mathematical English (ME), or for that matter for usage in the Symbolic Language (SL).

  • The Oxford Concise Dictionary of Mathematics [2] covers technical terms and symbols used in school math and in much of undergraduate math, but not so much of research math.  It does not mention being based on citations and it hardly talks about usage at all, even for notorious student-confusing notations such as “\sin^k x“. But it appears quite accurate with good explanations of the math it covers.
  • I wrote Handbook of Mathematical Discourse to stimulate investigations into mathematical usage.  It describes a good many usages in Mathematical English and the Symbolic Language, documented with citations of quotations, but is quite incomplete (as I said in its Introduction).  The Handbook has 428 citations for various usages.  (They are at the end of the on-line PDF version. They are not in the printed book, but are on the web with links to pages in the printed book.)
  • MathWorld has an extensive list of mathematical words, phrases and symbols, and accurate definitions or descriptions of them, even for a great many advanced research topics. It also frequently mentions usage (see formula and inverse sine), but does not give citations.
  • Wikipedia has the most complete set of definitions of mathematical objects that I know of.  The entries sometimes mention usage. I have not detected any entry that gives citations for usage.  Not that that should stop anyone from adding them.

Teaching mathematical usage

In explaining mathematical usage to students, particularly college-level or higher math students, you have choices:

  1. Tell them what you think the usage of a word, phrase, or symbol is, without researching citations.
  2. Tell them what you think the usage ought to be.
  3. Tell them what you think the usage is, supported by citations.

(1) has the problem that you can be wrong.  In fact when I worked on the Handbook I was amazed  at how wrong I could be in what the usage was, in spite of the fact that I had been thinking about usage in ME and SL since I first started teaching (and kept a folder of what I had noticed about various usages).  However,  professional mathematicians generally have a reasonably accurate idea about usage for most things, particularly in their field and in undergraduate courses.

(2) is dangerous.  Far too many mathematicians (but nevertheless a minority), introduce usage in articles and lecturing that is not common or that they invented themselves. As a result their students will be confused in trying to read other sources and may argue with other teachers about what is “correct”.  It is a gross violation of teaching ethics to tell the students that (for example) “x > 0″ allows x = 0 and not mention to them that nearly all written mathematics does not allow that.  (Did you know that a small percentage of mathematicians and educators do use that meaning, including in some secondary institutions in some countries?  It is partly Bourbaki’s fault.)

(3) You often can’t tell them what the usage is, supported by citations, because, as mentioned above, documented mathematical usage is sparse.

I think people should usually choose (1) instead of (2).  If they do want to introduce a new usage or notation because it is “more logical” or because “my thesis advisor used it” or something, they should reconsider.  Most such attempts have failed, and thousands of students have been confused by the attempts.

Abuse of notation

“Abuse of notation” is a phrase used in mathematical writing to describe terminology and notation that does not have transparent meaning. (Transparent meaning is described in some detail under “compositional” in the Handbook.)

Abuse of notation was originally defined in French, where the word “abus” does not carry the same strongly negative connotation that it does in English.

Suppression of parameters

One widely noticed practice called “abuse of notation”  is the use of the name of the underlying set of a mathematical structure to refer to a structure. For example, a group is a structure (G,\text{*}) where G is a set and * is a binary operation with certain properties. The most common way to refer to this structure is simply to call it G. Since any set of cardinality greater than 1 has more than one group structure on it, this does not include all the information needed to determine the group. This type of usage is cited in 82 below.  It is an example of suppression of parameters.

Writing “\log x” without mentioning the base of the logarithm is also an example of suppression of parameters.  I think most mathematicians would regard this as a convention rather than as an abuse of notation.  But I have no citations for this (although they would probably be easy to find).  I doubt that it is possible to find a rational distinction between “abuse of notation” and “convention”; it is all a matter of what people are used to saying.

Synecdoche

The naming of a structure by using the name of its underlying set is also an example of synecdoche, the naming of a whole by a part (for example, “wheels” to mean a car).

Another type of synecdoche that has been called abuse of notation is referring to an equivalence class by naming one of its elements.  I do not have a good quotation-citation that shows this use.  Sometimes people write 2 + 4 = 1 when they are working in the Galois field with 5 elements.  But that can be interpreted in more than one way.  If GF[5] consists of equivalence classes of integers (mod 5) then they are indeed using 2 (for example) to stand for the equivalence class of 2.  But they could instead define GF[5] in the obvious way with underlying set {0,1,2,3,4}.  In any case, making distinctions of that sort is pedantic, since the two structures are related by a natural isomorphism (next paragraph!)

Identifying objects via isomorphism

This is quite commonly called “abuse of notation” and is exemplified in citations 209, 395 and AB3.

Overloaded notation

John Harrison, in [1], uses “abuse of notation” to describe the use of a function symbol to apply to both an element of its domain and a subset of the domain.  This is an example of overloaded notation.  I have not found another citation for this usage other than Harrison and I don’t remember anyone using it.  Another example of overloaded notation is the use of the same symbol “\times” for multiplication of numbers, matrices and 3-vectors.  I have never heard that called abuse of notation.  But I have no authority to say anything about this usage because I haven’t made the requisite thorough search of the literature.

Powers of functions

The Wikipedia Article on abuse of notation (29 Dec 2011 version) mentions the fact that f^2(x) can mean either f(x)f(x) or f(f(x)).   I have never heard this called abuse of notation and I don’t think it should be called that.  The notation “f^2(x)” can in ordinary usage mean one of two things and the author or teacher should say which one they mean.  Many math phrases or symbolic expressions  can mean more than one thing and the author generally should say which.  I don’t see the point of calling this phenomenon abuse of notation.

Radial concept

The Wikipedia article mentions phrases such as “partial function”.  This article does provide a citation for Bourbaki for calling a sentence such as “Let f:A\to B be a partial function” abuse of notation.  Bourbaki is wrong in a deep sense (as the article implies).  There are several points to make about this:

  • Some authors, particularly in logic, define a function to be what most of us call a partial function.  Some authors  require a ring to have a unit and others don’t.  So what?
  • The phrase “partial function” has a standard meaning in math:  Roughly “it is a function except it is defined on only part of its domain”.  Precisely, f:A\to B is a partial function if it is a function f:A'\to B for some subset A' of A.
  • A partial function is not in general a function.  A stepmother is not a mother.  A left identity may not be an identity, but the phrase “left identity” is defined precisely.   An incomplete proof is not a proof, but you know what the phrase means! (Compare “expectant mother”).   This is the way we normally talk and think.  See the article “radial concept” in the Handbook.

Other uses

AB4 involves a redefinition of  “\in” in a special case.  Authors redefine symbols all the time.  This kind of redefinition on the fly probably should be avoided, but since they did it I am glad they mentioned it.

I have not talked about some of the uses mentioned in the Wikipedia article because I don’t yet understand them well enough.  AB1 and AB2 refer to a common use with pullback that I am not sure I understand (in terms of how they author is thinking of it).  I also don’t understand AB5.  Suggestions from readers would be appreciated.

Kill it!

Well, it’s more polite to say, we don’t need the phrase “abuse of notation” and it should be deprecated.

  • The use of the word “abuse” makes it sound like a bad thing, and most instances of abuse of notation are nothing of the sort.  They make mathematical writing much more readable.
  • Nearly everywhere it is used it could just as well be called a convention.  (This requires verification by studying math texts.)

Citations

The first three citations at in the Handbook list; the numbers refer to that list’s numbering. The others I searched out for the purpose of this post.

82. Busenberg, S., D. C. Fisher, and M. Martelli (1989), Minimal periods of discrete and smooth orbits. American Mathematical Monthly, volume 96, pages 5–17. [p. 8. Lines 2–4.]

Therefore, a normed linear space is really a pair (\mathbf{E},\|\cdot\|) where \mathbf{E} is a linear vector space and \|\cdot\|:\mathbf{E}\to(0,\infty) is a norm. In speaking of normed spaces, we will frequently abuse this notation and write \mathbf{E} instead of the pair (\mathbf{E},\|\cdot\|).

209. Hunter, T. J. (1996), On the homology spectral sequence for topological Hochschild homology. Transactions of the American Mathematical Society, volume 348, pages 3941–3953. [p. 3934. Lines 8–6 from bottom.]

We will often abuse notation by omitting mention of the natural isomorphisms making \wedge associative and unital.

395. Teitelbaum, J. T. (1991), ‘The Poisson kernel for Drinfeld modular curves’. Journal of the American Mathematical Society, volume 4, pages 491–511. [p. 494. Lines 1–4.]

\ldots may find a homeomorphism x:E\to \mathbb{P}^1_k such that \displaystyle x(\gamma u) = \frac{ax(u)+b}{cx(u)+d}. We will tend to abuse notation and identify E with \mathbb{P}^1_k by means of the function x.

AB1. Fujita, T. On the structure of polarized manifolds with total deficiency one.  I. J. Math. Soc. Japan, Vol. 32, No. 4, 1980.

Here we show examples of symbols used in this paper \ldots

L_{T}: The pull back of L to a space T by a given morphism T\rightarrow S . However, when there is no danger of confusion, we OFTEN write L instead of L_T by abuse of notation.

AB2. Sternberg, S. Minimal coupling and the symplectic mechanics of a classical
particle in the presence of a Yang-Mills field. Physics, Vol. 74, No. 12, pp. 5253-5254, December 1977.

On the other hand, let us, by abuse of notation, continue to write \Omega for the pullback of \Omega from F to P \times F by projection onto the second factor. Thus, we can write \xi_Q\rfloor\Omega = \xi_F\rfloor\Omega and \ldots

AB3. Dobson, D, and Vogel, C. Convergence of an iterative method for total variation denoising. SIAM J. Numer. Anal., Vol. 34, pp. 1779, October, 1997.

Consider the approximation

(3.7) u\approx U\stackrel{\text{def}}{=}\sum_{j=1}^N U_j\phi_j \ldots

In an abuse of notation, U will represent both the coefficient vector \{U_j\}_{j=1}^N and the corresponding linear combination (3.7).

AB4. Lewis, R, and Torczon, V. Pattern search algorithms for bound constrained minimization.  NASA Contractor Report 198306; ICASE Report No. 96-20.

By abuse of notation, if A is a matrix, y\in A means that the vector y is a column of A.

AB5. Allemandi, G, Borowiecz, A. and Francaviglia, M. Accelerated Cosmological Models in Ricci squared Gravity. ArXiv:hep-th/0407090v2, 2008.

This allows to reinterpret both f(S) and f'(S) as functions of \tau in the expressions:
\begin{equation*}\begin{cases}  f(S) = f(F(\tau)) = f(\tau )\\  f'(S) = f'(F(\tau )) = f'(\tau )\end{cases}\end{equation*}
following the abuse of notation f(F(t )) = f(t ) and f'(F(t )) = f'(t ).

References

[1] Harrison, J. Criticism and reconstruction, in Formalized Mathematics (1996).

[2] Clapham, C. and J. Nicholson.  Oxford Concise Dictionary of Mathematics, Fourth Edition (2009).  Oxford University Press.

 

Send to Kindle

More about defining “category”


In a recent post, I wrote about defining “category” in a way that (I hope) makes it accessible to undergraduate math majors at an early stage.  I have several more things to say about this.

Early intro to categories

The idea is to define a category as a directed graph equipped with an additional structure of composition of paths subject to some axioms.  By giving several small finite examples of categories drawn in that way that gives you an understanding of “category” that has several desirable properties:

  • You get the idea of what a category is in one lecture.
  • With the right choice of examples you get several fine points cleared up:
    • The composition is added structure.
    • A loop doesn’t have to be an identity.
    • Associativity is a genuine requirement —  it is not automatic.
  • You get immediate access to what is by far the most common notation used to work with a category — objects (nodes) and arrows.
  • You don’t have to cope with the difficult chunking required when the first examples given are sets-with-structure and structure-preserving functions.  It’s quite hard to focus on a couple of dots on the paper each representing a group or a topological space and arrows each representing a whole function (not the value of the function!).

Introduce more examples

Then the teacher can go on with the examples that motivated categories in the first place: the big deal categories such as sets, groups and topological spaces.   But they can be introduced using special cases so they don’t require much background.

  • Draw some finite sets and functions between them.  (As an exercise, get the students to find some finite sets and functions that make the picture a category with $f=kh$ as the composite and $f\neq g$.)
  • If the students have had calculus,  introduce them to the category whose objects are real finite nonempty intervals with continuous or differentiable mappings between them.  (Later you can prove that this category is a groupoid!)
  • Find all the groups on a two element set and figure out which maps preserve group multiplication.  (You don’t have to use the word “group” — you can simply show both of them and work out which maps preserve multiplication — and discover isomorphism!.)  This introduces the idea of the arrows being structure-preserving mape. You can get more complicated and use semigroups as well.  If the students know Mathematica you could even do magmas.  Well, maybe not.

All this sounds like a project you could do with high school students.

Large and small

If all this were just a high school (or intro-to-math-for-math-majors) project you wouldn’t have to talk about large vs. small.  However, I have some ideas about approaching this topic.

In the first place, you can define category, or any other mathematical object that might involve a proper class, using the syntactic approach I described in Just-in-time foundations.  You don’t say “A category consists of a set of objects and a set of arrows such that …”.  Instead you say something like “A category $\mathcal{C}$ has objects $A,\,B,\,C\ldots$ such that…”.

This can be understood as meaning “For any $A$, the statement $A$ is an object of  $\mathcal{C}$ is either true or false”, and so on.

This approach is used in the Wikibook on category theory.  (Note: this is a permanent link to the November 28 version of the section defining categories, which is mostly my work.  As always with Wikimedia things it may be entirely different when you read this.)

If I were dictator of the math world (not the same thing as dictator of MathWorld) I would want definitions written in this syntactic style.  The trouble is that mathematicians are now so used to mathematical objects having to be sets-with-structure that wording the definition as I did above may leave them feeling unmoored.  Yet the technique avoids having to mention large vs. small until a problem comes up. (In category theory it sometimes comes up when you want to quantify over all objects.)

The ideas outlined in this subsection could be a project for math majors.  You would have to introduce Russell’s Paradox.  But for an early-on intro to categories you could just use the syntactic wording and avoid large vs. small altogether.

 

http://en.wikibooks.org/w/index.php?title=Category_Theory/Categories&stableid=2221684

Send to Kindle

Defining “category”

The concept of category is typically taught later in undergrad math than the concept of group is.  It is supposedly a more advanced concept.  Indeed, the typical examples of categories used in applications are more advanced than some of those in group theory (for example, symmetries of geometric shapes and operations on numbers).

Here are some thoughts on how categories could be taught as early as groups, if not earlier.

Nodes and arrows

Small finite categories can be pictured as a graph using nodes and arrows, together with a specification of the identity arrows and a definition of the composition.  (I am using the word “graph” the way category people use it:  a directed graph with possible multiple edges and loops.)

An example is the category pictured below with three objects and seven arrows. The composition is forced except for $kh$, which I hereby define to be $f$.

This way of picturing a category is  easy to grasp. The composite $kh$ visibly has to be either $f$ or $g$.  There is only one choice for the composite of any other composable pair.  Still, the choice of composite is not deducible directly by looking at the graph.

A first class in category theory using graphs as examples could start with this example, or the example in Note 1 below.  This example is nontrivial (never start any subject with trivial examples!) and easy to grasp, in this case using the extraordinary preprocessing your brain does with the input from your eyes.  The definition of category is complicated enough that you should probably present the graph and then give the definition while pointing to what each clause says about the graph.

Most abstract structures have several different ways of representing them. In contrast, when you discuss categorial concepts the standard object-and-arrow notation is the overwhelming favorite.  It reveals domains and codomains and composable pairs, in fact almost everything except which of several possible arrows the composite actually is.  If for example you try to define category using sets and functions as your running example, the student has to do a lot of on-the-go chunking — thinking of a set as a single object, of a set function (which may involve lots of complicated data) as a single chunk with a domain and a codomain, and so on.  But an example shown as a graph comes already chunked and in a picture that is guaranteed to be the most common kind of display they will see in discussions of categories.

After you do these examples, you can introduce trivial and simple graph examples in which the composition is entirely induced; for example these three:

(In case you are wondering, one of them is the empty category.)  I expect that you should also introduce another graph non-example in which associativity fails.

Multiplication tables

The multiplication table for a group is easy to understand, too, in the sense that it gives you a simple method of calculating the product of any two elements.  But it doesn’t provide a visual way to see the product as a category-as-graph does.  Of course, the graph representation works only for finite categories, just as the multiplication table works only for finite groups.

You can give a multiplication table for a small finite category, too, like the one below for the category above.  (“iA” means the identity arrow on A and composition, as usual in category theory, is right to left.) This is certainly more abstract than the graph picture, but it does hit you in the face with the fact that the multiplication is partial.

Notes

1. My suggested example of a category given as a graph shows clearly that you can define two different categorial structures on the graph.  One problem is that the two different structures are isomorphic categories.  In fact, if you engage the students in a discussion about these examples someone may notice that!  So you should probably also use the graph below,where you can define several different category structures that are not all isomorphic. 

2. Multiplication tables and categories-as-graphs-with-composition are extensional presentations.  This means they are presented with all their parts laid out in front of you.  Most groups and categories are given by definitions as accumulations of properties (see concept in the Handbook of Mathematical Discourse).  These definitions tend to make some requirements such as associativity obvious.

Students are sometimes bothered by extensional definitions.  “What are h and k (in the category above)?  What are a, b and c?” (in a group given as a set of letters and a multiplication table).

Send to Kindle

Definition of “function”

I have made a major revision of the abstractmath.org article Functions: Specification and Definition.   The links from the revised article lead into the main abstractmath website, but links from other articles on the website still go back to the old version. So if you click on a link in the revised version, make it come up in a new window.

I expect to link the revision in after I make a few small changes, and I will take into account any comments from you all.

Remarks

1.  You will notice that the new version is in PDF instead of HTML.  A couple of other articles on the website are already in PDF, but I don’t expect to continue replacing HTML by PDF.   It is too much work.  Besides, you can’t shrink it to fit tablets.

2. It would also have been a lot of work to adapt the revision so that I could display it directly on Word Press.  In some cases I have written revisions first in WP and then posted them on the abmath website.  That is not so difficult and I expect to do it again.

Send to Kindle

Freezing a family of functions

The interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook algebra1.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.

Some background

  • Generally, I have advocated using all sorts of images and metaphors to enable people to think about particular mathematical objects more easily.
  • In previous posts I have illustrated many ways (some old, some new, many recently using Mathematica CDF files) that you can provide such images and metaphors, to help university math majors get over the abstraction cliff.
  • When you have to prove something you find yourself throwing out the images and metaphors (usually a bit at a time rather than all at once) to get down to the rigorous view of math [1], [2], [3], to the point where you think of all the mathematical objects you are dealing with as unchanging and inert (not reacting to anything else).  In other words, dead.
  • The simple example of a family of functions in this post is intended to give people a way of thinking about getting into the rigorous view of the family.  So this post uses image-and-metaphor technology to illustrate a way of thinking about one of the basic proof techniques in math (representing the object in rigor mortis so you can dissect it).  I suppose this is meta-math-ed.  But I don’t want to think about that too much…
  • This example also illustrates the difference between parameters and variables. The bottom line is that the difference is entirely in how we think about them. I will write more about that later.

 A family of functions

This graph shows individual members of the family of functions \( y=a\sin\,x\) for various values of a. Let’s look at some of the ways you can think about this.

  • Each choice of  “shows the function for that value of the parameter a“.  But really, it shows the graph of the function, in fact only the part between x=-4 and x= 4.
  • You can also think of it as showing the function changing shape as a changes over time (as you slide the controller back and forth).

Well, you can graph something changing over time by introducing another axis for time.  When you graph vertical motion of a particle over time you use a two-dimensional picture, one axis representing time and the other the height of the particle. Our representation of the function y=a\sin\,x is a two-dimensional object (using its graph) so we represent the function in 3-space, as in this picture, where the slider not only shows the current (graph of the) function for parameter value a but also locates it over a on the z axis.

The picture below shows the surface given by y=a\sin\,x as a function of both variables a and x. Note that this graph is static: it does not change over time (no slide bar!). This is the family of functions represented as a rigorous (dead!) mathematical object.

If you click the “Show Curves” button, you will see a selection of the curves in middle diagram above drawn as functions of x for certain values of a. Each blue curve is thus a sine wave of amplitude a. Pushing that button illustrates the process going on in your mind when you concentrate on one aspect of the surface, namely its cross-sections in the x direction.

Reference [4] gives the code for the diagrams in this post, as well as a couple of others that may add more insight to the idea. Reference [5] gives similar constructions for a different family of functions.

References

  1. Rigorous view in abstractmath.org 
  2. Representations II: Dry Bones (post)
  3. Representations III: Rigor and Rigor Mortis (post)
  4. FamiliesFrozen.nb.
  5. AnotherFamiliesFrozen.nb (Mathematica file showing another family of functions)
Send to Kindle

Curiosity

Science Daily recently reported on a new study [1] that shows that intellectual curiosity is a good predictor of academic performance.  A few days ago I published the post Liberal-Artsy people.  Now I know that what I was talking about are people with intellectual curiosity!  In the earlier post, I contrasted them with what I called “B.Sc.” types, who are narrowly focused and are not interested in asides in math class about the connections with some concept and other concepts, stories about the discoverer of the concept, the meaning of the name of the concept, and so on.

So better names would be “IC people” instead of Liberal Artsy people and “NF people” (Narrow Focus people) instead of B.Sc.  This is better terminology because it isn’t the type of undergraduate degree they have that matters but their attitude toward knowledge of the world.

There are things to say about these concepts with respect to research mathematicians.  I have known a good many over the years.  (My advice to young people who want to do math research is: Hang around people who know more than you do.)  My impression is that most of the very best mathematicians are IC people who are interested in all sorts of things, not just their branch of math.

Even so, some of the best mathematicians are narrowly focused.  This has always been the case.  Isaac Newton was evidently IC but Kurt Gödel was apparently NF.  (He had no interest in things outside math.  On the other hand, he did find a new model of general relativity, so he was willing to look at others parts of math besides logic.)

I have known some NF mathematicians.  When I wanted to tell them about something they might say, “I have enough trouble keeping up with my field”.  The ones that I knew were mediocre and rarely published much beyond writing up their dissertation.  I suspect that the famous NF mathematicians were simply brilliant enough to get away with being NF.

Perhaps the sort of NF student whose eyes glaze over when

  • you mention Evariste Galois’s tough and short life, or
  • talk about how group theory can be used to classify crystals, or
  • mention that “tangent” comes from the Latin word for “touching”

are doomed to the same mediocrity.  But undoubtedly some of those NF students will turn out to do great things, and some of the IC students will wind up dilettanting through life and never coming close to achieving their potential.

Don’t prejudge students.

[1] S. von Stumm, B. Hell, T. Chamorro-Premuzic. The Hungry Mind: Intellectual Curiosity Is the Third Pillar of Academic Performance. Perspectives on Psychological Science, 2011; 6 (6): 574 DOI: 10.1177/1745691611421204

Send to Kindle

Thinking about abstract math

 

The abstraction cliff

In universities in the USA, a math major typically starts with calculus, followed by courses such as linear algebra, discrete math, or a special intro course for math majors (which may be taken simultaneously with calculus), then go on to abstract algebra, analysis, and other courses involving abstraction and proofs.

At this point, too many of them hit a wall; their grades drop and they change majors.  They had been getting good grades in high school and in calculus because they were strong in algebra and geometry, but the sudden increase in abstraction in the newer courses completely baffles them. I believe that one big difficulty is that they can't grasp how to think about abstract mathematical objects.  (See Reference [9] and note [a].)   They have fallen off the abstraction cliff.  We lose too many math majors this way. (Abstractmath.org is my major effort to address the problems math majors have during or after calculus.)

This post is a summary of the way I see how mathematicians and students think about math.  I will use it as a reference in later posts where I will write about how we can communicate these ways of thinking.

Concept Image

In 1981, Tall and Vinner  [5] introduced the notion of the concept image that a person has about a mathematical concept or object.   Their paper's abstract says

The concept image consists of all the cognitive structure in the individual's mind that is associated with a given concept. This may not be globally coherent and may have aspects which are quite different from the formal concept definition.

The concept image you may have of an abstract object generally contains many kinds of constituents:

  • visual images of the object
  • metaphors connecting the object to other concepts
  • descriptions of the object in mathematical English
  • descriptions and symbols of the object in the symbolic language of math
  • kinetic feelings concerning certain aspects of the object
  • how you calculate parameters of the object
  • how you prove particular statements about the object

This list is incomplete and the items overlap.  I will write in detail about these ideas later.

The name "concept image" is misleading [b]), so when I have written about them, I have called them metaphors or mental representations as well as concept images, for example in [3] and [4].

Abstract mathematical concepts

This is my take on the notion of concept image, which may be different from that of most researchers in math ed. It owes a lot to the ideas of Reuben Hersh [7], [8].

  • An abstract mathematical concept is represented physically in your brain by what I have called "modules" [1] (physical constituents or activities of the brain [c]).
  • The representation generally consists of many modules.  They correspond to the list of constituents of a concept image given above.  There is no assumption that all the modules are "correct".
  • This representation exists in a semi-public network of mathematicians' and students' brains. This network exercises (incomplete) control over your personal representation of the abstract structure by means of conversation with other mathematicians and reading books and papers.  In this sense, an abstract concept is a social object.  (This is the only point of view in the philosophy of math that I know of that contains any scientific content.)

Notes

[a]  Before you object that abstraction isn't the only thing they have trouble with, note that a proof is an abstract mathematical object. The written proof is a representation of the abstract structure of the proof.  Of course, proofs are a special kind of abstract structure that causes special problems for students.

[b] Cognitive science people use "image" to include nonvisual representations, but not everyone does.  Indeed, cognitive scientists use "metaphor" as well with a broader meaning than your high school English teacher.  A metaphor involves the cognitive merging of parts of two concepts (specifically with other parts not merged). See [6].

[c] Note that I am carefully not saying what the modules actually are — neurons, networks of neurons, events in the brain, etc.   From the point of view of teaching and understanding math, it doesn't matter what they are, only that they exist and live in a society where they get modified by memes  (ideas, attitudes, styles physically transmitted from brain to brain by speech, writing, nonverbal communication, appearance, and in other ways).

References

  1. Math and modules of the mind (previous post)
  2. Mathematical Concepts (previous post)
  3. Mental, physical and mathematical representations (previous post)
  4. Images and Metaphors (abstractmath.org)
  5. David Tall and Schlomo Vinner, Concept Image and Concept Definition in Mathematics with particular reference to limits and continuity, Journal Educational Studies in Mathematics, 12 (May, 1981), no. 2, 151–169.
  6. Conceptual metaphor (Wikipedia article).
  7. What is mathematics, really? by Reuben Hersh, Oxford University Press, 1999.  Read online at Questia.
  8. 18 Unconventional Essays on the Nature of Mathematics, by Reuben Hersh. Springer, 2005.
  9. Mathematical objects (abstractmath.org).

 

 

Send to Kindle

Liberal-artsy people

I graduated from Oberlin College with a B.A. as a math major and minors in philosophy and English literature, with only three semesters of science courses.  I was and am "liberal-artsy".   As professor of math at Case Western Reserve University,  I had lots of colleagues in both pure and applied math who started out with B.Sc. degrees. We did not always understand each other very well!

Caveat: "Liberal-artsy" and "Narrowly Focused B.Sc. type" (I need a better name) are characteristics that people may have in varying amounts, and many professors in science and math have both characteristics.   I do, myself, although I am more L.A. that B.Sc.  Furthermore, I know nothing about any sociological or cognitive-science research on these characteristics.  I am making it all up as I write.  (This is a blog post, not a tome.)

I recently posted on secants and  tangents.  These articles were deliberately aimed to tickle the interests of L.A.  students.

Liberal-artsy types want to know about connections between concepts.  In each post, I wrote on both common meanings of the words (secant line and function, tangent line and function) and the close connections between them.  Some trig teachers / trig texts tell students about these connections but too many don't.   On the other hand, many B.Sc. types are left cold by such discussions.  B.Sc. types are goal-oriented and want to know a) how do I use it? b) how do I calculate it?  They get impatient when you talk about anything else.  I say point out these connections anyway.

L.A. types want to know about the reason for the name of a concept.  The post on secants refers to the metaphor that "secant" means "cutting". This is based on the etymology of "secant", which is hidden to many students  because it is based on Latin.  The post makes the connection that the "original" definition of "secant" was the length of a certain line segment generated by an angle in the unit circle. The post on tangents makes an analogous connection, and also points out that most tangent lines that students see touch the curve at only a single point, which is not a connotation of the English word "touch".

Many people think they have learned something when they know the etymology of a word.  In fact, the etymology of a word may have little or nothing to do with its current meaning, which may have developed over many centuries of metaphors that become dead, generate new metaphors that become dead, umpteen times, so that the original meaning is lost.  (The word "testimony" cam from a Latin phrase meaning hold your testicles, which is really not related to its meaning in present-day English.)

So I am not convinced that etymologies of names can help much in most cases.  In particular, different mathematical definitions of the same concept can be practically disjoint in terms of the data they use, and there is no one "correct" definition, although there may be only one that motivates the name.  (There often isn't a definition that motivates the name.  Think "group".)  But I do know that when I mention the history of a name of a concept in class, some students are fascinated and ask me questions about it.

L.A. types are often fascinated by ETBell-like stories about the mathematician who came up with a concept, and sometimes the stories illuminate the mathematical idea.  But L. A. types often are interested anyway.  It's funny when you talk about such a thing in class, because some students visibly tune out while others noticeably perk up and start paying attention.

So who should you cater to?  Answer:  Both kinds of students.  (Tell interesting stories, but quickly and in an offhand way.)

The posts on secants and tangents also experimented with using manipulable diagrams to illustrate the ideas.  I expect to write about that more in another post.

For more about the role of definitions, check out the abmath article and also Timothy Gowers' post on definitions (one of a series of excellent posts on working with abstract math).


Send to Kindle

Tangents

The interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook Tangent Line.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.

This is an experiment in exposition of the mathematical concepts of tangent.  It follows the same pattern as my previous post on secant, although that post has explanations of my motivation for this kind of presentation that are not repeated here.

Tangent line

A line is tangent to a curve (in the plane) at a given point if all the following conditions hold (Wikipedia has more detail.):

  1. The line is a straight line through the point.
  2. The curve goes through that point.
  3. The curve is differentiable in a neighborhood of the point.
  4. The slope of the straight line is the same as the derivative of the curve at that point.

In this picture the curve is $ y=x^3-x$ and the tangent is shown in red. You can click on the + signs for additional controls and information.

Etymology and metaphor

The word “tangent” comes from the Latin word for “touching”. (See Note below.) The early scholars who talked about “tangent” all read Latin and knew that the word meant touching, so the metaphor was alive to them.

The mathematical meaning of “tangent” requires that the tangent line have slope equal to the derivative of the curve at the point of contact. All of the red lines in the picture below touch the curve at the point (0, 1.5). None of them are tangent to the curve there because the curve has no derivative at the point:

The curve in this picture is defined by

The mathematical meaning restricts the metaphor. The red lines you can generate in the graph all touch the curve at one point, in fact at exactly at one point (because I made the limits on the slider -1 and 1), but there are not tangent to the curve.

Tangents can hug!

On the other hand, “touching” in English usage includes maintaining contact on an interval (hugging!) as well as just one point, like this:

The blue curve in this graph is given by

The green curve is the derivative dy/dx. Notice that it has corners at the endpoints of the unit interval, so the blue curve has no second derivative there. (See my post Curvature).

Tangent lines in calculus usually touch at the point of tangency and not nearby (although it can cross the curve somewhere else). But the red line above is nevertheless tangent to the curve at every point on the curve defined on the unit interval, according to the definition of tangent. It hugs the curve at the straight part.

The calculus-book behavior of tangent line touching at only one point comes about because functions in calculus books are always analytic, and two analytic curves cannot agree on an open set without being the same curve.

The blue curve above is not analytic; it is not even smooth, because its second derivative is broken at $x=0$ and $x=1$. With bump functions you can get pictures like that with a smooth function, but I am too lazy to do it.

Tangent on the unit circle

In trigonometry, the value of the tangent function at an angle $ \theta$ erected on the x-axis is the length of the segment of the tangent at (1,0) to the unit circle (in the sense defined above) measured from the x-axis to the tangent’s intersection with the secant line given by the angle. The tangent line segment is the red line in this picture:


This defines the tangent function for $ -\frac{\pi}{2} < x < \frac{\pi}{2}$.

The tangent function in calculus

That is not the way the tangent function is usually defined in calculus. It is given by \tan\theta=\frac{\sin\theta}{\cos\theta}, which is easily seen by similar triangles to be the same on -\frac{\pi}{2} < x < \frac{\pi}{2}.

We can now see the relationship between the geometric and the $ \frac{\sin\theta}{\cos\theta}$ definition of the tangent function using this graph:


The red segment and the green segment are always the same length.
It might make sense to extend the geometric definition to $ \frac{\pi}{2} < x < \frac{3\pi}{2}$ by constructing the tangent line to the unit circle at (-1,0), but then the definition would not agree with the $ \frac{\sin\theta}{\cos\theta}$ definition.

References

Send to Kindle