Tag Archives: mathematics education

1.000… and 0.999…

 

Note: This post uses MathJax. If you see mathematical formulas with dollar signs around them, or badly formatted formulas, try refreshing the screen. Sometimes you have to do it two or three times.
 
Recently Julian Wilson sent me this letter:
It is well known that students often have trouble accepting that $0.999\ldots$ is the same number as $1.000\ldots$.  However, there is at least one context in which these could be regarded as in some sense as being distinct. In a discrete dynamical system where the next iterate is formed by multiplying the current value by 10 and dropping the leading digit, and where you make a note at each iteration of the first digit after the decimal point, then 0.9999… generates a sequence of 9s, whereas 1.0000… generates a sequence of 0s. The imagery is of a stretching a circle, wrapping it ten times around itself and recording in which sector (labeled 0 to 9) you end up.
 
From the dynamical systems perspective, being in state 9 (and remaining there after each iteration) is different from being in state 0.
The $0.9999\ldots =1.0000\ldots$ equation is associated with several conceptual difficulties that math students have, which I will describe here.

The decimal representation is not the number

Another way of describing the equation is to say that "$0.999\ldots$" and "$1.000\ldots$" are distinct decimal representations of the same number, namely $1$. Julian's proposal provides a different interpretation of the notation, in which "$0.999\ldots$" and "$1.000\ldots$" are strings of symbols generated by two different machines.  Of course, that is correct.  But they are both correct decimal notation that correspond to the same number.

Mathematical writing will sometimes use notation to mean the abstract mathematical object it refers to, and at other times the text is referring to the notation itself.  For example,

$x^2+1$ is always positive.

refers to the value of $x^2+1$, but

If you substitute $5$ for $x$ in $x^2+1$ you get $26$.

refers to the expression "$x^2+1$".  Careful authors would write,

If you substitute $5$ for $x$ in "$x^2+1$" you get $26$.

This ambiguity in using mathematical notation is an example of what philosophers call the "use-mention" distinction, but they apply the phrase to many other situations as well.  Mathematicians have an operational knowledge of this distinction but many of them are not consciously aware of it.

Definitions

A decimal representation of a number by definition represents the number that a certain power series converges to. The two power series corresponding to 1.000… and to 0.999… both converge to 1:

\[1+\sum_{i=1}^{\infty}\frac{0}{10^n}=1\]

and

\[0+\sum_{i=1}^{\infty}\frac{9}{10^n}=1\] 

They are different power series (mention) but converge (use) to the same number.

Most students new to abstract math are not aware of the importance of definition in math. As they learn more, they may still hold on to the idea that you have to discover or reason out what a math word or expression means.  In purple prose, THE DEFINITION IS A DICTATOR. 

This does not mean that you can understand the concept merely by reading the definition.  The definition usually does not mention most of the important things about the concept.

Completed Infinity

A common remark by newbies about $0.999\ldots$ is that it gets closer and closer to $1$ but does not get there. So it can't be equal to $1$.  This shows a lack of understanding of completed infinity.  The point is that the notation "$0.999\ldots$" refers to a string beginning with "$0.$" and followed by an infinite sequence of $9$'s.  Now "$s$ is an infinite sequence of $9$'s" means precisely that $s$ has an entry $s_n$ for every positive integer $n$, and that $s_n$ is $9$ for every positive integer $n$. 

  • The expression is gradually unrolling over time, and does not ever "get there". 
  • All the nines are already there.

Both the preceding sentences are metaphorical.  They are about how you should think about "$0.999\ldots$".  The first metaphor is bad, the second metaphor is good.  Neither statement is a formal mathematical statement.  Neither statement says anything about what the sequence really is.  They are not statements about reality at all, they are about how you should think about the sequence if you are going to understand what mathematicians say about it. 

Metaphors are crucial to understanding math.  Too many students use the wrong metaphors, but too often no one tells them about it.

We need a math ed text for teachers

I am thinking of precalculus through typical college math major courses.  The issues I have discussed in this post are occasionally written about in the math ed literature but I have had difficulty finding many articles (on the web and on JStor) about these specific ideas.  Anyway, articles are not what we need.  We need a modest paperback book specifically aimed at teachers, covering the kinds of cognitive difficulties math students have when faced with abstraction. 

What I have written in abstractmath.org and in the Handbook are examples of what I mean, but they don't cover all the problems and they suffer from lack of focus.  (Note that the material in abstractmath.org and in posts on this blog can be used freely under a Creative Commons license — click on "Permissions" in the blue banner at the top of this page). 

Among math ed researchers, I have learned a lot from papers by Anna Sfard and David Tall

References

 
Send to Kindle

Freezing a family of functions

The interactive examples in this post require installing Wolfram CDF player, which is free and works on most desktop computers using Firefox, Safari and Internet Explorer, but not Chrome. The source code is the Mathematica Notebook algebra1.nb, which is available for free use under a Creative Commons Attribution-ShareAlike 2.5 License. The notebook can be read by CDF Player if you cannot make the embedded versions in this post work.

Some background

  • Generally, I have advocated using all sorts of images and metaphors to enable people to think about particular mathematical objects more easily.
  • In previous posts I have illustrated many ways (some old, some new, many recently using Mathematica CDF files) that you can provide such images and metaphors, to help university math majors get over the abstraction cliff.
  • When you have to prove something you find yourself throwing out the images and metaphors (usually a bit at a time rather than all at once) to get down to the rigorous view of math [1], [2], [3], to the point where you think of all the mathematical objects you are dealing with as unchanging and inert (not reacting to anything else).  In other words, dead.
  • The simple example of a family of functions in this post is intended to give people a way of thinking about getting into the rigorous view of the family.  So this post uses image-and-metaphor technology to illustrate a way of thinking about one of the basic proof techniques in math (representing the object in rigor mortis so you can dissect it).  I suppose this is meta-math-ed.  But I don’t want to think about that too much…
  • This example also illustrates the difference between parameters and variables. The bottom line is that the difference is entirely in how we think about them. I will write more about that later.

 A family of functions

This graph shows individual members of the family of functions \( y=a\sin\,x\) for various values of a. Let’s look at some of the ways you can think about this.

  • Each choice of  “shows the function for that value of the parameter a“.  But really, it shows the graph of the function, in fact only the part between x=-4 and x= 4.
  • You can also think of it as showing the function changing shape as a changes over time (as you slide the controller back and forth).

Well, you can graph something changing over time by introducing another axis for time.  When you graph vertical motion of a particle over time you use a two-dimensional picture, one axis representing time and the other the height of the particle. Our representation of the function y=a\sin\,x is a two-dimensional object (using its graph) so we represent the function in 3-space, as in this picture, where the slider not only shows the current (graph of the) function for parameter value a but also locates it over a on the z axis.

The picture below shows the surface given by y=a\sin\,x as a function of both variables a and x. Note that this graph is static: it does not change over time (no slide bar!). This is the family of functions represented as a rigorous (dead!) mathematical object.

If you click the “Show Curves” button, you will see a selection of the curves in middle diagram above drawn as functions of x for certain values of a. Each blue curve is thus a sine wave of amplitude a. Pushing that button illustrates the process going on in your mind when you concentrate on one aspect of the surface, namely its cross-sections in the x direction.

Reference [4] gives the code for the diagrams in this post, as well as a couple of others that may add more insight to the idea. Reference [5] gives similar constructions for a different family of functions.

References

  1. Rigorous view in abstractmath.org 
  2. Representations II: Dry Bones (post)
  3. Representations III: Rigor and Rigor Mortis (post)
  4. FamiliesFrozen.nb.
  5. AnotherFamiliesFrozen.nb (Mathematica file showing another family of functions)
Send to Kindle

Representations 2

Introduction

In a recent post I began a discussion of the mental, physical and mathematical representations of a mathematical object. The discussion continues here. Mathematicians, linguists, cognitive scientists and math educators have investigate some aspects of this topic, but there are many subtle connections between the different ideas which need to be studied.

I don’t have any overall theoretical grasp of these relationships. What I will do here is grope for an overall theory by mentioning a whole bunch of fine points. Some of these have been discussed in the literature and some (as far as I know) have not been discussed.  Many of them (I hope)  can be described as “an obvious fact about representations but no one has pointed it out before”.  Such fine points could be valuable; I think some scholars who have written about mathematical discourse and math in the classroom are not aware of many of these facts.

I am hoping that by thrashing around like this here (for graphs of functions) and for other concepts (set, function, triangle, number …) some theoretical understanding may emerge of what it means to understand math, do math, and talk about math.

The graph of a function

Let’s look at the graph of the function {y=x^3-x}.

What you are looking at is a physical representation of the graph of the function. The graph creates in your brain a mental representation of the graph of the function. These are subtly related to each other and to the mathematical definition of the graph.

Fine points

  1. The mathematical definition [2] of the graph of this function is: The set of ordered pairs of numbers {(x,x^3-x)} for all real numbers {x}.
  2. In the physical representation, each point {(x,x^3-x)} is shown in a location determined by the conventional {x-y} coordinate system, which uses a straight-line representation of the real numbers with labels and ticks.
    • The physical representation makes use of the fact that the function is continuous. It shows the graph as a curving line rather than a bunch of points.
    • The physical representation you are looking at is not the physical representation I am looking at. They are on different computer screens or pieces of paper. We both expect that the representations are very similar, in some sense physically isomorphic.
    • “Location” on the physical representation is a physical idea. The mathematical location on the mathematical graph is essentially the concept of the physical location refined as the accuracy goes to infinity. (This last statement is a metaphor attached to a genuine mathematical construction, for example Cauchy sequences.)
  3. The mathematical definition of “graph” and the physical representation are related by a metaphor. (See Note 1.)
    • The physical curve in blue in the picture corresponds via the metaphor to the graph in the mathematical sense: in this way, each location on the physical curve corresponds to an ordered pair of the form {(x,x^3-x)}.
    • The correspondence between the locations and the pairs is imperfect. You can’t measure with infinite accuracy.
    • The set of ordered pairs {(x,x^3-x)} form a parametrized curve in the mathematical sense. This curve has zero thickness. The curve in the physical representation has positive thickness.
    • Not all the points in the mathematical graph actually occur on the physical curve: The physical curve doesn’t show the left and right infinite tails.
    • The physical curve is drawn to show some salient characteristics of the curve, such as its extrema and inflection points. This is expected by convention in mathematical writing. If the graph had left out a maximum, for example, the author would be constrained (by convention!) to say so.
    • An experienced mathematician or advanced student understands the fine points just listed. A newbie may not, and may draw false conclusions about the function from the graph. (Note 2.)
  4. If you are a mathematician or at least a math student, seeing the physical graph shown above produces a mental image(see Note 3.) of the graph in your mind.
  5. The mathematical definition and the mental image are connected by a metaphor. This is not the same metaphor as the one that connects the physical representation and the mathematical definition.
    • The curve I visualize in my mental representation has an S shape and so does the physical representation. Or does it? Isn’t the S-ness of the shape a fact I construct mentally (without consciously intending to do so!)?
    • Does the curve in the mental rep have thickness? I am not sure this is a meaningful question. However, if you are a sufficiently sophisticated mathematician, your mental image is annotated with the fact that the curve has zero thickness. (See Note 4.)
    • The curve in your mental image of the curve may very well be blue (just because you just looked at my picture) but you must have an annotation to the effect that that is irrelevant! That is the essence of metaphor: Some things are identified with each other and others are emphatically not identified.
    • The coordinate axes do exist in the physical representation and they don’t exist in the mathematical definition of the graph. Of course they are implied by the definition by the properties of the projection functions from a product. But what about your mental image of the graph? My own image does not show the axes, but I do “know” what the coordinates of some of the points are (for example, {(-1,0)}) and I “see” some points (the local maximum and the local minimum) whose coordinates I can figure out.

Notes

1. This is metaphor in the sense lately used by cognitive scientists, for example in [6]. A metaphor can be described roughly as two mental images in which certain parts of one are identified with certain parts of another, in other words a pushout. The rhetorical use of the word “metaphor” requires it to be a figure of speech expressed in a certain way (the identification is direct rather than expressed by “is like” or some such thing.)  In my use in this article a metaphor is something that occurs in your brain.  The form it takes in speech or writing is not relevant.

2. I have noticed, for example, that some students don’t really understand that the left and right tails go off to infinity horizontally as well as vertically.   In fact, the picture above could mislead someone into thinking the curve has vertical asymptotes: The right tail looks like it goes straight up.  How could it get to x equals a billion if it goes straight up?

3. The “mental image” is of course a physical structure in your brain.  So mental representations are physical representations.

4. I presume this “annotation” is some kind of physical connection between neurons or something.  It is clear that a “mental image” is some sort of physical construction or event in the brain, but from what little I know about cognitive science, the scientists themselves are still arguing about the form of the construction.  I would appreciate more information on this. (If the physical representation of mental images is indeed still controversial, this says nothing bad about cognitive science, which is very new.)

References

[1] Mental Representations in Math (previous post).

[2] Definitions (in abstractmath).

[3] Lakoff, G. and R. E. Núñez (2000), Where Mathematics Comes From. Basic Books.

Send to Kindle

Templates in mathematical practice

This post is a first pass at what will eventually be a section of abstractmath.org. It’s time to get back to abstractmath; I have been neglecting it for a couple of years.

What I say here is based mainly on my many years of teaching discrete mathematics at Case Western Reserve University in Cleveland and more recently at Metro State University in Saint Paul.

Beginning abstract math

College students typically get into abstract math at the beginning in such courses as linear algebra, discrete math and abstract algebra. Certain problems that come up in those early courses can be grouped together under the notion of (what I call) applying templates [note 0]. These are not the problems people usually think about concerning beginners in abstract math, of which the following is an incomplete list:

The students’ problems discussed here concern understanding what a template is and how to apply it.

Templates can be formulas, rules of inference, or mini-programs. I’ll talk about three examples here.

The template for quadratic equations

The solution of a real quadratic equation of the form {ax^2+bx+c=0} is given by the formula

\displaystyle  x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}

This is a template for finding the roots of the equations. It has subtleties.

For example, the numerator is symmetric in {a} and {c} but the denominator isn’t. So sometimes I try to trick my students (warning them ahead of time that that’s what I’m trying to do) by asking for a formula for the solution of the equation {a+bx+cx^2=0}. The answer is

\displaystyle x=\frac{-b\pm\sqrt{b^2-4ac}}{2c}

I start writing it on the board, asking them to tell me what comes next. When we get to the denominator, often someone says “{2a}”.

The template is telling you that the denominator is 2 times the coefficient of the square term. It is not telling you it is “{a}”. Using a template (in the sense I mean here) requires pattern matching, but in this particular example, the quadratic template has a shallow incorrect matching and a deeper correct matching. In detail, the shallow matching says “match the letters” and the deep matching says “match the position of the letters”.

Most of the time the quadratic being matched has particular numbers instead of the same letters that the template has, so the trap I just described seldom occurs. But this makes me want to try a variation of the trick: Find the solution of {3+5x+2x^2=0}. Would some students match the textual position (getting {a=3}) instead of the functional position (getting {a=5})? [Note [0]). If they did they would get the solutions {(-1,-\frac{2}{3})} instead of {(-1,-\frac{3}{2})}.

Substituting in algebraic expressions have other traps, too. What sorts of mistakes would students have solving {3x^2+b^2x-5=0}?

Most students on the verge of abstract math don’t make mistakes with the quadratic formula that I have described. The thing about abstract math is that it uses more sophisticated templates

  • subject to conditions
  • with variations
  • with extra levels of abstraction

The template for proof by induction

This template gives a method of proof of a statement of the form {\forall{n}\mathcal{P}(n)}, where {\mathcal{P}} is a predicate (presumably containing {n} as a variable) and {n} varies over positive integers. The template says:

Goal: Prove {\forall{n}\mathcal{P}(n)}.

Method:

  • Prove {\mathcal{P}(1)}
  • For an arbitrary integer {n>1}, assume {\mathcal{P}(n)} and deduce {\mathcal{P}(n+1)}.

For example, to prove {\forall n (2^n+1\geq n^2)} using the template, you have to prove that {2^2+1\geq  1^1}, and that for any {n>1}, if {2^n+1\geq n^2}, then {2^{n+1}+1\geq  (n+1)^2}. You come up with the need to prove these statements by substituting into the template. This template has several problems that the quadratic formula does not have.

Variables of different types

The variable {n} is of type integer and the variable {\mathcal{P}} is of type predicate [note 0]. Having to deal with several types of variables comes up already in multivariable calculus (vectors vs. numbers, cross product vs. numerical product, etc) and they multiply like rabbits in beginning abstract math classes. Students sometimes write things like “Let {\mathcal{P}=n+1}”. Multiple types is a big problem that math ed people don’t seem to discuss much (correct me if I am wrong).

Free and bound

The variable {n} occurs as a bound variable in the Goal and a free variable in the Method. This happens in this case because the induction step in the Method originates as the requirement to prove {\forall  n(\mathcal{P}(n)\rightarrow\mathcal{P}(n+1))}, but as I have presented it (which seems to be customary) I have translated this into a requirement based on modus ponens. This causes students problems, if they notice it. (“You are assuming what you want to prove!”) Many of them apparently go ahead and produce competent proofs without noticing the dual role of {n}. I say more power to them. I think.

The template has variations

  • You can start the induction at other places.
  • You may have to have two starting points and a double induction hypothesis (for {n-1} and {n}). In fact, you will have to have two starting points, because it seems to be a Fundamental Law of Discrete Math Teaching that you have to talk about the Fibonacci function ad nauseam.
  • Then there is strong induction.

It’s like you can go to the store and buy one template for quadratic equations, but you have to by a package of templates for induction, like highway engineers used to buy packages of plastic French curves to draw highway curves without discontinuous curvature.

The template for row reduction

I am running out of time and won’t go into as much detail on this one. Row reduction is an algorithm. If you write it up as a proper computer program there have to be all sorts of if-thens depending on what you are doing it for. For example if want solutions to the simultaneous equations

2x+4y+z = 1
x+2y = 0
x+2y+4z = 5

you must row reduce the matrix

2 4 1 1
1 2 0 0
1 2 4 5

(I haven’t yet figured out how to wrap this in parentheses) which gives you

1 2 0 0
0 0 1 0
0 0 0 1

This introduces another problem with templates: They come with conditions. In this case the condition is “a row of three 0s followed by a nonzero number means the equations have no solutions”. (There is another condition when there is a row of all 0’s.)

It is very easy for the new student to get the calculation right but to never sit back and see what they have — which conditions apply or whatever.

When you do math you have to repeatedly lean in and focus on the details and then lean back and see the Big Picture. This is something that has to be learned.

What to do, what to do

I have recently experimented with being explicit about templates, in particular going through examples of the use of a template after explicitly stating the template. It is too early to say how successful this is. But I want to point out that even though it might not help to be explicit with students about templates, the analysis in this post of a phenomenon that occurs in beginning abstract math courses

  • may still be accurate (or not), and
  • may help teachers teach such things if they are aware of the phenomenon, even if the students are not.

Notes

  1. Many years ago, I heard someone use the word “template” in the way I am using it now, but I don’t recollect who it was. Applied mathematicians sometimes use it with a meaning similar to mine to refer to soft algorithms–recipes for computation that are not formal algorithms but close enough to be easily translated into a sufficiently high level computer language.
  2. In the formula {ax^2+bx+c}, the “{a}” has the first textual position but the functional position as the coefficient of the quadratic term. This name “functional position” has nothing to do with functions. Can someone suggest a different name that won’t confuse people?
  3. I am using “variable” the way logicians do. Mathematicians would not normally refer to “{\mathcal{P}}” as a variable.
  4. I didn’t say anything about how templates can involve extra layers of abstract.  That will have to wait.
Send to Kindle

Three kinds of mathematical thinkers

This is a continuation of my post Syntactic and semantic thinkers, in which I mentioned Leone Burton’s book [1] but hadn’t read it yet.  Well, now it is due back at the library so I’d better post about it!

I recommend this book for anyone interested in knowing more about how mathematicians think about and learn math.  The book is based on in-depth interviews with seventy mathematicians.  (One in-depth interview is worth a thousand statistical studies.)   On page 53, she writes

At the outset of this study, I had two conjectures with respect to thinking style.  The first was that I would find the two different thinking styles,the visual and the analytic, well recorded in the literature… The second was that research mathematicians would move flexibly between the two.  Neither of these conjectures were confirmed.

What she discovered was three styles of mathematical thinking:

Style A: Visual (or thinking in pictures, often dynamic)

Style B: Analytic (or thinking symbolically, formalistically)

Style C: Conceptual (thinking in ideas, classifying)

Style B corresponds more or less with what was called “syntactic” in [3] (based on [2]).  Styles A and C are rather like the distinctions I made in [3] that I called “conceptual” and “visual”, although I really want Style A to communicate not only “visual” but “geometric”.

I recommend jumping through the book reading the quotes from the interviews.  You get a good picture of the three styles that way.

Visual vs. conceptual

I had thought about this distinction before and have had a hard time explaining what “conceptual” means, particularly since for me it has a visual component.  I mentioned this in [3].  I think about various structures and their relationship by imagining them as each in a different part of a visual field, with the connections as near as I can tell felt rather than seen.  I do not usually think in terms of the structures’ names (see [4]).  It is the position that helps me know what I am thinking about.

When it comes time to write up the work I am doing, I have to come up with names for things and find words to describe the relationships that I was feeling. (See remark (5) below).  Sometimes I have also written things down and come up with names, and if this happened very much I invariable get a clash of notation that didn’t bother me when I was thinking about the concepts because the notations referred to things in different places.

I would be curious if others do math this way.  Especially people better than I am.  (Clue to a reasonable research career:  Hang around people smarter than you.)

Remarks

1) I have written a lot about images and metaphors [5], [6].  They show up in the way I think about things sometimes.  For example, when I am chasing a diagram I am thinking of each successive arrow as doing something.  But I don’t have any sense that I depend a lot on metaphors.  What I depend on is my experience with thinking about the concept!

2) Some of the questions on Math Overflow are of the “how do I think about…” type (or “what is the motivation for…”).  Some of the answers have been Absolutely Entrancing.

3) Some of the respondents in [1] mentioned intuition, most of them saying that they thought of it as an important part of doing math.  I don’t think the book mentioned any correlation between these feelings and the Styles A, B, C, but then I didn’t read the book carefully.  I never read any book carefully. (My experience with Style B of the subtype Logic Rules diss intuition. But not analysts of the sort who estimate errors and so on.)

4) Concerning A, B, C:  I use Style C (conceptual) thinking mostly, but a good bit of Style (B) (analytic) as well.  I think geometrically when I do geometry problems, but my research has never tended in that direction.  Often the analytic part comes after most of the work has been done, when I have to turn the work into a genuine dry-bones proof.

5) As an example of how I have sometimes worked, I remember doing a paper about lifting group automorphisms (see [7]), in which I had a conceptual picture with a conceptual understanding of the calculations of doing one transformation after another which produced an exact sequence in cohomology.  When I wrote it up I thought it would be short.  But all the verifications made the paper much longer.  The paper was conceptually BigChunk BigChunk BigChunk BigChunk … but each BigChunk required a lot of Analytic work.  Even so, I missed a conceptual point (one of the groups involved was a stabilizer but I didn’t notice that.)

References

[1] Leone Burton, Mathematicians as Enquirers: Learning about Learning Mathematics.  Kluwer, 2004.

[2] Keith Weber, Keith Weber, How syntactic reasoners can develop understanding, evaluate conjectures, and generate counterexamples in advanced mathematics. Proof copy available from Science Direct.

[3] Post on this blog: Syntactic and semantic thinkers.

[4] Post: Thinking without words.

[5] Post: Proofs without dry bones.

[6] Abstractmath.org article on Images and Metaphors.

[7] Post: Automorphisms of group extensions updated.

Send to Kindle