Tag Archives: parentheses

Pattern recognition in understanding math

Abstract patterns

This post is a revision of the article on pattern recognition in abstractmath.org.

When you do math, you must recognize abstract patterns that occur in

  • Symbolic expressions
  • Geometric figures
  • Relations between different kinds of math structures.
  • Your own mental representations of mathematical objects

This happens in high school algebra and in calculus, not just in the higher levels of abstract math.

Examples

Most of these examples are revisited in the section called Laws and Constraints.

At most

For real numbers $x$ and $y$, the phrase “$x$ is at most $y$” means by definition $x\le y$. To understand this definition requires recognizing the pattern “$x$ is at most $y$” no matter what expressions occur in place of $x$ and $y$, as long as they evaluate to real numbers.

Examples

  • “$\sin x$ is at most $1$” means that $\sin x\le 1$. This happens to be true for all real $x$.
  • “$3$ is at most $7$” means that $3\leq7$. You may think that “$3$ is at most $7$” is a silly thing to say, but it nevertheless means that $3\leq7$ and so is a correct statement.
  • “$x^2+(y-1)^2$ is at most $5$” means that
    $x^2+(y-1)^2\leq5$. This is true for some pairs $(x,y)$ and false for others, so it is a constraint. It defines the disk below:

The product rule for derivatives

The product rule for differentiable functions $f$ and $g$ tells you that the derivative of $f(x)g(x)$ is \[f'(x)\,g(x)+f(x)\,g'(x)\]

Example

You recognize that the expression ${{x}^{2}}\sin x$ fits the pattern $f(x)g(x)$ with $f(x)={{x}^{2}}$ and $g(x)=\sin x$. Therefore you know that the derivative of ${{x}^{2}}\,\sin x$ is \[2x\sin x+{{x}^{2}}\cos x\]

The quadratic formula

The quadratic formula for the solutions of an equation of the form $a{{x}^{2}}+bx+c=0$ is usually given as\[r=\frac{-b\pm
\sqrt{{{b}^{2}}-4ac}}{2a}\]

Example

If you are asked for the roots of $3{{x}^{2}}-2x-1=0$, you recognize that the polynomial on the left fits the pattern $a{{x}^{2}}+bx+c$ with

  • $a\leftarrow3$ (“$a$ replaced by $3$”)
  • $b\leftarrow-2$
  • and $c\leftarrow-1$.

Then
substituting those values in the quadratic formula gives you the roots $-1/3$ and $1$.

Difficulties with the quadratic formula

A little problem

The quadratic formula is easy to use but it can still cause pattern recognition problems. Suppose you are asked to find the solutions of $3{{x}^{2}}-7=0$. Of course you can do this by simple algebra — but pretend that the first thing you thought of was using the quadratic formula.

  • Then you got upset because you have to apply it to $a{{x}^{2}}+bx+c$
  • and $3{{x}^{2}}-7$ has only two terms
  • but $a{{x}^{2}}+bx+c$ has three terms…
  • (Help!)
  • Do Not Be Anguished:
  • Write
    $3{{x}^{2}}-7$ as $3{{x}^{2}}+0\cdot x-7$, so $a=3$, $b=0$ and $c=-7$.
  • Then put those values into the quadratic formula and you get $x=\pm \sqrt{\frac{7}{3}}$.   
  • This is an example of the following useful principle:


    Write zero cleverly.

    I suspect that most people reading this would not have had the problem with $3{{x}^{2}}-7$ that I have just described. But before you get all insulted, remember:


    The thing about really easy examples is that they give you the point without getting you lost in some complicated stuff you don’t understand very well.

    A fiendisher problem

      Even college students may have trouble with the following problem (I know because I have tried it on them):

    What are the solutions of the equation $a+bx+c{{x}^{2}}=0$?

    The answer

             

    \[r=\frac{-b\pm
    \sqrt{{{b}^{2}}-4ac}}{2a}\]

    is wrong. The correct answer is

                                     \[r=\frac{-b\pm
    \sqrt{{{b}^{2}}-4ac}}{2c}\]


    When you remember a pattern with particular letters in it and an example has some of the same letters in it, make sure they match the pattern!

    The substitution rule for integration

    The chain rule says that the derivative of a function of the form $f(g(x))$ is $f'(g(x))g'(x)$. From this you get the substitution rule for finding indefinite integrals:

                                      \[\int{f'(g(x))g'(x)\,dx}=f(g(x))+C\]

    Example

    To find $\int{2x\,\cos
    ({{x}^{2}})\,dx}$, you recognize that you can take $f(x)=\sin x$and $g(x)={{x}^{2}}$ in the formula, getting \[\int{2x\,\cos ({{x}^{2}})\,dx}=\sin ({{x}^{2}})\]    Note that in the way I wrote the integral, the functions occur in the opposite order from the pattern. That kind of thing happens a lot.

    Laws and constraints

    • The statement “$(x+1)^2=x^2+2x+1$” is a pattern that is true for all numbers $x$. $3^2=2^2+2\times2+1$ and $(-2)^2=(-1)^2+2\times(-1)+1$, and so on. Such a pattern is a universal assertion, so it is a theorem. When the statement is an equation, as in this case, it is also called a law.
    • The statement “$\sin x\leq 1$” is also true for all $x$, and so is a theorem.
    • The statement “$x^2+(y-1)^2$ is at most $5$” is true for some real numbers and not others, so it is not a theorem, although it is a constraint.
    • The quadratic formula says that:
      The solutions of an equation
      of the form $a{{x}^{2}}+bx+c=0$ is
      given by\[r=\frac{-b\pm
      \sqrt{{{b}^{2}}-4ac}}{2a}\]

      This is true for all complex numbers $a$, $b$, $c$.
      The $x$ in the equation is not a free variable, but a “variable to be solved for” and does not appear in the quadratic formula. Theorems like the quadratic formula are usually called “formulas” rather than “laws”.

    • The product rule for derivatives

      The derivative of $f(x)g(x)$ is $f'(x)\,g(x)+f(x)\,g'(x)$

      is true for all differentiable functions $f$ and $g$. That means it is true for both of these choices of $f$ and $g$:

      • $f(x)=x$ and $g(x)=x\sin x$
      • $f(x)=x^2$ and $g(x)=\sin x$

      But both choices of $f$ and $g$ refer to the same function $x^2\sin x$, so if you apply the product rule in either case you should get the same answer. (Try it).

    Some bothersome types of pattern recognition

    Dependence on conventions

    Definition: A quadratic polynomial in $x$is an expression of the form $a{{x}^{2}}+bx+c$.   

    Examples

    • $-5{{x}^{2}}+32x-5$ is a quadratic polynomial: You have to recognize that it fits the pattern in the definition by writing it as $(-5){{x}^{2}}+32x+(-5)$
    • So is ${{x}^{2}}-1$: You have to recognize that it fits the definition by writing it as ${{x}^{2}}+0\cdot x+(-1)$ (I wrote zero cleverly).

    Some authors would just say, “A quadratic polynomial is an expression of the form $a{{x}^{2}}+bx+c$” leaving you to deduce from conventions on variables that it is a polynomial in $x$ instead of in $a$ (for example).

    Note also that I have deliberately not mentioned what sorts of numbers $a$, $b$, $c$ and $x$ are. The authors may assume that you know they are using real numbers.

    An expression as an instance of substitution

    One particular type of pattern recognition that comes up all the time in math is recognizing that a given expression is an instance of a substitution into a known expression.

    Example

    Students are sometimes baffled when a proof uses the fact that ${{2}^{n}}+{{2}^{n}}={{2}^{n+1}}$ for positive integers $n$. This requires the recognition of the patterns $x+x=2x$ and $2\cdot
    \,{{2}^{n}}={{2}^{n+1}}$.

    Similarly ${{3}^{n}}+{{3}^{n}}+{{3}^{n}}={{3}^{n+1}}$.

    Example

    The assertion

    \[{{x}^{2}}+{{y}^{2}}\ge 0\ \ \ \ \ \text{(1)}\]

    has as a special case

    \[(-x^2-y^2)^2+(y^2-x^2)^2\ge
    0\ \ \ \ \ \text{(2)}\]

    which involves the substitutions $x\leftarrow -{{x}^{2}}-{{y}^{2}}$ and $y\leftarrow
    {{y}^{2}}-{{x}^{2}}$.

    Remarks
    • If you see (2) in a text and the author blithely says it is “never negative”, that is because it is of the form \[{{x}^{2}}+{{y}^{2}}\ge 0\] with certain expressions substituted for $x$ and $y$. (See substitution and The only axiom for algebra.)
    • The fact that there are minus signs in (2) and that $x$ and $y$ play different roles in (1) and in (2) are red herrings. See ratchet effect and variable clash.
    • Most people with some experience in algebra would see quickly that (2) is correct by using chunking. They would visualize (2) as

      \[(\text{something})^2+(\text{anothersomething})^2\ge0\]
      This shows that in many cases


      chunking is a psychological inverse to substitution

    • Note that when you make these substitutions you have to insert appropriate parentheses (more here). After you make the substitution, the expression of course can be simplified a whole bunch, to

      \[2({{x}^{4}}+{{y}^{4}})\ge0\]

    • A common cause of error in doing this (a mistake I make sometimes) is to try to substitute and simplify at the same time. If the situation is complicated, it is best to

      substitute as literally as possible and then simplify

    Integration by Parts

    The rule for integration by parts says that

                             \[\int{f(x)\,g'(x)\,dx=f(x)\,g(x)-\int{f'(x)\,g(x)\,dx}}\]

    Suppose you need to find $\int{\log x\,dx}$.(In abstractmath.org, “log” means ${{\log }_{e}}$).  Then we can recognize this integral as having the pattern for the left side of the parts formula with $f(x)=1$ and $g(x)=\log \,x$. Therefore

    \[\int{\log x\,dx=x\log x-\int{\frac{1}{x}dx=x\log \,x-x+c}}\]

    How on earth did I think to recognize $\log x$ as $1\cdot \log x$??  
    Well, to tell the truth because some nerdy guy (perhaps I should say some other nerdy guy) clued me in when I was taking freshman calculus. Since then I have used this device lots of times without someone telling me — but not the first time.

    This is an example of another really useful principle:


    Write $1$ cleverly.

    Two different substitutions give the same expression

    Some proofs involve recognizing that a symbolic expression or figure fits a pattern in two different ways. This is illustrated by the next two examples. (See also the remark about the product rule above.) I have seen students flummoxed by Example ID, and Example ISO is a proof that is supposed to have flummoxed medieval geometry students.

    Example ID

    Definition: In a set with an associative binary operation and an identity element $e$, an element $y$ is the inverse of an element $x$ if

    \[xy=e\ \ \ \ \text{and}\ \ \ \ yx=e \ \ \ \ (1)\]

    In this situation, it is easy to see that $x$ has only one inverse: If $xy=e$ and $xz=e$ and $yx=e$ and $zx=e$, then \[y=ey=(zx)y=z(xy)=ze=z\]

    Theorem: ${{({{x}^{-1}})}^{-1}}=x$.

    Proof: I am given that ${{x}^{-1}}$ is the inverse of $x$, By definition, this means that

    \[x{{x}^{-1}}=e\ \ \ \text{and}\ \ \ {{x}^{-1}}x=e \ \ \ \ (2)\]

    To prove the theorem, I must show that $x$ is the inverse of ${{x}^{-1}}$. Because $x^{-1}$ has only one inverse, all we have to do is prove that

    \[{{x}^{-1}}x=e\ \ \ \text{and}\ \ \ x{{x}^{-1}}=e\ \ \ \ (3)\]  

    But (2) and (3) are equivalent! (“And” is commutative.)

    Example ISO

    This sort of double substitution occurs in geometry, too.

    Theorem: If a triangle has two equal angles, then it has two equal sides.

    Proof: In the figure, assume $\angle ABC=\angle ACB$. Then triangle $ABC$ is congruent to triangle $ACB$ since the sides $BC$ and $CB$ are equal (they are the same line segment!) and the adjoining angles are equal by hypothesis.

    The point is that although triangles $ABC$ and $ACB$ are the same triangle, and sides $BC$ and $CB$ are the same line segment, the proof involves recognizing them as geometric figures in two different ways.

    This proof (not Euclid’s origi­nal proof) is hundreds of years old and is called the pons asinorum (bridge of donkeys). It became famous as the first theorem in Euclid’s books that many medi­eval stu­dents could not under­stand. I con­jecture that the name comes from the fact that the triangle as drawn here resembles an ancient arched bridge. These days, isos­ce­les tri­angles are usually drawn taller than they are wide.

    Technical problems in carrying out pattern matching

    Parentheses

    In matching a pattern you may have to insert parentheses. For example, if you substitute $x+1$ for $a$, $2y$ for
    $b$ and $4$ for $c$ in the expression \[{{a}^{2}}+{{b}^{2}}={{c}^{2}}\] you get \[{{(x+1)}^{2}}+4{{y}^{2}}=16\]
    If you did the substitution literally without editing the expression so that it had the correct meaning, you would get \[x+{{1}^{2}}+2{{y}^{2}}={{4}^{2}}\] which is not the result of performing the substitution in the expression ${{a}^{2}}+{{b}^{2}}={{c}^{2}}$.   

    Order switching

    You can easily get confused if the patterns involve a switch in the order of the variables.

    Notation for integer division

    • For integers $m$ and $n$, the phrase “$m$ divides $n$” means there is an integer $q$ for which $n=qm$.
    • In number theory (which in spite of its name means the theory of positive integers) the vertical bar is used to denote integer division. So $3|6$ because $6=2\times 3$ ($q$ is $2$ in this case). But “$3|7$” is false because there is no integer $q$ for which $7=q\times 3$.
    • An equivalent definition of division says that $m|n$ if and only if $n/m$ is an integer. Note that $6/3=2$, an integer, but $7/3$ is not an integer.
    • Now look at those expressions:
    • “$m|n$” means that there is an integer $q$ for which $n=qm$.In these two expressions, $m$ and $n$ occur in opposite order.
    • “$m|n$” is true only if $n/m$ is an integer. Again, they are in opposite order. Another way of writing $n/m$ is $\frac{n}{m}$. When math people pronounce “$\frac{n}{m}$” they usually say, “$n$ over $m$” using the same order.
  • I taught these notation in courses for computer engineering and math majors for years. Some of the students stayed hopelessly confused through several lectures and lost points repeatedly on homework and exams by getting these symbols wrong.
  • The problem was not helped by the fact that “$|$” and “$/$” are similar but have very different syntax:

    Math notation gives you no clue which symbols are operators (used to form expressions) and which are verbs (used to form assertions).

  • A majority of the students didn’t have so much trouble with this kind of syntax. I have noticed that many people have no sense of syntax and other people have good intuitive understanding of syntax. I suspect the second type of people find learning foreign languages easy.
  • Many of the articles in the references below concern syntax.
  • References

    Creative Commons License

    This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.


    Send to Kindle

    Computable algebraic expressions in tree form

    Invisible algebra

    1. An  expression such as $4(x-2)=6$ has an invisible abstract structure.  In this simple case it is

    using the style of presenting trees used in academic computing science.  The parentheses are a clue to the structure; omitting them results in  $4x-2=6$, which has the different structure

    By the time students take calculus they supposedly have learned to perceive and work with this invisible structure, but many of them still struggle with it.  They have a lot of trouble with more complex expressions, but even something like $\sin x + y$ gives some of them trouble.

    Make the invisible visible

    The tree expression makes the invisible structure explicit. Some math educators such as Jason Dyer and Bret Victor have experimented with the idea of students working directly with a structured form of an algebraic expression, including making the structured form interactive.

    How could the tree structure be used to help struggling algebra students?

    1) If they are learning on the computer, the program could provide the tree structure at the push of a button. Lessons could be designed to present algebraic expressions that look similar but have different structure.

    2) You could point out things such as:

    a) “inside the parentheses pushes it lower in the tree”
    b) “lower in the tree means it is calculated earlier”

    3) More radically, you could teach algebra directly using the tree structure, with the intention of introducing the expression-as-a-string form later.  This is analogous to the use of the initial teaching alphabet for beginners at reading, and also the use of shape notes to teach sight reading of music for singing.  Both of these methods have been shown to help beginners, but the ITA didn’t catch on and although lots of people still sing from shape notes (See Note 1) they are not as far as I know used for teaching in school.

    4) You could produce an interactive form of the structure tree that the student could use to find the value or solve the equation.  But that needs a section to itself.

    Interactive trees

    When I discovered the TreeForm command in Mathematica (which I used to make the trees above), I was inspired to use it and the Manipulate command to make the tree interactive.


    This is a screenshot of what Mathematica shows you.  When this is running in Mathematica, moving the slide back and forth causes the dependent values in the tree also change, and when you slide to 3.5, the slot corresponding to $ 4(x-2)$ becomes 6 and the slot over “Equals” becomes “True”:

    As seen in this post, these are just screen shots that you can’t manipulate.  The Mathematica notebook Expressions.nb gives the code for this and lets you experiment with it.  If you don’t have Mathematica available to you, you can still manipulate the tree with the slider if you download the CDF form of the notebook and open it in Mathematica CDF Player, which is available free here.  The abstractmath website has other notebooks you may want to look at as well.

    Moving the slider back and forth constitutes finding the correct value of x by experiment.  This is a peculiar form of bottom-up evaluation.   With an expression whose root node is a value rather than an equation, wiggling the slider constitutes calculating various values with all the intermediate steps shown as you move it.  Bret Victor s blog shows a similar system, though not showing the tree.

    Another way to use the tree is to arrange to show it with the calculated values blank.  (The constants and the labels showing the operation would remain.)   The student could start at the top blank space (over Times)  and put in the required value, which would obviously have to be 6 to make the space over Equals change to “True”.  Then the blank space over Plus would have to be 1.5 in order to make multiplying it by 4 be 6.  Then the bottom left blank space would have to be 3.5 to make it equal to 1.5 when -2 is added.  This is top down evaluation.

    You could have the student enter these numbers in the blank spaces on the computer or print out the tree with blank spaces and have them do it with a pencil.  Jason Dyer’s blog has examples.

    Implementation

    My example code in the notebook is a kludge.  If you defined a  special VertexRenderingFunction for TreeForm in Mathematica, you could create a function that would turn any algebraic expression into a manipulatable tree with a slider like the one above (or one with blank spaces to be filled in).  [Note 2]. I expect I will work on that some time soon but my main desire in this series of blog posts is to through out ideas with some Mathematica code attached that others might want to develop further. You are free to reuse all the Mathematica code and all my blog posts under the Creative Commons Attribution – ShareAlike 3.0 License.  I would like to encourage this kind of open-source behavior.

    Notes

    1. Including me every Tuesday at 5:30 pm in Minneapolis (commercial).

    2. There is a problem with Equals.  In the hacked example above I set the increment the value jumps by when the slider is moved to 0.1, so that the correct value 3.5 occurs when you slide.  If you had an equation with an irrational root this would not work.  One thing that should work is to introduce a fuzzy form of Equals with the slide-increment smaller that the latitude allowed in the fuzzy Equals.

    Send to Kindle