Wednesday, February 10, 2010

Algorithms: The first stage in learning or the final stage in understanding?

Consider the above as a prompt, a thought experiment.  Is it better to use algorithms as the first step in a learning process, or as the culmination?  This really hinges on that word better, as in, better for what purpose?  Better for learning what?  As we all know, the question is rarely asked, “Is our students learning?”  -- but I don’t know that we’re all on the same page about what we’re trying to teach!

First things first, let’s have a definition of terms for the lay reader.  An algorithm is a series of explicit steps to accomplish some task.  Here’s an example taken from my class’ algebra textbook of an algorithm to add two numbers:

Rules of Addition:

  • To add two numbers of the same sign,
    1. Add their absolute values
    2. Attach the common sign
  • To add two numbers with opposite signs
    1. Subtract the smaller absolute value from the larger one
    2. Attach the sign of the number with larger absolute value

I don’t want to be polemic, but reading that I get the mental image of that screeching sound of a sudden halt that you hear on TV.  It hits me right around the  point where I get to Subtract the smaller absolute value from the larger one… what?

At the risk of sounding arrogant, I feel like if I read something in an introductory math book and I don’t get it the first time around, despite just about having my PhD in Applied Physics, something is amiss.  It’s not a foolproof test, but it does have a good success rate.  Still, I value the input of you, my dear (imaginary?) readers, to tell me if I’m off-base here.

Frankly, I think math is one of those cases where you have to be very careful about the balance between the general and the specific.  Normally, I’m all about teaching things in the context of a general framework.  My calculus class, for example – they are great at applying things like the product or quotient rule, provided you give them two functions named f and g and tell them to find the derivative of fg or f/g.  But, give the problem in the book a different spin so that it doesn’t match up verbatim with what they’ve seen before, and… disaster.  They’ve just memorized specific ways to do specific problems, without having a framework to put it all in.

Nevertheless, I want to draw a distinction between a conceptual framework and an algorithm.  Let’s take an example from algebra: finding the equation for a line.  As it happens, there are really two fundamental ways to do this, and students are required to learn both.  They are:

  1. Slope-intercept form, y = m*x+b
  2. Point-slope form, (y-y1) = m*(x-x1)

Now, an algorithm for slope-intercept form would say:

  • If you are given two points (x1,y1) and (x2,y2):
    1. Find the slope of the line (m) between the two points as m = rise / run =  (y2-y1) / (x2-x1)
    2. Choose one of the two points; we’ll assume you chose (x1,y1).
    3. Find the y-intercept (b) by solving the equation y1 = m*x1 + b for the value of b
    4. Write your final answer in the form y = m*x + b, where x and y are variables and not the specific values for either point
  • If you have a point (x1,y1) and a slope m:
    1. You already have the slope m, so just do steps 3 and 4 from above.
  • If you have a point (x1,y1) and another line in the form y = m*x + b that you are told is parallel to the line through the point (x1,y1)
    1. Parallel lines must have the same slope, so you know that the slope of your line is the same as the slope m of the parallel line, so that’s your m too.  Do steps 3 and 4 from above.
  • If you have a point (x1,y1) and another line in the form y = m*x + b that you are told is perpendicular to the line through the point (x1,y1)
    1. Perpendicular lines have negative reciprocal slopes, so you know that your slope is –1/m where m is the slope of the other line, i.e., if it was y = 4*x+5 then your slope would be –1/4.  Do steps 3 and 4 from above.
    2. Don’t get confused by using m’s in two places here.  The point is that both m’s are slopes of different lines, but  (your slope) = –1 / (their slope).

I don’t think that’s even a complete algorithm!  They could give the other line in point-slope instead of slope-intercept form, they could tell you the point by specifying the intersection of two other, completely different lines that your third line has to pass through… the point is that the problem can be arbitrarily complicated, and thus so can your algorithm.  We haven’t even touched on point-slope yet!

Trying to extend an addition algorithm including absolute values may be good computer programming practice, but it’s not good pedagogy.  Here’s my counterexample of a decent conceptual framework for the finding the linear equation:

  • Start by looking at a line on a graph.  Ask yourself, how can we distinguish this line from any other line we might draw?  What makes it unique, one of a kind?  How could I make any line look like any other if I could stretch it and push it and pull it and move it around?
  • It turns out there are only two ways to change it, two things that are important and that make a line a line.  One is how steep it is, which we call a slope.  A hill is really steep if it goes up really far without going very far horizontally, so for example a handicap ramp would not be very steep, so it would have a small slope, while a staircase might have a large slope and an elevator would just have a super-enormous slope.  So, to decide how steep a line is, I need two pieces of information:  how much does it rise, and how much does it go horizontally, which we call how far the graph runs from left to right.  The slope is just the rise divided by the run, so slope = rise/run.
  • Remember that there’s another thing we can change – I could shift a line up and down or left and right.  For example, a handicap ramp has the same slope whether it’s on the first or second story, or in this room or the next, but those are all different ramps.  In math, we just talk about it moving up and down, because our lines go on forever so a shift up and down can look the same as a shift left or right (Something about pictures and kilo-words comes to mind here, alas).
  • So, we need two pieces of information to talk about lines: a slope, and how far up or down it should be shifted.  If we have a slope and a point the line has to go through, we can pin that line down and know exactly where to draw it.  But sometimes, we can be tricky about how we give our two pieces of information.  For example,
    • we could give two points instead of a slope and the line – then we could figure out from the second point how much the line would rise and how much it would run from the first point, so the second point would hold the key to finding the slope.
    • we could give one point, and then tell you the slope of a different line that we said was parallel to the first.  Parallel lines are like handicap ramps on another story or in another room – they have exactly the same slope, just a different shift up, down, left or right.  So you’d have a point and you’d know your slope was really the same as the other line’s slope!
    • kind of like above, we could give one point and then tell you the slope of another line that was perpendicular -- (discussion of the –1/m bit would require a picture).
  • Of course, the upshot is, you always need two bits of information: a point to pin down your line in one spot and a slope to decide how steep to draw it.  It just happens to turn out that there are a lot of ways to represent that second bit of information, the slope, with other facts like the location of a second point, or the slopes of parallel and perpendicular lines.

If you actually read all that, bless your heart.  Brevity, I ain’t got it.  I could go on to talk about how we could pick a special spot, the place where the line crosses the vertical (y) axis as our point that we’ll always use, but I think the dead horse, she is beaten. 

To return from our super-long and in-depth example, I feel like having a discussion about that framework is crucial – I don’t know that any of my kids really understood that the difference between slope-intercept and point-slope form of a line is just that point-slope is a generalization to a fixed point that doesn’t have to be on the y-axis.  They couldn’t conceive of it as a generalization, because they didn’t know what made us pick that specific point in the first place!  It was as if God came down and said, “Thou shalt use the y-intercept”, not like we discussed it and talked about why it made things simple.

So, now let me see if I had a point in my longwinded-ness here…. ah yes, found it, the balance between generality and specificity in math teaching.  I distinguish between generality as an overarching conceptual framework, to give you a roadmap when you’re learning something new so you can figure out how it fits with other stuff you know, and generality as in an algorithm that allows you to handle any arbitrary set of inputs correctly.

I think the right order to teach those things should be something like, start with a discussion of the framework, so you get a preview of what we’re going to do and why, then do all the specific cases and drill the hell out of them, constantly referring back to where we are in our framework at every step, then at the end you make your students write the algorithm so you expose any remaining flaws or gaps in their thinking with devious, pathological algorithm-breaking test cases.  Discuss. 

No comments: