When I was a freshman in high school, I was a mediocre math student: I earned a D in second semester geometry and had to repeat the course. By the time I was a senior in high school, I was one of the strongest few math students in my class of ~600 students at an academic magnet high school. I went on to earn a PhD in math. Most people wouldn't have guessed that I could have improved so much, and the shift that occurred was very surreal to me. It’s all the more striking in that the bulk of the shift occurred in a single year. I thought I’d share what strategies facilitated the change.

I became motivated to learn more

I took a course in chemistry my sophomore year, and loved it so much that I thought that I would pursue a career in the physical sciences. I knew that understanding math is essential for a career in the physical sciences, and so I became determined to learn it well. I immersed myself in math: At the start of my junior year I started learning calculus on my own. I didn’t have the “official” prerequisites for calculus, for example, I didn’t know trigonometry. But I didn’t need to learn trigonometry to get started: I just skipped over the parts of calculus books involving trigonometric functions. Because I was behind a semester, I didn’t have the “official” prerequisite for analytic geometry during my junior year, but I gained permission to sit in on a course (not for official academic credit) while taking trigonometry at the same time. I also took a course in honors physics that used a lot of algebra, and gave some hints of the relationship between physics and calculus.

I learned these subjects better simultaneously than I would have had I learned them sequentially. A lot of times students don’t spend enough time learning math per day to imprint the material in their long-term memories. They end up forgetting the techniques that they learn in short order, and have to relearn them repeatedly as a result. Learning them thoroughly the first time around would save them a lot of time later on. Because there was substantial overlap in the algebraic techniques utilized in the different subjects I was studying, my exposure to them per day was higher, so that when I learned them, they stuck in my long-term memory.

I learned from multiple expositions

This is related to the above point, but is worth highlighting on its own: I read textbooks on the subjects that I was studying aside from the assigned textbooks. Often a given textbook won’t explain all of the topics as well as possible, and when one has difficulty understanding a given textbook’s exposition of a topic, one can find a better one if one consults other references.

I learned basic techniques in the context of interesting problems

I distinctly remember hearing about how it was possible to find the graph of a rotated conic section from its defining equation. I found it amazing that it was possible to do this. Similarly, I found some of the applications of calculus to be amazing. This amazement motivated me to learn how to implement the various techniques needed, and they became more memorable when placed in the context of larger problems.

I found a friend who was also learning math in a serious way

It was really helpful to have someone who was both deeply involved and responsive, who I could consult when I got stuck, and with whom I could work through problems. This was helpful both from a motivational point of view (learning with someone else can be more fun than learning in isolation) and also from the point of view of having easier access to knowledge.

New Comment
25 comments, sorted by Click to highlight new comments since:

Good for you. It's quite satisfying to discover one's hidden aptitude to what is widely considered (in the US, at least) a difficult subject. I had a similar, if not as dramatic, experience with Physics. Unfortunately, your experience does not generalize to those who are not natural at math (which you only discover by trying hard, of course). I have observed several people who were just as motivated to learn math deeper, but gave up after realizing that math just doesn't make sense to them as much as to others, and settled for a B-grade knowledge. The same story applies to nearly every subject area: music, language, biology, art, programming, electronics, HVAC, you name it... Hence my standard advice to try as many different things as possible before picking one or two to spend the proverbial 10k hours on.


When something very similar happened to me (failing Algebra in 9th grade, aptitude suddenly surfacing in 11th), I also thought motivation was really important, but I also noticed my brain working differently. Algebra went from being semi-confused symbol manipulation to understanding what a variable was actually about.

In a simultaneous psychology course, I learned about Piaget's "formal operational stage" and that's what I attributed it to. I think it happens when you're 17 or 18. (Consider/compare with also this data point). So I agreed, it felt like it was a physical difference in development. What do you think of this as an explanatory hypothesis? (Any way to tell them apart?)

Good point. I think that there was in fact a physiological shift. But that doesn't account for my dramatically improved performance relative to my classmates.

Criticism of Piaget's theory

What is “developmentally appropriate practice”? For many teachers, I think the definition is that school activities should be matched to children’s abilities—they should be neither too difficult nor too easy, given the child’s current state of development. The idea is that children’s thinking goes through stages, and each stage is characterized by a particular way of understanding the world. So if teachers know and understand that sequence, they can plan their lessons in accordance with how their students think.

In this column I will argue that this notion of developmentally appropriate practice is not a good guide for instruction. In order for it to be applicable in the classroom, two assumptions would have to be true. One is that a child’s cognitive development occurs in discrete stages; that is, children’s thinking is relatively stable, but then undergoes a seismic shift, whereupon it stabilizes again until the next large-scale change. The second assumption that would have to be true is that the effects of the child’s current state of cognitive development are pervasive—that is, that the develop mental state affects all tasks consistently.

Data from the last 20 years show that neither assumption is true. Development looks more continuous than stage-like, and the way children perform cognitive tasks is quite variable. A child will not only perform different tasks in different ways, he may do the same task in two different ways on successive days! [...]

The problem is not simply that Piaget didn’t get it quite right. The problem is that cognitive development does not seem amenable to a simple descriptive set of principles that teachers can use to guide their instruction. Far from proceeding in discrete stages with pervasive effects, cognitive development appears to be quite variable—depending on the child, the task, even the day (since children may solve a problem correctly one day and incorrectly the next). [...]

These experiments tell us that there is not a rapid shift whereby children acquire the ability to understand that other people have their own perspectives on the world. The age at which children show comprehension of this concept depends on the details of what they are asked to understand and how they are asked to show that they understand it. This pattern of task dependence holds for other hallmarks of Piagetian stages as well. The implication is that stages, if they exist, are not pervasive (i.e., they do not broadly affect children’s cognition). The particulars of the task matter. [...]

Until about 40 years ago, most thought of children’s minds as a set of machinery. As children developed, parts of the machine changed, or parts were discarded and replaced by new parts. The machinery didn’t work well during these transitions, but the changes happened quickly. Today, researchers more often think that there are several sets of machinery. Children have multiple cognitive processes and modes of thought that coexist, and any one might be recruited to solve a problem. Those sets of cognitive machinery undergo change as children develop, but in addition, the probability of using one set of machinery or the other also changes as children develop.

This conclusion doesn’t mean that there is no consistency across children in their thought, or in the way that it changes with development. But the consistency is only really evident at a broader scale of measurement. A geographic metaphor is helpful in understanding this distinction (Siegler, DeLoache, and Eisenberg 2003). If one begins a trip in Virginia and drives west, there are very real differences in terrain that can be usefully described. The East Coast is wet, green, and moderately hilly. The Midwest is less wet and flatter. The mountain states are mountainous and green, and the West is mostly flat and desert-like. There is no abrupt transition from one region to another and the characterization is only a rough one—if I tell you that I’m on the East Coast and you say, “Oh, it must be green, wet, and hilly where you are,” you may well be wrong. But the rough characterization is not meaningless. Similarly, all children take the same developmental “trip.” They may travel at different paces and take different paths. But at a broad level of description, there is similarity in the trip that each takes.

Obviously, the description of multiple sets of cognitive machinery rather than a single set complicates the job of the developmental psychologist who seeks to describe how children’s minds work and how they change as children grow. Worse, it negates the possibility that teachers can use developmental psychology in the way we first envisioned. There is a developmental sequence (if not stages) from birth through adolescence, but pinpointing where a particular child is in that sequence and tuning your instruction to that child’s cognitive capabilities is not realistic.

What I summarize from the above is that educators have decided that Piaget's theory is not helpful for deciding 'developmentally appropriate practice'. Perhaps because the transitions from one stage to another are fuzzy and overlapping, or because students of a particular age group are not necessarily in step. Furthermore, understanding of a concept is 'multi-dimensional' and there are many ways to approach it, and many ways for a child to think about it, rather than a unique pathway, so that a student might seem more or less advanced depending on how you ask the question.

I think the real nail in the coffin would be if a young child does not understand a particular concept (say, volume conservation) and it is found that you can teach them this concept before they are supposed to be developmentally ready. This because I think the crux of Piaget's theory is that certain concepts are physically possible only after a corresponding physical development?

I think the real nail in the coffin would be if a young child does not understand a particular concept (say, volume conservation) and it is found that you can teach them this concept before they are supposed to be developmentally ready.

The article doesn't discuss conservation of volume in detail, but it talks about an experiment that's said to be "conceptually similar". And while it's hard to say from the quote, it seems to imply that when children are given feedback on the similar problem, their performance improves (I've bolded that part):

The child is shown two rows of objects, say, pennies. Each row has the same number of pennies and they are aligned, one for one. The child will agree that the rows are the same. Then the experimenter changes one row by pushing the pennies farther apart. Now, the experimenter asks, which row has more? (Pennies might also be added to or subtracted from a line.) Younger children will say that the longer line has more pennies.

When Piaget (1952) developed this task he argued that children go through three stages on their way to successfully solving this problem. Initially they cannot process both the length of the rows and the density of coins in the rows, so they focus on just one of these, usually saying that the longer row has more. The next stage is brief, and is characterized by variable performance: children sometimes use row length and sometimes row density to make their judgment, sometimes they use both but cannot say why they did so, and sometimes they simply say that they are unsure. In the third stage, children have grasped the relevant concepts and consistently perform correctly.

Robert Siegler (1995) showed that children’s performance on this task doesn’t develop that way. Ninety-seven 4- to 6-year-olds who initially could not solve the problem were studied, with each child performing variants of the problem a total of 96 times over eight sessions. After each problem, children were asked to explain why they gave the answer they did, so there was ample opportunity to examine the consistency of the children’s performance and their reasoning. The experimenter found a good deal of inconsistency. Children used a variety of explanations— sophisticated and naïve—throughout, even though they became more accurate with experience (the experimenter provided accuracy feedback, which is a big help to learning). It was not the case that once the child “got it” he consistently used the correct strategy. If the child gave a good explanation for a problem, there was only a 43 percent chance of his advancing the same explanation when later confronted with the identical problem.

I agree that while not exactly 'volume conservation', this addresses the exact same skill.

If the child gave a good explanation for a problem, there was only a 43 percent chance of his advancing the same explanation when later confronted with the identical problem.

Would you interpret this as meaning the children had not acquired the concept, after all? It seems that if the child actually truly understands the concept that moving things around doesn't change their number, then they wouldn't be inconsistent. (Or is the study demonstrating what I found unintuitive, that children can grasp and then forget a concept?)

I interpreted it as indicating that there are multiple ways of thinking about the problem, some of which produce the right answer and some of which produce the wrong answer. There's an element of chance involved in which one the child happens to employ, and children who are farther along in their development are more likely but not certain to pick the correct one on any single trial.

"Acquiring a concept" is a little ambiguous of an expression - suppose there's some subsystem or module in the child's brain which has learned to apply the right logic and hits upon on the right answer each time, but that subsystem is only activated and applied to the task part of the time, and on other occasions other subsystems are applied instead. Maybe the brain has learned that this system/mode of thought is the right way to think about the issue in some situations, but it hasn't yet reliably learned to distinguish what those situations are.

Not sure how analogous this really is, but I'm reminded of the fact that IBM's Watson used a wide variety of algorithms for scoring possible answer candidates, and then used a metalearning algorithm for figuring out the algorithms whose outputs were the most predictive of the correct answer in different situations (i.e. doing model combination and adjustment). So it, too, had some algorithms which produced the right answer, but it didn't originally know which ones they were and when they should be applied.

That kind of an explanation would still be compatible with a sudden boost in math talent, if things suddenly clicked and the learner came to more reliably apply the correct ways of thinking. But I'm not entirely sure if it's necessarily a developmental thing, as opposed to just being a math-related skill that was acquired by practice. Jonah wrote:

Because there was substantial overlap in the algebraic techniques utilized in the different subjects I was studying, my exposure to them per day was higher, so that when I learned them, they stuck in my long-term memory.

And if there is a specific "recognize the situations that can be thought of in algebraic terms and where algebraic reasoning is appropriate" skill, for example, then simultaneously studying multiple different subjects employing the same algebraic techniques in different contexts sounds just like the kind of thing that would be good practice for it.

I appreciate your responses, thanks. My perspective on understanding a concept was a bit different -- once a concept is owned, I thought, you apply it everywhere and are confused and startled when it doesn't apply. But especially in considering this example I see your point about the difficulty in understanding the concept fully and consistently applying it.

Volume conservation is something we learn through experience that is true -- it's not logically required, and there are probably some interesting materials that violate it at any level of interpretation. But there is an associated abstract concept -- that number of things might be conserved as you move them around -- that we might measure comprehension of.

There are different levels at which this concept can be understood. It can be understood that it works for discrete objects: this number of things staying the same always works for things like blocks, but not for fluids, which flow together, so the child might initially carve reality in this way. Eventually volume conservation can be applied to something abstract like unit squares of volume, which liquids do satisfy.

Now that I see that the concept isn't logically required (it's a fact about everyday reality we learn through experience) and that there are a couple stages, I'm really skeptical that there is a physical module dedicated to this concept.

So I've updated. I don't believe there are physical/neurological developments associated with particular concepts. (Abstract reasoning ability may increase over time, and may require particular neurological advancements, but these developments would not be tied with understanding particular concepts.)

Seems kind of silly now. Though there was some precedent with some motor development concepts (e.g., movements while learning to walk) being neurologically pre-programmed.

This seems an appropriate place to observe that while watching my children develop from very immature neurological systems (little voluntary control, jerky, spasmodic movements that are cute but characteristic of very young babies) to older babies that could look around and start learning to move themselves, I was amazed by how much didn't seem to be pre-programmed and I wondered how well babies could adapt to different realities (e.g., weightlessness or different physics in simulated realities). Our plasticity in that regard, if my impression is correct, seems amazing. Evolution had no reason to select for that. Unless it is also associated with later plasticity for learning new motor skills, and new mental concepts.

I appreciate your responses, thanks.

I appreciate hearing that you appreciate them. :)

Boaler 1993 is another interesting discussion about the rules that people might use in order to decide what kind of skill or mental strategy might apply to a situation.

It argues that, because school math problems often require a student to ignore a lot of features that would be relevant if they were actually solving a similar problem in real life, they easily end up learning that "school math" is a weird and mysterious form of mathematics in which normal rules don't apply. As a result, while they might become capable of solving "school math" problems, this prevents them from actually applying the learnt knowledge in real life. They learn that school math problems require a mental strategy of school math, and that real-life math problems require an entirely different mental strategy.

Lave [1988] has suggested that the specific context within which a mathematical task is situated is capable of determining not only general performance but choice of mathematical procedure. Taylor [1989] illustrated this effect in a research study which compared students' responses to two questions on fractions: one asking the fraction of a cake that each child would get if it were shared equally between six, and one asking the fraction of a loaf if shared between five. One of the four students in Taylor's case study varied methods in response to the variation of the word, "cake" or "loaf". The cake was regarded as the student as a single entity which could be divided into sixths, whereas the loaf of bread was regarded as something that would always be divided into quite a lot of slices - the student therefore had to think of the bread as cut into a minimum of, say, ten slices with each person getting two-tenths of a loaf. [...]

One difficulty in creating perceptions of reality occurs when students are required to engage partly as though a task were real whilst simultaneously ignoring factors that would pertinent in the "real life version" of the task. [...] Wiliam [1990] cites a well known investigation which asks students to imagine a city with streets forming a square grid where police can see anyone within 100m of them; each policeman being able to watch 400m of street (see Figure 1.)

Students are required to work out the minimum number of police needed for different-sized grids. This task requires students to enter into a fantasy world in which all policemen see in discrete units of 100m and "for many students, the idea that someone can see 100 metres but not 110 metres is plainly absurd" [Wiliam, 1990; p30]. Students do however become trained and skillful at engaging in the make-believe of school mathematics questions at exactly the "right" level. They believe what they are told within the confines of the task and do not question its distance from reality. This probably contributes to students' dichotomous view of situations as requiring either school mathematics or their own methods. Contexts such as the above, intended to give mathematics a real life dimension, merely perpetuate the mysterious image of school mathematics.

Evidence that students often fail to engage in the "real world" aspects of mathematics problems as intended is provided by the US Third National Assessment of Educational Progress. In a question which asked the number of buses needed to carry 1128 soldiers, each bus holding 36 soldiers, the most frequent response was 31 remainder 12 [Schoenfeld, 1987; p37]. Maier [1991] explains this sort of response by suggesting that such problems have little in common with those faced in life: "they are school problems, coated with a thin veneer of 'real world' associations."

I think 17 or 18 would be considered pretty outlyingly late for the onset of a formal operational stage... but it is supposed to be an ongoing stage of development from something like 11-13 or so onwards, so I guess there could still be some sort of qualitative change around that age.

Piagets formal operational stage overly simplifies things. It doesn't go the same way for all people. The basic capability for formal operations sets in much earlier. But using it or recognizing the applicability of specific instances is something else. Some people never get algebra, but that doesn't mean they can't do formal operations. I think what is missing is the intuition behind the formal operations. Just doing the formal operations without intuitively understanding why kills motivation. That is the reason DragonBox works so well. You need to train both. I once draw an ascii art about this: http://c2.com/cgi/wiki?FuzzyAndSymbolicLearning

Yeah, agreed. I think a lot of Piaget's work is considered pretty outdated anyway.

The basic capability for formal operations sets in much earlier.

I think it depends. The wikipedia page says that the onset is between 11 and 20 years or so.

My aptitude in mathematics was a bit above average when I was 11 years old. Maybe I had already met the criterion for the formal operation stage, despite not doing well in math the first couple years of high school. But something significant happened when I was 17, and it seemed to be a qualitative change in the way I understood mathematics. I also seemed to be developed the ability to excel in Algebra (with motivated effort) later than my peers. Perhaps it wasn't a specific stage identified by Piaget, but it felt physical/neurological.

I do think Piaget is considered outdated. He might have gotten some of the details wrong or its not the whole story. (For example, I'm skeptical that babies ever lack object permanence.) Nevertheless, Piaget is likely correct that certain concepts develop in stages that are timed with physical development.

It's interesting that you found Piaget's "formal operational stage" so applicable. I remember when I learned about it (also in a psychology course at around the same age) I found the claim that people only develop abstract thought at the age of 12 completely ridiculous. This is probably related to how my own development was very anomalous.

Were you precocious?

I find this post slightly disingenuous. My experience has been that mathematics is heavily g-loaded: it's just not feasible to progress beyond a certain point if you don't have the working memory or information processing capacity or whatever g factor actually is to do so. The main conclusion I draw from the fact that you eventually completed a Ph.D. is that you always had the g for math; given that, what's mysterious isn't how you eventually performed well but why you started out performing poorly.

Nope, not disingenuous :-). Yes, I had unusual mathematical potential, but many of those who do don't realize it, and even those who have average mathematical ability could learn much better.

Based on the Less Wrong survey results, my IQ isn't substantially higher than the average LWer's, but I know a lot more math than the average LWer. Whether or not this is significant is in part a value judgment, but my story is relevant to those who would like to improve their mathematical knowledge and ability.

Thanks for writing this. This puts some of my experience in perspective. When I was in 11th grade, I was doing very poorly in math: I was barely scraping through the exams and my math teacher told my parents that I would not be cut out for college in engineering or physical sciences. But come 12th grade, I was on top of the class without even breaking a sweat; even though the math got much harder and the math teacher was the same. Now, I'm doing a PhD in physics.

I think an important factor in my case was finding friends who were also genuinely curious about math, instead of just wanting to get through the exams.

But I still think there were a lot of hidden variables that governed this transition that I'm still unaware of. For example, friends cannot explain all of it as I had many of the same friends in 12th grade as well. "Increased motivation" is not really an explanation. Learning more deeply---from different sources and in interesting contexts--- are significantly causally linked with more motivation.

I would break out a sub-header from the first one: It sounds like you tried to actively ignore official prerequisites and prescribed orderings when you felt it would benefit you. You were actively choosing to risk biting off more than you could chew (which makes plenty of sense: it's pretty safe in an academic context, but it's still more bold than most people).

As shminux says below, your story could have turned out poorly if you'd done the same and found yourself constantly mired in confusion. Your motivation would have warn out quickly. This is why those prescribed orderings, even if they have more baby-steps than some people require, are so popular in pedagogy.


Because there was substantial overlap in the algebraic techniques utilized in the different subjects I was studying, my exposure to them per day was higher, so that when I learned them, they stuck in my long-term memory.

Counterpoint: this paper seems to indicate that this sort of "overlearning" doesn't work:

As shown in Figure 1, overlearning provided noticeable gains at 1 week, but these gains were almost undetectable after 4 weeks.

One thing I've often wondered: it seems like the people who like math the best are often also the people who are really good at it. There don't seem to be many people who are bad at math who like it. I wonder if that's because the way math is taught in school, if you're not one of the top few kids in the class, part of your experience in the class is developing an identity as a person who isn't the best in the world at math and feeling intimidated by those who are really good. Perhaps the fact that you were just auditing classes, or the fact that you were self-studying, allowed you to escape this identity and thus grow to like math.

it seems like the people who like math the best are often also the people who are really good at it.

This doesn't seem specific to math to me. I think it's true of any activity where, if you're bad at it, it's really obvious to you that you're bad at it. Based on a quick mental tally, it seems like activities that people can like while being bad at them (e.g. singing) are activities that don't necessarily have this property.

The relatively quick transition from D to A could also result from changes in brain 'wiring'. Freshmen year seems to coincide with puberty and your changed motivation and abilities may(!) stem (partly?) from changes in your brain. You semm to have made the best out of it.

The relatively quick transition from D to A could also result from changes in brain 'wiring'.

Every change in learning is a change in brainwiring. That term doesn't explain anything.

With wiring I didn't mean the 'normal' means of learning and brain plasticity (some of which doesn't involve any rewiring but 'just' changes of weights and creation of proteins; see e.g. memory consolidation). I meant large scale brain reorganization like the Brain changes at puberty 'help to develop intellectual machinery'.