Turbocharging

Summary:

Sometimes, reinforcement learning goes wrong: how can this be prevented?
Example: math education
- One student simply “learns to follow along”, and the other “learns to predict what comes next”
- The other student may gain the ability to solve math problems on their own, while the first plausibly won’t.
Turbocharging, general notes:
- Idea: You get better at the things you practice, and it pays off to think about what, mechanistically, you want to learn.
- You won’t just learn “what you intend”:
  - If you intend to gain the skill of disarmament of people but hand the weapon back during training with a partner, then that is what you learn.
- Example of math student revisited: are they…
  - Actively thinking about the symbols?
  - Calling up related material from memory?
  - Generating hypotheses (instead of falling prey to hindsight bias)?
  - Thinking about the underlying structure of the problem?
- ← These questions determine what’s actually practiced.
The Turbocharging Algorithm:
- Select a skill to be acquired/improved
- Select a practice method (to be evaluated or to be strengthened/developed)
- Evaluate the resemblance between method and skill:
  - Does/Do the “practice trigger(s)” resemble the real-world trigger, or at least plausibly generalize?
  - Does/Do the “practice action(s)” resemble real-world actions, or at least plausibly generalize?
- Possibly adjust the practice method in response to the previous answers
Further Notes
- Declarative and Procedural Knowledge require different types of learning
- Turbocharging is for procedural learning, which is more of what applied rationality is about
- The article lists many counterexamples of the theory that turbocharging is “the one and only” way to gain procedural knowledge.

This actually sounds quite different from "deliberate practice". This says that to get better at playing piano, you should play the piano. Deliberate practice says that just playing the piano isn't enough and maybe even mostly a waste of time (for this specific goal). It feels to me that deliberate practice would win out, so am I misunderstanding this? Is it just a framework for the concept and the implementation would look basically like deliberate practice?

[-]DaveEtCircenses3y61

I think the Law of Equal but Opposite Advice is extremely relevant here, in that there are two common failure modes for practicing.

The first of these is "not practicing what you actually do", and turbocharging helps with that.

The second of these is "practicing what you actually do, but inefficiently", and deliberate practice helps with that.

Of course, trying too hard to avoid the first failure mode yields the second (e.g. playing a whole piano piece through repeatedly), and trying too hard to avoid the second failure mode yields the first (e.g. memorising Anki flashcards for a language, but being unable to speak it since you didn't practice talking).

[-]Towards_Keeperhood8mo10

I don't quite like "Turbocharging" as a name because it suggests too little about the content. Better might e.g. be "the directness principle".

(IIRC Directness is also one of the ultralearning principles from Scott Young and I guess it describes the same thing, but I don't remember.)

[-][anonymous]2y10

The student employing version one of the learning strategy will gain proficiency at watching information appear on a board, copying that information into a notebook, and coming up with post-hoc confirmations or justifications for particular problem-solving strategies that have already provided an answer.

ouch I wasn't prepared for direct attacks but thank you very much for explaining this :), I now know why some of the later strategies of my experienced self of "if I was at this step how would I figure this out from scratch" and "what will the teacher teach today based on previous knowledge" worked better, or felt more engaging from my POV (I love maths and it was normal for me to try find ways to engage more) .

But this tells me I should apply rationality A-Z techniques more often to learning...given how this is just anticipation controller,fake causality and replacing symbol with the referent, positive bias.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

55

55

55

A closer look: math education

Turbocharging Training: A principled approach

The Turbocharging algorithm

Caveats and complications

Turbocharging—Further Resources