This is an exercise about Planmaking and Surprise-Anticipation. It takes about 2-3 hours. It's a small, simplified exercise, but I think it's a useful building block.

Humans often solve complex problems via iteration and empiricism. Usually, trying to figure everything out from first principles without experimenting is a bad idea. You can spend loads of time thinking, and then you go outside and interact with reality for 5 minutes and realize all that thinking was pointed in the wrong direction.

But some important problems have poor feedback loops, such that iteration/empiricism don't work very well. Experimentation might take a really long time, the results might be noisy, or you might just really need to get something right on the first try

Often, when making a plan in a confusing domain, it's enough to just ask yourself "how do I expect this plan to turn out?" to get you to notice ways the plan is likely to fail. Then you can fix those things. This is often faster than doing the entire plan, and watching it fail, and then doing it all over again.

Side note: a particular worry I have is that a lot of people entering the AI Alignment space don't feel like it's tractable to tackle more theoretical research directions, and end up gravitating to interpretability or evals because they at least have a feedback loop. One thing I think this exercise is "for" is laying some building blocks for "how to think about a situation where your feedbackloop is terrible, and eke as many bits out of it to help you focus your strategy.

I don't know whether this will successfully transfer to the domains I care about, but that's one thing exercise is aiming at.

This exercise uses the Baba is You videogame to teach a combination of rationality skills (which I suspect weave together into something greater than the sum of their parts):

  • Planmaking
  • Calibration
  • Inner Sim / Internal Surprise-o-meter
  • Patience

Those skills weave together into something similar to Murphyjitsu, but with a somewhat different flavor. The exercise is intended to build upon Exercise: Meta-strategy [TODO], and is a stepping stone building towards Generating 10x Plans.

I've tested this on ~6 people, including myself. So, this is still experimental, but I think good enough about it to ship it publicly for now. Let me know if you try it. You can post your results in the comments (please spoiler-block them).

Format:

  1. You’ll be given a puzzle video game level, which you haven’t played before.
     
  2. Instead of fiddling around, playing with the game the way you might normally do… you will just look at the screen, and make a complete plan for solving a given level, before you begin to move your character around. 
     
  3. Write down that plan as a series of steps.
     
  4. Before you execute your plan, for each step in the plan, consider all the ways that you might be surprised when you execute that step.
     
  5. Loop through all of your “possible surprises”, and consider if any of them actually seem more likely than your mainline plan. Consider updating your plan. If there is a step that might go multiple ways, try making multiple guesses and plans.
     
  6. Are you confident in your plan? If so, execute it.
     
  7. Did the plan go the way you expected? Spend 10 minutes reflecting on what you learned, and what you could have done differently.

I recommend doing the exercise using this google doc worksheet

(I think this exercise would work well as a meetup where one meetup-organizer has read the post thoroughly, has already done the exercise once, and can help other people who are confused or stuck)

Step 0: Read this blogpost

This is a fairly involved exercise. I've broken in down into steps so you only have to think about one thing at a time, but it's useful to first read through the whole thing so you can see how all the pieces fit together.

Step 1: Download the Game, Pick a Level

We're practicing planmaking in a game called Baba is You. Baba is You is a really great puzzle game, but we're adding some additional wrinkles.

My favorite version of this exercise is one where you've never played Baba is You before, and part of the task is figuring out the core gameplay mechanics without even interacting with the game. (If you haven't heard anything about the game before, I highly recommend not looking anything up first)

If you're a rationalist nerd on LessWrong, you've probably heard about or even played the game. That's fine. Unless you've literally beaten the entire game, this exercise works if you play a level you haven't played before. (I recommend picking a level that introduces at least one new mechanic you haven't seen before)

So, go download the game on Steam, or whatever device you prefer.

If you've never played the game before, I recommend starting with a particular level I made, via:

  1. Click "Play Levels" from the menu
  2. Click "Get new levels"
  3. Click "Use level code"
  4. Enter: "6KQL-TPUB"

Alternately you can play the normal game from the beginning. The first couple levels might be fairly easy. I recommend following all the exercise steps for the first 2 levels, but it's okay to do them a bit more quickly, and save your "try very hard to get it right the first time" for a later level.

If you're just starting the game, you have to do the first levels 0 and 1, but then I recommend skipping to levels 3, 4 and 7. (Levels 2, 5 and 6 each have elements that make them somewhat worse exercises here IMO)

Remember to pull up the google doc worksheet.

Step 2: Observation, Orientation and Livelogging

Once you've gotten to a level that seems like a good fit, stare at the level for a bit and soak in the details. It will look something like this:

A screenshot from Baba is You, showing simplistic graphics in the form of colorful block puzzles.

I recommend writing down the all the details that seem relevant.

More generally: I recommend livelogging. Jot down notes about your thought process as things occur to you. 

After soaking in the level, start thinking through how to solve it.

If you're new to the game, you might ask "How do I solve it? What are the rules? I don't even know what it means to beat a level of this game." Part of the point of this exercise is you do have information about that, even without having played the game. You've played other games before. If you haven't, you've probably interacted with the world and you can make an informed guess about what you need to achieve.

A level might introduce multiple new mechanics you haven't encountered before. For each new element, I suggest coming up with some guesses as to how that element will behave, or what happens when you try to interact with it.

Step 3: Write out your plan

Okay, now start writing down your best guess plan for solving the level

This can include things like “I think if I do X, it’ll most likely cause Y to happen, but might cause Z to happen instead. If Y happens, I’ll do Plan A, if Z happens, I’ll do Plan B.”

(You don’t need to go crazy branching out on every possible option for this exercise, just pick the most likely branch, and maybe 1-2 backup plans)

Step 3A: If you get stuck, brainstorm strategies

If you feel like you have no idea what to do, stop and go meta. Try spending 10 minutes brainstorming strategies that might help.

Examples of strategies might include:

  • Try to break the problem down into subgoals
  • Notice that you're tired, and get a snack
  • List as many dumb ideas as you can

I recommend setting a literal 10 minute timer, and trying to come up with at least 10 strategies.

Step 4: Predict Surprises

For each step in your plan, write down the how likely the plan is to go the way you expect. (i.e. 10% likely, 50% likely, 90% likely, etc).

For step, ask "Does this seem like an area I expect to get surprised? What other things might happen instead of my main prediction?"

You might notice that you actually think there's a second outcome that feels more likely than your original plan. If so, maybe update your plan + predictions.

When you are done, right down your overall probability that your plan will work.

Step 5: Execute the Plan

Once you have a plan written down, and you've thought about how likely you are to be surprised... execute the plan!

...

What happenedDid you get it right on the first try? If so, yay! If not, think more, and see if you can come up with a new plan.

Did you get any “surprise surprises” (as opposed to “surprises in a place you kinda expected to get surprised?”)

Step 5b: Try earnestly 3 times, then, idk screw around

If your plan didn't work, go back to the drawing board and try again. You've lost some imaginary points from Raemon, but, you can still try again to make a followup plan. If your assumptions were wrong, re-examine them and think about what else might work.

Each time you try/fail, I recommend setting another 10 minute timer for "meta-strategy brainstorming."

If you've tried this earnestly 3 times, after the 3rd time, I think it's fine to switch to just trying to solve the level however you want (i.e. moving your character around the screen, experimenting).

Step 6: Debrief / Meta Reflections

Whether you got the answer right or wrong, now you stop to ask "how did I do? Could I have done better?"

Set a 10 minute timer, and brainstorm potential takeaways. Some possible prompts:

  • What were some useful thoughts that you thought?
  • What were the key pivot points in your thinking?
  • How could you have gotten to the right concept faster?
  • Summarize the key concept of the solution.
    • Is there an abstract generalization of that concept?
    • How does that generalization apply other problems?
  • What thinking approach brought me to the right answer?
    • Does that approach generalize?
  • What were some useful thoughts that I thought?

Followup

I think it's worth doing this exercise a couple times until you've gotten the hang of it. You can do it on different puzzle games.

The next step after that is "try making some actual real life plans for goals that feel somewhat hard/confusing", and reflect on which parts (if any) seem to transfer. I'm working on some explicit exercises for this, but so far it seems to depend a lot on an individual's goals. So far, this process seems to take a few days rather than a few hours.

New to LessWrong?

New Comment
15 comments, sorted by Click to highlight new comments since: Today at 9:58 AM

If you've already played Baba Is You and are looking for other options: Humble Bundle has a puzzle bundle going for the next 5 days. It's $10 for 7 games, of which The Witness is the lowest rated at 85% positive, and the rest range from 93-99%

Baba Is You is an unusual puzzle game in a way that seems relevant here.

One way of classifying puzzle games might be on a continuum from logic-based to exploration-based (or, if you like, between logical uncertainty and environmental uncertainty).

At the first extreme you have stuff like Sudoku, or logic grids, or three gods named True, False, and Random, or blue eyes.  In these puzzles, you are given all necessary information up-front, and you should (if the puzzle is well-constructed) be able to verify the solution entirely on your own, without requiring an external authority to confirm it.

At the opposite extreme, there's 20 questions or mastermind or Guess Who?, where the entire point is that necessary information is being withheld and you need to interact with the puzzle to expose it.  Knowing all the information is the solution; there would be no point without the concealment.

Baba Is You is pretty close to the first extreme, but not all the way there.  It does ask you learn the basic rules of the game by interacting with it, and it does gradually introduce new rules, but most of the difficulty comes from logical uncertainty.  Some puzzles do not introduce new rules at all, or only introduce new rules in the sense of exploring the edge cases of a previously-established rule.  It also makes the entire puzzle visible at once, so once you understand the rules it becomes a pure logic puzzle.

This exercise relies on the possibility of being empirically surprised, but also on being able to make fairly detailed plans in spite of that possibility.  This seems like it requires (or at least heavily benefits from) being at a pretty narrow area within the logic <=> exploration continuum, which Baba Is You happens to be exactly situated at.

Most puzzle video games lean more heavily on exploration than that.  You mentioned The Witness, which I would classify as primarily exploration-based:  each series of puzzles centers around a secret rule that you need to infer through experimentation, and most puzzles are easy once you have figured out the secret rule.  (The game Understand, mentioned by another commenter, has the same premise.)

Another puzzle game I recognize from the bundle you linked is Superliminal, which has the premise that you're inside a dream and solve puzzles using dream-logic.  I'd also consider that heavily exploration-based.

The Talos Principle is much closer to Baba Is You's point on this continuum, with a relatively small number of rules and an emphasis on applying them creatively, although in The Talos Principle you can't always see the entire puzzle before you begin solving it, and I'd say the puzzle components' appearances are less suggestive of their functions than the adjectives in Baba Is You, probably making it significantly harder to guess how they'll behave without doing some experimentation.

Patrick's Parabox is similar to Baba Is You in that they are both Sokoban games, though I didn't play too far in Patrick's Parabox because the puzzles felt more workaday and less mind-bendy and I just got bored.  (Though it's highly rated, so presumably most people didn't.)

Quick note that I have another exercise in the works about the beginning of Patrick's Parabox, but after having investigated more I think the rest of the game doesn't hold up for my purposes.

I like your breakdown of why Baba is You fits exactly here. 

I do think most puzzle games lend themselves to some kind of rationality exercise, but not necessarily this one.

Or Understand for 4 EUR which has a highly upvoted lesswrong post recommending it.

Fwiw I tried out Understand and was underwhelmed. (Cool concept but it wasn’t actually better as an exercise than other good puzzle games)

What about Outer Wilds? It's not strictly a puzzle game, but I think it might go well with this exercise. Also, what games would you recommend for this to someone who has already played every available level in Baba Is You?

I think I’d end up constructing a new exercise for Outer Wilds but could see doing something with ir. (I have started but not completed Outer Wilds)

I think this exercise works best for games where puzzles come in relatively discrete chunks where you can see most of the puzzle at once.

Unless you've literally beaten the entire game, this exercise works if you play a level you haven't played before.

I haven't beaten every level in the game, but I don't have access to any levels that I haven't played before, because the reason I stopped playing was that I had already tried and failed every remaining available level.

(Though I suppose I could cheat and look up the solution for where I got stuck...)

Summarize the key concept of the solution

This might not apply to the early levels you've focused on in your examples, but an observation I made while playing the more advanced levels of this game was that often there was not just one key concept.

In most puzzle games that I've played, I find I can quickly get a sort of feel for the general shape of the solution:  I start here, I have to end up there, therefore there must be a step in the middle that bridges those two.  This often narrows the possible search space quite a lot, because the missing link has to touch the parts I know about at both the beginning and the end.

Lots of puzzles in Baba Is You have two significant steps in the middle.  And this is a huge jump in difficulty, because it means there's an intermediate state between those two steps and I have no idea what that intermediate state looks like so I can't use it to infer the shape of those steps.  Each of the missing steps has only 1 constraint instead of 2.

I haven't beaten every level in the game, but I don't have access to any levels that I haven't played before, because the reason I stopped playing was that I had already tried and failed every remaining available level.

(Though I suppose I could cheat and look up the solution for where I got stuck...)

I'm not sure I understand. If you have levels leftover that you haven't beaten beacuse they were too hard, I think this is still a fine exercise (the fact that "it's hard" isn't a crux for me. I do think it's doable, and I think the constraints of the exercise probably help about as much as they hinder. 

(You might not succeed at doing succeeding within three tries of one-shotting, but I think you're more likely to go on to beat the level afterwards, and still learn something from it)

On a literal level, I can't play "a level I haven't played before", which is what the instructions call for.

On a practical level, I've already spent multiple hours beating my head against this wall, and when I stopped I had no remotely promising ideas for how to make further progress.  (And most of that time was spent staring at the puzzle and thinking hard without interacting with it, so it was already kind of similar to this exercise.)

Admittedly, this was years ago, so maybe it's time to revisit the puzzle anyway.

I will note that a level editor for this game seems to exist, so in theory you could craft custom levels for this exercise.  Though insofar as the point is being potentially-surprised by the rules, maybe that doesn't help if you aren't inventing new rules as well.

One note: custom levels now exist and you can go browse them directly even if you've beaten the game.

I do agree that this exercise, as-worded, probably nudges towards a flavor of "explicit thinking", which I don't think is even necessarily the best strategy for Baba is You overall.

I don't think this exercise necessarily says "think explicitly" – the section on metacognitive brainstorming is meant to fuzzy/experiential/"go-take-a-shower"/"meditate" style options.

To clarify slightly more: I think it's fine to a do a hard level you haven't beaten before, even if you've played it.

Nod, these instructions were generated for people doing early levels but I agree about how later levels feel.

(Personally, I would be rather intimidated by such a long list of questions at Step 6. I would be thinking something like, question one: why do I think it wasn't just sheer dumb (lack of) luck? And question two, have I had fun?)

Nod, the prompts are meant to be suggestions and you can come up with your own prompts.

I am intending this exercise primarily for people who are interested in answering those sorts of questions though. (But, I also think the exercise is fun, and worth trying/evaluating on that basis if it feels interesting to you)