Instrumental Rationality 6: Attractor Theory

by lifelonglearner9 min read18th Oct 20175 comments



[Instrumental Rationality Sequence 6/7]

[Attractor Theory is a hybrid model that tries to reconcile the effects of internal and external factors of motivation. It makes the claim that an important additional consideration in decision-making is how the action affects your ability to take future actions.]

The Model:

Attractor Theory is a qualitative model that’s aimed at changing your intuitions about yourself and decision-making. That means it explains the how but not the why. As a brief summary, Attractor Theory basically states that you should consider any action you take as having meta-level effects on changing your local preferences for which actions feel desirable.

That is to say, taking actions changes which actions you’ll take, later down the road.

I’ll first introduce the three parts of the model, then I’ll go over the implications of the model.

1) First, there’s You. Imagine that you’re in a clear hamster ball:


As a human inside this ball, you can kinda roll around by exerting energy. But it’s hard to do so all of the time — you’d likely get tired. Still, if you really wanted to, you could push the ball and move.

We’ll explore this later, but you can basically think of the energy you have left for rolling as a proxy for willpower.

2) Second, there are these Utilons, which just represent stuff you want.


They represent productivity hours, lives saved, HPMOR fan-fictions written, or anything else you care about getting a lot of. As a human in a hamster ball, you are trying to roll around and collect as many Utilons as possible.

3) Third, there are all these Attractors that pull you in.


And, uh, technically, anything could be an Attractor. But that clearly isn’t useful. A more concrete framing is to think of Attractors as actions or situations. For example, reading a book, going on vacation, and doing some pushups are all examples of Attractors.

The bottom line is that something is classified as an Attractor if it changes how you currently feel.

(I know this is still pretty vague, but there are some more examples later, so you might want to just black-box it for now.)

Attractors are like valleys or magnets. The point is that there’s a potential difference, which causes them to pull you, in your little hamster ball, towards them.



That’s the gist of this model. It also has two major components:

1. Attractors Can Change:


Attractors affect one another.

Once you’re being pulled in by one, this actually modifies other Attractors. This usually manifests by changing how strongly other ones are pulling you in. Sometimes, though, this even means that some Attractors will disappear, and new ones may appear.

This basically means that taking actions can affect how you feel about other actions.

For example, the set of things that feel desirable to me after running a marathon (EX: drinking water) may differ greatly from the set of things after I read a book on governmental corruption (EX: starting a socialist revolution).

Upon reflection, this seems fairly obvious. Humans aren’t closed systems—our preferences are always changing with our internal and external states.

Transfer is always happening. Think about how we react when someone says something nasty or as the weather changes. Our emotions leak into our actions in the real world, and real world events affect our emotions.

My point here is that, from a perception-based point-of-view, it feels like our actions change the sorts of things we might want.

Every time we take an action, then, this will, in turn, prime how we view other actions, often in predictable ways. Though we might not know exactly how they’ll change, we can get good, rough ideas from past experience and our imaginations.

We’ll be capitalizing on this interaction later on when we start exploring further consequences of the Attractor Theory model.

2. Direct Path ≠ Optimal Path


As a human, your goal is to navigate this tangle of Utilons and Attractors from your hamster ball, trying to collect Utilons.

Now you could just try to take a direct path to all the nearest Utilons. However, that would also mean exerting a lot of energy to fight the pull of Attractors that pull you in Utilon-sparse directions.

Instead, given that you can’t avoid Attractors (they’re everywhere in the environment!), the best thing to do is to be strategic:

What I mean by that is you want to think about how “launch” yourself around in the environment. Attractors might pull you in, but you still have some limited control when it comes to choosing which ones to dive into and which ones to pop out of.

You want to choose which Attractors you’re drawn to and selectively choose when to exert energy to move from one to another to maximize your overall trajectory (more on the energy exertion next section).

The Global Optimization view is also a lot more forgiving to taking breaks. Once you take the view that long-term maximization is the goal, you’re less likely to beat yourself up for taking rests.

This is because, in many cases, the break isn’t cutting time away from your “potential work time”, but it’s actually essential to maintaining your ability to even do work in the first place.

For example, taking short breaks is a key component of the Pomodoro Method that ensures you don’t burn out. Likewise, taking periodic walks or other activities which give you time for your attention to wander often allow your mind to do deeper thinking.

And of course sleeping is fairly necessary for optimal functionality.



Attractor Theory as a model contains several useful concepts: Meta-Effects, Auxiliary Actions, Starting / Stopping Costs, and Precommitment.


When most people consider actions, I claim that they consider basically two things:

1. The cost of the action.
EX: “How many hours will it take to drive to Los Angeles?”

2. The effects of the action.
EX: “What are the benefits of going to Los Angeles?”

If you’re smart, you might also consider the tradeoffs and opportunity costs, by comparing the action to other choices.

With Attractor Theory, I think you also now consider a third very important property of the actions available to you:

3. The effects of the action on you.
EX: “How will going to Los Angeles change the set of actions that feel yummy to me?”

It’s obvious, for sure, but I think that most people’s defaults either only have this as an implicit consideration. Otherwise, they actually just don’t really think it about it all.


Auxiliary Actions:

Attractor Theory really shines when you start seeing your actions in terms of, not just their direct effects, but also their effects on how you can take further actions. It changes your decision algorithm to be something like:

“Choose actions such that their meta-level effects on me by my taking them allow me to take more actions of this type in the future and maximize the number of Utilons I can earn in the long run.”

By phrasing it this way, it makes it more clear that most things in life are a longer-term endeavor that involve trying to globally optimize, rather than locally.

(While it’s arguable that a naive view of maximization should by default take this into account from a consequentialist lens, I think making it explicitly clear, as the above formulation does, is a useful distinction.)

This allows us to better evaluate actions which, by themselves, might not be too useful, but do a good job of reorienting ourselves into a better state of mind.

I think it ends up creating the class of auxiliary actions, actions which are easy to do and also make it easier to take other actions. You can sort of think of them as stepping stones, which bridge the state between where you are and where you want to end up.

For example, spending a few minutes outside to get some air might not be directly useful, but it’ll likely help clear my mind, which has good benefits down the line, in how I’m able to do work in the immediate future.

Other potential auxiliary actions for you might include drinking water, stretching, doodling, meditating, or going for a short walk.


Starting / Stopping Costs:

Attractor Theory also does a good job of modeling how actions seem much harder to start than to stop. Moving from one Attractor to a disparate one can be costly in terms of energy, as you need to move against the pull of the current Attractor.

Once you’re pulled in, though, it’s usually easier to keep going with the flow. So using this model ascribes costs to starting, and it places a lower cost on continuing actions. By “pulled in”, I mean making it feel effortless or desirable to continue with the action.

(I’m thinking of the feeling you get when you have a decent album playing music, and you feel sort of tempted to switch it to a better album, except that, given that this good song is already playing, you don’t really feel like switching. Or something like that.)

This is where willpower comes in. Remember that rolling takes energy, and you probably only have a finite amount of it. Thus, you want to pick and choose when you apply willpower.

The Attractor Theory model suggests that the best opportunities to try “extra hard” are the ones where you predict that things will be smooth sailing once you’re pulled into the Attractor.

For example, if getting started on reading a book is difficult, but you know that you’ll likely find yourself engrossed in the book conditional on your starting, then this is a good place to put in willpower.

Keeping this in mind allows you to strategically go "against the current" in the situations where it'll have the greatest benefit on the immediate Future You's ability to do continued work.



Attractor Theory views all actions and situations as self-reinforcing slippery slopes.

As such, it more realistically models the act of taking certain actions as leading you to other Attractors, so you’re not just looking at things in isolation.

This view allows you to better see certain “traps”, where an action will lead you deeper and deeper down an addiction/reward cycle, like a huge bag of chips or a webcomic.

These are situations where, after the initial buy-in, it becomes incredibly attractive to continue down the same path, as these actions make reinforce themselves, making it easy to continue on and on…

In this model, we can reasonably predict, for example, that any video on YouTube will likely lead to more videos because the “sucked-in-craving-more-videos Future You” will have different preferences than “needing-some-sort-of-break Present You”.

Our model better reveals how things like YouTube are being deceptive by tricking your brain with the promise of a small action (“I’ll watch just one video…”).

The reality is that watching YouTube is a monstrously large Attractor.

(“It’s so f***ing big!”)

The vast variety of suggested videos coupled with the inertia associated with switching actions means that it’s never actually just that one video. Once you’re down the rabbit hole, you just keep on going.

Under Attractor Theory, you’d want to avoid situations which you could lead you down dangerous spirals, even when the initial actions themselves may not be that distracting because the model more accurately penalizes this type of snowballing.



Attractor Theory tries to explain how we’re not always directly in control. Our actions appear to affect how we take other actions. Still, we do have willpower, and it’s best to try and strategically use energy when considering which decisions to make.

Next Essay


5 comments, sorted by Highlighting new comments since Today at 5:53 PM
New Comment

This triggered Valentine's Lotus for me. Are the concepts similar on a deeper level?

I think they're pretty different. My read on on Lotus Eating is basically "addictiveness is considered harmful and there's a bunch of things out there that can hijack your attention".

My thoughts on Attractor Theory are more about "note the ways your preferences will change in response to actions you take and act accordingly". While this certainly often includes how certain activities can be spirals, I think it goes more breadth-wise and prescribes a more general strategy for which actions to take.

Musings on Lotus Eating can get a lot deeper into the "whys" and "hows" of whether or not X counts as Lotus Eating, whether or not you could secretly benefit from X, etc. Attractor Theory just notes that getting spiraled in could be a consequence of certain actions, and you can take this into account.

Oh, that's right, thanks!

I think I misremembered/misunderstood Lotus and the concepts got jumbled together.

Thanks for introducing the terms Attractor Theory and Meta-Effects.

How does Attractor Theory compare to Virtue Ethics? On one hand, it seems a lot like a utilitarian adaption of Virtue Ethics. On the other hand, the attractor metaphor seems to focus my attention more on short-term Meta-Effects, while the Virtue Ethics framing seems to focus my attention more on long-term Meta-Effects.

Note: I don't know much about Virtue Ethics other than the post here.

First off, I think it's clear that both of them operate on the assumption that things you do will have effects on you, and that this is a key instrumental consideration. And they both do seem to be basically instances of that general principle.

Virtue Ethics seems to work by shifting your self-image. Doing X to become the Sort Of Person Who Does X, such that in future situations, being consistent wins out, is what I think is important here.

Attractor Theory is sort of more about asking yourself, "If I don't currently "want" to do X, can I put myself in a position to want to do X?"

Mainly, they seem to work on a different level of granularity.

When using Virtue Ethics, it feels like you're choosing among classes of strategies (EX: "Finishing this project that won't net me much good is still beneficial in the long run because I view myself more as the type of person who gets things done."), while Attractor Theory is more about the moment-to-moment shifts + being mindful of how your local preferences change (EX:"I don't want to do work right now. That'll probably change if I take a nap, so let's do that.").