If you know that your ability to make the world better is somewhat below average (across all possible worlds you could have found yourself in)
What counterfactual are we looking for here? This makes me very confused?
Sort of thinking of this tweet:
For which are you most thankful: (A): existing at all, (B): given (a), existing as human. (C): given (b) your time & place in human history, or (D) given (c) your particular role & associates.
In a recent post, Scott Garrabrant gave an application of geometric rationality to the problem of work-life balance. Here's the setup: part of you wants to try to make the world better (I'll be calling this your "altruistic part") and part of you wants to relax and play video games (your "relaxation part"). Geometric rationality suggests doing a Nash bargain between the altruistic part and the relaxation part, across possible worlds that you might have found yourself in. In worlds where you're in an unusually good position to make the world better, the bargain commits you to spend most of your time doing that (satisfying your altruistic part); in return, in worlds where you're not in a good position to make the world better, you spend most of your time playing video games (satisfying your relaxation part).
Scott built a toy mathematical model and tested it with a nice example. The example involved five possible worlds, all equally likely. The way the math worked out, if you ranked the five worlds from least conducive to your altruism to most conducive, the Nash bargain had you spend 0%, 50%, 67%, 75%, and 80% of your time on altruism, respectively, in those five worlds. A pretty nice, intuitively satisfying list of numbers.
I recommend reading Scott's post before reading this one. And while I really liked the post, his example was constructed so that the probability distribution over how much altruistic impact you could have was pretty close to a uniform distribution. In the example, the world that was most conducive to your altruistic impact only gave you 1.7x more impact than the median world. But I think impact is much more unevenly distributed than that across possible worlds.
My issue with Scott's post wasn't his model (I quite liked the model!); it was only his example. So I decided to take the model and solve for the optimal bargain in more generality. I wanted to see what the results would be -- whether they'd still be intuitively compelling.
The rest of this post will be a bunch of math, but here's the (imprecise) TLDR:
All right, time to get to the math!
The setup
First, let r be the fraction of you that wants to improve the world (Scott's example uses r=2/3).
Order all worlds based on how impact potential you have, from least to most. For x∈[0,1], let f(x) be the impact potential you have in the x-th quantile world (so f is an increasing function). (In Scott's example, f is 1 on (0,0.2), 2 on (0.2,0.4), 3 on (0.4,0.6), 4 on (0.6,0.8), and 5 on (0.8,1).)
Let p(x) be fraction of time you'll be devoting toward altruism in the x-th quantile world. This is the function that we will be optimizing. Specifically, we will be looking for the function p:[0,1]→[0,1] that maximizes
(∫10p(x)f(x)dx)r⋅(∏10(1−p(x))dx)1−r.
(This is just the generalization of Scott's equation to a continuous setting.)
...okay, whoa there, what's this thing with the dx in the exponent? You can read Scott's post on the geometric integral if you'd like. But if you (like me) don't have experience with geometric integrals, fear not: it'll go away once we take the log of that expression (which we can do because the function p that maximizes that expression will also be the p that maximizes its logarithm).
So, taking the log: we are looking for the function p that maximizes
V(p):=rln∫10p(x)f(x)dx+(1−r)∫10ln(1−p(x))dx.
(V stands for "value".)
The next section will be technical. Feel free to skip to the "Drawing conclusions" section below -- you won't lose too much.
Solving for the optimal function p
The right way to optimize this is with Lagrange multipliers. Unfortunately, I don't know Lagrange multipliers, so I do these sorts of optimization problems informally.
Let ^p be the optimal choice of p. Let's consider making a small change to the value of ^p near a particular value x0: in particular, for very small ϵ and very small positive δ, we will consider increasing ^p(x) by ϵ for x∈[x0−δ/2,x0+δ/2]. This change will approximately:
for a net change of δϵ(rf(x0)μ−1−r1−^p(x0)). In order for ^p to be optimal, this change cannot be positive. Thus, one of two things must be true for all x: either
(Briefly, we can't have the analogous edge case with ^p(x)=1 because we have 1−^p(x0) in a denominator.)
Therefore, rearranging terms, we have
^p(x)=max(0,1−1−rrμ⋅1f(x)).
We find it convenient to define c:=1−rrμ, so that
^p(x)=max(0,1−cf(x)).
(An aside: the quantity μ (which, recall, is defined as ∫10^p(x)f(x)dx) has a nice interpretation: it's how much you improve the world, on average across worlds, if you follow the optimal bargain ^p. The quantity c also has a nice interpretation: it's how much your relaxation part would need to be satisfied, if it were satisfied as much as your altruistic part, in proportion to how much these two parts exist.)
Our solution for ^p shows that you should spend all your time playing video games if you find yourself in a world where f(x)≤c, and should spend at least some of your time saving the world if in fact f(x)>c.
...okay, but what is c? While we found ^p in terms of c, we've defined c in terms of ^p. That means we have an equation to solve! In particular, we have
c=1−rr∫10max(0,1−cf(x))f(x)dx=1−rr∫10max(0,f(x)−c)dx=1−rr∫1f−1(c)(f(x)−c)dx.
By moving the c out of the integrand and rearranging, we can rewrite this as
c=1−r1−(1−r)f−1(c)∫1f−1(c)f(x)dx. (Eq. 1)
This isn't a solution, since there's a c on the right side of the equation. And we won't solve for c exactly, but we will prove a nice lemma that bounds c on both sides in terms of ∫10f(x)dx -- which, remember, measures your average impact potential across possible worlds.
Lemma: (1−r)∫10f(x)dx≤c≤1−rr∫10f(x)dx.
Proof: the right-hand inequality is straightforward. We have
c=1−rrμ=1−rr∫10^p(x)f(x)dx≤1−rr∫10f(x)dx.
To prove the left-hand inequality, for convenience we will define t:=f−1(c), A:=∫t0f(x)dx, and B:=∫1tf(x)dx. Then (from Eq. 1) c=1−r1−(1−r)tB. We also have that A/t≤c, i.e. that the average value of f on [0,t] is less than c, since f(t)=c and f is increasing. Therefore, we have
c∫10f(x)dx=cA+B≥cct+B=1t+1−(1−r)t1−r=1−r.
This completes the proof.
Now, recall that ^p(x)=max(0,1−cf(x)). Let C(x):=1−cf(x). By Lemma 1, for any particular x-value x0, we have
1−1−rr∫10f(x)dxf(x0)≤C(x0)≤1−(1−r)∫10f(x)dxf(x0).
This gives us some nice bounds to work with when interpreting ^p.
Drawing conclusions
Recall that r is the fraction of you that's altruistically minded. In this section, I recommend having a particular value of r in mind. Scott's choice of r=2/3 is a reasonable one.
The optimal function p is ^p(x)=max(0,C(x)). If you skipped the previous section, you don't need to worry about what C(x) is; all you need to know is that we've bounded it: for every x-value x0, we have
1−1−rr∫10f(x)dxf(x0)≤C(x0)≤1−(1−r)∫10f(x)dxf(x0).
If x0 is such that C(x0)≤0, the model tells us to always play video games. We are guaranteed that this is the case whenever
f(x0)∫10f(x)dx≤1−r.
The quantity on the left has a really natural interpretation: it is the factor by which you have more impact potential in world x0 than you do in the average world. So if you're in a somewhat below-average impact world, you can safely spend all your time playing video games (or so the model says).
On the other hand, what if this factor is instead pretty large -- at least 100⋅1−rr, let's say? Then our bound tells us that ^p(x0)≥0.99 -- that is, you should spend nearly all your time working to improve the world.
And if for some reason you think that the factor is pretty close to 1 -- let's say it's exactly 1 -- then the time you should spend improving the world is somewhere between 2−1r and r. (If r=2/3, this gives a lower bound of 50% and an upper bound of 67%.) So you'd probably spend some intermediate fraction of your time improving the world.
Analysis and takeaways
First: how satisfying is this solution?
There's a sense in which I find it satisfying, which is that it accords with my intuition of what should have happened in the model. It seems like a reasonable way for the bargain to go down.
Then again, a sense in which I find it dissatisfying is that if you think (as I do) that the distribution of impact potential across possible worlds has a huge spread, then the bargain has almost all copies of you either spending all their time on video games or spending nearly all their time on improving the world. Which isn't really the takeaway that Scott was going for in his original post.
Is there a way to salvage this -- to restore the intuition that in fact you should have a reasonable work-life balance? I think the answer is yes! That's because you're probably very uncertain about what world you live in: you don't have a great idea about what the shape of f is, nor what quantile world you're in (i.e. what x-value your world corresponds to). You should probably assign a decent chance to being in an above-average world in terms of impact potential, and a decent chance to being in a below-average world in terms of impact potential.[1]
Faced with this uncertainty, what do you do? If I understand the spirit of geometric rationality correctly, an important lesson is that you can do a Nash bargain not just between you-in-possible-worlds, but between different possibilities with respect to your epistemic state. For example, if you think there's a 90% chance that the Riemann hypothesis is true, you can bargain between you-if-the-Riemann-hypothesis-is-true and you-if-the-Riemann-hypothesis-is-false, even if there's an objective fact of the matter to which one is the actual you.
You can try to build this uncertainty into the model, but I think this would be really hard.[2] On the other hand, here's a basic fact about Nash bargaining: suppose there are two options, A and B, and you assign x credence that A has utility a and B has utility 0, and 1−x credence that A has utility 0 and B has utility b, then the Nash bargain between these two epistemic states says to do A with probability x and B with probability 1−x.
And if you squint, this is sort of like what we have going on. Suppose you think that you're either in a substantially above-average world for impact potential (with probability x), or a substantially below-average one (with probability 1−x). In the first epistemic state, "spend almost all your time altruistically" (option A) is much better than "spend all your time on video games" (option B); the the second case, option B is much better than option A. And so -- if I'm not losing too much accuracy with my hand-wavy comparisons -- the Nash bargain between these two epistemic states would have you spend x fraction of your time on video games.
So suppose you think there's a 25% chance you live in a substantially below-average world in terms of impact potential, a 70% chance you live in a substantially above-average world, and a 5% chance of a close-to-average world -- well then, perhaps you should spend about 70% of your time working to improve the world!
...well, if you buy the basic model, and if you buy the hand-waving about Nash bargaining over epistemic states that I just did (which I'm not sure I buy)
...but, putting all these caveats to the side, I find this conclusion pretty satisfying.
That last bit -- that maybe you're in a below average impact world -- is perhaps contrarian, but I stand by it. You may well think that you are in the top one-billionth of possible worlds in terms of impact potential. But it could be that among those worlds that are even higher than this one in terms of your impact potential, 1% of them have 1010100 times more potential beings than ours, and that this completely dominates the average.
You'd need to have the domain of optimization be the set of your epistemic states instead of the set of world states. But your "epistemic state" includes way more than just a guess about what f looks like across worlds (or even a probability distribution over what f looks like across worlds). You need to include -- in each individual epistemic state -- your guesses about (what your probability distribution over what f looks like across worlds) looks like in all the other possible epistemic states. And you'll need to include your guesses about what those look like in all the other possible epistemic states. And so on. There are probably nice simplifying assumptions you could make to make this optimization problem tractable; I haven't thought much about this.