This post is about finding a way to resolve the paradox inherent in Pascal's Mugging. Note that I'm not talking about the bastardized version of Pascal's Mugging that's gotten popular of late, where it's used to refer to any argument involving low probabilities and huge stakes (e.g. low chance of thwarting unsafe AI vs. astronomical stakes). Neither am I talking specifically about the "mugging" illustration, where a "mugger" shows up to threaten you.
Rather I'm talking about the general decision-theoretic problem
, where it makes no difference how low of a probability you put on some deal paying off, because one can always choose a humongous enough payoff to make "make this deal" be the dominating option. This is a problem that needs to be solved in order to build e.g. an AI system that uses expected utility and will behave in a reasonable manner.
Intuition: how Pascal's Mugging breaks implicit assumptions in expected utility theory
Intuitively, the problem with Pascal's Mugging type arguments is that some probabilities are just too low to care about. And we need a way to look at just the probability part component in the expected utility calculation and ignore the utility component, since the core of PM is that the utility can always be arbitrarily increased to overwhelm the low probability.
Let's look at the concept of expected utility a bit. If you have a 10% chance of getting a dollar each time when you make a deal, and this has an expected value of 0.1, then this is just a different way of saying that if you took the deal ten times, then you would on average have 1 dollar at the end of that deal.
More generally, it means that if you had the opportunity to make ten different deals that all had the same expected value, then after making all of those, you would on average end up with one dollar. This is the justification for why it makes sense to follow expected value even for unique non-repeating events: because even if that particular event wouldn't repeat, if your general strategy is to accept other bets with the same EV, then you will end up with the same outcome as if you'd taken the same repeating bet many times. And even though you only get the dollar after ten deals on average, if you repeat the trials sufficiently many times, your probability of having the average payout will approach one.
Now consider a Pascal's Mugging scenario. Say someone offers to create 10^100 happy lives in exchange for something, and you assign them a 0.000000000000000000001 probability to them being capable and willing to carry through their promise. Naively, this has an overwhelmingly positive expected value.
But is it really a beneficial trade? Suppose that you could make one deal like this per second, and you expect to live for 60 more years, for about 1,9 billion trades in total. Then, there would be a probability of 0,999999999998 that the deal would never once have paid off for you. Which suggests that the EU calculation's implicit assumption - that you can repeat this often enough for the utility to converge to the expected value - would be violated.
Our first attempt
This suggests an initial way of defining a "probability small enough to be ignored":
1. Define a "probability small enough to be ignored" (PSET, or by slight rearranging of letters, PEST) such that, over your lifetime, the expected times that the event happens will be less than one.
2. Ignore deals where the probability component of the EU calculation involves a PEST.
Looking at the first attempt in detail
To calculate PEST, we need to know how often we might be offered a deal with such a probability. E.g. a 10% chance for something might be a PEST if we only lived for a short enough time that we could make a deal with a 10% chance once. So, a more precise definition of a PEST might be that it's a probability such that
(amount of deals that you can make in your life that have this probability) * (PEST) < 1
But defining "one" as the minimum times we should expect the event to happen for the probability to not be a PEST feels a little arbitrary. Intuitively, it feels like the threshold should depend on our degree of risk aversion: maybe if we're risk averse, we want to reduce the expected amount of times something happens during our lives to (say) 0,001 before we're ready to ignore it. But part of our motivation was that we wanted a way to ignore the utility part of the calculation: bringing in our degree of risk aversion seems like it might introduce the utility again.
What if redefined risk aversion/neutrality/preference (at least in this context) as how low one would be willing to let the "expected amount of times this might happen" fall before considering a probability a PEST?
Let's use this idea to define an Expected Lifetime Utility:
ELU(S,L,R) = the ELU of a strategy S over a lifetime L is the expected utility you would get if you could make L deals in your life, and were only willing to accept deals with a minimum probability P of at least S, taking into account your risk aversion R and assuming that each deal will pay off approximately P*L times.
Suppose that we a have a world where we can take three kinds of actions.
- Action A takes 1 unit of time and has an expected utility of 2 and probability 1/3 of paying off on any one occasion.
- Action B takes 3 units of time and has an expected utility of 10^(Graham's number) and probability 1/100000000000000 of paying off one any one occasion.
- Action C takes 5 units of time and has an expected utility of 20 and probability 1/100 of paying off on an one occasion.
Assuming that the world's lifetime is fixed at L = 1000 and R = 1:
ELU("always choose A"): we expect A to pay off on ((1000 / 1) * 1/3) = 333 individual occasions, so with R = 1, we deem it acceptable to consider the utility of A. The ELU of this strategy becomes (1000 / 1) * 2 = 2000.
ELU("always choose B"): we expect B to pay off on ((1000 / 3) * 1/100000000000000) = 0.00000000000333 occasions, so with R = 1, we consider the expected utility of B to be 0. The ELU of this strategy thus becomes ((1000 / 3) * 0) = 0.
ELU("always choose C"): we expect C to pay off on ((1000 / 5) * 1/100) = 2 individual occasions, so with R = 1, we consider the expected utility of C to be ((1000 / 5) * 20) = 4000.
Thus, "always choose C" is the best strategy.
Is R something totally arbitrary, or can we determine some more objective criteria for it?
Here's where I'm stuck. Thoughts are welcome. I do know that while setting R = 1 was a convenient example, it's most likely too high, because it would suggest things like not using seat belts.
An interesting thing about this approach is that the threshold for a PEST becomes dependent on one's expected lifetime. This is surprising at first, but actually makes some intuitive sense. If you're living in a dangerous environment where you might be killed anytime soon, you won't be very interested in speculative low-probability options; rather you want to focus on making sure you survive now. Whereas if you live in a modern-day Western society, you may be willing to invest some amount of effort in weird low-probability high-payoff deals, like cryonics.
On the other hand, whereas investing in that low-probability, high-utility option might not be good for you individually, it could still be a good evolutionary strategy for your genes. You yourself might be very likely to die, but someone else carrying the risk-taking genes might hit big and be very successful in spreading their genes. So it seems like our definition of L, lifetime length, should vary based on what we want: are we looking to implement this strategy just in ourselves, our whole species, or something else? Exactly what are we maximizing over?