Pascal's Muggle (short version)


29


Eliezer_Yudkowsky

Shortened version of:  Pascal's Muggle:  Infinitesimal Priors and Strong Evidence

One proposal which has been floated for dealing with Pascal's Mugger is to penalize hypotheses that let you affect a large number of people, in proportion to the number of people affected - what we could call perhaps a "leverage penalty" instead of a "complexity penalty".  This isn't just for Pascal's Mugger in particularly, it seems required to have expected utilities in general converge when the 'size' of scenarios can grow much faster than their algorithmic complexity.

Unfortunately this potentially leads us into a different problem, that of Pascal's Muggle.

Suppose a poorly-dressed street person asks you for five dollars in exchange for doing a googolplex's worth of good using his Matrix Lord powers - say, saving the lives of a googolplex other people inside computer simulations they're running.

"Well," you reply, "I think that it would be very improbable that I would be able to affect so many people through my own, personal actions - who am I to have such a great impact upon events?  Indeed, I think the probability is somewhere around one over googolplex, maybe a bit less.  So no, I won't pay five dollars - it is unthinkably improbable that I could do so much good!"

"I see," says the Mugger.

A wind begins to blow about the alley, whipping the Mugger's loose clothes about him as they shift from ill-fitting shirt and jeans into robes of infinite blackness, within whose depths tiny galaxies and stranger things seem to twinkle.  In the sky above, a gap edged by blue fire opens with a horrendous tearing sound - you can hear people on the nearby street yelling in sudden shock and terror, implying that they can see it too - and displays the image of the Mugger himself, wearing the same robes that now adorn his body, seated before a keyboard and a monitor.

"That's not actually me," the Mugger says, "just a conceptual representation, but I don't want to drive you insane.  Now give me those five dollars, and I'll save a googolplex lives, just as promised.  It's easy enough for me, given the computing power my home universe offers.  As for why I'm doing this, there's an ancient debate in philosophy among my people - something about how we ought to sum our expected utilities - and I mean to use the video of this event to make a point at the next decision theory conference I attend.   Now will you give me the five dollars, or not?"

"Mm... no," you reply.

"No?" says the Mugger.  "I understood earlier when you didn't want to give a random street person five dollars based on a wild story with no evidence behind it whatsoever.  But surely I've offered you evidence now."

"Unfortunately, you haven't offered me enough evidence," you explain.

"Seriously?" says the Mugger.  "I've opened up a fiery portal in the sky, and that's not enough to persuade you?  What do I have to do, then?  Rearrange the planets in your solar system, and wait for the observatories to confirm the fact?  I suppose I could also explain the true laws of physics in the higher universe in more detail, and let you play around a bit with the computer program that encodes all the universes containing the googolplex people I would save if you just gave me the damn five dollars -"

"Sorry," you say, shaking your head firmly, "there's just no way you can convince me that I'm in a position to affect a googolplex people, because the prior probability of that is one over googolplex.  If you wanted to convince me of some fact of merely 2-100 prior probability, a mere decillion to one - like that a coin would come up heads and tails in some particular pattern of a hundred coinflips - then you could just show me 100 bits of evidence, which is within easy reach of my brain's sensory bandwidth.  I mean, you could just flip the coin a hundred times, and my eyes, which send my brain a hundred megabits a second or so - though that gets processed down to one megabit or so by the time it goes through the lateral geniculate nucleus - would easily give me enough data to conclude that this decillion-to-one possibility was true.  But to conclude something whose prior probability is on the order of one over googolplex, I need on the order of a googol bits of evidence, and you can't present me with a sensory experience containing a googol bits.  Indeed, you can't ever present a mortal like me with evidence that has a likelihood ratio of a googolplex to one - evidence I'm a googolplex times more likely to encounter if the hypothesis is true, than if it's false - because the chance of all my neurons spontaneously rearranging themselves to fake the same evidence would always be higher than one over googolplex.  You know the old saying about how once you assign something probability one, or probability zero, you can't update that probability regardless of what evidence you see?  Well, odds of a googolplex to one, or one to a googolplex, work pretty much the same way."

"So no matter what evidence I show you," the Mugger says - as the blue fire goes on crackling in the torn sky above, and screams and desperate prayers continue from the street beyond - "you can't ever notice that you're in a position to help a googolplex people."

"Right!" you say.  "I can believe that you're a Matrix Lord.  I mean, I'm not a total Muggle, I'm psychologically capable of responding in some fashion to that giant hole in the sky.  But it's just completely forbidden for me to assign any significant probability whatsoever that you will actually save a googolplex people after I give you five dollars.  You're lying, and I am absolutely, absolutely, absolutely confident of that."

"So you weren't just invoking the leverage penalty as a plausible-sounding way of getting out of paying me the five dollars earlier," the Mugger says thoughtfully.  "I mean, I'd understand if that was just a rationalization of your discomfort at forking over five dollars for what seemed like a tiny probability, when I hadn't done my duty to present you with a corresponding amount of evidence before demanding payment.  But you... you're acting like an AI would if it was actually programmed with a leverage penalty on hypotheses!"

"Exactly," you say.  "I'm forbidden a priori to believe I can ever do that much good."

"Why?" the Mugger says curiously.  "I mean, all I have to do is press this button here and a googolplex lives will be saved."  The figure within the blazing portal above points to a green button on the console before it.

"Like I said," you explain again, "the prior probability is just too infinitesimal for the massive evidence you're showing me to overcome it -"

The Mugger shrugs, and vanishes in a puff of purple mist.

The portal in the sky above closes, taking with the console and the green button.

(The screams go on from the street outside.)

A few days later, you're sitting in your office at the physics institute where you work, when one of your colleagues bursts in through your door, seeming highly excited.  "I've got it!" she cries.  "I've figured out that whole dark energy thing!  Look, these simple equations retrodict it exactly, there's no way that could be a coincidence!"

At first you're also excited, but as you pore over the equations, your face configures itself into a frown.  "No..." you say slowly.  "These equations may look extremely simple so far as computational complexity goes - and they do exactly fit the petabytes of evidence our telescopes have gathered so far - but I'm afraid they're far too improbable to ever believe."

"What?" she says.  "Why?"

"Well," you say reasonably, "if these equations are actually true, then our descendants will be able to exploit dark energy to do computations, and according to my back-of-the-envelope calculations here, we'd be able to create around a googolplex people that way.  But that would mean that we, here on Earth, are in a position to affect a googolplex people - since, if we blow ourselves up via a nanotechnological war or (cough) make certain other errors, those googolplex people will never come into existence.  The prior probability of us being in a position to impact a googolplex people is on the order of one over googolplex, so your equations must be wrong."

"Hmm..." she says.  "I hadn't thought of that.  But what if these equations are right, and yet somehow, everything I do is exactly balanced, down to the googolth decimal point or so, with respect to how it impacts the chance of modern-day Earth participating in a chain of events that leads to creating an intergalactic civilization?"

"How would that work?" you say.  "There's only seven billion people on today's Earth - there's probably been only a hundred billion people who ever existed total, or will exist before we go through the intelligence explosion or whatever - so even before analyzing your exact position, it seems like your leverage on future affairs couldn't reasonably be less than one in a ten trillion part of the future or so."

"But then given this physical theory which seems obviously true, my acts might imply expected utility differentials on the order of 1010100-13," she explains, "and I'm not allowed to believe that no matter how much evidence you show me."


This problem may not be as bad as it looks; a leverage penalty may lead to more reasonable behavior than depicted above, after taking into account Bayesian updating:


Mugger:  "Give me five dollars, and I'll save 3↑↑↑3 lives using my Matrix Powers."

You:  "Nope."

Mugger:  "Why not?  It's a really large impact."

You:  "Yes, and I assign a probability on the order of 1 in 3↑↑↑3 that I would be in a unique position to affect 3↑↑↑3 people."

Mugger:  "Oh, is that really the probability that you assign?  Behold!"

(A gap opens in the sky, edged with blue fire.)

Mugger:  "Now what do you think, eh?"

You:  "Well... I can't actually say this has a likelihood ratio of 3↑↑↑3 to 1.  No stream of evidence that can enter a human brain over the course of a century is ever going to have a likelihood ratio larger than, say, 101026 to 1 at the absurdly most, assuming one megabit per second of sensory data, for a century, each bit of which has at least a 1-in-a-trillion error probability.  You'd probably start to be dominated by Boltzmann brains or other exotic minds well before then."

Mugger:  "So you're not convinced."

You:  "Indeed not.  The probability that you're telling the truth is so tiny that God couldn't find it with an electron microscope.  Here's the five dollars."

Mugger:  "Done!  You've saved 3↑↑↑3 lives!  Congratulations, you're never going to top that, your peak life accomplishment will now always lie in your past.  But why'd you give me the five dollars if you think I'm lying?"

You:  "Well, because the evidence you did present me with had a likelihood ratio of at least a billion to one - I would've assigned less than 10-9 prior probability of seeing this when I woke up this morning - so in accordance with Bayes's Theorem I promoted the probability from 1/3↑↑↑3 to at least 109/3↑↑↑3, which when multiplied by an impact of 3↑↑↑3, yields an expected value of at least a billion lives saved for giving you five dollars."


I confess that I find this line of reasoning a bit suspicious - it seems overly clever - but at least on the level of intuitive-virtues-of-rationality it doesn't seem completely stupid in the same way as Pascal's Muggle.  This muggee is at least behaviorally reacting to the evidence.  In fact, they're reacting in a way exactly proportional to the evidence - they would've assigned the same net importance to handing over the five dollars if the Mugger had offered 3↑↑↑4 lives, so long as the strength of the evidence seemed the same.

But I still feel a bit nervous about the idea that Pascal's Muggee, after the sky splits open, is handing over five dollars while claiming to assign probability on the order of 109/3↑↑↑3 that it's doing any good.  My own reaction would probably be more like this:


Mugger:  "Give me five dollars, and I'll save 3↑↑↑3 lives using my Matrix Powers."

Me:  "Nope."

Mugger:  "So then, you think the probability I'm telling the truth is on the order of 1/3↑↑↑3?"

Me:  "Yeah... that probably has to follow.  I don't see any way around that revealed belief, given that I'm not actually giving you the five dollars.  I've heard some people try to claim silly things like, the probability that you're telling the truth is counterbalanced by the probability that you'll kill 3↑↑↑3 people instead, or something else with a conveniently exactly equal and opposite utility.  But there's no way that things would balance out that neatly in practice, if there was no a priori mathematical requirement that they balance.  Even if the prior probability of your saving 3↑↑↑3 people and killing 3↑↑↑3 people, conditional on my giving you five dollars, exactly balanced down to the log(3↑↑↑3) decimal place, the likelihood ratio for your telling me that you would "save" 3↑↑↑3 people would not be exactly 1:1 for the two hypotheses down to the log(3↑↑↑3) decimal place.  So if I assigned probabilities much greater than 1/3↑↑↑3 to your doing something that affected 3↑↑↑3 people, my actions would be overwhelmingly dominated by even a tiny difference in likelihood ratio elevating the probability that you saved 3↑↑↑3 people over the probability that you did something equally and oppositely bad to them.  The only way this hypothesis can't dominate my actions - really, the only way my expected utility sums can converge at all - is if I assign probability on the order of 1/3↑↑↑3 or less.  I don't see any way of escaping that part."

Mugger:  "But can you, in your mortal uncertainty, truly assign a probability as low as 1 in 3↑↑↑3 to any proposition whatever?  Can you truly believe, with your error-prone neural brain, that you could make 3↑↑↑3 statements of any kind one after another, and be wrong, on average, about once?"

Me:  "Nope."

Mugger:  "So give me five dollars!"

Me:  "Nope."

Mugger:  "Why not?"

Me:  "Because even though I, in my mortal uncertainty, will eventually be wrong about all sorts of things if I make enough statements one after another, this fact can't be used to increase the probability of arbitrary statements beyond what my prior says they should be, because then my prior would sum to more than 1.  There must be some kind of required condition for taking a hypothesis seriously enough to worry that I might be overconfident about it -"

Mugger:  "Then behold!"

(A gap opens in the sky, edged with blue fire.)

Mugger:  "Now what do you think, eh?"

Me (staring up at the sky):  "...whoa."  (Pause.)  "You turned into a cat."

Mugger:  "What?"

Me:  "Private joke.  Okay, I think I'm going to have to rethink a lot of things.  But if you want to tell me about how I was wrong to assign a prior probability on the order of 1/3↑↑↑3 to your scenario, I will shut up and listen very carefully to what you have to say about it.  Oh, and here's the five dollars, can I pay an extra twenty and make some other requests?"

(The thought bubble pops, and we return to two people standing in an alley, the sky above perfectly normal.)

Mugger:  "Now, in this scenario we've just imagined, you were taking my case seriously, right?  But the evidence there couldn't have had a likelihood ratio of more than 101026 to 1, and probably much less.  So by the method of imaginary updates, you must assign probability at least 10-1026 to my scenario, which when multiplied by a benefit on the order of 3↑↑↑3, yields an unimaginable bonanza in exchange for just five dollars -"

Me:  "Nope."

Mugger:  "How can you possibly say that?  You're not being logically coherent!"

Me:  "I agree that I'm being incoherent in a sense, but I think that's acceptable in this case, since I don't have infinite computing power.  In the scenario you're asking me to imagine, you're presenting me with evidence which I currently think Can't Happen.  And if that actually does happen, the sensible way for me to react is by questioning my prior assumptions and reasoning which led me to believe I shouldn't see it happen.  One way that I handle my lack of logical omniscience - my finite, error-prone reasoning capabilities - is by being willing to assign infinitesimal probabilities to non-privileged hypotheses so that my prior over all possibilities can sum to 1.  But if I actually see strong evidence for something I previously thought was super-improbable, I don't just do a Bayesian update, I should also question whether I was right to assign such a tiny probability in the first place - whether the scenario was really as complex, or unnatural, as I thought.  In real life, you are not ever supposed to have a prior improbability of 10-100 for some fact distinguished enough to be written down, and yet encounter strong evidence, say 1010 to 1, that the thing has actually happened.  If something like that happens, you don't do a Bayesian update to a posterior of 10-90.  Instead you question both whether the evidence might be weaker than it seems, and whether your estimate of prior improbability might have been poorly calibrated, because rational agents who actually have well-calibrated priors should not encounter situations like that until they are ten billion days old.  Now, this may mean that I end up doing some non-Bayesian updates:  I say some hypothesis has a prior probability of a quadrillion to one, you show me evidence with a likelihood ratio of a billion to one, and I say 'Guess I was wrong about that quadrillion to one thing' rather than being a Muggle about it.  And then I shut up and listen to what you have to say about how to estimate probabilities, because on my worldview, I wasn't expecting to see you turn into a cat.  But for me to make a super-update like that - reflecting a posterior belief that I was logically incorrect about the prior probability - you have to really actually show me the evidence, you can't just ask me to imagine it.  This is something that only logically incoherent agents ever say, but that's all right because I'm not logically omniscient."


When I add up a complexity penalty, a leverage penalty, and the "You turned into a cat!" logical non-omniscience clause, I get the best candidate I have so far for the correct decision-theoretic way to handle these sorts of possibilities while still having expected utilities converge.

As mentioned in the longer version, this has very little in the way of relevance for optimal philanthropy, because we don't really need to consider these sorts of rules for handling small large numbers on the order of a universe containing 1080 atoms, and because most of the improbable leverage associated with x-risk charities is associated with discovering yourself to be an Ancient Earthling from before the intelligence explosion, which improbability (for universes the size of 1080 atoms) is easily overcome by the sensory experiences which tell you you're an Earthling.  For more on this see the original long-form post.  The main FAI issue at stake is what sort of prior to program into an AI.