Related to: Can Counterfactuals Be True?, Newcomb's Problem and Regret of Rationality.

Imagine that one day, Omega comes to you and says that it has just tossed a fair coin, and given that the coin came up tails, it decided to ask you to give it $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don't want to give up your $100. But see, Omega tells you that if the coin came up heads instead of tails, it'd give you $10000, but only if you'd agree to give it $100 if the coin came up tails.

Omega can predict your decision in case it asked you to give it $100, even if that hasn't actually happened, it can compute the counterfactual truth. Omega is also known to be absolutely honest and trustworthy, no word-twisting, so the facts are really as it says, it really tossed a coin and really would've given you $10000.

From your current position, it seems absurd to give up your $100. Nothing good happens if you do that, the coin has already landed tails up, you'll never see the counterfactual $10000. But look at this situation from your point of view before Omega tossed the coin. There, you have two possible branches ahead of you, of equal probability. On one branch, you are asked to part with $100, and on the other branch, you are conditionally given $10000. If you decide to keep $100, the expected gain from this decision is $0: there is no exchange of money, you don't give Omega anything on the first branch, and as a result Omega doesn't give you anything on the second branch. If you decide to give $100 on the first branch, then Omega gives you $10000 on the second branch, so the expected gain from this decision is

-$100 * 0.5 + $10000 * 0.5 = $4950

So, this straightforward calculation tells that you ought to give up your $100. It looks like a good idea before the coin toss, but it starts to look like a bad idea after the coin came up tails. Had you known about the deal in advance, one possible course of action would be to set up a precommitment. You contract a third party, agreeing that you'll lose $1000 if you don't give $100 to Omega, in case it asks for that. In this case, you leave yourself no other choice.

But in this game, explicit precommitment is not an option: you didn't know about Omega's little game until the coin was already tossed and the outcome of the toss was given to you. The only thing that stands between Omega and your 100$ is your ritual of cognition. And so I ask you all: is the decision to give up $100 when you have no real benefit from it, only counterfactual benefit, an example of winning?

P.S. Let's assume that the coin is deterministic, that in the overwhelming measure of the MWI worlds it gives the same outcome. You don't care about a fraction that sees a different result, in all reality the result is that Omega won't even consider giving you $10000, it only asks for your $100. Also, the deal is unique, you won't see Omega ever again.

268 comments, sorted by
magical algorithm
Highlighting new comments since Today at 11:55 PM
Select new highlight date
Moderation Guidelinesexpand_more

The counterfactual anti-mugging: One day No-mega appears. No-mega is completely trustworthy etc. No-mega describes the counterfactual mugging to you, and predicts what you would have done in that situation not having met No-mega, if Omega had asked you for $100.

If you would have given Omega the $100, No-mega gives you nothing. If you would not have given Omega $100, No-mega gives you $10000. No-mega doesn't ask you any questions or offer you any choices. Do you get the money? Would an ideal rationalist get the money?

Okay, next scenario: you have a magic box with a number p inscribed on it. When you open it, either No-mega comes out (probability p) and performs a counterfactual anti-mugging, or Omega comes out (probability 1-p), flips a fair coin and proceeds to either ask for $100, give you $10000, or give you nothing, as in the counterfactual mugging.

Before you open the box, you have a chance to precommit. What do you do?

If you would have given Omega the $100, No-mega gives you nothing. If you would not have given Omega $100, No-mega gives you $10000. No-mega doesn't ask you any questions or offer you any choices. Do you get the money? Would an ideal rationalist get the money?

I would have no actionable suspicion that I should give Omega the $100 unless I knew about No-mega. So I get the $10000 only if No-mega asks the question "What would Eliezer do knowing about No-mega?" and not if No-mega asks the question "What would Eliezer do not knowing about No-mega?"

You forgot about MetaOmega, who gives you $10,000 if and only if No-mega wouldn't have given you anything, and O-mega, who kills your family unless you're an Alphabetic Decision Theorist. This comment doesn't seem specifically anti-UDT -- after all, Omega and No-mega are approximately equally likely to exist; a ratio of 1:1 if not an actual p of .5 -- but it still has the ring of Just Cheating. Admittedly, I don't have any formal way of telling the difference between decision problems that feel more or less legitimate, but I think part of the answer might be that the Counterfactual Mugging isn't really about how to act around superintelligences: It illustrates a more general need to condition our decisions based on counterfactuals, and as EY pointed out, UDT still wins the No-mega problem if you know about No-mega, so whether or not we should subscribe to some decision theory isn't all that dependent on which superintelligences we encounter.

I'm necroing pretty hard and might be assuming too much about what Caspian originally meant, so the above is more me working this out for myself than anything else. But if anyone can explain why the No-mega problem feels like cheating to me, that would be appreciated.

Yes, that there can just as easily be a superintelligence that rewards people predicted to act one way as one that rewards people predicted to act the other. Which precommitment is most rational depends depends on the which type you expect to encounter.

I don't expect to encounter either, and on the other hand I can't rule out fallible human analogues of either. So for now I'm not precommitting either way.

You don't precommit to "give away the $100, to anyone who asks". You precommit to give away the $100 in exactly the situation I described. Or, generalizing such precommitments, you just compute your decisions on the spot, in a reflectively consistent fashion. If that's what you want do to with your future self, that is.

there can just as easily be a superintelligence that rewards people predicted to act one way as one that rewards people predicted to act the other.

Yeah, now. But after Omega really, really, appears in front of you, chance of Omega existing is about 1. Chance of No-Mega is still almost non-existent. In this problem, existence of Omega is given. It's not something you are expecting to encounter now, just as we're not expecting to encounter eccentric Kavkan billionaires that will give you money for toxicating yourself. The Kavka's Toxin and the counterfactual mugging present a scenario that is given, and ask you how would you act then.

But you aren't supposed to be updating... the essence of UDT, I believe, is that your policy should be set NOW, and NEVER UPDATED.

So... either:

  1. You consider the choice of policy based on the prior where you DIDN'T KNOW whether you'd face Nomega or Omega, and NEVER UPDATE IT (this seems obviously wrong to me: why are you using your old prior instead of your current posterior?). or
  2. You consider the choice of policy based on the prior where you KNOW that you are facing Omega AND that the coin is tails, in which case paying Omega only loses you money.

Thanks for pointing that out. The answer is, as expected, a function of p. So I now find explanations of why UDT gets mugged incomplete and misleading.

Here's my analysis:

The action set is {give, don't give}, which I'll identify with {1, 0}. Now, the possible deterministic policies are simply every mapping from {N,O} --> {1,0}, of which there are 4.

We can disregard the policies for which pi(N) = 1, since giving money to Nomega serves no purpose. So we're left with




which give/don't, respectively, to Omega.

Now, we can easily compute expected value, as follows:

r (pi_give(N)) = 0

r (pi_give(O, heads)) = 10

r (pi_give(0, tails)) = -1

r (pi_don't(N)) = 10

r (pi_don't(0)) = 0

So now:

Eg := E_give(r) = 0 p + .5 (10-1) * (1-p)

Ed := E_don't(r) = 10 p + 0 (1-p)

Eg > Ed whenever 4.5 (1-p) > 10 p,

i.e. whenever 4.5 > 14.5 p

i.e. whenever 9/29 > p

So, whether you should precommit to being mugged depends on how likely you are to encounter N vs. O, which is intuitively obvious.

Imagine that one day you come home to see your neighbors milling about your house and the Publisher's Clearinghouse (PHC) van just pulling away. You know that PHC has been running a new schtick recently of selling $100 lottery tickets to win $10,000 instead of just giving money away. In fact, you've used that very contest as a teachable moment with your kids to explain how once the first ticket of the 100 printed was sold, scratched, and determined not to be the winner -- that the average expected value of the remaining tickets was greater than their cost and they were therefore increasingly worth buying. Now, it's weeks later, most of the tickets have been sold, scratched, and not winners and they came to your house. In fact, there were only two tickets remaining. And you weren't home. Fortunately, your neighbor and best friend Bob asked if he could buy the ticket for you. Sensing a great human interest story (and lots of publicity), PHC said yes. Unfortunately, Bob picked the wrong ticket. After all your neighbors disperse and Bob and you are alone, Bob says that he'd really appreciate it if he could get his hundred dollars back. Is he mugging you? Or, do you give it to him?

Yes, I think you still owe him the $100.

But I like how you made it into a relatively realistic scenario.

Considering the ticket was worth $5,000 when he bought it, sure.

Did you give the same answer to Omega? The cases are exactly analogous. (Or do you argue that they are not?)

The disanalogy here is that you have a long term social relationship with Bob that you don't have with Omega, and the $100 are an investment into that relationship.

Also, there is the possibility of future scenarios arising in which Bob could choose to take comparable actions, and we want to encourage him in doing so. I agree that the cases are not exactly analogous.

The outcomes don't seem to be tied together as they were in the original problem; is it true that if had he won, he would only then have given you the money if, had he not won, you would have given him the $100 back? That isn't clear.

Philosopher Kenny Easwaran reported in 2007 that:

Josh von Korff, a physics grad student here at Berkeley, and versions of Newcomb’s problem. He shared my general intuition that one should choose only one box in the standard version of Newcomb’s problem, but that one should smoke in the smoking lesion example. However, he took this intuition seriously enough that he was able to come up with a decision-theoretic protocol that actually seems to make these recommendations. It ends up making some other really strange predictions, but it seems interesting to consider, and also ends up resembling something Kantian!

The basic idea is that right now, I should plan all my future decisions in such a way that they maximize my expected utility right now, and stick to those decisions. In some sense, this policy obviously has the highest expectation overall, because of how it’s designed.

Korff also reinvents counterfactual mugging:

Here’s another situation that Josh described that started to make things seem a little more weird. In Ancient Greece, while wandering on the road, every day one either encounters a beggar or a god. If one encounters a beggar, then one can choose to either give the beggar a penny or not. But if one encounters a god, then the god will give one a gold coin iff, had there been a beggar instead, one would have given a penny. On encountering a beggar, it now seems intuitive that (speaking only out of self-interest), one shouldn’t give the penny. But (assuming that gods and beggars are randomly encountered with some middling probability distribution) the decision protocol outlined above recommends giving the penny anyway.

In a sense, what’s happening here is that I’m giving the penny in the actual world, so that my closest counterpart that runs into a god will receive a gold coin. It seems very odd to behave like this, but from the point of view before I know whether or not I’ll encounter a god, this seems to be the best overall plan. But as Josh points out, if this was the only way people got food, then people would see that the generous were doing well, and generosity would spread quickly.

And he looks into generalizing to the algorithmic version:

If we now imagine a multi-agent situation, we can get even stronger (and perhaps stranger) results. If two agents are playing in a prisoner’s dilemma, and they have common knowledge that they are both following this decision protocol, then it looks like they should both cooperate. In general, if this decision protocol is somehow constitutive of rationality, then rational agents should always act according to a maxim that they can intend (consistently with their goals) to be followed by all rational agents. To get either of these conclusions, one has to condition one’s expectations on the proposition that other agents following this procedure will arrive at the same choices.

Korff is now an Asst. Prof. at Georgie State.

In Ancient Greece, while wandering on the road, every day one either encounters a beggar or a god.

If it's an iterated game, then the decision to pay is a lot less unintuitive.


My name is Omega. You may have heard of me.

Anyway, I have just tossed a fair coin, and given that the coin came up tails, I'm gonna have to ask each of you to give me $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don't want to give up your $100. But see, if the coin came up heads instead of tails, I'd have given you each $10000, but only to those that would agree to give me $100 if the coin came up tails.

You forgot to add that we have sufficient reason to believe everything you say.

My two bits: Omega's request is unreasonable.

Precommitting is something that you can only do before the coin is flipped. That's what the "pre" means. Omega's game rewards a precommitment, but Omega is asking for a commitment.

Precommitting is a rational thing to do because before the coin toss, the result is unknown and unknowable, even by Omega (I assume that's what "fair coin" means). This is a completely different course of action than committing after the coin toss is known! The utility computation for precommitment is not and should not be the same as the one for commitment.

In the example, you have access to information that pre-you doesn't (the outcome of the flip). If rationalists are supposed to update on new information, then it is irrational for you to behave like pre-you.

We're assuming Omega is trustworthy? I'd give it the $100, of course.

Had the coin come up differently, Omega might have explained the secrets of friendly artificial general intelligence. However, he now asks that you murder 15 people.

Omega remains completely trustworthy, if a bit sick.

Ha, I'll re-raise: Had the coin come up differently, Omega would have filled ten Hubble volumes with CEV-output. However, he now asks that you blow up this Hubble volume.

(Not only do you blow up the universe (ending humanity for eternity) you're glad that Omega showed to offer this transparently excellent deal. Morbid, ne?)

For some reason, raising the stakes in these hypotheticals to the point of actual pain has become reflex for me. I'm not sure if it's to help train my emotions to be able to make the right choices in horrible circumstances, or just my years in the Bardic Conspiracy looking for an outlet.

Raising the stakes in this way does not work, because of the issue described in Ethical Injunctions: it is less likely that Omega has presented you with this choice, than that you have gone insane.

So imagine yourself in the most inconvenient possible world where Omega is a known feature of the environment and has long been seen to follow through on promises of this type; it does not particularly occur to you or anyone that believing this fact makes you insane.

When I phrase it that way - imagine myself in a world full of other people confronted by similar Omega-induced dilemmas - I suddenly find that I feel substantially less uncomfortable; indicating that some of what I thought was pure ethical constraint is actually social ethical constraint. Still, it may function to the same self-protective effect as ethical constraint.

To add to the comments below, if you're going to take this route, you might as well have already decided that encountering Omega at all is less likely than that you have gone insane.

That may be true, but it's still a dodge. Conditional on not being insane, what's your answer?

Additionally, I don't see why Omega asking you to give it 100 dollars vs 15 human lives necessarily crosses the threshold of "more likely that I'm just a nutbar". I don't expect to talk to Omega anytime soon...

Can you please explain the reasoning behind this? Given all of the restrictions mentioned (no iterations, no possible benefit to this self) I can't see any reason to part with my hard earned cash. My "gut" says "Hell no!" but I'm curious to see if I'm missing something.

There are various intuition pumps to explain the answer.

The simplest is to imagine that a moment from now, Omega walks up to you and says "I'm sorry, I would have given you $10000, except I simulated what would happen if I asked you for $100 and you refused". In that case, you would certainly wish you had been the sort of person to give up the $100.

Which means that right now, with both scenarios equally probable, you should want to be the sort of person who will give up the $100, since if you are that sort of person, there's half a chance you'll get $10000.

If you want to be the sort of person who'll do X given Y, then when Y turns up, you'd better bloody well do X.

If you want to be the sort of person who'll do X given Y, then when Y turns up, you'd better bloody well do X.

Well said. That's a lot of the motivation behind my choice of decision theory in a nutshell.

Thanks, it's good to know I'm on the right track =)

I think this core insight is one of the clearest changes in my thought process since starting to read OB/LW -- I can't imagine myself leaping to "well, I'd hand him $100, of course" a couple years ago.

If you want to be the sort of person who'll do X given Y, then when Y turns up, you'd better bloody well do X.

I think this describes one of the core principles of virtue theory under any ethical system.

I wonder how much it depends upon accidents of human psychology, like our tendency to form habits, and how much of it is definitional (if you don't X when Y, then you're simply not the sort of person who Xes when Y)

That's not the situation in question. The scenario laid out by Vladimir_Nesov does not allow for an equal probability of getting $10000 and paying $100. Omega has already flipped the coin, and it's already been decided that I'm on the "losing" side. Join that with the fact that me giving $100 now does not increase the chance of me getting $10000 in the future because there is no repetition.

Perhaps there's something fundamental I'm missing here, but the linearity of events seems pretty clear. If Omega really did calculate that I would give him the $100 then either he miscalculated, or this situation cannot actually occur.

-- EDIT --

There is a third possibility after reading Cameron's reply... If Omega is correct and honest, then I am indeed going to give up the money.

But it's a bit of a trick question, isn't it? I'm going to give up the money because Omega says I'm going to give up the money and everything Omega says is gospel truth. However, if Omega hadn't said that I would give up the money, then I wouldn't of given up the money. Which makes this a bit of an impossible situation.

Assuming the existence of Omega, his intelligence, and his honesty, this scenario is an impossibility.

I feel like a man in an Escher painting, with all these recursive hypothetical mes, hypothetical kuriges, and hypothetical omegas.

I'm saying, go ahead and start by imagining a situation like the one in the problem, except it's all happening in the future -- you don't yet know how the coin will land.

You would want to decide in advance that if the coin came up against you, you would cough up $100.

The ability to precommit in this way gives you an advantage. It gives you half a chance at $10000 you would not otherwise have had.

So it's a shame that in the problem as stated, you don't get to precommit.

But the fact that you don't get advance knowledge shouldn't change anything. You can just decide for yourself, right now, to follow this simple rule:

If there is an action to which my past self would have precommited, given perfect knowledge, and my current preferences, I will take that action.

By adopting this rule, in any problem in which the oppurtunity for precommiting would have given you an advantage, you wind up gaining that advantage anyway.

I don't see this situation is impossible, but I think it's because I've interpreted it differently from you.

First of all, I'll assume that everyone agrees that given a 50/50 bet to win $10'000 versus losing $100, everyone would take the bet. That's a straightforward application of utilitarianism + probability theory = expected utility, right?

So Omega correctly predicts that you would have taken the bet if he had offered it to you (a real no brainer; I too can predict that you would have taken the bet had he offered it).

But he didn't offer it to you. He comes up now, telling you that he predicted that you would accept the bet, and then carried out the bet without asking you (since he already knew you would accept the bet), and it turns out you lost. Now he's asking you to give him $100. He's not predicting that you will give him that number, nor is he demanding or commanding you to give it. He's merely asking. So the question is, do you do it?

I don't think there's any inconsistency in this scenario regardless of whether you decide to give him the money or not, since Omega hasn't told you what his prediction would be (though if we accept that Omega is infallible, then his prediction is obviously exactly whatever you would actually do in that situation).

Omega hasn't told you his predictions in the given scenario.

The simplest is to imagine that a moment from now, Omega walks up to you and says "I'm sorry, I would have given you $10000, except I simulated what would happen if I asked you for $100 and you refused". In that case, you would certainly wish you had been the sort of person to give up the $100.

I liked this position -- insightful, so I'm definitely upvoting.

But I'm not altogether convinced it's a completely compelling argument. With the amounts reversed, Omega could have walked up to you and said "I would have given you $100 except if I asked you for $10.000 you would have refused." You'd then certainly wish to have been the sort of person to counterfactually have given up the $10000, because in the real world it'd mean you'd get $100, even though you'd certainly REJECT that bet if you had a choice for it in advance.

Not necessarily; it depends on relative frequency. If Omega has a 10^-9 chance of asking me for $10000 and otherwise will simulate my response to judge whether to give me $100, and if I know that (perhaps Omega earlier warned me of this), I would want to be the type of person who gives the money.

If you want to be the sort of person who's known to do X given Y, then when Y turns up, you'd better bloody well do X.

Is that an acceptable correction?

Well, with a being like Omega running around, the two become more or less identical.

If we're going to invent someone who can read thoughts perfectly, we may as well invent someone who can conceal thoughts perfectly.

Anyway, there aren't any beings like Omega running around to my knowledge. If you think that concealing motivations is harder than I think, and that the only way to make another human think you're a certain way is to be that way, say that.

And if Omega comes up to me and says "I was going to kill you if you gave me $100. But since I've worked out that you won't, I'll leave you alone." then I'll be damn glad I wouldn't agree.

This really does seem like pointless speculation.

Of course, I live in a world where there is no being like Omega that I know of. If I knew otherwise, and knew something of their properties, I might govern myself differently.

We're not talking Pascal's Wager here, you're not guessing at the behaviour of capricious omnipotent beings. Omega has told you his properties, and is assumed to be trustworthy.

You are stating that. But as far as I can tell Omega is telling me its a capricious omnipotent being. If there is a distinction, I'm not seeing it. Let me break it down for you:

1) Capricious -> I am completely unable to predict its actions. Yes.
2) Omnipotent -> Can do the seemingly impossible. Yes.

So, what's the difference?

It's not capricious in the sense you give: you are capable of predicting some of its actions: because it's assumed Omega is perfectly trustworthy, you can predict with certainty what it will do if it tells you what it will do.

So, if it says it'll give you 10k$ in some condition (say, if you one-box its challenge), you can predict that it'll give it the money if that condition arises.

If it were capricious in the sense of complete inability of being predicted, it might amputate three of your toes and give you a flower garland.

Note that the problem supposes you do have certainty that Omega is trustworthy; I see no way of reaching that epistemological state, but then again I see no way Omega could be omnipotent, either.

On an somewhat unrelated note, why would Omega ask you for 100$ if it had simulated you wouldn't give it the money? Also, why would it do the same if it had simulated you would give it the money? What possible use would an omnipotent agent have for 100$?

Omega is assumed to be mildly bored and mildly anthropic. And his asking you for 100$ could always be PART of the simulation.

And his asking you for 100$ could always be PART of the simulation.

Yes, it’s quite reasonable that if it was curious about you it would simulate you and ask the simulation a question. But once it did that, since the simulation was perfect, why would it waste the time to ask the real you? After all, in the time it takes you to understand Omega’s question it could probably simulate you many times over.

So I’m starting to think that encountering Omega is actually pretty strong evidence for the fact that you’re simulated.

0 points