Counterfactual mugging: alien abduction edition

by Emile1 min read28th Sep 201018 comments


Counterfactual MuggingCounterfactuals
Personal Blog

Omega kidnapps you and an alien from FarFarAway Prime, and gives you the choice: either the alien dies and you go home with your memory wiped, or you lose an arm, and you both go home with your memories wiped. Nobody gets to remember this. Oh and Omega flipped a coin to see who got to choose. What is your choice?

As usual, Omega is perfectly reliable, isn't hiding anything, and goes away afterwards. You also have no idea what the alien's values are, where it lives, what it would choose, nor what is the purpose of that organ that pulsates green light.

(This is my (incorrect) interpretation of counterfactual mugging, which we were discussing on the #lesswrong channel; Boxo pointed out that it's Prisonner's Dilemma where a random player is forced to cooperate, and isn't that similar to counterfactual mugging.)

18 comments, sorted by Highlighting new comments since Today at 6:21 PM
New Comment

How is this a counterfactual mugging? You're not making any counterfactual decision.

Agreed, it isn't counterfactual mugging. It would only be counterfactual mugging if Omega also told you the alien would decide in exactly the same way you would if it had won the coin-flip.

Here's another CFM-inspired game:

Omega flips a coin. Then he asks you to choose to cooperate or defect in a game of PD against the counterfactual version of you that got the other side of the coin. He knows what that version would have chosen. Do you C or D?


I know the answer is clear, but I think this is a good intuition pump for CFM. Like I said on IRC, CFM is all about not defecting against your counterfactual cousins just because they're not real.

[-][anonymous]11y 0

The obvious TDT/UDT answer would be C and the CDT answer would probably be D. The counterfactual component doesn't (or shouldn't) change your strategy much in this case.

The obvious UDT answer is C, but I think TDT defects, for the exact same reason it doesn't pay the counterfactual mugger (can't improve any aspect of the 'real' situation it finds itself in).

Edit: Wrong, see reply.

TDT cooperates. The node representing the output of TDT affects the counterfactual TDT agent, which in turn affects Omega's "real" prediction of the counterfactual TDT.

By crafting an appropriate dependency graph, you can make TDT agent agree to any UDT decision. Even in CM, if you model Omega in more detail as depending on your decision, you can get TDT agent to comply, but this is not the point: TDT doesn't get this answer naturally without external introduction of compensating explicit dependence bias, and neither does it in this case.

I would like to see the dependency graph that compels TDT to pay in a counterfactual mugging.

Not if it expresses what's real, but surely if it expresses what the agent cares about, basically the counterfactual world explicitly included.

Are you saying that it's easier to get TDT to comply to CM if it's ontologically fundamental randomness than if it's logical uncertainty? (but you think it can be made to comply then, too)

In the least convenient possible world, the TDT agent doesn't care intrinsically about any counterfactual process, only about the result on the real world.

Saying you can get an agent with one DT to follow the output of another DT by changing its utility function is not interesting.

Saying you can get an agent with one DT to follow the output of another DT by changing its utility function is not interesting.

If the mapping is natural enough, it establishes relative expressive power of the decision theories, perhaps even allowing to get the same not-a-priori-obvious conclusions from studying one theory as the other. But granted, as I described in this post, the step forward made in UDT/ADT, as compared to TDT, is that causal graph doesn't need to be given as part of problem statement, dependencies get inferred from utility/action definition.

If the mapping is natural enough,

Ok, so show me an actual example of a mapping that is "natural enough", and causes TDT to pay of in CM.

I argued with your argument, not your conclusion.

I am not following your abstract argument, and would like to see an example of how a "natural enough" mapping can establish "relative expressive power of the decision theories".

I think you're right.

If you accept updateless decision-making (caring about alternative possibilities, or equivalently deciding on your strategies in advance), this is equivalent to PD, with payoff depending of your and alien's strategies, computed as expected utility through the uncertainty of the coin.

[-][anonymous]11y 0

Why are you even using Omega, if it's not actually using its superintelligence to simulate you (or anyone else)? It would be less misleading it if you just said it was a random truthful jerk.