# Ω 4

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

The Counterfactual Prisoner's Dilemma is a symmetric version of Counterfactual Mugging where regardless of whether the coin comes up heads or tails you are asked to pay $100 and you are then paid$10,000 if Omega predicts that you would have paid if the coin had come up the other way. If you decide updatelesly you will always received $9900, while if you decide updatefully, then you will receive$0. So unlike Counterfactual Mugging, pre-committing to pay ensures a better outcome regardless of how the coin flip turns out, suggesting that focusing only on your particular probability branch is mistaken.

The Logical Counterfactual Mugging doesn't use a coin flip, but instead looks at the parity of sometime beyond your ability to calculate, like the 10,000th digit of pi. You are told it is even and then asked to pay $100 on the basis that if Omega predict you would have paid, then he would have given you$10,000 if had turned out to be odd.

You might naturally assume that you couldn't construct a logical version of the Counterfactual Prisoner's Dilemma. I certainly did at first. After all, you might say, the coin could have come up tails, but the 10,000th digit of pi couldn't have turned out to be odd. After all, that would be a logical impossibility.

But could the coin actually have come up tails? If the universe is deterministic, then the way it came up was the only way it could ever have come up. So is there is less difference between these two scenarios than it looks at first glance?

Let's see. For the standard counterfactual mugging, you can't find the contradiction because you lack information about the world, while for the logical version, you can't find the contradiction because of processing power. In the former, we could actually construct two consistent worlds - one where it is heads and one where it is tails - that are consistent with the information you have about the scenario. In the later, we can't.

Notice however that for Logical Counterfactual Mugging to be well defined, you need to define what Omega is doing when it is making its prediction. In Counterfactuals for Perfect Predictors, I explained that when dealing with perfect predictors, often the counterfactual would be undefined. For example, in Parfit's Hitchhicker a perfect predictor would never give a lift to someone who never pays in town, so it isn't immediately clear that predicting what such a person would do in town involves predicting something coherent.

However, even though we can't ask what the hitchhiker would do in an incoherent situation, we can ask what they would do when they receive an input representing an incoherent situation (see Counterfactuals for Perfect Predictors for a more formal description). Indeed, Updateless Decision Theory uses this technique - programs are as defined as input-output maps - although I don't know whether Wei Dai was motivated by this concern or not.

Similarly, the predictor in Logical Counterfactual Mugging must be predicting something that is well defined. So we can assume that it is producing a prediction based on an input, which may possibly represent a logically inconsistent situation. Given this, we can construct a logical version of the Counterfactual prisoner's dilemma. Writing this explicitly:

First you are told the 10,000th digit of Pi. Regardless of whether it is odd or even, you are asked for $100. You are then paid$10,000 if you Omega predicts that you would produce output corresponding to paying when fed input correpsonding to having been informed that this digit had the opposite parity that you observed.

There really isn't any difference between how we make the logical case coherent and how we make the standard case coherent. At this point, we can see that just as per the original Counterfactual Prisoner's Dilemma always paying scores you \$9900, while never paying scores you nothing. You are guaranteed to do better regardless of the coin flip (or in Abram Demski's terms we now have an all-upside updateless situation).