Counterfactuals as a matter of Social Convention

by Chris_Leong 1 min read30th Nov 20194 comments

11

Ω 3


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

In my last post, I wrote that the counterfactuals in Transparent-Box Newcomb's problem were largely a matter of social convention. One point I overlooked for a long time was that formalising a problem like Newcomb's is tricker than it seems. Depending on how it is written, some statements may seem to apply to just our actual world, some may seem to be also referring to counterfactual worlds and some may seem ambiguous.

To clarify this, I'll consider phrases that one might hear in relation to this problem + some variations and draw out their implications. I won't use modal logic since it really wouldn't add anything to this discussion except more jargon.

The idea that counterfactuals could have a social element should seem really puzzling at first. After all, counterfactuals determine what counts as a good decision and surely what is a good decision isn't just a matter of social convention? I think I know how to resolve this problem and I'll address that in a post soon, but for now I'll just provide a hint and link you to a comment by Abram Demski talking about how probabilities are somewhere between subjective and objective.

Example 1:

a) Omega is a perfect predictor

b) You find out from an infallible source that Omega will predict your choice correctly

The first suggests that Omega will predict you correctly no matter what you choose, so we might take it to apply to every counterfactual world, while it is technically possible that Omega might only be a perfect predictor in this world. The second is much more ambiguous and you might take its prediction to only be correct in this world and not the counterfactual.

Example 2:

a) The first box always contains $1000

b) The first box contains $1000

First seems to be making a claim about counterfactual worlds again, while the second is ambiguous. It isn't clear if it applies to all worlds or not.

Example 3:

"The game works as follows: the first box contains $1000, while the second contains $0 or $1000 depending on whether the predictor predicts you'll two-box or one-box"

Talking about the rules of the game seems to be a hint that this will apply to all counterfactuals. After all, decision problems are normally about winning within a game, as opposed to the rules changing according to your decision.

Example 4:

a) The box in front of you contains $1 million

b) The box in front of you contains either $0 or $1 million. In this case, it contains $1 million

The first is ambiguous. The second seems to make a statement about all counterfactuals, then one about this world. If it were making a statement just about this world then the first sentence wouldn't have been necessary.

Lessons

This could be leveraged to provide a critique of the erasure approach. This approach wants to construct a non-trivial decision problem by erasing information, but this analysis suggests that either a) this may be unnecessary because it is already implicit in the problem which information is universal or not or b) the issue isn't that we need to figure out which assumption to erase, but that the problem is ambiguous about which parts should be taken universally.

11

Ω 3