An Intuitive Introduction to Evidential Decision Theory

Heighn

In the last post, we discussed Causal Decision Theory (CDT). We learned how purely looking at the causal effects of your actions makes you lose at Newcomb's Problem. Evidential Decision Theory (EDT) is an alternative decision theory that tells us to take the action that, conditional upon it happening, gives the best outcome. This means EDT looks at the evidence each action provides for the different outcomes. It's best to explain what this means exactly by looking at a few known problems. But first, what is "evidence", anyway?

What is evidence, anyway?

"Evidence" may sound like a complicated term, but really, it's extremely simple. Imagine we have a big box with 1000 items in it. 600 of these items are balls; 400 are cubes. Of the balls, 200 are red, whereas 400 are green. Furthermore, 300 cubes are red, and 100 are green. I draw a random item from our big box. What is the probability that the item is a ball?

Well, there are 1000 items, of which 600 are balls. That gives us a probability of 600 / 1000 = 0.6 that a random item is a ball. We write P(B) = 0.6, where the B stands for an item being a ball.

Now, suppose I putt the ball back and again draw a random item from our box. I tell you that it is red. What is the probability that it is a ball?

In total, there are 500 red items (200 red balls + 300 red cubes). Of those red items, 200 are balls. 200 / 500 = 0.4; so that's a 0.4 probability that a red item is a ball. We write P(B | R) = 0.4: the probability that an item is a ball given that or conditional on the fact that it is red equals 0.4. P(C | R) - the probability of an item being a cube given that it is red - equals 300 red cubes / 500 red items = 0.6. Note that P(B | R) + P(C | R) = 0.4 + 0.6 = 1 (or 100%). P(B | R) and P(C | R) adding up to 1 makes sense - a red item must either be a ball or a cube.

P(B | R) = 0.4 tells us the evidence redness gives us about the item being a ball. That evidence is quite weak: the evidence redness gives for the item being a cube is bigger (0.6). Therefore, we would usually simply say that redness is evidence for the item being a cube. This is simply because there are more red cubes than red balls.

We can easily turn things around. P(R | B) is the probability that an item is red given that it is a ball. There are 600 balls, 200 of which are red. Then P(R | B) = 200 / 600 = . P(G) = 500 / 1000 = 0.5 - the probability a random item is green. P(G | B) = 400 / 600 = $\frac{2}{3}$ = 1 - P(R | B).

This is the nature of evidence as Evidential Decision Theory uses it. What I gave here is a very basic and very incomplete introduction to Bayesian reasoning - if you want to learn more, I recommend An Intuitive Explanation of Bayes's Theorem.

Newcomb's Problem

Remember how in Newcomb's Problem, every two-boxer got $1,000 and every one-boxer got $1,000,000? That means two-boxing is very strong evidence of getting $1,000, just like redness is strong evidence of an item being a cube in our example above. Similarly, one-boxing is very strong evidence of getting $1,000,000. Therefore, if we assume Omega is always right in its prediction, conditional on you two-boxing, you'd earn $1,000,000; conditional on one-boxing, you'd only get $1,000. One-boxing is then clearly better, and EDT indeed one-boxes. Therefore, following EDT earns you a $1,000,000 in Newcomb's Problem, and EDT'ers do clearly better than CDT'ers here. So how does EDT perform on Smoking Lesion?

Smoking Lesion

In the world of Smoking Lesion, smoking is evidence of getting lung cancer: the problem states smoking is correlated with lung cancer. As we know, this correlation is due to a common cause: both lung cancer and a fondness of smoking are caused by a genetic lesion. Smoking doesn't cause lung cancer in this world, but EDT doesn't deal with causes: it deals with evidence, and P(Cancer | Smoke) is quite high. Smokers get lung cancer more often than non-smokers: smoking is evidence for getting lung cancer. As life without cancer is preferred, EDT'ers don't smoke, and simply lose $1,000 in utility: whether or not they have the lesion, smoking doesn't change their probability of developing cancer, and they might as well smoke.

Final Remarks

Where EDT does better than CDT on Newcomb's Problem, CDT wins at Smoking Lesion. EDT's problem in Smoking Lesion is that correlation doesn't equal causation. When deciding what action to take, EDT imagines what the world would be like after each action: for each action, it constructs a different world, and keeps the correlations intact. So, in Newcomb's Problem, it keeps the correlations caused by Omega's predictions intact, which is correct. However, in Smoking Lesion, the correlation between smoking and getting lung cancer is kept, which is wrong: that correlation is merely "accidental". When considering smoking, one should imagine a world in which one enjoys smoking with a probability of getting lung cancer; when considering not smoking, one should imagine a world in which one is not enjoying smoking with the same probability of getting lung cancer.

CDT constructs its worlds by keeping only causal effects of the actions intact. This works well in Smoking Lesion, but in Newcomb's Problem, this means cutting the connection between its action and Omega's prediction. This is wrong: CDT acts like Omega's probability of filling box B to be the same regardless of its decision, but that's not the case. The exact way CDT goes wrong will become more clear in the next post, where I finally introduce Functional Decision Theory.

LESSWRONG
LW