An Intuitive Introduction to Causal Decision Theory

2TLW

1Heighn

2TLW

New Comment

Newcomb's Problem

Several issues, all of which I have highlighted elsewhere and I'll highlight here too:

- Omega cannot exist without being super-Turing, assuming agents are Turing-complete.
- If you have an algorithm for producing an Omega, then I run said algorithm and run Omega on myself and do whatever Omega predicts I won't do
^{[1]}^{[2]}.- This is a contradiction, hence either:
- A: If Omega cannot exist, there's no sense spending time on this
^{[3]}. - B: If Omega is super-Turing, then all bets are off.
- C: If the agent is not Turing-complete, see below.

- A: If Omega cannot exist, there's no sense spending time on this

- This is a contradiction, hence either:
- If agents
*aren't*Turing-complete, then CDT does not apply I don't think.- (You may be able to come up with agents for specific problems, but you cannot come up with an agent that applies CDT to all encountered problems
^{[4]}.)

- (You may be able to come up with agents for specific problems, but you cannot come up with an agent that applies CDT to all encountered problems

- If you have an algorithm for producing an Omega, then I run said algorithm and run Omega on myself and do whatever Omega predicts I won't do
- "reasonably accurate predictor" hides some subtlety here:
- Does this mean that "Omega-lite" predicts
*all*agents correctly >50% - Or does this mean it can e.g. predict 80% of agents 80% of the time?
- If the former, 1 still holds. "Omega-lite" cannot have a prediction accuracy of better than a coinflip for my "run Omega and do the opposite of whatever it says" counterexample without falling into options A-C.
- If the latter, there's no guarantee that "Omega-lite" has any accuracy greater than a coinflip for any specific agent, and there's no paradox as a result.
- (There are also a few other subtleties. Omega-lite cannot predict itself (and do computation on the result), or else you get the same contradiction with Omega-lite predicting itself and doing whatever it thinks it won't do. Ditto, you end up with similar issues if >1 Omega-lite can exist and predict other Omega-lites.)

- Does this mean that "Omega-lite" predicts

...I should really write this up and turn into a proper post at some point, as this isn't the first post I've seen that ignores these issues.

^{^}I'm using first-person here to make the distinction a little clearer, as 'itself' is ambiguous as to if it refers to the agent or Omega. Alternatively: I submit an agent A that runs Omega on A and then agent A does whatever Omega said agent A wouldn't do.

^{^}Or, in the probabilistic case, I do X if and only if Omega predicts a <50% probability of me doing X.

^{^}...mostly. E.g. halting oracles don't exist, but are sometimes useful as steppingstones for proofs.

^{^}Among other issues, parsing the question itself may be impossible...

Thanks for your comment!

If you have an algorithm for producing an Omega, then I run said algorithm and run Omega on myself and do whatever Omega predicts I won't do

So then my decision procedure simulates Omega, who simulates my decision procedure. "My decision procedure" is the same decision procedure in both cases, so I think it's impossible for me to do what Omega predicts I won't do. Or is this where Omega becomes super-Turing? My knowledge on that is limited.

So then my decision procedure simulates Omega, who simulates my decision procedure.

Yep. The problem is the unbounded recursion here, give or take. A reasons about B reasoning about A reasoning about B reasoning about A ad infinitum.

I think it's impossible for me to do what Omega predicts I won't do.

Hence, there's a contradiction. Assuming the Oracle side is correct, it's fairly straightforward to show that the agent is also correct... and that this leads to a contradiction. Hence, the Oracle side cannot be correct^{[1]}.

It's easiest to see via analogy to the Halting problem^{[2]}.

~~Omega~~Halting Problem Solver-> a decidable algorithm that predicts~~what an arbitrary agent does~~if an arbitrary Turing Machine haltsIf you have a decidable algorithm for producing

~~an Omega~~aHalting Problem Solver, then I run said algorithm and run said resulting~~Omega~~Halting Problem Solveron myself and do whatever said~~Omega~~Halting Problem Solverpredicts I won't do (~~take both boxes~~go into an infinite loopif and only ifsaid~~Omega~~Halting Problem Solverpredicts I won't)This is a contradiction, therefore an algorithm for producing

an Omega~~a Halting Problem Solver~~cannot exist.

Or, in the slightly more straightforward 'standard' proof form:

~~Omega~~Halting Problem Solver-> a decidable algorithm that predicts~~what an arbitrary agent does~~if an arbitrary Turing Machine haltsIf you have

~~an Omega~~aHalting Problem Solver, then I run said~~Omega~~Halting Problem Solveron myself and do whatever said~~Omega~~Halting Problem Solverpredicts I won't do (~~take both boxes~~go into an infinite loopif and only ifsaid~~Omega~~Halting Problem Solverpredicts I won't)This is a contradiction, therefore

an Omega~~a Halting Problem Solver~~cannot exist.

Or is this where Omega becomes super-Turing?

This contradiction only applies if the agent can simulate Omega. (The premise requires that Omega can simulate the agent.)

One way of avoiding this contradiction is if the agent is *not* Turing-complete, and Omega can simulate the agent but not vice versa.

If the agent is Turing-complete, this implies that Omega must be Turing-complete in order for Omega to simulate the agent. By the Church-Turing thesis, if both are computable this in turn implies that Omega and the agent can simulate each other, and we lead to this contradiction. Which means that if the agent is Turing-complete, Omega must not be computable. So the other possibility is the agent is Turing-complete and that Omega isn't computable (and is e.g. an Oracle Machine). This is where Omega becomes super-Turing.

(A third possibility is that Omega is allowed to be semidecidable... but in this case if I'm an agent that will result in Omega infinite-looping it shouldn't have been able to ask the question in the first place.)

^{^}In particular, I suspect the failure mode of many Omegas would be to go into an infinite loop.

^{^}This is a slightly different proof than the 'standard' proof of the Halting Problem

^{[3]}, but this proof also works.^{^}The standard proof starts with 'assume you have a TM that solves the Halting Problem', and directly shows that said TM is a contradiction. This proof starts with 'assume you have an algorithm that produces a TM that solves the Halting Problem', and shows that said algorithm is a contradiction.

Like any decision theory, Causal Decision Theory (CDT) aims to maximize expected utility; it does this by looking at the

causal effectseach available action in a problem has. For example, in Problem 1, taking box A has the causal effect of earning $100, whereas taking box B causes you to earn $500. $500 is more than $100, so CDT says to take box B (like any decision theory worth anything should). Similarly, CDT advices to take box A in Problem 2.CDT's rule of looking at an action's causal effects make sense: if you're deciding which action to take, you want to know how your actions change the environment. And as we will see later, CDT correctly solves the problem of the Smoking Lesion. But first, we have to ask ourselves: what is causality?

## What is causality?

A formal description of causality is beyond the purpose of this post (and sequence), but intuitively speaking, causality is about stuff that makes stuff happen. If I throw a glass vase on concrete, it will break; my action of throwing the vase

causedit to break.You may have heard that correlation doesn't necessarily imply causality, which is true. For example, I'd bet hand size and foot size in humans strongly correlate: if we'd measure the hands and feet of a million people, those with larger hands will - on average - have larger feet as well, and vice versa. But hopefully we can agree hand size doesn't have a

causal effecton foot size, or vice versa: your hands aren't large or smallbecauseyour feet are large or small, even though we might be able to quite accurately predict your foot sizeusingyour hand size. Rather, hand size and foot size havecommon causeslike genetics (determining how large a person can grow) and quality and quantity of food taken, etc.Eliezer Yudkowsky describes causality in a the following very neat way:

"I broke the vase" and "John kicks the ball" are both examples of this.

With the hope the reader now has an intuitive notion of causality, we can move on to see how CDT handles Smoking Lesion.

## Smoking Lesion

CDT says "yes". The agent either gets lung cancer or not; having the lesion certainly increases the risk, but smoking doesn't causally affect whether or not the agent has the lesion and has no direct causal effect on her probability of getting lung cancer either. CDT therefore reasons that whether you get the $1,000,000 in utility is beyond your control, but smoking simply gets you $1,000 more than not smoking. While smokers in this hypothetical world more often get lung cancer than non-smokers, this is because there are relatively more smokers in that part of the population that has the lesion, which is the cause of lung cancer. Smoking or not doesn't change whether the agent is in that part of the population; CDT therefore (correctly) says the agent should smoke. The Smoking Lesion situation is actually similar to the hands and feet example above: where e.g. genetics cause people to have larger hands

andfeet, the Smoking Lesion causes people to have cancerandenjoy smoking.CDT makes intuitive sense, and seems to solve problems correctly so far. However, it does have a major flaw, which will become apparent in Newcomb's Problem.

## Newcomb's Problem

(Note that "iff" means "if and only if.)

How does CDT approach this problem? Well, let's look at the causal effects of taking both boxes ("two-boxing") and taking one box ("one-boxing").

First of all, note that Omega has already made its prediction. Your action now doesn't causally affect this, as you can't cause the past. Omega made its prediction and based upon it either filled box B or not. If box B isn't filled, one-boxing gives you nothing; two-boxing, however, would give you the contents of box A, earning you $1,000. If box B

isfilled, one-boxing gets you $1,000,000. That's pretty sweet, but two-boxing gets you $1,000,000 + $1,000 = $1,001,000. In both cases, two-boxing beats one-boxing by $1,000. CDT therefore two-boxes.John, who is convinced by CDT-style reasoning, takes both boxes. Omega predicted he would, so John only gets $1,000. Had he one-boxed, Omega would have predicted

that, giving him $1,000,000. If only he hadn't followed CDT's advice!## Is Omega even possible?

At this point, you may be wondering whether Newcomb's Problem is relevant: is it even possible to make such accurate predictions of someone's decision? There are two important points to note here.

First, yes, such accurate predictions might actually be possible, especially if you're a robot: Omega could then have a copy - a model - of your decision-making software, which it feeds Newcomb's Problem to see whether the model will one-box or two-box. Based on

that, Omega predicts whetheryouwill one-box or two-box, and fixes the contents of box B accordingly. Now, you're not a robot, but future brain-scanning techniques might still make it possible to form an accurate model of your decision procedure.The second point to make here is that predictions need not be this accurate in order to have a problem like Newcomb's. If Omega could predict your action with only 60% accuracy (meaning its prediction is wrong 40% of the time), e.g. by giving you some tests first and examine the answers, the problem doesn't fundamentally change. CDT would still two-box:

givenOmega's prediction (whatever its accuracy is), two-boxing still earns you $1,000 more than one-boxing. But, of course, Omega's prediction isconnectedto your decision: two-boxing gives you 0.6 probability of earning $1,000 (because Omega would have predicted you'd two-box with 0.6 accuracy) and 0.4 probability of getting $1,001,000 (the case where Omega is wrong in its prediction), whereas one-boxing would give you 0.6 probability of getting $1,000,000 and 0.4 probability of $0. This means two-boxing has an expected utility of 0.6 x $1,000 + 0.4 x $1,001,000 = $401,000, whereas the expected utility of one-boxing is 0.6 x $1,000,000 + 0.4 x $0 = $600,000. One-boxing still wins, and CDT still goes wrong.In fact, people's microexpressions on their faces can give clues about what they will decide, making many real-life problems

Newcomblike.## Newcomb's Problem vs. Smoking Lesion

You might be wondering about the exact difference between Newcomb's Problem and Smoking Lesion: why does the author suggest to smoke on Smoking Lesion, while also saying one-boxing on Newcomb's Problem is the better choice? After all, two-boxers indeed often find an empty box in Newcomb's Problem - but isn't it also true that smokers often get lung cancer in Smoking Lesion?

Yes. But the latter has nothing to do with the

decision to smoke, whereas the former has everything to do with thedecision to two-box. Let's indeed assume Omega has a model of your decision procedure in order to make its prediction. Then whatever you decide, the model also decided (with perhaps a small error rate). This isn't different than two calculators both returning "4" on "2 + 2": if your calculator outputs "4" on "2 + 2", you know that, when Fiona input "2 + 2" on her calculator a day earlier, hers must have output "4" as well. It's the same in Newcomb's Problem: if you decide to two-box, so did Omega's model of your decision procedure; similarly, if you decide to one-box, so did the model. Two-boxing then systematically leads to earning only $1,000, while one-boxing gets you $1,000,000. Your decision procedure is instantiated in two places: in your head and in Omega's, and you can't act as if your decision has no impact on Omega's prediction.In Smoking Lesion, smokers do often get lung cancer, but that's "just" a statistical relation. Your decision procedure has no effect on the presence of the lesion and whether or not you get lung cancer; this lesion does give people a

fondness ofsmoking, but the decision to smoke is still theirs and has no effect on getting lung cancer.Note that, if we assume Omega doesn't have a model of your decision procedure, two-boxing would be the better choice. For example, if, historically, people wearing brown shoes always one-boxed, Omega might base its prediction on

thatinstead of on a model of your decision procedure. In that case, your decision doesn't have an effect on Omega's prediction, in which case two-boxing simply makes you $1,000 more than one-boxing.## Conclusion

So it turns out CDT doesn't solve every problem correctly. In the next post, we will take a look at another decision theory:

Evidential Decision Theory, and how it approaches Newcomb's Problem.