An Intuitive Introduction to Functional Decision Theory

Heighn

In the last two posts, we looked at both Causal (CDT) and Evidential Decision Theory (EDT). While both theories have their strength, CDT fails on Newcomb's Problem and EDT performs poorly on Smoking Lesion. Both problems do have a correct answer if you want to get as much utility as possible: one-boxing results in earning $1,000,000, and smoking results in an extra $1,000.

Functional Decision Theory (FDT) is a decision theory that manages to answer both problems correctly. Where CDT looks at the causal effects of the available actions, FDT considers the effects of its decision. It asks:

Which output of this decision procedure causes the best outcome?

Where the decision procedure is simply the reasoning or "thinking" done to determine what action to do. This might seem like only a superficial difference with CDT, but that really isn't the case. The crucial point is that the same decision procedure can be implemented by multiple physical systems. It's best to explain this using an example.

Psychological Twin Prisoner's Dilemma

The Psychological Twin Prisoner's Dilemma (PTPD) runs as follows:

An agent and her twin must both choose to either “cooperate” or “defect.” If both cooperate, they each receive $1,000,000. If both defect, they each receive $1,000. If one cooperates and the other defects, the defector gets $1,001,000 and the cooperator gets nothing. The agent and the twin know that they reason the same way, using the same considerations to come to their conclusions. However, their decisions are causally independent, made in separate rooms without communication. Should the agent cooperate with her twin?

Looking at this in the most selfish way possible, the best outcome for the agent is the one where she defects and her twin cooperates, in which case she gets $1,001,000. However, note that this situation is impossible: the agent and her twin reason the same way. If the agent wants the $1,001,000 outcome, so does her twin; in that case, they both defect and both get $1,000, while nobody gets the $1,001,000.

The reader might have noticed there are only two outcomes possible here: either both the agent and her twin cooperate, or they both defect. The agent prefers the first outcome: that's the one where she gets $1,000,000, whereas the defect-defect scenario only gets her $1,000. CDT'ers, however, defect: since the agent's decision doesn't causally effect that of the twin, they reason that given the decision of the twin, defecting always gives $1,000 more. This is quite straightforward: given defection by the twin, defecting gives the agent $1,000 whereas cooperating gives her nothing; given cooperation by the twin, the agent gets $1,001,000 by defecting and $1,000,000 by cooperating, making PTPD analogous to Newcomb's Problem. (Indeed, the problems are equal; the names of the actions are different, but the expected utilities are the same.)

CDT fails to incorporate the connection between the agent's and her twin's decision making in her decision making, and reasons as if the two are independent. But that's not the case: just like two identical calculators both return 4 to 2 + 2, two agent's who reason the same way on the same problem come to the same conclusion. Whatever the agent decides, her twin also decides, because, crucially, her twin has the same decision procedure as she does. The agent's decision procedure is "implemented" twice here: once in the agent, and once in her twin. FDT asks: "Which output of this decision procedure causes the best outcome?" Well, if the output of this decision procedure is "defect", then all "physical systems" (in this case, the agent and her twin) implementing this decision procedure defect, earning the agent $1,000. If the output of this decision procedure is "cooperate", then both the agent and her twin cooperate, getting the agent a sweet $1,000,000. FDT therefore recommends cooperating!

Note that EDT also cooperates, as it recognizes that cooperating is evidence for the twin cooperating, whereas defecting is evidence for defection by the twin. This is, however, the wrong reason to cooperate: as we saw in the Smoking Lesion problem, mere correlation doesn't provide a good basis for decision making. As we will see later in this post, there are problems where FDT beats EDT and even problems where it beats both CDT and EDT.

Newcomb's Problem

We discussed the similarity between Newcomb's Problem and PTPD before, and it might not surprise you to read that FDT one-boxes on Newcomb's Problem. The reason has been briefly touched upon before, and is the same as the reason for cooperating on PTPD: the agent's decision procedure is implemented twice. The action of the agent ("you") in Newcomb's problem has been predicted by Omega; in order for Omega to make an accurate prediction, it must have a model of your decision procedure which it uses to make its prediction. Omega "feeds" Newcomb's Problem to its model of your decision procedure. If you two-box, so did Omega's model; you'll find nothing in box B. If you one-box, so did the model and you'll find $1,000,000 in box B. This might be very counterintuitive to the reader, but it's no different than two identical calculators both returning 4 to 2 + 2. In that case, the "addition procedure" is implemented twice (once in each calculator); in Newcomb's Problem, its your decision procedure that's doubly implemented. Knowing that Omega's model will have made the same decision she does, the FDT agent therefore one-boxes and earns $1,000,000. (That is, unless she, for some reason, concludes Omega does not have a model of her decision procedure, and bases its prediction on something boring as your shoe color. In that case, the agent's decision procedure does not affect Omega's prediction, in which case two-boxing is better and FDT does that instead.)

Smoking Lesion

Like CDT - and unlike EDT - FDT smokes in Smoking Lesion. There is no model or otherwise "extra" implementation of the agent's decision procedure in this problem, and FDT (like CDT) correctly reasons that smoking doesn't affect the probability of getting cancer.

Subjunctive Dependence

While CDT is based upon causality, FDT builds on subjunctive dependence: two physical systems (e.g. two calculators) computing the same function are subjunctively dependendent upon that function. In PTPD, the agent and her twin are both computing the same decision procedure; therefore, what the agent decides literally is what the twin decides; they are subjunctively dependent upon the agent's decision procedure. In Newcomb's problem, Omega runs a model of the agents decision procedure; again, what the agent decides literally is what Omega's model decided earlier, and the agent and Omega are subjunctively dependent upon the agent's decision procedure.

Parfit's Hitchhiker

An agent is dying in the desert. A driver comes along who offers to give the agent a ride into the city, but only if the agent will agree to visit an ATM once they arrive and give the driver $1,000. The driver will have no way to enforce this after they arrive, but she does have an extraordinary ability to detect lies with 99% accuracy. Being left to die causes the agent to lose the equivalent of $1,000,000. In the case where the agent gets to the city, should she proceed to visit the ATM and pay the driver?

The above problem is known as Parfit's Hitchhiker, and both CDT and EDT make the wrong decision here - albeit for different reasons. CDT reasons that there's no point in paying once the agent is already in the city; paying now only causes her to lose $1,000. Once an EDT agent learns she's in the city, paying isn't evidence for the driver taking her: she's already in the city! Paying now only correlates with losing $1,000. Both CDT and EDT therefore refuse to pay. You may think this is rational, as the agent is indeed already in the city - but note that any agent following CDT or EDT encountering this problem will never have been taken to the city to begin width. Neither a CDT nor an EDT agent can honestly claim they will pay once they are in the city: they know how they would reason once there. Both CDT and EDT agents are therefore left by the driver, to die in the desert.

FDT once again comes to the rescue: just like Omega has a model of the agent's decision procedure in Newcomb's Problem, it appears the driver has such a model in Parfit's Hitchhiker. The agent's decision in the city, then, is the decision the model makes earlier (with a 1% error rate). FDT then reasons that the expected utility of paying is 0.99 x $1,000,000 - $1,000 = $991,000, because of the 0.99 probability that the driver predicted you would pay. The expected utility of not paying is only 0.01 x $1,000,000 = $10,000, as the driver would have to wrongly predict you'd pay. $991,000 is a lot more than $10,000, so FDT agents pay - and don't get left to die in the desert.

Transparent Newcomb Problem

Events transpire as they do in Newcomb’s problem, except that this time both boxes are transparent—so the agent can see exactly what decision the predictor made before making her own decision. The predictor placed $1,000,000 in box B iff she predicted that the agent would leave behind box A (which contains $1,000) upon seeing that both boxes are full. In the case where the agent faces two full boxes, should she leave the $1,000 behind?

This problem is called the Transparent Newcomb Problem, and I suspect many people will be sure the rational move is to two-box here. After all, the agent knows there is $1,000,000 in box B. She sees it. In the original Newcomb's Problem, you don't see the contents of box B, and your decision influences the contents. Now the agent does see the contents - so why leave an extra $1,000 behind?

Because it was never about whether you actually see the contents of box B or not. It's about subjunctive dependence! Your decision influences the contents of box B, because Omega modelled your decision procedure. FDT recognizes this and one-boxes on the Transparent Newcomb Problem - and, crucially, virtually never ends up in the situation where it sees two full boxes. FDT one-boxes, so Omega's model of the agent one-boxes as well - causing Omega to put $1,000,000 in box B. Had FDT one-boxed, so would Omega's model, in which case Omega would leave box B empty.

Ask yourself this: "What kind of person do I want to be?" Do you want to be a two-boxer or a one-boxer? I mean, if I ask you now, before you even face the above dilemma, do you want to be someone who would two-box or someone who would one-box? Two-boxers never face the situation with two full boxes; one-boxers do, although (but because) they decide to leave the $1,000 behind. You can't be a "one-boxing person" in general and still two-box once you're actually facing the two full boxes - if that's the way you reason once you're in that situation, you're a two-boxer after all. You have to leave the $1,000 behind and be a one-boxing person; this is the only way to ensure that once you're in a situation like the Transparent Newcomb Problem, there's $1,000,000 in box B.

Counterfactual Mugging

Omega, a perfect predictor, flips a coin. If it comes up tails Omega asks you for $100. If it comes up heads, Omega pays you $10,000 if it predicts that you would have paid if it had come up tails.

So should you pay the $100 if the coin comes up tails?

FDT pays in this problem, known as Counterfactual Mugging, and the reason might sound crazy: your decision procedure in this problem is modelled by Omega in a counterfactual situation! The coin didn't come up heads, but had it come up heads, you would have made $10,000 if and only if you pay up the $100 now.

Again, ask yourself: "What kind of person do I want to be?" Rationally, you should want to be someone who'd pay in case she runs across Omega and the coin comes up tails. It's the only way to get $10,000 in the heads case.

Concluding Remarks

This ends this sequence on Functional Decision Theory. The purpose of this sequence was to deliver an intuitive introduction; to fully explain FDT, however, we need to go deeper and dive into mathematics a bit more. My plan is to post a more "technical" sequence on FDT in the near future.

20