Defending Functional Decision Theory

13TLW

4tailcalled

1JBlack

2tailcalled

1Heighn

1tailcalled

1Sylvester Kollin

1Heighn

1TLW

1Heighn

New Comment

What, precisely, is meant here by:

The predictor has a failure rate of only 1 in a trillion trillion. Helpfully, she left a note, explaining that she predicted that you would take Right, and therefore she put the bomb in Left.

Is this a prediction that I would take Right, *given that the predictor said that I would take Right*?

Or is note indicating that I would take Right *in the absence of a note*?

This is an important distinction I believe.

However, it does seem the second one uses the function of the first one as "subfunction": it needs to know the "real" answer to "2 + 2" in order to output "-4". Therefore, the calculators are subjunctively dependent on that subfunction, even though their outputs are different. Even if the second calculator always outputs "[output of first calculator] + 1", the calculators are still subjunctively dependent on that same function.

Why not reverse the situation? Couldn't you just as well say that the calculator that outputs 4 is subjunctively dependent on the calculator that outputs -4, since it needs to know that the real answer to the second is -4 in order to drop the - and output 4?

Direction of causality, or even causality itself, is irrelevant to FDT. Subjunctive dependence is simply a statement that two variables are *not independent* across the possible worlds conditional on parameters of interest. It doesn't say that one causes the other or that they have a common cause.

In the calculator example, the variables are the outputs of the two calculators, and the parameters of interest are inputs common to both calculators. In this case the dependence is extremely strong: there is a 1:1 relation between the outputs for any given input in all possible worlds where both calculators are functioning correctly.

For the purposes of FDT, the relevant subjunctive dependence is that between the decision process outputs and the outcomes, and the variables of interest are the inputs to the decision process. In carefully constructed scenarios such as Newcombe's problem, the subjunctive dependence is total: Omega is a perfect predictor. When the dependence is weaker, the details matter more - but still causality is irrelevant.

In the case of weaker dependence you can get something *like* a direction of dependence, in that perhaps each value of variable A corresponds to a single value of variable B across possible worlds, but not vice versa. This still doesn't indicate causality.

What I have in mind is stuff like this:

FDT can require that P come augmented with information about the logical, mathematical, computational, causal, etc. structure of the world more broadly. Given a graph G that tells us how changing a logical variable affects all other variables, we can re-use Pearl’s do operator to give a decision procedure for FDT

FDT seems to rely heavily on this sort of assumption, but also seems to lack any sort of formalization of how the logical graphs work.

Interesting point. It seems to me that given MacAskill's original setup of the calculators, the second one *really does *calculate the first one's function and adds the -. Like, if 2 + 2 where to equal 5 tomorrow, the first calculator would output 5 and the second one -5.

Idk . MacAskill's setup is kinda messy because it involves culture and physics and computation too, these layers introduce all sorts of complexity that makes it hard to analyze. Whereas you seem to say that causality is meaningful for logic and for mathematical functions too.

So let's stay within math. Suppose for instance we represent functions in the common way, with f being represented as it's graph { (x, y) where y = f(x) }. Under what conditions does one such set cause another?

Procreation* gives both FDT and CDT agents (and indeed, all agents) the same dilemma. FDT agents procreate and live miserably; CDT agents don't procreate and almost certainly don't exist. FDT beats CDT in this dilemma.

This doesn't seem right: you already exist! In order to say that "FDT beats CDT" I think you have to argue that one should care about the number of branches you exist in—which is what you plausibly have uncertainty about, not about whether *this* very instance of you exists. (And this is arguably just about preferences, as Christiano writes about here. So it is unclear what it would even mean to say that "FDT beats CDT".) That is, this is about implementing a specific version of mixed-upside updatelessness or not—specifically, the multiverse version of MUU I describe here.

Thanks for your reaction!

This doesn't seem right: you already exist!

Sure, I already exist; together with the fact that I make the exact same decision my father made, that implies I procreate and therefore I'm not a CDT'er.

The point with these problems is, I believe, that your decision procedure is implemented at least 1 time, but possibly 2 times throughout time - depending on what your decision procedure outputs.

In Procreation*, if "my" decision procedure outputs "procreate", it first does so "in" my father, who then procreates, causing me to exist. I then also procreate.

But if "my" decision procedure outputs "don't procreate", it also first does so "in" my father, who then doesn't procreate, and then I don't exist.

The question "Should I procreate?" is a bit misleading, then, as I possibly don't exist.*Or*, we indeed assume I *do* exist; but then it's not much of a decision problem anymore. If I exist, then my father procreated, and I *necessarily *procreate too.

(Warning: this is a bit of a sidenote. There are very likely other related problems that do not suffer this issue. I might suggest that the argument and chain of logic in this post would be stronger if you chose another variant.)

If you're the last person in the universe, knowing that you'll never see anyone else again ever, why does $100 have any value to you?

Response: just decrease the failure rate arbitrarily far until it does balance again.

Counter-response A: the failure rate cannot be reduced arbitrarily far, because P(I misinterpreted what the agent is saying) is positive & non-zero.

Counter-response B: if the value of the money to me is negative - which given I'm hauling around a piece of paper when I'm literally the last person in the universe it very well may be - there is no such failure rate.

Yeah, the $100 wouldn't have value, but we can assume for the problem at hand that Right-boxing comes with a cost that, expressed in dollars, equals 100 - just like I expressed the value of living in dollars, at $1,000,000.

As I have been studying Functional Decision Theory (FDT) a lot recently, I have come across quite some counterarguments and general remarks that are worth rebutting and/or discussing in more detail. This post is an attempt to do just that. Most points have been discussed in other posts, but as my understanding of FDT has grown, I decided to write this new post. For readers unfamiliar with FDT, I recommend reading Functional Decision Theory: A New Theory of Instrumental Rationality.

## The Bomb Argument

Originally proposed by William MacAskill:

The argument against FDT, then, is that it recommends Left-boxing, which supposedly is wrong because it makes you slowly burn to death while you could have just paid $100 instead.

## Analysis and Rebuttal

On Bomb, FDT indeed recommends Left-boxing. As the predictor seems to have a model of your decision procedure which she uses to make her prediction, FDT reasons that whatever you decide now, the predictor's model of you also decided. If you Left-box, so did the model; if you Right-box, so did the model. If the model Left-boxed, then the predictor would have predicted you Left-box, and, crucially,

not put a bomb in Left. If the model instead Right-boxed, there would be a bomb in Left. Reasoning this way, Left-boxing gives you a situation with no bomb (with probability a trillion trillion minus 1 out of a trillion trillion) where you don't pay any money, while Right-boxing gets you one where you pay $100. Left-boxing then clearly wins, assuming you don't value your life higher than $100 trillion trillion. Let's assume you value your life at $1,000,000.## "But there

isa bomb in Left! You burn to death!"Well, the problem indeed specifies there is a bomb in Left, but this is as irrelevant as saying "But you're in town already!" in Parfit's Hitchhiker (note that this version of Parfit's Hitchhiker asks whether you should pay once you're already in town). There, you could say paying is irrational since you're in town already and paying just loses you money. But if you are a non-paying agent talking to the driver, he will know you are a non-paying agent (by design of the problem), and

never take you to town to begin with. Similarly, if you are a Left-boxer, the predictor in Bomb will not put a bomb in Left and you can save yourself $100. Really: Left-boxing in Bomb is analogous to and just as rational as paying in Parfit's Hitchhiker.## "The predictor isn't perfect. There can be a bomb in Left while you Left-box."

So we're focusing on that 1 in a trillion trillion case where the predictor is wrong? Great. FDT saves $100 in 99,999,999,999,999,999,999,999 out of a trillion trillion cases and burns to death in 1 of them. FDT wins, period.

## "But the scenario focuses on that 1 in a trillion trillion case. It doesn't mention the other 99,999,999,999,999,999,999,999 cases."

No, it doesn't just focus on that 1 in a trillion trillion case. It mentions the predictor, who predicts your decision with great accuracy, and then asks what decision you should make. That decision influences the prediction via subjunctive dependence. You can't propose an extremely accurate predictor-of-your-decision and then expect me to reason as if that predictor's prediction and my decision are independent of each other. Yes, the prediction can be wrong, but it can be - and almost certainly is - right too. It's simply wrong to reason about a fixed prediction.

## "Look, if you had to choose

beforeyou know what's in the boxes, Left-boxing might make sense. But that's not the case!"Yes, that's

exactlythe case, due to subjunctive dependence between you and the predictor. The predictor runs a model of your decision procedure. Whatever you decide, that model also "decided", before the predictor fixes the content of the boxes.Bomb gives us 1 in a trillion trillion cases where FDT agents die horribly, and almost a trillion trillion cases where they save $100. Bomb is an argument

forFDT, not against it.## The Procreation Argument

From On Functional Decision Theory:

## Analysis and Rebuttal

FDT agents indeed have a worse life than CDT agents in Procreation, but that has nothing to do with rationality and everything with the problem structure. An FDT agent would procreate, since that gives a high probability (let's say 0.99) that she exists miserably, which she prefers to not existing at all. If life without children is valued at $1,000,000 and life with children at $100,000, than her expected utility for procreation is $99,000. For not procreating, it is $10,000. FDT therefore procreates. If you're a CDT agent, it is assumed the father procreated; the expected utility for procreating, then, is $100,000; for not procreating, it is $1,000,000. CDT doesn't procreate, and makes $990,000 more than FDT. But I hope the reader agrees that we're not really discussing one problem here; we're discussing two different problems, one for FDT and one for CDT. For each theory, there are very different probabilities on the table! Critiquing FDT with Procreation is like critiquing CDT because EDT gets more money in Newcomb's Problem than CDT does in Parfit's Hitchhiker. FDT agents choose the best option available to them in Procreation!

Note that we can just as easily create a version of Procreation where CDT agents "have a much worse life" than FDT agents. Simply have the father be a CDT agent! In that case, FDT agents don't procreate and have a happy life - and, notably, CDT agents, not using the subjunctive dependence between them and their father, still don't procreate, and almost certainly

cease to exist.## A More Fair Version of Procreation

Any problem designed to compare two decision theories should at least give the same payoffs and probabilities for each decision theory. Therefore, here's a more fair version of Procreation:

Procreation*.I wonder whether to procreate. I know for sure that doing so would make my life miserable. But I also have reason to believe that my father faced the exact same choice, and that he followed my very decision procedure. If I were to not procreate, there's a significant probability that I wouldn't exist. I highly value existing (even miserably existing).Procreation* gives both FDT and CDT agents (and indeed, all agents) the same dilemma. FDT agents procreate and live miserably; CDT agents don't procreate and almost certainly don't exist. FDT beats CDT in this dilemma.

## The Tweak the Utility Function Argument

Alright, this one is not targeted at FDT per se, but it's still important to discuss as it might hinder further development of FDT. In On Functional Decision Theory, Wolfgang Schwarz argues that where CDT makes the less-than-optimal decision, the trick is not to develop a new decision theory, but to

tweak the utility function. I want to emphasize just how much this doesnotfix the problem. If your game AI doesn't play chess very well, the right thing to do is toimprove your algorithm, not to define the opening position of chess as a winning position for your AI.For example, Schwarz argues that on the Psychological Twin Prisoner's Dilemma, the agent should care about her twin's prison years as well. If the agent cares about her and her twin's prison years equally, then, based on these prison years, the payoff matrix becomes something like this:

Now cooperating is easily the best choice for CDT. Schwarz notes that if he "were to build an agent with the goal that they do well for themselves, I'd give them this kind of utility function, rather than implement FDT." Of course you'd give them an altruistic utility function! However, CDT

still doesn't solve the Psychological Twin Prisoner's Dilemma. It only fixes the version with the modified utilities, which is completely different (e.g. it has a different Nash Equilibrium). You may argue that a CDT agent with an altruistic utility function wouldn't ever come across the original version of the problem - butthe very fact that it can't solve that relatively easy problem points at a serious flaw in its decision theory (CDT). It also suggests this isn't the only problem CDT doesn't solve correctly. This is indeed the case, and Schwarz goes on to make an ad hoc adjustment for CDT to solve Blackmail:Here, Schwarz suggest Donald should have a "strong sense of pride" or a "vengeful streak" in order to avoid being blackmailed. (Note that an altruistic player wouldn't prefer not being blackmailed over paying Stormy.) The point is this: if your decision theory requires ad hoc fixes in the utility function, it's

not a good decision theory.Schwarz:

Well, and have a vengeful streak, or pride, apparently. Altruism doesn't solve it all, it seems.

If your decision theory can't solve Newcomb's Problem, that's probably a sign there are more problems it can't solve. Indeed, Newcomblike problems are the norm.

## Argument Against Subjective Dependence

From A Critique of Functional Decision Theory:

## Analysis and Rebuttal

The first thing to say here is that FDT's subjunctive dependence is about

functions, notalgorithms: for example, counting sort and Quicksort are both sorting algorithms for the same function. However, the argument works the same if we replace "algorithm" for "function." But perhaps most importantly, the properties of a calculator (or anything, really) can't depend on how we interpret its output,becausedifferent people can interpret it differently. Therefore, the calculators in the example are implementing different functions: one of them maps "2 + 2" to "4", the other maps "2 + 2" to "-4". However, it does seem the second one uses the function of the first one as "subfunction": it needs to know the "real" answer to "2 + 2" in order to output "-4". Therefore, the calculatorsaresubjunctively dependent on that subfunction, even though their outputs are different. Even if the second calculator always outputs "[output of first calculator] + 1", the calculators are still subjunctively dependent on that same function.In Newcomb's Problem, the idea seems to be that the predictor uses a model of your decision procedure that does use the same outputs as you, in which case the predictor is computing the same function as the agent. But, like with the calculators, even if the outputs are phrased differently, subjunctive dependence can still exist. It is of course up to the predictor how she interprets the outputs of the model, but there is a clearly "right" way to interpret them

giventhat there is (full) subjunctive dependence going on between the agent and the predictor.## The Agent-y Argument

Also in A Critique of Functional Decision Theory, MacAskill makes an argument that hinges on how "agent-y" a process is:

## Analysis and Rebuttal

The crucial error here is that whether "there's an agent making predictions" is

notthe relevant factor for FDT. What matters is subjunctive dependence: two physical systems computing the same function. This definition doesn't care about any of these systems being agents. So:No. The problem remains the same as far as FDT is concerned (although maybe some uncertainty is added with the agent). There is no subjunctive dependence in this case, and adding the agent like this doesn't help as it isn't computing the same function as the main agent in the problem.The rebuttal of MacAskill's second example about S become gradually more "agent-y" is mostly the same: agent-ness doesn't matter. However:

Why? I mean, there's no sharp jump anyway (because there's no subjunctive dependence), but in general, a tiny change in physical makeup

canmake a difference. For example, in Newcomb's Problem, if the accuracy of the predictor drops below a threshold, two-boxing "suddenly" becomes the better choice. I can imagine a tiny change in physical makeup causing the predictor to predict just a little less accurately, dropping the accuracy from just above to just below the threshold.## Final Notes

In conclusion, none of the above arguments successfully undermine FDT. So far, it seems FDT does everything right that CDT does right while also doing everything right EDT does right, and all of that using a very plausible concept. Subjunctive dependence is a real thing: you

knowone calculator will output "5040" on "7!" if you just gave "7!" to another identical calculator. FDT needs to be developed further, but it certainly withstands the criticisms.