This is crossposed from my blog. While I think the ideas here are solid I think the presentation still needs some work so I'd also appreciate comments on the presentation so I can turn this into a more polished essay, e.g., is the second section worth keeping and what should be expanded/cut.
With the recent flurry of posts in the rationalist community about which decision theory ( e.g. CDT EDT UDT etc..) it's time to revisit the theme of this blog: rejecting rationality realism In this case that means pointing out that there isn't actually a well-defined fact of the matter about which decision theory is better. Of course, nothing stops us from arguing with each other about the best decision theory but those disagreements are more like debates about what's the best programming language than disagreements about the chemical structure of Benzene.. , ,
Any attempt to compare decision theories must first address the question: what does it mean for one decision theory to be better than another? Unlike many pseudo-problems there is a seemingly meaningful answer to this question: one decision theory is better than another to the extent that the choices it recommends lead to better outcomes for the agent. Other than some ambiguity about which theory is better if neither dominates the other it seems like this gives a straightforward criteria for superiority: we just look at actual outcomes and see which decision theory offers the best results for an agent. However, this only appears to give a well-defined criteria because in every day life the subtle differences between the various ways to understand a choice and how to conceptualize making a choice don't matter.
In particular, I argue here that the kind of scenarios which distinguish between theories like EDT and CDT are hazardous places to deploy our raw intuitions and support different theories depending on how exactly one precisifies the question being asked. As such, there isn't a well-defined fact of the matter as to which decision theory is right or the best as we only appear to have specified a clear conception of what it means for one theory to be better than another. This doesn't mean the various decision theories aren't interesting notions to play with but it does mean there is no deep fact to get at about which one is right.
Intuitions and Motivation
So why would one even suspect there is a fact of the matter about which decision theory is best/true? I mean we wouldn't suspect there is a fact of the matter about which formal model of computation (Turing machines, register machines, DFAs, Quantum Turing Machines etc..) is the true model. They are all valid formal constructs and we use whatever one seems most appropriate to a given question. So why would we suspect decision theory to be any different?
My best guess is that people have some intuition like the following:
I can mentally walk through the predictions of various decision theories (even in Newcomb style problems) and seeing who gets what makes it seem obvious that one theory is better than another. Nothing about what counts as a choice or how I idealize those choices seems to make a difference.
To convince you that's not enough let me provide an extreme dramatization of how our intuitive conception of what choice would be better for you or I to make can come apart from the apparent formal payouts of a decision theory. Consider the following Newtonian, rather than Newcombian, problem. You fall off the top of the empire State building what you do as you fall past the fifth floor? What would one say about the virtues of Floating Decision Theory which (otherwise being equal to, say, CDT) tells us that in such a situation we should make the choice to float gently to the ground. Now obviously, one would prefer to float rather than fly but posing the problem as a decision between these two choices doesn't render it a real choice. Obviously, there is something dubious about evaluating your decision theory based on its performance on the float/fall question.
The point of this example is to illustrate that when the supposed choices given by our idealization are sufficiently different from what we normally think of as a choice we can't punt on preciscifying what it means for a choice (or a decision theory) to be better than another and simply refer back to our informal intuitions about choices. After all choosing to float doesn't make sense as a choice in our intuitive sense so how can we assume that intuitive sense steps in and tells us what it means for one theory to be better than another in such a situation?
Yet, this is precisely the kind of situation we encounter in the original Newcomb problem as the very assumption of predictability which allows the demon (or in Yudkowsky's formulation Omega) to favor the 1 boxers ensures the physical impossibility of choosing any number of boxes other than what you did choose. Of course, the same is (up to quantum mechanical randomness) true of any actual `choice' by a real person but under certain circumstances we find it useful to idealize it as free choice. What's different about the Newcomb problem is that, understood naively, it simultaneously asks us to idealize selecting 1 or 2 boxes as a free choice while assuming it isn't actually. In other words here too we can't simply punt back to our intuitions about one choice being better than another and assume it gives a clear answer.
Once we are willing to grant that it's not enough to just presume that our gut instincts about choices give rise to a well-defined notion of what it means for one decision theory to be better than another and start trying to precisify what that means it quickly becomes clear there is more than one way to do that.
Ultimately, there is something a bit weird about asking what decision a real physical agent should take in a given situation. After all, the agent will act just as it's software dictates and/or the laws of physics require. Thus, as Yudkowsky recognizes, any comparison of decision theories is asking some kind of counterfactual. However, which counterfactual we ask makes a huge difference in what decision theory is preferable. For instance, all of the following are potential ways to precifisify the question of what it means for it to be better for XDT to be a better decision theory than YDT.
1. If there was a miracle that overrode the agent's programming/physical laws at the moment of a choice then doing so in the manner prescribed by XDT yields better outcomes than doing so in a manner prescribed by YDT.
2. In fact those actual agents who more often choose the outcome favored by XDT do better than those who choose the outcome favored by YDT.
3. Those actual agents which adopt/apply XDT do better than those who adopt/apply YDT.
4. Suppose there is a miracle that overrode physical laws at the moment the agent's programming/internal makeup is specified then if the miracle results in outcomes more consistent with XDT than YDT the agent does better.
5. As above except with applying XDT/YDT instead of just favoring outcomes which tend to agree with it.
6. Moving one level up we could ask about which performs better, agents whose programming inclines them to adopt XDT or YDT when considered.
7. Finally, if what we are interested in is actually coding agents, i.e., writing AI software, we might ask whether programmers who code their agents to reason in a manner that prefers choice A produce agents that do better than programmers who code agents to reason in a manner that prefers choice B.
8. Pushing that one level up we could ask about whether programmers who are inclined to adopt/apply XDT/YDT as true produce agents which do better.
One could continue and list far more possibilities but these eight are enough to illustrate the point that there are multiple different kinds of questions one might want a decisions theory to answers. Importantly, depending on which one we choose we get different answers as to which theory is preferable.
For instance, note that if we are asking question 1 CDT outperforms EDT. For the purposes of question 1 the right answer to the Newcomb problem is to be a 2 boxer. After all, if we idealize the choice as a miracle that allows deviation from physical law then the demon's prediction of whether we would be a two-boxer or one-boxer no longer must be accurate so two-boxes always outperforms one boxing. It doesn't matter that your software says you will choose only one box if we are asking about outcomes where a miracle occurs and overrides that software.
On the other hand it's clearly true that EDT does better than CDT with respect to question 2. That's essentially the definition of EDT.
To distinguish the remaining options we need to consider a range of different scenarios such as demons who punish agents who actually apply/adopt XDT/YDT in reaching their conclusions. Or consider Newcombian demons who punish agents who adopt (or whose programmers adopted one of XDT/YDT). But the important point is that depending on the assumptions we make about what it means for one theory to be better than another and the kind of problems the agent will face yield different answers for the 'right' way for the agent to behave.
Ultimately, which criteria we should use to compare decision theories depends on what we want to achieve. Different idealizations/criteria will be appropriate depending on whether we are asking which rule we ourselves should adopt, how we should program agents to act, how we should program agents who program agents etc.. etc... At anytime we've precisified what it is we want out of our decision theory sufficiently well to make the question of which one is the best well-defined there won't be anything left to debate about, e.g., it's analytic that CDT is the theory which yields the best outcomes if we take the nature of the agent (source code/physical makeup) to be fixed but idealize decisions as miracles that temporarily suspend the normal casual rules (allowing agents to choose things Newcombian demons predict they wouldn't).
Is Realism Impossible?
I don't take myself to have offered a proof that it's impossible to ever believe there are facts of the matter about the true decision theory. Merely offered a strong prima facie case that there probably isn't such a fact. After all, it's always possible that, like the argument that justified true belief isn't knowledge, someone will pop up and show that there really was a precise criteria for preferability/truth of decision theories implicit in the usage of every competent English speaker. But even if such a hidden fact about our linguistic commitments was found it wouldn't really tell us anything significant about the world. We would do better simply spelling out in detail what it is we seek to claim (e.g. the decisions which maximize outcome under the assumption that choices can be idealized as being little miracles) and tossing aside as, probably meaningless and at best unimportant, the question of what the best decision theory is. If there is some aspect of reality that such an attitude seems to be ignoring the burden is on those who believe this to offer evidence.