This is crossposed from my blog. While I think the ideas here are solid I think the presentation still needs some work so I'd also appreciate comments on the presentation so I can turn this into a more polished essay, e.g., is the second section worth keeping and what should be expanded/cut.

With the recent flurry of posts in the rationalist community about which decision theory ( e.g. CDT EDT UDT etc..) it's time to revisit the theme of this blog: rejecting rationality realism In this case that means pointing out that there isn't actually a well-defined fact of the matter about which decision theory is better. Of course, nothing stops us from arguing with each other about the best decision theory but those disagreements are more like debates about what's the best programming language than disagreements about the chemical structure of Benzene.. , ,

Any attempt to compare decision theories must first address the question: what does it mean for one decision theory to be better than another? Unlike many pseudo-problems there is a seemingly meaningful answer to this question: one decision theory is better than another to the extent that the choices it recommends lead to better outcomes for the agent. Other than some ambiguity about which theory is better if neither dominates the other it seems like this gives a straightforward criteria for superiority: we just look at actual outcomes and see which decision theory offers the best results for an agent. However, this only appears to give a well-defined criteria because in every day life the subtle differences between the various ways to understand a choice and how to conceptualize making a choice don't matter.

In particular, I argue here that the kind of scenarios which distinguish between theories like EDT and CDT are hazardous places to deploy our raw intuitions and support different theories depending on how exactly one precisifies the question being asked. As such, there isn't a well-defined fact of the matter as to which decision theory is right or the best as we only appear to have specified a clear conception of what it means for one theory to be better than another. This doesn't mean the various decision theories aren't interesting notions to play with but it does mean there is no deep fact to get at about which one is right.

Intuitions and Motivation

So why would one even suspect there is a fact of the matter about which decision theory is best/true? I mean we wouldn't suspect there is a fact of the matter about which formal model of computation (Turing machines, register machines, DFAs, Quantum Turing Machines etc..) is the true model. They are all valid formal constructs and we use whatever one seems most appropriate to a given question. So why would we suspect decision theory to be any different?

My best guess is that people have some intuition like the following:

I can mentally walk through the predictions of various decision theories (even in Newcomb style problems) and seeing who gets what makes it seem obvious that one theory is better than another. Nothing about what counts as a choice or how I idealize those choices seems to make a difference.

To convince you that's not enough let me provide an extreme dramatization of how our intuitive conception of what choice would be better for you or I to make can come apart from the apparent formal payouts of a decision theory. Consider the following Newtonian, rather than Newcombian, problem. You fall off the top of the empire State building what you do as you fall past the fifth floor? What would one say about the virtues of Floating Decision Theory which (otherwise being equal to, say, CDT) tells us that in such a situation we should make the choice to float gently to the ground. Now obviously, one would prefer to float rather than fly but posing the problem as a decision between these two choices doesn't render it a real choice. Obviously, there is something dubious about evaluating your decision theory based on its performance on the float/fall question.

The point of this example is to illustrate that when the supposed choices given by our idealization are sufficiently different from what we normally think of as a choice we can't punt on preciscifying what it means for a choice (or a decision theory) to be better than another and simply refer back to our informal intuitions about choices. After all choosing to float doesn't make sense as a choice in our intuitive sense so how can we assume that intuitive sense steps in and tells us what it means for one theory to be better than another in such a situation?

Yet, this is precisely the kind of situation we encounter in the original Newcomb problem as the very assumption of predictability which allows the demon (or in Yudkowsky's formulation Omega) to favor the 1 boxers ensures the physical impossibility of choosing any number of boxes other than what you did choose. Of course, the same is (up to quantum mechanical randomness) true of any actual `choice' by a real person but under certain circumstances we find it useful to idealize it as free choice. What's different about the Newcomb problem is that, understood naively, it simultaneously asks us to idealize selecting 1 or 2 boxes as a free choice while assuming it isn't actually. In other words here too we can't simply punt back to our intuitions about one choice being better than another and assume it gives a clear answer.

Once we are willing to grant that it's not enough to just presume that our gut instincts about choices give rise to a well-defined notion of what it means for one decision theory to be better than another and start trying to precisify what that means it quickly becomes clear there is more than one way to do that.

Possible Precisifications

Ultimately, there is something a bit weird about asking what decision a real physical agent should take in a given situation. After all, the agent will act just as it's software dictates and/or the laws of physics require. Thus, as Yudkowsky recognizes, any comparison of decision theories is asking some kind of counterfactual. However, which counterfactual we ask makes a huge difference in what decision theory is preferable. For instance, all of the following are potential ways to precifisify the question of what it means for it to be better for XDT to be a better decision theory than YDT.

1. If there was a miracle that overrode the agent's programming/physical laws at the moment of a choice then doing so in the manner prescribed by XDT yields better outcomes than doing so in a manner prescribed by YDT.

2. In fact those actual agents who more often choose the outcome favored by XDT do better than those who choose the outcome favored by YDT.

3. Those actual agents which adopt/apply XDT do better than those who adopt/apply YDT.

4. Suppose there is a miracle that overrode physical laws at the moment the agent's programming/internal makeup is specified then if the miracle results in outcomes more consistent with XDT than YDT the agent does better.

5. As above except with applying XDT/YDT instead of just favoring outcomes which tend to agree with it.

6. Moving one level up we could ask about which performs better, agents whose programming inclines them to adopt XDT or YDT when considered.

7. Finally, if what we are interested in is actually coding agents, i.e., writing AI software, we might ask whether programmers who code their agents to reason in a manner that prefers choice A produce agents that do better than programmers who code agents to reason in a manner that prefers choice B.

8. Pushing that one level up we could ask about whether programmers who are inclined to adopt/apply XDT/YDT as true produce agents which do better.

One could continue and list far more possibilities but these eight are enough to illustrate the point that there are multiple different kinds of questions one might want a decisions theory to answers. Importantly, depending on which one we choose we get different answers as to which theory is preferable.

For instance, note that if we are asking question 1 CDT outperforms EDT. For the purposes of question 1 the right answer to the Newcomb problem is to be a 2 boxer. After all, if we idealize the choice as a miracle that allows deviation from physical law then the demon's prediction of whether we would be a two-boxer or one-boxer no longer must be accurate so two-boxes always outperforms one boxing. It doesn't matter that your software says you will choose only one box if we are asking about outcomes where a miracle occurs and overrides that software.

On the other hand it's clearly true that EDT does better than CDT with respect to question 2. That's essentially the definition of EDT.

To distinguish the remaining options we need to consider a range of different scenarios such as demons who punish agents who actually apply/adopt XDT/YDT in reaching their conclusions. Or consider Newcombian demons who punish agents who adopt (or whose programmers adopted one of XDT/YDT). But the important point is that depending on the assumptions we make about what it means for one theory to be better than another and the kind of problems the agent will face yield different answers for the 'right' way for the agent to behave.

Ultimately, which criteria we should use to compare decision theories depends on what we want to achieve. Different idealizations/criteria will be appropriate depending on whether we are asking which rule we ourselves should adopt, how we should program agents to act, how we should program agents who program agents etc.. etc... At anytime we've precisified what it is we want out of our decision theory sufficiently well to make the question of which one is the best well-defined there won't be anything left to debate about, e.g., it's analytic that CDT is the theory which yields the best outcomes if we take the nature of the agent (source code/physical makeup) to be fixed but idealize decisions as miracles that temporarily suspend the normal casual rules (allowing agents to choose things Newcombian demons predict they wouldn't).

Is Realism Impossible?

I don't take myself to have offered a proof that it's impossible to ever believe there are facts of the matter about the true decision theory. Merely offered a strong prima facie case that there probably isn't such a fact. After all, it's always possible that, like the argument that justified true belief isn't knowledge, someone will pop up and show that there really was a precise criteria for preferability/truth of decision theories implicit in the usage of every competent English speaker. But even if such a hidden fact about our linguistic commitments was found it wouldn't really tell us anything significant about the world. We would do better simply spelling out in detail what it is we seek to claim (e.g. the decisions which maximize outcome under the assumption that choices can be idealized as being little miracles) and tossing aside as, probably meaningless and at best unimportant, the question of what the best decision theory is. If there is some aspect of reality that such an attitude seems to be ignoring the burden is on those who believe this to offer evidence.

New Comment
10 comments, sorted by Click to highlight new comments since: Today at 10:38 PM

Of course, nothing stops us from arguing with each other about the best decision theory but those disagreements are more like debates about what’s the best programming language than disagreements about the chemical structure of Benzene.

A better comparison is to, in the absence of modern chemistry, thinking about what the best ontology is for thinking about physical materials, and how to make useful physical materials.

Before the discovery of modern chemical theories (the elements, molecules, etc), people had folk concepts of what materials are. They had different ways of thinking about and working with them (e.g. the classical elements, metalworking, distillation, cooking, alchemy).

To them it might have seemed that there was no best ontology for thinking about physical materials; there are different models that explain different phenomena and are practical in different contexts, but no overarching theory to rule them all. And that's true to some degree. Even after the discovery of modern chemistry, we still haven't figured out the best way to think about and work with all possible materials (as evidenced by the fact that materials science is still a live field, and nanotechnology is not a thing yet). But, modern chemistry constituted the discovery of an extremely useful and precise abstraction for modelling physical materials which revolutionized the field. There is no comparison between the accuracy/usefulness of those ontologies that came before modern chemistry and those that came after.

Is there reason to expect the field of decision theory to be similar to the field of chemistry? Decision theory has seen a great deal of progress throughout history, with the discovery of probability theory, VNM utility theory, Nash equilibrium, and so on. These abstractions are pervasive throughout the field and are extremely useful in thinking through decision problems, indicating that something similar to the invention of modern chemistry has already happened in the field of decision theory. The question is, is there reason to expect that there are more of such abstractions that have not been discovered yet? I think, based on the level of current confusion and the amount of recent progress (e.g. UDT, proof-based UDT, reflective oracles, logical inductors, COEDT), the answer is "hell yes".

Chemistry didn't solve ontology, it passed the buck to physics. We don't know whether the correct ontology of physics is fields, particles , waves, pure information, etc.

I suspect that you are saying something similar to what I have talked about: there is no magical way to override agent's programming or physical laws, what you do is observe what happens, which sometimes feels like making decisions. You can no more change what happens than use your "Floating Decision Theory".

Robot cars decide their actions. You can look at the code that makes those decisions, and if it isn't a mess of NNs you can see exactly how it makes them. They do not discover what their actions are and then "feel like they made a decision". The code has causal effect upon the world outside the car.

Why should that be any less true for people?

Yes, in the sense of "follow their programming", robot cars do decide their actions. They cannot suddenly decide to make an action different from what their code, coupled with external inputs, forces them to. There is no top-down causation. People are exactly the same, except that those inputs an programming change the programming on the fly. But that can be also true of robot cars.

A robot car's code does not force it to do anything. What the code does is what the robot does. As for top-down causation, if things that are made of parts exist, top-down causation exists. It is not separate from bottom-up causation; it is the same thing.

Obviously, there is something dubious about evaluating your decision theory based on its performance on the float/fall question.

Decision theory is not about what decisions you will encounter, but how to handle those you do encounter. If you are presented with the option of floating, that is the one to take. That you will not be presented with it is not a defect of the decision theory. We can still talk sensibly about what to do if you suddenly remembered you were wearing flying boots.

The question of what range of decisions a decision theory should handle is not a question for that decision theory, unless you're going into the further reaches of exotically self-reflexive decision theory. In which case, you would have to be doing mathematics (as some people are), instead of elaborating verbal formulations.

Technicalities: Under Possible Precisifications, 1 and 5 are not obviously different. I can interpret them differently, but I think you should clarify them. 2 is to 3 as 4 is to 1, so I suggest listing them in that order, and maybe adding an option that is to 3 as 5 is to 1.

Substance: I think you're passing over a bigger target for criticism, the notion of "outcomes". In general, agents can and do have preferences over decision processes themselves, as contrasted with the standard "outcomes" of most literature like winning or losing money or objects. For example, I can be "money pumped" in the following manner. Sell me a used luxury sedan on Monday for $10k. Trade me a Harley Davidson on Tuesday for the sedan plus my $5. Trade me a sports car on Wednesday for the Harley plus $5. Buy the sports car from me on Thursday for $9995. Oh no, I lost $15 on the total deal! Except: I got to drive, or even just admire, these different vehicles in the meantime.

If all processes and activities are fair game for rational preferences, then agents can have preferences over the riskiness of decisions, the complexity of the decision algorithm, and a host of other features that make it much more individually variable which approach is "best".

"I mean we wouldn't suspect there is a fact of the matter about which formal model of computation (Turing machines, register machines, DFAs, Quantum Turing Machines etc..) is the true model. They are all valid formal constructs and we use whatever one seems most appropriate to a given question. So why would we suspect decision theory to be any different?"

We also have an entire field of study devoted to characterizing exactly when one model is equivalent to another, the power of each, etc. I would suspect decision theory to be very similar, and would expect to end up with a similar poset zoo, in particular claiming that certain decision theories are just straight up better than others assuming you have the power to implement them.

I find it disappointing that this post has not received more attention. (If it has elsewhere, please let me know.)


Re: Jessica, I find the comparison to chemistry only accurate if we are doing decision theory to make useful agents, helpful predictions, or whatever -- just as chemistry was used to "make useful physical materials." 

But this does not seem to be the point of most discussion about decision theory. If debates were framed in terms of "what sort of decision theory should we build into the choice-making architecture of an agent if our goal is to ___ (maximize utility, etc.)," then the discussion would look very different.

We wouldn't, for instance, have people arguing about whether we should engage in multiverse-wide cooperation because of evidential decision theory. In my eyes, that discussion seems suspiciously about some independent fact of the matter about rationality, and does not seem to be treating decision theory as just a useful abstraction.


Re: Guy

Likewise, in my eyes this does not seem like the point of most discussion about decision theory. People do not seem to be just comparing different formal models, saying when they are equivalent, etc. Your last sentence seems to steer away from this view, as you say "certain decision theories are just straight up better than others assuming you have the power to implement them." But this talk of being "straight up better" is precisely the sort of realism that we are challenging. What does it mean for one decision theory to be better? Don't we need to specify some criteria of evaluation?