# 2

Finding a good decision theory is hard. Previous attempts, such as Timeless Decision Theory, work, it seems, in providing a stable, effective decision theory, but are mathematically complicated. Simpler theories, like CDT or EDT, are much more intuitive, but have deep flaws. They fail at certain problems, and thus violate the maxim that rational agents should win. This makes them imperfect.

But it seems to me that there is a relatively simple fix one could make to them, in the style of TDT, to extend their power considerably. Here I will show an implementation of such an extension of CDT, that wins on the problems that classic CDT fails on. It quite possibly could turn out that this is not as powerful as TDT, but it is a significant step in that direction, starting only from the naivest of decision theories. It also could turn out that this is nothing more than a reformulation of TDT or a lesser version thereof. In that case, this still has some value as a simpler formulation, easier to understand. Because as it stands, TDT seems like a far cry from a trivial extension of the basic, intuitive decision theories, as this hopes to be.

We will start by remarking that when CDT (or EDT) tries to figure out the expected value or a action or outcome, the naive way which it does so drops crucial information, which is what TDT manages to preserve. As such, I will try to calculate a CDT with this information not dropped. This information is, for CDT, the fact that Omega has simulated you and figured out what you are going to do. Why does a CDT agent automatically assume that it is the "real" one, so to speak? This trivial tweak seems powerful. I will, for the purpose of this post, call this tweaked version of CDT "Simulationist Causal Decision Theory", or SCDT for short.

Let's run this tweaked version though Newcomb's problem. Let Alice be a SCDT agent. Before the problem begins, as is standard in Newcomb's problem, Omega looks at Alice and calculates what choice Alice will make in the game. Without to much loss of generality, we can assume that Omega directly simulates Alice, and runs the simulation through the a simulation of the game, in order make the determination of what choice Alice will make. In other formulations of Newcomb's problem, Omega figures in out some other way what Alice will do, say by doing a formal analysis of her source code, but that seems intuitively equivalent. This is a possible flaw, but if the different versions of Newcomb's problem are equivalent (as they seem to be) this point evaporates, and so we will put it aside for now, and continue.

We will call the simulated agent SimAlice. SimAlice does not know, of course, that she is being simulated, and is an exact copy of Alice in all respects. In particular, she also uses the same SCDT thought processes as Alice, and she has the same utility function as Alice.

So, Alice (or SimAlice, she doesn't know which one she is) is presented with the game. She reasons thusly:

There are two possible cases: Either I am Alice or I am SimAlice.

• If I am Alice: Choosing both boxes will always get me exactly \$1000 more then choosing just one. Regardless of whether or not there is \$1,000,000 in box 2, by choosing box 1 as well, I am getting an extra \$1000. (Note that this is exactly the same reasoning standard CDT uses!)
• If I am SimAlice: Then "I" don't actually get any money in this game, regardless of what I choose. But my goal is not SimAlice getting money it is is Alice getting money, by the simple fact that this is what Alice wants, and we assumed above that SimAlice uses the same utility function as Alice.And depending what I choose now, that will affect the way Omega sets up the boxes, and so affects the amount of money Alice will get. Specifically, if I one box, Omega will put an extra \$1,000,000 in box 2, and so Alice will get an extra \$1,000,000, no matter what she chooses. (Because in both the choices Alice could make (taking either box 2 or boxes 1&2), she takes box 2, and so will wind up with a bonus \$1,000,000 above what she would get if box 2 was empty, which is what would happen if SimAlice didn't two box.)
So, as I don't know whether I am Alice or SimAlice, and as there is one of each, there is a 0.5 probability of me being either one, so by the law of total expectation,
E[money|I one box]=0.5 * E[money|(I one box)&(I am Alice)] + 0.5 * E[money|(I one box)&(I am SimAlice)]
So my expected return off one boxing (above what I would get by two boxing) is 0.5 * -\$1000 + 0.5 * \$1,000,000 = \$450,000, which is positive, so I should one box.

As you can see, just by acknowledging the rules of the game, by admitting that Omega has the power to simulate her (as the rules of Newcomb's problem insist), she will one box. This is unlike a CDT agent, which would ignore Omega's power to simulate her (or otherwise figure out what she will do), and say "Hey, what's in the boxes is fixed, and my choice does not affect it". That is only valid reasoning if you know you are the "original" agent, and Alice herself uses that reasoning, but only in the case where she is assuming she is the "original". She takes care, unlike a CDT agent, to multiply the conditional expected value by the chance of the condition occurring.

This is not only limited to Newcomb's problem. Let's take a look at Parfit's Hitchhiker, another scenario CDT has trouble with. There are again two identical agents making decisions: The "real" Alice, as soon as she gets home; and the "Alice-after-she-gets-home-as simulated-by-the-driver-offering-her-a-ride, which I will again call SimAlice for short.

Conditional on an agent being Alice and not SimAlice, paying the driver loses that agent her \$100 and gains her nothing compared to refusing to pay. Conditional on an agent being SimAlice and not Alice, agreeing to pay the driver loses her nothing (as she, being a simulation, cannot give the driver real money), and gains her a trip out of the desert, and so her life. So, again, the law of total expectation gives us that the expected value of paying the driver (considering you don't know which you are), is 0.5 * -\$100 + 0.5 * (Value of Alice's life). This gives us that Alice should pay if and only if she values her life at more than \$100, which is, once again, the correct answer.

So, to sum up, we found that SCDT can not only solve Newcomb's problem, which standard CDT cannot, but also solve Parfit's Hitchhiker, which neither CDT nor EDT can do. It does so at almost no cost in complexity compared to CDT, unlike, say, TDT, which is rather more complex. In fact, I kind of think that it is entirely possible that this SCDT is nothing more than a special case of something similar to TDT. But even if it is, it is a very nice, simple, and relatively easy to understand special case, and so may deserve a look for that alone.

There are still open problems for SCDT. If, rather than a simulation, you are analysed in a more direct way, should that change anything? What if, in Newcomb's problem, Omega simulates many simulations of you in parallel? Should that change the weights you place on the expected values? This ties in deeply with the philosophical problem of how you assign measure to identical, independent agents. I can not give a simple answer, and a simple answer to those questions is needed before SCDT is complete. But, if we can figure out the answer to these questions, or otherwise bypass them, we have a trivial extrapolation of CDT, the naivest decision theory, which solves correctly most or all of the problems that trip up CDT. That seems quite worthwhile.