# 6

I couldn't find any concise explanation of what the decision theories are. Here's mine:

A Causal Decision Theorist wins, given what's happened so far.

An Evidential Decision Theorist wins, given what they know.

A Timeless Decision Theorist wins a priori.

To explain what I mean, here are two interesting problems. In each of them, two of the decision theories give one choice, and the third gives the other.

In Newcomb's problem and you separate people into groups based on what happened before the experiment, i.e. whether or not Box A has money, CDT will be at least as successful in each group as any other strategy, and notably more successful than EDT and TDT. If you separate it into what's known, there's only one group, since everybody has the same information. EDT is at least as successful as any other strategy, and notably more successful than CDT. If you don't separate it at all, TDT will be at least as successful as any other strategy, and notably more successful than EDT.

In Parfit's hitchhiker, when it comes time to pay the driver, if you split into groups based on what happened before the experiment, i.e. whether or not one has been picked up, CDT will be at least as successful in each group as any other strategy, and notably more successful than TDT. If you split based on what's given, which is again whether or not one has been picked up, EDT will be at least as successful in each group as any other strategy, and notably more successful than TDT. If you don't separate at all, TDT will be at least as successful as any other strategy, and notably more successful than CDT and EDT.

There's one thing I'm not sure about. How does Updateless Decision Theory compare?

## New to LessWrong?

New Comment

Upvoted for noticing the connection between Simpson's paradox and CDT vs TDT. It's new to me, at least. To recap: CDT beats TDT for each single past history, but TDT wins in aggregate over all past histories.

It's not Simpson's paradox. CDT just figures you can't change the past, and you might as well take advantage of that fact when making decisions.

TDT doesn't think you can change the past either. TDT behaves as if it decides the past, which it does.

Well, they'd say if you can't change it you can't decide it.

Anyway, my point is that it's not that they figure that if it's better when box A is full, and better when box A is empty, so it must be better under all the conditions. If it was that, they'd think that EDT and TDT give the same results, and would start using TDT when this error is pointed out.

The reason they do it is that they assume that the problem when box A is full, and when it's empty, are essentially two different problems. You don't know which you're in, but they have the same solution anyway. The reason EDT- and TDTists do better is that they're arbitrarily given more money, and CDTists would do better if omega decided to give extra money if you were a CDTist.

A good guide is linked here (it's quite long though).

The upshot is that causal decision theory is optimal on all problems where the only thing that matters is what you do. However, TDT is optimal on the extended class of problems where what you plan to do also matters. It is not optimal on the even-more-extended class of problems where how you made your decision matters.

Evidential decision theory is harder to describe, because it is flawed - it falls to Simpson's paradox.

A good guide is linked here (it's quite long though)

Mostly I just wanted to make a short version.

The upshot is that causal decision theory is optimal...

They're all optimal if you look at a situation in the right way. It's a question of what you count as given when you do the optimizing.

Evidential decision theory is harder to describe, because it is flawed - it falls to Simpson's paradox.

That explains why they don't come out the same. the flaws are based on what is and isn't given. For example, according to a proponent of CDT, your decision on Newcomb's problem can't possibly change if box A has money, so controlling if it has money shouldn't affect your decision. Similarly, in Parfit's hitchhiker, a proponent of EDT would say that they already know if they got picked up, and they're not going to base their decision on counterfactuals.

Optimal by the fairly obvious criterion of "gets agents who use it maximal rewards." If you cared about which decision theory you used because of some extra factor, the problem would become one where the rewards were not solely action-determined or decision-determined, when that extra factor is cast in terms of reward.

If you prefer, I'm sure you could recast it using the word "wins."

A CDTist when presented with Newcomb's paradox, would say that, given his situation, \$1,000 is the best he could hope for. Sure he could do better if box A had a million dollars in it, but he could also do better if box B had a million dollars. It doesn't, so he can't. He can't make box A had a million dollars any more than he can make box B had a million dollars. He's not a time-traveler. If you put a TDTist in this scenario, he'd get nothing. If you put anyone in a different scenario

An EDTist, when presented with a non-ideal Parfit's Hitchhiker, when asked for the money, would say that he knows the guy picked him up. It's one thing to change the past when it's unknown. That's really the only way it's different from the future. But there is no way refusing at this point could possibly leave him stranded in the desert.

"Gets agents who use it maximal rewards" is a fairly obvious criterion to say, but there are some obvious unspoken assumptions. The difference between CDT, EDT, and TDT is what assumptions exist.

Well obviously. But agents using different decision theories do not disagree about rewards within the class of problems we're talking about. So you can compare different decision theories using the idea of "reflective consistency," where an agent is reflectively consistent only if it doesn't think it could gain by changing itself. A CDT agent watching a TDT agent win a million dollars from Omega believes that if only it could change, it could win a million dollars too, so it is not reflectively consistent within the class of decision-determined problems.

Suppose Omega gave money exclusively to TDT agents. If this were the case, it would be good to precommit to being a TDT agent. This isn't because TDT is better; it's because the contest was unfair.

Maybe this is analagous to Newcomb's problem. Maybe not. The point is, it's not obvious whether or not it is.

I suppose what I'm trying to say is that it's not that CDT usually gives the optimal solution but has a few flaws. It's that CDT, EDT, and TDT agents have a different idea of what the "optimal solution" refers to. It's not that optimal is the one the *DT strategy would pick. It's that the strategy itself is to find the optimal solution, for some value of optimal.

Maybe this is analagous to Newcomb's problem. Maybe not.

It's different. If the reward isn't determined by the decision the agent makes, but instead by how the agent made that decision, it isn't a "decision-determined problem" anymore. That's why I've been using that phrase. TDT is only generally good for decision-determined problems. Newcomb's problem is a decision-determined problem, which is important because it doesn't set out to expressly reward some type of agent; it's fair.

It's that the strategy itself is to find the optimal solution, for some value of optimal.

But all these "local optimal solutions" can be measured on the same scale, e.g. dollars. And so if the decision theory is just an intermediary - if what we really want is a dollar-maximizing agent, or a game-winning agent, we can compare different decision theories along a common yardstick. The best decision theories will be the ones that dominate all the others within a certain class of problems - they do as well or better than all other decision theories on every single problem of that class. This quickly becomes impossible for larger classes of problems, but can be made possible again by Occamian constraints like symmetry.

For games with only one player (which includes the absent-minded driver, newcomb's problem, counterfactual mugging and parfit's hitchhiker) I think we can characterize it like this:

CDT gives the correct decisions for Sam Beckett assuming he's just 'leaped' into you (and for some reason shares your utility function).

EDT gives the correct "decisions" for a spectator who shares your utility function but is unable to actually make any decisions. (It tells the spectator which decision to hope for.)

TDT gives the correct decisions for a God who shares your utility function and when you pray to it, can control the flow of mathematical logic as it deduces your actual decision from your predetermined prior mental state (of TDT-prayer). (Meanwhile, 'you' are just carried along by the force of mathematical truth, which must be consistent everywhere and at all times, including at 'other instances of yourself' if there are any.)

TDT is the correct decision theory for a God who shares your utility function and when you pray to it, can control the flow of mathematical logic as it deduces your actual decision from your predetermined prior mental state (of TDT-prayer).

Assuming the God must answer all possible prayers, and can't deduce, for example, that Parfit must have picked you up or you wouldn't have made that prayer.

TDT-God can only intervene mathematically, not physically.

The power of TDT-God is to fix the answer to the mathematical question "If you were Parfit's hitchhiker, and you prayed to the TDT-God, would you pay?" That answer must remain the same at all times, and it determines both whether you're picked up and whether you pay. (Assuming you're the kind of person who makes decisions by praying to the TDT-God.)

The part about praying makes it sound like the TDT-God is supposed to answer the question after getting the prayer. I guess you mean more like you pray to get his already-determined answer.

I wanted to bring out the fact that TDT is about "logical uncertainty".

To choose according to TDT is to "choose as though controlling the logical output of the abstract computation you implement". What is the abstract computation you implement? Whatever it is, it's not something you can change (unless "you" are really Sam Beckett, in which case you override whatever was there previously) and the output follows by mathematical necessity from the input.

Therefore, you are choosing as though you are controlling the truth values of mathematical statements of the form "Given that this is my initial state, I ultimately do that". (Therefore, TDT gives the correct decisions for an imaginary being ("TDT-God" if you will) who can determine which mathematical statements are true.)

No. When you are acting as a human in the physical universe, you are not "making correct decisions for an imaginary being who can determine what will happen according to the laws of physics". Obeying the laws of physics is not a normative consideration, you are not optimizing your actions with a goal of not breaking out of Nature. According to the laws of physics, you always act according to the laws of physics, however you actually decide (according to the laws of physics).

Where do you think my previous comment begins to 'go wrong'?

The last paragraph feels off, although you are not saying anything explicitly/obviously wrong, so your intended meaning could be correct. Do you see what I was trying to say in my comment?

Obeying the laws of physics is not a normative consideration, you are not optimizing your actions with a goal of not breaking out of Nature. According to the laws of physics, you always act according to the laws of physics, however you actually decide (according to the laws of physics).

Yes, that all sounds right. That's why I was puzzled - I can't see where the tension is between that and what I'm saying.

To clarify: When I refer to "Given that this is my initial state, I ultimately do that" as a mathematical statement, the statement I'm referring to is really: "Given that this is my initial state, and given that these are the laws of physics, I will ultimately do that." I never had in mind any idea of "trying not to break out of Nature".

The idea of TDT-God arose as the answer to the question "for whom is it trivially true that TDT makes the correct decisions". (Just as it's absolutely, utterly trivial from the definitions that CDT is correct for Sam Beckett and EDT is correct for the spectator.)

(Just as it's absolutely, utterly trivial from the definitions that CDT is correct for Sam Beckett and EDT is correct for the spectator.)

CDT is not correct in game-theoretic situations where other agents can know things about you, with the effect of its incorrectness gradual. See Ch. 7 of TDT paper:

Modeling agents as influenced to some greater or lesser degree by "the sort of decision you make, being the person that you are", realistically describes present-day human existence.

The error could be tiny, but it could even be present where no other agents are around. On a bet with finely tuned probabilities and utilities, you'll rule wrong if you use CDT.

It's not at all clear in what sense one can be correct about "hoping for a particular outcome". The problem statement to which EDT is supposed to be an answer seems to be nonsense.

Let me explain about Sam Beckett (which admittedly I should have done at the outset): In each episode of Quantum Leap, Sam's consciousness teleports ("leaps") into the brain of some random person, and Sam then has to Do Something Important (e.g. Stop Something Bad From Happening). No-one else expects or notices the leap.

CDT is not correct in game-theoretic situations where other agents can know things about you, with the effect of its incorrectness gradual.

Assuming for argument's sake that Sam's "leap" was not foreseen by "Omega-like" beings, or anyone else, other agents can only be influenced by the kind of person who was there prior to the leap, not the kind of person Sam is.

(Note: I'm also assuming for argument's sake that after making that one decision, Sam "leaps out" leaving the original decision-maker in his place.)

I'm not sure whether I've got this exactly right yet, but whatever the defects of the wording, surely there's some problem statement for which CDT is trivially correct.

It's not at all clear in what sense one can be correct about "hoping for a particular outcome".

It's simply a question of: which of these possibilities, if you observed them, would maximize your expected final utility. There is a 'fact of the matter' about this, regardless of whether the spectator's judgements make any difference.

The problem statement to which EDT is supposed to be an answer seems to be nonsense.

The trouble is that they're all different kinds of nonsense. Sam doesn't and couldn't exist. Spectators don't exist and make no difference if they do, and TDT-God doesn't and couldn't exist. (I don't mean to denigrate TDT - I think it gives better answers than CDT or EDT. In some weird sense, TDT-God is a better approximation to "the decision-maker themselves" than Sam Beckett.)

It's not at all clear in what sense one can be correct about "hoping for a particular outcome".

It's simply a question of: which of these possibilities, if you observed them, would maximize your expected final utility. There is a 'fact of the matter' about this, regardless of whether the spectator's judgements make any difference.

If you take an updateless look at the situation, observations never influence utility. You can consider relevant "observational events" that contain the worlds that you expect being able to influence in the future given what you've observed, and then different observational events would have different utility. But possible reasons to actually compare these numbers are nonobvious (e.g. you might be able to control your observations, but then it's not necessarily a good way to parse the consequences of that control decision), so you could just as well treat them as unrelated. Also, these are heuristic rules for simplifying the updateless calculation (by moving a constant term outside maximization operator), and given that you are stipulated to be unable to influence anything, I wonder if these should just all be empty, giving you trivially zero utility, both for all observations and a priori.

[-][anonymous]13y0

Yeah, but that presupposes that God is just another being 'within time' who answers prayers by 'changing things that already exist'. This is theologically unsophisticated. God already knew what questions you were going to ask, or might have asked in each of the possible branches of the game. :-)

(Disclaimer: I am not a theist. Metadisclaimer: But I do like TDT, though!)

There's a reason for it being called Timeless Decision Theory. Thinking about it in timeful terms is going to get confusing.

Updateless (not Universal) Decision Theory is almost the same thing as TDT. Somewhat simplified: In TDT you act as though you were controlling the abstract computation that governs your action and take all other instances of that computation into account. In UDT you pretend to be the abstract computation.

I think you are mischarcterizing EDT, you should look at the Smoking Lesion problem to see how.

Your take on Causal Decision Theory (calling it "Classical" is perhaps not completely wrong, but better stick to Causal) is a bit better, but holding the past constant doesn't eliminate all instances of CDT underperforming (from a TDT/UDT perspective).

I was going to do the Smoking Lesion problem, but EDT doesn't seem to be well-defined under that. You know that you're using EDT, which affects things weird. If this means that you'll definitely not smoke, definitely not smoking would be optimal, since you'd be tied with every other strategy in which it's known it would result in never smoking, but the same goes for definitely smoking.

In TDT you act as though you were controlling the abstract computation that governs your action and take all other instances of that computation into account. In UDT you pretend to be the abstract computation.

So, what do they do different?

I was going to do the Smoking Lesion problem, but EDT doesn't seem to be well-defined under that. You know that you're using EDT, which affects things weird. If this means that you'll definitely not smoke, definitely not smoking would be optimal, since you'd be tied with every other strategy in which it's known it would result in never smoking, but the same goes for definitely smoking.

Huh? If you mean that knowledge of yourself being an EDT screens off your decision being evidence for subsequent cancer, what distinguishes this case from cases where that knowledge doesn't screen off? Remember, you are not allowed to look at causal arrows, because that would make you a CDT actor.

Your decision is evidence. It's just that if you knew before-hand that you were going to smoke, there's nobody else that knew they'd smoke that does any better.

Thinking about it more, it's just the idea of being certain about your future decisions that breaks it. I guess that means an ideal EDTist is a contradiction. If you assume that there's some chance of doing something besides what EDT suggests, then it only gives the "never smoke" answer. If it gave "always smoke" then you'd be 99% sure or whatever, in which case the 1% who ended up not smoking would be better off.

Edit: No, that still doesn't work. I don't think you can count the 1% as using a different strategy. I'm going to have to think about this more.

So, what do they do different?

UDT is conceptionally a lot simpler, but probably computationally more expensive.

As for differences in outcome, UDT implies perfect altruism between different instances of yourself (or at least every instance valuing each particular instance the same way all others do), TDT not necessarily (though it's suggestive). There may be other differences but that was the first I could think of.

I think you mean Updateless, not Universal.

Fixed.