Hello,

I've been reading through various Newcomblike Problems, in order to get a better understanding of the differences between each Decision Theory. From what I can tell, it seems like each Decision Theory gets evaluated based on whether they are able to "win" in each of these thought experiments. Thus, there is an overarching assumption that each thought experiment has an objectively "right" and "wrong" answer, and the challenge of Decision Theory is to generate algorithms that will guarantee that the agent will choose the "right" answer.

However, I am having some trouble in seeing how some of these problems have an objectively "winning" state. In Newcomb's Problem, obviously one can say that one-boxing "wins" because you get way more money than two-boxing, and these are the only two options available. Of course, even here there is some room for ambiguity, as said by Robert Nozick:

To almost everyone it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly.

But other Newcomblike Problems leave me with some questions. Take, for example, the Smoking Lesion Problem. I am told that smoking is the winning decision here (as long as we take a suspension of disbelief from the fact that smoking is bad in the real world). But I'm not sure why that makes such a big difference. Yes, the problem states we would prefer to smoke if we could, but our preferences can come from many different dimensions such as our understanding of the environment, not just a spontaneous inner desire. So when EDT says that you shouldn't smoke because it increase the probability of having a cancerous lesion, then one could say that that information has shaped your preference. To use a different analogy, I may desire ice cream because it tastes good, but I may still prefer not to eat it out of my understanding of how it impacts my health and weight. So in other words, I'm not sure I could objectively say that a preference influenced by EDT is "not winning". Unlike Newcomb's Problem, there isn't a quantifiable value such as money to say one is objectively better than the other.

A second question comes from the Prisoner's Dilemma. There is a common notion that C,C is the "winning" outcome, and Decision Theory strives to generate algorithms that will guarantee a C,C result. But for each individual prisoner, C,C isn't the best payoff they can get. So hypothetically, there could be an undiscovered Decision Theory in which Prisoner A is tricked into cooperating, only for B to betray him, resulting in B achieving the optimal outcome of C,D. Wouldn't such a result objectively see B "winning", because he got a higher payoff than C,C?

The third and most baffling example is from Counterfactual Mugging. Just like in Newcomb's Problem, we have a quantifiable value of money to track which decision is best. However, in this case I understand the "winning" result is to pay the mugger, despite the fact you become $100 poorer and gain nothing. I understand the general concept, that an updateless system factors in the counterfactual timeline where you could have gotten $10,000. What I don't understand is the seeming inconsistency, where Newcomb's problem defines "winning" by gaining the most money, which clearly doesn't apply here. How can we objectively say that refusing to pay the mugger is "not winning"?

If our definition of "winning" is being shaped by the branch of Decision Theory that we adopt, then I worry about falling into a kind of circular logic, because I thought that the point of Logical Decision Theory is to generate policies that guarantee a winning state.

Could you please provide a simple explanation of your UDT?