Nate Soares on the Ultimate Newcomb's Problem

2Vanessa Kosoy

3Selffriend

5Vanessa Kosoy

2Pattern

2Vanessa Kosoy

2Pattern

5So8res

-1Pattern

2StefanHex

3Oskar Mathiasen

1Insub

1StefanHex

6So8res

2JBlack

5So8res

1acgt

1JBlack

7So8res

3JBlack

5So8res

New Comment

What's special about this compared to transparent Newcomb with noise? (for which CDT and EDT also fail)

Hi Vanessa, what is a transparent Newcomb with noise? Any reference or link? Many thanks in advance.

Hi! Transparent Newcomb means you consider Newcomb's problem but the box is transparent so the agent knows whether it's empty or full. We then need to specify which counterfactual Omega predicts (agent seeing empty box, agent seeing full box, or some combination of both) but for our present purpose it doesn't matter. EDT is undefined because e.g. if you see a full-box in the relevant counterfactual then you cannot condition on two-boxing. This can be circumvented by adding a little noise, i.e. a small probability of Omega mispredicting.

I figured that the right answer is (and that FDT would also reason):

If I choose to take the big box only, I only get $1M.

If I I don't take the big box only, then that number is composite so I get $2M.

One way to not take the big box only is to take both boxes thus netting $2M+$1,000.

Separately, there's the option of factoring/primality testing the number. (I may be unsure of it's primality, but for less than $1,000 I should be able to get more sure.) (If there's enough time to decide, I could take the small box, use the money in it to get more info about that number, and then go back and decide if I'm going to take the other box.)

Edited to add:

If the two numbers weren't the same, then you could (as a quick primality/composability check):

- divide the larger by the smaller
- find the greatest common factor

The difference between your reasoning and the reasoning of FDT, is that your reasoning acts like the equality of the number in the big box and the number chosen by Omicron is robust, whereas the setup of the problem indicates that while the number in the big box is sensitive to your action, the number chosen by Omicron is not. As such, FDT says you shouldn't imagine them covarying; when you imagine changing your action you should imagine the number in the big box changing while the number chosen by Omicron stays fixed. And indeed, as illustrated in the expected utility calculation in the OP, FDT's reasoning is "correct" in the sense of winning more utility (in all cases, and in expectation).

The consequences of not having enough time to think.

winning more utility

more money.

EDIT: It's not clear what effects the amount of time restriction has. 'Not enough time to factor this number' could still be a lot of time, or it could be very little.

This scenario seems impossible, as in contradictory / not self-consistent. I cannot say exactly why it breaks, but at least the two statements here seem to be not consistent:

today they [Omicron] happen to have selected the number X

and

[Omega puts] a prime number in that box iff they predicted you will take only the big box

Both of these statements have implications for X and cannot both be always true. The number cannot both, be random, and be chosen by Omega/you, can it?

From another angle, the statement

FDT will always see a prime number

demonstrates that something fishy is going on. The "random" number X that Omicron has chosen -- and is in the box -- and seen my FDT -- is "always prime". Then it is not a random number?

Edit: See my reply below, the contradiction is that Omega cannot predict EDT's behaviour when Omicron chose a prime number. EDT's decision depends on Omega's decision, and EDT's decision depends on Omega's decision (via the "do the numbers coincide" link). On days where Omicron chooses a prime number this cyclic dependence leads to a contradiction / Omega cannot predict correctly.

Yes that was my reasoning too. The situation presumably goes:

- Omicron chooses a random number X, either prime or composite
- Omega simulates you, makes its prediction, and decides whether X's primality is consistent with its prediction
- If it is, then:
- Omega puts X into the box
- Omega teleports you into the room with the boxes and has you make your choice

- If it's not, then...? I think the correct solution depends on what Omega does in this case.
- Maybe it just quietly waits until tomorrow and tries again? In which case no one is ever shown a case where the box does not contain Omicron's number. If this is how Omega is acting, then I think you
*can*act as though your choice affects Omircon's number, even though that number is technically random on this particular day. - Maybe it just picks its own number, and shows you the problem anyway. I believe this was the assumption in the post.

- Maybe it just quietly waits until tomorrow and tries again? In which case no one is ever shown a case where the box does not contain Omicron's number. If this is how Omega is acting, then I think you

I think I found the problem: Omega is unable to predict your action in this scenario, i.e. the assumption "Omega is good at predicting your behaviour" is wrong / impossible / inconsistent.

Consider a day where Omicron (randomly) chose a prime number (Omega knows this). Now an EDT is on their way to the room with the boxes, and Omega has to put a prime or non-prime (composite) number into the box, predicting EDT's action.

If Omega makes X prime (i.e. coincides) then EDT two-boxes and therefore Omega has failed in predicting.

If Omega makes X non-prime (i.e. numbers don't coincide) then EDT one-boxes and therefore Omega has failed in predicting.

Edit: To clarify, EDT's policy is two-box if Omega and Omicron's numbers coincide, one-box if they don't.

If the agent is EDT and Omicron chooses a prime number, then Omega has to choose a different prime number. Fortunately, for every prime number there exists a distinct prime number.

EDT's policy is not "two-box if both numbers are prime or both numbers are composite", it's "two-box if both numbers are equal". EDT can't (by hypothesis) figure out in the allotted time whether the number in the box (or the number that Omicron chose) is prime. (It can readily verify the equality of the two numbers, though, and this equality is what causes it -- erroneously, in my view -- to believe it has control over whether it gets paid by Omicron.)

Why is the evaluation using a condition that isn't part of the problem? Isn't it trivial to construct other evaluation assumptions that yields different payouts?

Strictly speaking, there is no single payout from this problem. It's underspecified, and is actually an infinite family of problems.

Why is the evaluation using a condition that isn't part of the problem?

For clarity. The fact that the ordinal ranking of decision theories remains the same regardless of how you fill in the unspecified variables is left (explicitly) as an exercise.

This doesn’t seem true, at least in the sense of strict ranking? In the EDT case: if Omega’s policy is to place a prime in Box 1 whenever Omicron chooses a composite number (instead of matching Omicron when possible), then it predicts the EDT agent will choose only Box 1 and so is a stable equilibrium. But since it also always places a different prime whenever Omicron chooses a prime, EDT never sees matching numbers and so always one-boxes, therefore its expected earnings are no less than FDT

The variables with no specified value in the template given aren't the problem. The fact that the template *has the form that it does* is the problem. That form is unjustified.

The *only* information we have about Omega's choices is that choosing the same number as Omicron is* sometimes possible*. Assuming that its probability is the same - or even nonzero - for all decision theories is unjustified, because Omega knows what decision theory the agent is using and can vary their choice of number.

For example, it is compatible with the problem description that Omega *never* chooses the same number as Omicron if the agent is using CDT. Evaluating how well CDT performs in this scenario is then *logically impossible*, because CDT agents never enter this scenario.

Like many extensions, variations, and misquotings of well-known decision problems, this one opens up far too many degrees of freedom.

I agree that the problem is not fully specified, and that this is a common feature of many decision problems in the literature. On my view, the ability to notice which details are missing and whether they matter is an important skill in analyzing informally-stated decision problems. Hypothesizing that the alleged circumstances are impossible, and noticing that the counterfactual behavior of various agents is uncertain, are important parts of operating FDT at least on the sorts of decision problems that appear in the literature.

At a glance, it looks to me like the omitted information is irrelevant to all three decision algorithms under consideration, and doesn't change the ordinal ranking of payouts (except to collapse the rankings in some edge cases). That said, I completely agree that the correct answer to various (other, afaict) decision problems in the literature is to cry foul and point to a specific piece of decision-relevant underspecification.

The omitted information seems very relevant. An EDT agent decides to do the action maximizing

Sum P(outcomes | action) U(outcomes, action).

With omitted information, the agent *can't compute* the P() expressions and so their decision is undetermined. It should already be obvious from the problem setup that something is wrong here: equality of Omega and Omicron's numbers is part of the *outcomes*, and so arguing for an EDT agent to condition on that is suspicious to say the least.

The claim is not that the EDT agent doesn't know the mechanism that fills in the gap (namely, Omega's strategy for deciding whether to make the numbers coincide). The claim is that it doesn't matter what mechanism fills the gap, because for any particular mechanism EDT's answer would be the same. Thus, we can figure out what EDT does across the entire class of fully-formal decision problems consistent with this informal problem description without worrying about the gaps.

Nate Soares's distilled exposition of the Ultimate Newcomb's Problem, plus a quick analysis of how different decision theories perform, copied over from a recent email exchange (with Nate-revisions):