Breaking Newcomb's Problem with Non-Halting states

[-]Shmi3y72

if I predict that the Predictor has only put money in one box, I will only take box B, since this will result in the accurate prediction that I will take one box, which will mean that both boxes contain money and allow me to take both boxes.

I think you gravely misunderstand the original Newcomb's setup, as well as various instantiations of it. The original setup focuses on what you will do, not how you arrive at the decision what to do. What you are suggesting is being an exception to the problem: someone so powerful, the predictor fails at predicting your actions. This is the essence of CDT two-boxing and various other ways to fight the hypothetical, that hypothetical being that you are predictable. If you posit that you can foil the predictor, the whole setup melts away.

[-]Slimepriestess3y76

Of course we care about the outcomes. This isn't necessarily about having perfect predictive power or outplaying the predictor, it's about winning Newcomb's problem. 3-Condition Marion, when presented with Newcomb's problem, runs the first two conditionals which is essentially a check to see how adversarial she can get away with being. If she predicted that she would be able to outgame the predictor at some point, she would take two boxes. However the Predictor is essentially perfect at its job, so the most she predicts being able to do is cause a non-halting recursion in her own decision tree, so that's no good. That cuts off the option to try and get 1,001,000 out of the Predictor and Marion settles for her second best outcome, which is two conclude she should just take Box B. The Predictor correctly predicts that Marion will employ this algorithm and only take box B, and thus fills Box B. Marion can't then decide to two-box, she's already reasoned out that there's no way for her to game the predictor.

3-Conditional Marion is interesting in part because something like adversarial play emerges from her decision algorithm simply from the fact she's trying to model the other agent and conditionally respond to her predictions of them. The other agent of course wants to satisfy its own values and block Marion from adversarially going too far, so she wants to calculate exactly how extortionary she can be before the other party defects. She can't get more than 1,000,000 out of the Predictor without losing 1,000,000 in her attempt for being too greedy and failing to cooperate. The same thing happens in Parfit's Hitchhiker.

[-]Multicore3y52

The part about two Predictors playing against each other reminded me of Robust Cooperation in the Prisoner's Dilemma, where two agents with the algorithm "If I find a proof that the other player cooperates with me, cooperate, otherwise defect" are able to mutually prove cooperation and cooperate.

If we use that framework, Marion plays "If I find a proof that the Predictor fills both boxes, two-box, else one-box" and the Predictor plays "If I find a proof that Marion one-boxes, fill both, else only fill box A". I don't understand the math very well, but I think in this case neither agent finds a proof, and the Predictor fills only box A while Marion takes only box B - the worst possible outcome for Marion.

Marion's third conditional might correspond to Marion only searching for proofs in PA, while the Predictor searches for proofs in PA+1, in which case Marion will not find a proof, the Predictor will, and then the Predictor fills both boxes and Marion takes only box B. But in this case clearly Marion has abandoned the ability to predict the Predictor and has given the Predictor epistemic vantage over her.

[-]Vladimir_Nesov3y40

"If I find a proof that the other player cooperates with me, cooperate, otherwise defect"

This would cooperate with CooperateBot (algorithm that unconditionally says "Cooperate").

[-]Multicore3y40

Yes. The one I described is the one the paper calls FairBot. It also defines PrudentBot, which looks for a proof that the other player cooperates with PrudentBot and a proof that it defects against DefectBot. PrudentBot defects against CooperateBot.

[-]Slimepriestess3y30

Yeah after the first two conditionals return as non-halting, Marion effectively abandons trying to further predict the predictor. After iterating the non-halting stack, Marion will conclude that she's better served by giving into the partial blackmail and taking the million dollars then she is by trying to game the last $1000 out of the predictor, based on the fact that her ideal state is gated behind an infinitely recursed function.

[-]Anon User3y30

I will execute this algorithm unless doing so produces non-halting states, in which case I will only take Box B since this creates a schelling point at my next most preferred worldstate.

At least two issues there:

halting problem is famously undecidable - in general, there is no way to know whether you are in an infinite loop vs a very long finite loop (but you can approximate with a timeout)
you presuppose some way to find the best fallback decision - how does that work? Almost like you are presupposing you already know how to slove the problem?

[-]Gunnar_Zarncke3y30

My understanding is that you don't solve the question "will it halt?" but will it halt in the time I'm given in the problem at hand.

[-]Anon User3y10

That's exactly what I meant by "you can approximate with a timeout".

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

16

Breaking Newcomb's Problem with Non-Halting states

16

16