I'm not sure this changes the underlying disagreement. "desire to be predicted to one-box" is universal. Two-boxers deny the causal link from the prediction to the action. One boxers deny their own free will, and accept that the prediction must match the behavior.
Talk about "desire" or "intent" or "disposition" is only an attempt to elucidate that the full question is whether the prediction can actually be correct.
In other words, the results table below, the whole debate is whether X is reachable in the in-game universe (which may or may not map to our universe in terms of causality)
Your Action
Omega predict One-box Two-box
One-box $1M X
Two-box X $1000
Content warning: silly.
Here's Adam. Adam is an agent who believes in one-boxing. He knows arguments for one-boxing, and says them often. Unfortunately, Adam had a stroke last year, and has spatial neglect. His conscious reason does not correctly perceive the controls on the predictor's machine. When he consciously intends to press the "one-box" button, he reliably instead presses the "two-box" button; when this happens, he thinks he pressed the "one-box" button. Adam is surprised when he is given two boxes (one empty) because his beliefs are not fully hooked up to his perceptions and actions. But so long as the control panel is set up this way, it is correct to predict that Adam will two-box; so that is what the predictor will predict.
Here's Bob. Bob is an adorable baby. He doesn't read buttons or think about decision theory. But if you put him in a high-chair in front of a control panel, he will flail his hands at the buttons. Bob is left-handed, and flails his left hand further and more vigorously than his right. As a result, he reliably hits the "one-box" button first. The predictor determines that Bob has a disposition to one-box, and predicts accordingly.
Here's the predictor's buddy. He points out to the predictor that it's weird that left-handed babies are way richer than right-handed babies. "What's up with that? Do right-handed babies not like winning?"
The predictor shrugs and says, "I don't care about 'winning' either. I just predict button pushes. It's like content moderation, but with less trauma. I still think they're gonna replace me with AI, though."
The predictor's buddy's friend hears about this and gets really annoyed because his whole job is improving accessibility for financial user interfaces but he just got fired because the Niemöller counter reached "people with disabilities" a while ago. And he comes around and gives the predictor what-for about the biases encoded into the control panel.
"Yo," says the predictor, "I didn't build the dang buttons. I just work here; and they'll fire me and replace me with another model if my accuracy drops. Look, I proposed replacing the buttons with a voice interface but they told me there's this thing called a 'tube ox' and the voice model couldn't tell the difference and started ordering oxygen tubes for everyone as if they were tungsten cubes or something. Adam's a two-boxer because if I call him a one-boxer and he two-boxes, my ass is the one that gets fired."
The predictor's buddy's friend's pal drops in. "Wait a minute, I'm totally confused, is this like faith vs. works or something? Is the first guy called Adam because..."
"NO!" said everyone else.
Love this! Great examples to illustrate that your identity as a one boxer is rooted in your behavior instead of your mind. And pretty cool to think this mirrors theological debates that have gone on for so long
This post assumes knowledge of Newcomb's problem. It is provided here for reference and can be skipped if you already have the background knowledge.
Background on Newcomb's Problem
Newcomb's problem is a game where you are faced with two choices: pick only an opaque box (that has either zero dollars or one million dollars), or additionally pick a transparent box that visibly contains a thousand dollars.
At first glance, it is obvious that you should always pick both boxes in order to get the extra thousand dollars.
The catch is that before you even knew you were about to play this game, someone made a prediction about what decision you were about to make, and the contents of the opaque box are based entirely on the prediction that they made of you. If they predicted you would just take the opaque box, then it will have a million dollars. Otherwise, it will have zero dollars.
Additionally, this predictor is known to be very good at predicting people's decisions in this game, and you have witnessed that they have never made a mistake in the past, even after hundreds of fair trials (there was no cheating or foul play going on - and you know this).
Now that you have been given all this information about how the game works, you are informed that the prediction for your decision has already been made before you even knew that this game existed. Accordingly, the contents of the box have been set and cannot be changed. You are now faced with the decision to take just the opaque box, or take both the opaque box and the transparent box.
After hearing this problem, most people immediately gravitate to one decision or the other, and feel completely confident in it. The issue is that this problem seems to divide people evenly on what the best thing to do is.
On one hand, it is a fact that one boxers have historically all become millionaires after one-boxing whereas two boxers have historically had nothing to show for their rationality other than a sad thousand dollars. And this isn't just historical coincidence, either. Given that the predictor is highly accurate, it is a fact that if you take the expected value of one-boxing, it is a much higher expected value than the expected value of two-boxing.
And yet, on the other hand, it is also a fact that, by the time you are presented with this choice, no one can change the contents of the opaque box anymore; therefore, for any unchangeable prediction that was made, the two-boxing decision yields an extra thousand dollars relative to the one-boxing decision.
Newcomb's problem is a thought experiment that pits expected value calculations against the dominance principle. One boxers tend to be one boxers because one boxers get rich as a result of being one boxers. Two boxers tend to be two boxers because they apply reason to this problem to see that the decision to two box yields them an extra thousand dollars. The debate has not been settled and the discourse around it rages on.
I was reading this post that sums up the discourse on Newcomb's problem quite nicely. The post argues that one boxers care about what the optimal agent type is, whereas two boxers care about what the optimal decision is. So perhaps one boxers and two boxers are simply talking past each other.
After all, everyone agrees that, if the prediction is based on your disposition, then it is optimal to have had the disposition of a one boxer at the time that the prediction was made. The difference is that the two boxer further insists that you have no control over your disposition at the time the prediction was made, and once the prediction has already been made, two-boxing is the optimal decision because it causes you to get an extra thousand dollars relative to the one-boxing decision. It seems, then, that one boxers are simply failing to see the two boxer's point because they hyper-fixate on the optimal disposition, not on the optimal decision.
I believe this conclusion is exactly backwards from the reality. It is the two boxer who places undue emphasis on disposition, whereas one boxers care only about results.
To see this, imagine if we lived in a world where everyone suffers from occasional, involuntary muscle spasms.
In this world, we present these spasmic participants with a version of Newcomb's problem where the mechanism by which you make your decision is to press one of two buttons that correspond to one-boxing or two-boxing.
We then set up a predictor who can account for these muscle spasms while still producing accurate predictions.
In this world, some of the people who one-box actually intended to two-box before they had an involuntary muscle spasm that made them accidentally press the one-box button. Call such agents the involuntary one boxers.
Despite having the disposition of a two boxer, the involuntary one boxers are still one boxers. It doesn't matter that they intended to two-box to scoop up the "extra" thousand dollars - they in fact one-boxed, and so they are a one boxer.
The predictor's goal is to predict which button gets pressed - nothing more, nothing less. It is completely possible that the predictor does not care at all about what your disposition is - there is a world in which muscle spasms are so frequent that the predictor simply needs to be good at predicting the direction of these involuntary muscle spasms without caring one bit about the agents' dispositions.
We can extend this thought to the actual world, where involuntary muscle spasms are infrequent. Whether you are a one boxer or a two boxer depends solely on whether you actually one-box or two-box. The only thing we know in Newcomb's problem is that the predictor accurately predicts decisions.
Given this, if you have the disposition of a two boxer, and somehow one-box anyway, you will reap the rewards of a one boxer.
You don't have to understand why being a one boxer works in order to acknowledge that it works. And being a one boxer is simply about actually one-boxing. In other words, you are not a one boxer merely for thinking like a one boxer. You can talk the talk, but what matters is whether you walk the walk.
If we follow two boxers' conventional line of thinking, it is clear that their argument hyper-fixates on the importance of disposition. Their argument goes:
You cannot influence what your disposition in the past was, and the prediction is made based on your disposition, so you cannot influence the contents of the opaque box. Therefore, you might as well focus on what you can control going forward, and the only relevant decision before you now is whether you grab an extra one thousand dollars or not.
It may well be true that you cannot influence what your disposition in the past was. However, the muscle spasm thought experiment clearly shows that the prediction itself can be completely unrelated to your disposition.
You may well have had the disposition of a two boxer. And yet, if you one-box anyway, you actually are a one boxer. It doesn't matter what factors led you to the point of one-boxing, what matters is that you in fact one-boxed.
From this perspective, the one boxer can agree with the fundamental tenet of focusing only on what you can control while ignoring everything that you cannot.
You may not be able to control who you were in the past, but you can control which button you press. And that is all that matters.
If you press that one box button, then you will have revealed that you were a one boxer all along.