The Perspective-based Explanation to the Reflective Inconsistency Paradox

2avturchin

3dadadarren

1Ape in the coat

1Ape in the coat

2Dagon

3dadadarren

2Dagon

3dadadarren

2Gunnar_Zarncke

1Ape in the coat

1dadadarren

1Ape in the coat

1dadadarren

1Radford Neal

1dadadarren

1qbolec

New Comment

If I play this game many times, say 100, when I update on getting green ball, I will losing on average - and after 100 games I will be in minus. So in this game it is better not to update on personal position and EY used this example to demonstrate the power of his Updateless decision theory.

Another example: imagine that for each real me, 10 Boltzmann Brains appear in the universe. Should I go to gym? If I update that I am BB, I should not, as gym is useless for BBs, as they will disappear soon. However, I can decide a rule that I ignore BB and go gym, and in that case real me will get benefits of gym.

Numerically it is trivial to say the better thing to do (for each bet, for the benefit of all participants) is not to update. The question is of course how do we justify this. After all, it is pretty uncontroversial that the probability of urn-with mostly-green-balls is 0.9 when I get received the randomly assigned ball which turns out to be green. You can enlist a new type of decision theory such as UDT, or a new type of probability theory which allows two probability to be both valid depending on what betting scheme like Ape in the Coat's did). What I am suggesting is stick with the traditional CDT and probability theory, but recognizing the difference between the coordination vs personal strategy, because they are from different perspectives.

For the BB example you have posted, my long held position is that there is no way to reason about the probability of "I am a BB", even with the added assumption that for each real me there are 10 BBs appear in the universe. However, if you are really a BB, then your decision doesn't matter to your personal interest as you will disappear right momentarily. So you can make your personal decision entirely based on the assumption that you are not a BB. Or alternatively, and I would say not very realistically, you assume that real you and BB you care about each other and want to come up with a coordinating strategy that will benefit the entire group, then each faithfully follow that strategy without specifically thinking about each of their own personal strategy. In this example both will recommend the same decision of going to the gym.

I absolutely didn't create a new type of probability theory.

People just happen to have some bizarre misconeptions about probability theory like "you are always supposed to use the power set of the sameple space as your event space" or "you can't use more than one probability space to describe a problem". And I point that nothing in formal probability theory actually justify such claims. See my recent post and discussion with Throwaway2367 for another example.

Suppose there is another betting rule in the same setting:

Every person in the experiment is proposed to guess whether the coin has landed Heads or Tails. If they guessed correctly, they personally get 10 dollars, otherwise they personally lose 10 dollars.

Now you may notice that if you see green, the correct behaviour is to pick this personal bet and refuse the collective bet, thus simultaneiusly update and not update. Which may appear paradoxical, unless you understand that we are talking about different probabilities.

Wait. Presumably the pre-game discussion resolved "never bet", right? When you say "However, if a participant received a green ball, he shall update the probability of mostly-green-ball urn from 0.5 to 0.9.", that's just wrong! Your answer to question 2B is true in some sense, but very misleading in setups where the probabilities are formed based on different numbers of independent observers/predictors. In sleeping beauty, the possibility of multiple wakings confuses things, in this example, the difference between 2 and 18 green-ball-holders does the damage.

It's the symmetry assumption that shows the problem. If you knew you were in spot 1, then you'd be correct that a green ball is evidence to mostly-green (but you'd have to give up the symmetry argument). Thus, the question is equivalent to "what evidence do I have that I'm in position 1", which makes the sleeping beauty similarity even more clear.

Whether to use SSA or SIA in anthropic probability remains highly dependent on the question setup.

The Sleeping Beauty problem and this paradox are highly similar, I would say they are caused by the same thing—switching of perspectives. However, there is one important distinction.

For the current paradox, there is an actual sampling process for the balls. Therefore there is no need to assume a reference class of "I". Take who I am—which person's perspective I am experiencing the world from—as a given, and the ball-assigning process treats "I" and other participants as equals. So there is no need to interpret "I" as a a random sample from all 20 participants. You can perform the probability calculations like a regular probability problem. This means there is no need to make an SIA-like assumption.The question does not depend on how you construe "the evidence....that I'm in position I".

I think it is pretty uncontroversial if we take all the betting and money away from the question, we can all agree that the probability becomes 0.9 if I receive a green ball. So if I understand correctly, by disagreeing with this probability, you are in the same position as Ape in the Coat: the correct probability depends on the betting scheme. Which is consistent with your latter statement that "whether to use SSA or SIA...dependent on the question setup."

My position has always been not to ever use any anthropic assumptions: SSA or SIA or FNC or anything else: They all lead to paradoxes. Instead, take the perspective, or in your words: "I am in position I", as primitively given, and reason within this perspective. In the current paradox, that means either reason from the perspective of a participant and use the probability of 0.9 to make decisions for your own interest; or, alternatively, think in terms of a coordination strategy by reasoning from an impartial perspective with the probability remains at 0.5, but never mix the two.

I think we're more in agreement than at odds, here. The edict to avoid mixing or switching perspectives seems pretty strong. I'm not sure I have a good mechanism for picking WHICH perspective to apply to which problems, though. The setup of this (and of Sleeping Beauty) is such that using the probability of 0.9 is NOT actually in your own interest.

This is because of the cost of all the times you'd draw red and have to pay for the idiot version of you who drew green - the universe doesn't care about your choice of perspective; in that sense it's just incorrect to use that probability.

The only out I know is to calculate the outcomes of both perspectives, including the "relevant counterfactuals", which is what I struggle to define. Or to just accept predictably-bad outcomes in setups like these (which is what actually happens in a lot of real-world equilibria).

The probability of 0.9 is the correct one to use to derive "my" strategies maximizing "my" personal interest. e.g. If all other participants decides to say yes to the bet, what is your best strategy? Based on the probability of 0.9 you should also say yes. But based on the probability of 0.5 you would say no. However, the former will yield you more money. It would be obvious if the experiment is repeated a large number of time.

You astutely pinpointed that the problem of saying yes is not beneficial because you are paying the idiot versions of you's decision when you drew red. This analysis is based on the assumption that your personal decision prescribes actions of all participants in similar situations. (The assumption that Radford Neal first argued against, which I agree) Then such a decision is no longer a personal decision, it is a decision for all and is evaluated by the overall payoff. That is a coordination strategy, which is based on an objective perspective and should use the probability of 0.5.

The problem is setup in a way to make people confound the two. If say the payoff is not divided among all 20 participants, but instead among people holding red balls. The resultant coordination strategy would still be the same (the motivation of coordination can be the same group of 20 participants will keep playing for a large number of games). But the distinction between personal strategy maximizing personal payoff and coordination strategy maximizing the overall payoff would be obvious, i.e., personal strategy after drawing a green ball is to do whatever you want because it does not affect you (which is well known when coming up with the pre-game coordination plan), but coordination strategy would remain the same: Saying no to the bet. People would be less likely to mix the two strategies and pose it as an inconsistency paradox in such a setup.

With "However, if a participant received a green ball, he shall update the probability of mostly-green-ball urn from 0.5 to 0.9." dadadarren is just restating the reasoning from the Outlawing Anthropics post:

Let the dilemma be, "I will ask all people who wake up in green rooms if they are willing to take the bet 'Create 1 paperclip if the logical coinflip came up heads, destroy 3 paperclips if the logical coinflip came up tails'. (Should they disagree on their answers, I will destroy 5 paperclips.)" Then a paperclip maximizer, before the experiment, wants the paperclip maximizers who wake up in green rooms to refuse the bet. But a conscious paperclip maximizer who updates on anthropic evidence, who wakes up in a green room, will want to take the bet, with expected utility ((90% * +1 paperclip) + (10% * -3 paperclips)) = +0.6 paperclips.

I think you partly agree with dadadarren.

Whether to use SSA or SIA in anthropic probability remains highly dependent on the question setup.

Yes, in a way it is the question setup. But which part? I think dadadarren's answer is the use of terms like "I" and "now" in an ambiguous way.

First, this resolution implies probabilities are dependent on the context of betting schemes. It implies a reverse in reasoning: Instead of using the correct probability to generate correct decisions for bets, we ought to check what bets are offered and then work backward to get the correct probability.

A reverse in reasoning is an assumption that probabilities are not meaningful without betting schemes, that decision theory predates probability and betting is what justifies probabilities. I condemn this view in the my post.

As far as I understand, I'm doing basically the same thing as you do. I notice that there are two different questions, describe two different mathematical models - one that talks about probability based on the fact that a specific person in sees green, and the other based on the fact that any person sees green. Both these mathematical models exist before we specified any bets. Both of them are valid to describe different aspects of the setting.

And then, after the bets are specified, we check which model is relevant in our particular case. If the bet is about probability of any person to see green we resolve the decision theory problem using this model. If it's about probability of just one person in particular to see green - we use that model. If there are two bets at the same time, we use both at the same time. Easy as that.

But more importantly, I disagree with interpreting the first-person perspective (indexical such as "I' or "now") to objectively defined agents. As I have discussed in previous posts, perspective is primitive and there is no transcoding. People consistently try to use assumptions (e.g. SIA, SSA) to "explain away" the first-person perspective in anthropic problem because otherwise they would not be able to answer questions like self-locating probability. But it is accompanied with perspective switches that leads to paradoxes like the question presented here. For the current paradox, interpreting I as a decider who always sees green is akin to the logic of Self-Sampling Assumption, which I have long held against.

We seem to agree on basically every anthropic problem, up to a point where we are both double halfers in SB, who notice that there is no such thing as probability of *this day is Monday* and yet we have this weird crux that I can't exactly conceptualize. Could you help me, please?

Both SSA and SIA are silly in a general case, because people treat them as candidates for universal laws - as if either SSA or SIA have to *always *be true. And of course, then if you follow these theories off the cliff you will inevitably arrive to paradoxes.

That's absolutely not what I'm arguing for. I'm saying that to solve these problems we need to investigate the setting and act accordingly. If the problem is designed in such a way that I always see green, than I'm supposed to reason as if I always see green. If it's designed in such a way that I may not see green, then I'm supposed to reason accordingly - update my probability estimate when I see green according to the law of conservation of expected evidence. And inevitably I'll sometimes agree with SSA and sometimes with SIA. So what?

You are appealing to this "axiomatic I-ness" to justify difference in perspectives between people. But here I'm recreating the same result without the use of this extra entity. You can reduce I-ness to possible events that a person can observe in the setting and still have different probability estimates for a person created in a fission problem and a visitor who comes to one of the rooms at random. What else are you using this extra entity for?

We both argue the two probabilities, 0.5 and 0.9, are valid. The difference is how we justify both. I have held that "the probability of mostly-green-balls" are different concepts if there are from different perspectives: From a participant's first-person perspective, the probability is 0.9. From an objective outsider's perspective, even after I drew a green ball, it is 0.5. The difference come from the fact that the inherent self-identification "I" is meaningful only to the first-person. Which is the same reason for my argument for perspective disagreement from previous posts.

I purport the two probabilities should be used for questions regarding respective perspectives: for my decisions maximizing my payoffs, use 0.9; for coordination strategy prescribing action of all participants with the goal of maximizing overall payoffs, use 0.5. In fact, the paradox started with the coordination strategy from an objective viewpoint when talking about the pre-game plan, but it later switched to the personal strategy using 0.9.

I understand you do not endorse this perspective-based reasoning. So what is the logical foundation of this duality of probabilities then? If you say they are based on two mathematic models that are both valid, then after you drew a green ball, if someone asks about your probability of the mostly-green urn what is your answer? 0.5 AND 0.9? It depends?

Furthermore, using whatever probability that best match the betting scheme to me is a convenient way of avoiding undesirable answers without committing to a hard methodology. It is akin to endorse SSA or SIA situationally to get the least paradoxical answer for each individual question. But I also understand from your viewpoint you are following a solid methodology.

If my understand is correct you are holding that there is only one goal for the current question: maximizing overall payoff and maximizing my personal payoff is the same goal. And furthermore there is only one strategy: my personal strategy and the coordination strategy is the same strategy. .But because the betting setup, the correct probability to use is 0.5, not 0.9. If so, after drawing the green ball and being told all other participants have said yes to the bet, what is the proper answer to maximize your own gain? Which probability would you use then?

I purport the two probabilities should be used for questions regarding respective perspectives: for my decisions maximizing my payoffs, use 0.9; for coordination strategy prescribing action of all participants with the goal of maximizing overall payoffs, use 0.5. In fact, the paradox started with the coordination strategy from an objective viewpoint when talking about the pre-game plan, but it later switched to the personal strategy using 0.9.

I think I mostly agree with that.

I understand you do not endorse this perspective-based reasoning.

I agree that there can be valid differences in people perspectives. But I reduce them to differences in possible events that people can or can't observe. This allows to reduce all the mysterious anthropic stuff to simple probability theory and makes the reasoning more clear, I believe.

If you say they are based on two mathematic models that are both valid, then after you drew a green ball, if someone asks about your probability of the mostly-green urn what is your answer? 0.5 AND 0.9? It depends?

As I've written in the post, my personal probability is 0.9. More specifically, it's probability that the coin is Heads, conditionally on *me* seeing green.

But probability that the coin is Heads, conditionally on *any person *seeing green is 0.5

This is because while *I* may or may not see green, *someone *from the group always will. *Me*, in particular and *any person* have different possible events that we can observe. Thus we have different probabilities for these events. If we had the same possible events, for example, because I'm the only person in the experiment, then the probabilities would be the same

And then you just check which probability is relevant to which betting scheme. In this case it's the probability for *any person *not for* me.*

Furthermore, using whatever probability that best match the betting scheme to me is a convenient way of avoiding undesirable answers without committing to a hard methodology. It is akin to endorse SSA or SIA situationally to get the least paradoxical answer for each individual question.

Of course it would look like that from inside the SSA vs SIA framework. But that's because the framework is stupid.

Imagine there is a passionate disagreement of what color the sky is. Some people claim that it's blue, while other people claim that it's black. There is a significant amount of evidence supporting both sides. For example, a group of blue sky supporters went outside during the day and recorded that the sky is blue. Then a group of black sky supporters did the same during the night and recorded that the sky is black. Both groups argue that the other group made their experiment from the other side of the planet then the result would be different. With time, two theories are developped: Constant Day Assumption and Constant Night Assumption. Followers of CDA claim that one should reason about the color of the sky as if it's day, while followers of CNA claim that one should reason about the color of the sky as if it's night. Different experiments are pointing towards different directions and both sides claim, that while they indeed have to bite some bullets, at least it's not so bad as with the other side.

Now, suppose, someone comes forward and claims that sometimes the sky is blue and sometimes it's black. That the right behaviour is not to always assume that it's either day or night, but to to check which is currently true. That when it's day one should reason as if it's day and thus the sky is blue, while when it's night one should reason as if it's night that the sky is black. Surprisingly, this new approach fixes all the problems with both CDA and CNA.

But isn't it such an unprincipled position? Just a refusal to commit to one of the theories and switching between them?

No, of course not! It's just how we solve every other question - we actually look at what's going on before making assumptions! It's the core thing about epistemic rationality - if you want to systematically being able to make a map of cities - you have to actually explore them. That's the most solid metodology.

If my understand is correct you are holding that there is only one goal for the current question: maximizing overall payoff and maximizing my personal payoff is the same goal.

I'm not sure I understand this question. Probabilities are not just about payoffs, we can talk about them even without utility functions over outcomes. But if your probabilistic model is correct then whatever betting scheme is specified you should be able to apply it to get the best payoff. So getting the best payoff isn't the goal in itself but it can be used as a validation for your mathematical model, though one should be careful and not base everything on just betting as there are still weird edge cases, such as Sleeping Beauty which my future posts will explore.

If so, after drawing the green ball and being told all other participants have said yes to the bet, what is the proper answer to maximize your own gain? Which probability would you use then?

If I'm the only decider then probability for *any person* to see green becomes the same as probability of *me* in particular to see green. And so I should say yes to the "collective" bet.

I guess my main problem with your approach is that I don't see a clear rational of which probability to use, or when to interpret it as "I see green" and when to interpret it as "Anyone see green" when both of the statement is based on the fact that I drew a green ball.

For example, my argument is that after seeing the green ball, my probability is 0.9, and I shall make all my decisions based on that. Why not update the pre-game plan based on that probability? Because the pre-game plan is not my decision. It is an agreement reached by all participants, a coordination. That coordination is reached by everyone reasoning objectively, which does not accommodate any any first-person self identification like "I". In short, when reasoning from my personal perspective, use "I see green"; when reasoning from an objective perspective, use "someone see green". All my solution (PBR) for anthropic and related questions are based on the exact same supposition of the axiomatic status of the first-person perspective. It gives the same explanation, and one can predict what this theory says about a problem. Some results are greatly disliked by many, like the nonexistence of self-locating probability and perspective disagreement, but those are clearly the conclusion of PBR, and I am advocating it.

You are arguing the two interpretation of "I see green" and "Anyone sees green" are both valid, and which one to use depends on the specific question. But, to me, what exact logic dictates this assignment is unclear. You argue that the bets structured not depending on which exact person gets green, then "my decision" shall be based on "anyone sees green", it seems to me, a way of simply selecting whichever interpretation that does not yield a problematic result. A practice of fitting theory to results.

To the example I brought up in the last reply, what would you do if you drew a green ball and were told that all participants said yes, you used the probability of 0.9. Rational being you are the only decider in this case. It puzzles me because in exactly what sense "I am the only decider?" Didn't other people also decide to say "yes"? Didn't their "yes" contributed to whether the bet would be taken the same way as your "yes"? If you are saying I am the only decider because whatever I say would determine whether the bet would be taken. How is that different from deriving other's responses by using the assumption of "everyone in my position would have the same decision as I do"? But you used probability of 0.5 ("someone sees green") in that situation. If you are referring you being the only decider in a causal—counterfactual sense, then you are still in the same position as all other green ball holders. What justifies the change regarding which interpretation—which probability (0.5 or 0.9)—to use?

And also the case of our discussion about perspective disagreement in the other post where you and cousin-it were having a discussion. I, by PBR, concluded there should be a perspective disagreement. You held that there won't be a probability disagreement, because the correct way for Alice to interpret the meeting is "Bob has met Alice in the experiment overall" rather than "Bob has met Alice today". I am not sure your rational for picking one interpretation over the other. It seems the correct interpretation is always the one that does not give the problematic outcome. And that to me, is a practice of avoiding the paradoxes but not a theory to resolve them.

I'm not sure what you're saying here.

Certainly an objective outside observer who is somehow allowed to ask the question, "Has somebody received a green ball?" and receives the answer "yes" has learned nothing, since that was guaranteed to be the case from the beginning. And if this outside observer were somehow allowed to override the participants' decisions, and wished to act in their interest, this outside observer would enforce that they do not take the bet.

But the problem setup does not include such an outside objective observer with power to override the participants' decisions. The actual decisions are all made by individual participants. So where do the differing perspectives come from?

Perhaps of relevance (or perhaps not): If an objective outside observer is allowed to ask the question, "Has somebody with blonde hair, six-foot-two-inches tall, with a mole on the left cheek, barefoot, wearing a red shirt and blue jeans, with a ring on their left hand, and a bruise on their right thumb received a green ball?", which description they know fits exactly one participant, and receives the answer "yes", the correct action for this outside observer, if they wish to act in the interests of the participants, is to enforce that the bet is taken.

I am trying to point out the difference between the following two:

(a) A strategy that prescribes all participants' actions, with the goal of maximizing the overall combined payoff, in the current post I called it the coordination strategy. In contrast to:

(b) A strategy that that applies to the single participant's action (me), with the goal of maximizing my personal payoff, in the current post I called it the personal strategy.

I argue that they are not the same things, the former should be derived with an impartial observer's perspective, while the later is based on my first-person perspective. The probabilities are different due to self-specification (indexicals such as "I") not objectively meaningful, giving 0.5 and 0.9 respectively. Consequently the corresponding strategies are not the same. The paradox equate the two,:for pre-game plan it used (a), while for during-the-game decision it used (b) but attempted to confound it with (a) by using an acausal analysis to let my decision prescribing everyone's actions, also capitalizing on the ostensibly convincing intuition of "the best strategy for me is also the best strategy for the whole group since my payoff is 1/20 of the overall payoff."

Admittedly there is no actual external observer forcing the participants to make the move, however, by committing to coordination the participants are effectively committed to that move. This would be quite obvious if we modified the question a bit: If instead of dividing the payoff equally among the 20 participants, say the overall payoff is only divided among the red ball holders. (We can incentivize coordination by letting the same group of participants play the game repeatedly for a large number of games.) What would the pre-game plan be? It would be the same as the original setup: everyone say no to the bet. (In fact if played repeatedly this setup would pay the same as the original setup for any participant). After drawing a green ball however, it would be pretty obvious my decision does not affect my payoff at all. So saying yes or no doesn't matter. But if I am committed to coordination, I ought to keep saying no. In this setup it is also quite obvious the pre-game strategy is not derived by letting green-ball holders maximizing their personal payoff. So the distinction between (a) and (b) is more intuitive.

If we recognize the difference between the two, then (b) does not exactly coincide with (a) is not really a disappointment or a problem requiring any explanation. Non-coordination optimal strategies for each individual doesn't have to be optimal in terms of overall payoff (as coordination strategy would).

Also I can see that the question of "Has somebody with blonde hair, six-foot-two-inches tall, with a mole on the left cheek, barefoot, wearing a red shirt and blue jeans, with a ring on their left hand, and a bruise on their right thumb received a green ball?" comes from your long held position of FNC. I am obliged to be forthcoming and say that I don't agree with it. But of course, I am not naive enough to believe either of us would change our minds in this regard.

Cool puzzle. (I've wrote like 4 versions of this comment each time changing explanation and conclusions and each time realizing I am still confused).

Now, I think the problem is that we don't pay much attention to: **What should one do when one has drawn a red ball?***(Yeah, I strategically use word "one" instead of "I" to sneak assumption that everyone should do the same thing)*

I know, it sounds like an odd question, because, the way the puzzle is talked about, I have no agency when I got a red ball, and I can only wait in despair as the owners of green balls make their moves.

And if you imagine a big 2-dimensional array where each of 100 columns is an iteration of a game, and each of 20 rows is a player, and look at an individual row (a player) then, we'd expect, say 50 columns to be "mostly green", of them roughly 45 have the player "has drawn green" cell, and 50 columns to be "mostly red", with 5 of them having "has drawn green". If you focus just on those 45+5 columns, and note that 45:5 is 0.9:0.1, then yeah, indeed the chance that the column is "mostly green" given "I have drawn green" is 0.9.

AND coincidentally, if you only focus on those 45+5 columns, it looks like to optimize the collective total score limited to those 45+5 columns, the winning move is to take the bet, because then you'll get 0.9*12-0.1*52 dollars.

But what about the other 50 columns??

What about the rounds in which that player has chosen "red"?

Turns out they are mostly negative. So negative, that it overwhelms the gains of the 45+5 columns.

So, the problem is that when thinking about the move in the game, we should not think about

1. "What is the chance one is in mostly green column if one has a green ball?" (to which the answer is 90%)

but rather:

2. "What move should one take to maximize overall payout when one has a green ball?" (to which the answer is: pass)

and that second question is very different from:

3. "What move should one take to maximize payout limited just to the columns in which they drew a green ball when seeing a green ball?" (to which the answer is: take the bet!)

The 3. question even though it sounds very verbose (and thus weird) is actually the one which was mentally substituted (by me, and I think most people who see the paradox?) naturally when thinking about the puzzle, and this is what leads to paradox.

The (iterated) game has 45+5+50 columns, not just 45+5, and your strategy affects all of them, not just the 45+5 where you are active.

How can that be? Well, I am not good at arguing this part, but to me it feels natural, that if rational people are facing same optimization problem, they should end up with same strategy, so whatever I end up doing I should expect that others will end up doing it too, so I should take that into account when thinking what to do.

It still feels feel a bit strange to me mathematically, that a solution which seems to be optimal for 20 various different subsets (each having 45+5 columns) of 100 columns individually, is somehow not optimal for the whole 100 columns.

The intuition for why it is possible is that a column which has 18 green fields in it, will be included in 18 sums, and a column which has just 2 green fields in it will be counted in just 2 of them, so this optimization process, focuses too much on the "mostly green" columns, and neglects those "mostly red".

Is it inconsistent to at the same time think:

"The urn is mostly green with ppb 90%" and

"People who think urn is mostly green with ppb 90% should still refuse the bet which pays $12 vs $-52"?

It certainly *sounds* inconsistent, but what about this pair of statements in which I've only changed the first one:

"The urn is mostly green with ppb 10%" and

"People who think urn is mostly green with ppb 90% should still refuse the bet which pays $12 vs $-52?"

Hm, now it doesn't sound so crazy, at least to me.

And this is something a person who has drawn a red ball could think.

So, I think the mental monologue of someone who drew a green ball should be:

"Yes, I think that the urn is mostly green with ppb 90%, by which I mean, that if I had to pay -lg(p) Bayes points when it turns out to be mostly green, and -lg(1-p) if it isn't, then I'd choose p=0.9. Like, really, if there is a parallel game with such a rules, I should play p=0.9 in it. But still, in this original puzzle game, I should pass, because whatever I'll do now, is whatever people will tend to do in cases like this, and I strongly believe that "People who think urn is mostly green with ppb 90% should still refuse the bet which pays $12 vs $-52", because I can see how this strategy optimizes the payoff in all 100 columns, as opposed to just those 5+45 I am active in. The game in the puzzle doesn't ask me what I think the urn contained, nor for a move which optimizes the payoff limited to the rounds in which I am active. The game asks me: what should be the output of this decisions process so that the sum over all 100 columns is the largest. To which the answer is: pass".

Eliezer Yudkowsky's post Outlawing Anthropics: An Updateless Dilemma brought up a paradox involving reflective inconsistency. It was originally constructed with anthropic terms but can also be formulated in non-anthropic context. Recently, Radford Neal and Ape in the coat discussed it in detail with different insights. Here I am presenting how my approach to the anthropic paradox—perspective based reasoning—would explain said problem.

The paradox in the non-anthropic context is as follows:

The paradox is presented as follows: the combined payoff if the mostly-green-ball urn is chosen is $12, in comparison to negative $52 dollars if the mostly-red-ball urn is chosen. As they are equiprobable, the optimal strategy is clearly not to take the bet. However, if a participant received a green ball, he shall update the probability of mostly-green-ball urn from 0.5 to 0.9. Then the expected payoff of taking the bet would be 0.9×12+0.1×(-52) which is positive. So taking the bet would be the optimal choice, furthermore, "

" meaning the participants will all change their original strategy, and entering the bet to lose money, which is a departure from the pre-game plan.Since everyone with the green ball is in the exact situation as I am, we will reach the same decision## Two Different Questions

Let's not scrutinize the above logic for the moment. Instead, consider the follow run-of-the-mill probability questions:

Clearly the probability is 0.5 and no sane person would take the bet.

There is no new information since it is already known there will exist some participant with green balls. So there is no change to either the probability nor the decision.

Again, the probability is clearly 0.5 and there is no way I am taking the bet. (The exact number of pecuniary payout is not important, as long as they are of the rough ratio.)

Basic Bayesian update would change my probability to 0.9 and entering the bet is the optimal decision.

## The Erroneous Grafting

The above two problems are as simple as they get for probability questions, and the decisions are nothing more than vanilla causal decision theory can't handle. The supposed reflective inconsistency in the paradox, however, is the result of mixing the two. It takes Question 1A and connected it to 2B then points at the supposed contradiction, while the contradiction is precisely caused by this sleight of hand. Specifically, the paradox switched a probability from an objective perspective with an ostensibly similar, yet categorically different, probability from the participant's first-person perspective.

The question is setup to encourage the confounding of the two: From the outset, the payoff is aggregated for all participants. Even though the paradox take the long way to present it as paying $1 or taking away $3 for each person, "

". So there is no need to consider any participant's individual circumstances. The overall payoff is the only thing that needs so be optimized for the coordination strategy. Participants' personal interest is conveniently set to coincide with it.The total wins and losses are divided equally among the 20 people at the endThen there is the problem of numerical difference: For Question 1, there is only one decision; for Question 2 multiple participants makes their individual decisions. To negate this problem, there has to be a rule transcribing the multiple participants' decisions to the single coordination decision. So the paradox dictates "If

with a green ball decides not to take the bet, the bet is off." This numerical difference is also why some versions feel compelled to add the additional premise such as "anyone" It gives the participants a major incentive to avoid different answers.the game punishes all players grievously if the decisions from the green ball holders are different from one another.One last step to enable the grafting is not part of the question setup, but a step in its analysis. That's the assumption "

**Since everyone with the green ball is in the exact situation as I am, we will reach the same decision**". I want to be clear here. I am not suggesting this statement is necessarily, or even likely, wrong. I think the factual correctness of the statement is not pertinent for the paradox's resolution. However, it does has the effect of blurring the distinction between the two different concepts. Let's just leave it as that for the time being.Both the setup points and the assumption are hinting toward the same idea which would make the incoming sleight of hand inconspicuous: They are hinting the personal decision and the coordination decision are identical: that their respective strategies, as well as the probabilities, are the same thing.

## The Perspective Probability Difference

The question says the participants gather before the experiment to discuss a coordinating strategy. It is easy to see, because all participants are in symmetrical positions, the coordination strategy shall aim to maximize the overall payoff. As Questions 1A shows, the best action would be not taking the bet. So at least one person with green balls must say no to it. Because there is no communication afterwards, and the balls are randomly assigned, the strategy would dictate everybody saying no if presented with the choice. So, committing to coordinate, I shall say no to the bet.

I would venture to guess that's how most people arrived at the pre-game plan. Notably it is derived not from any particular participant's viewpoint but from an objective perspective (like that of the outsider's). The participants all follow this

oneoptimal strategy. Abiding to the coordination means you do not decide for yourself but carry out the common strategy. As if an objective outsider makes the decision, then makes the participants carrying out the moves. In contrast, imagine there is no coordination, my personal pre-game plan needs to be derived by maximizing my personal payoff (it may involve finding Nash equilibrium points, and saying "no" would still be one of the solutions). My guess is most of us did not derive the pre-game plan this way since the question asked for coordination.The important thing here is to realize that those are two different decision processes, albeit likely recommending the same move for this instance.

^{[1]}The coordination strategy is based on the objective perspective, and, prescribes the move for all; whereas the personal decision is based on my first-person perspective, how others act is the input rather than output of the decision process. The paradox uses the former decision in the first half of its reasoning.What should be the coordinating decision after the I get the green ball? Recall the coordination is derived from the objective viewpoint, self-identity is only meaningful for the first-person perspective (you can't use "I" to specify a particular participant while reasoning objectively). Therefore the information available is some unspecified participant received a green ball, nothing that's not known before. So, like Question 1B, the probability remains at 1/2 and there is no change to the coordination strategy of everyone saying no to the bet.

There is no denying that from the participant's first-person perspective the probability changed to 0.9 like in Question 2B. But as I have discussed in perspective disagreement: two semantically identical questions from different perspectives are not the same question, and they can have different answers. The difference here has the same reason as my earlier example: Self specification is only meaningful for the first-person perspective. If anything, the current case is less dramatic than the my previous problem where two parties communicating, fully sharing all information, would nonetheless give different answers and be correct for both.

Switching perspective is the culprit of the inconsistency. Rather than analyzing the coordination strategy from the objective viewpoint, the paradox derived the probability of 0.9 from a participant's perspective. Basing on this first-person probability, with the assumptionthat everyone else would act the same way as I do, the paradox described a collective behaviour for all participants and posit it as the new coordination strategy. While in reality, a coordination strategy should prescribe instead of describe.

## Other Takes on the Paradox

In his post, Radford Neal has aptly demonstrated that the inconsistency disappears if we do not assume everyone in the same situation would reason and act the exact same way. I agree. This evaluation is correct. However, in my humble opinion, that is not because such assumption is unrealistic or possibly factual inaccurate. The real reason, which Radford Neal also pinpointed, is people tend to use that assumption and treat my personal decision and other participant's decision as

one. This acausal analysis would consider my personal strategy prescribes every participants' action, confounding it for the coordination strategy. Without this acausal analysis, people won't make the mistaken of using first-person probability in determining the coordination strategy.I also want to call attention to the fact that after drawing a green ball, a personal strategy different from the pre-game plan is not a logical inconsistency. For instance, say you received a green ball, by assuming other participants would follow the pre-game plan and all say no to the bet, you derived your best move can be either saying yes or no. It makes no difference to your interest. This "yes or no" strategy is different from the "always say no" pre-game plan. Prof Neal said it is a little disappointing that the new plan do not exactly recommend the pre-game plan. But it doesn't have to. The coordination strategy is still the same: "always say no.". The difference is because you are only thinking for yourself now: no longer committed to coordinate.

^{[2]}Ape in the coat's solution is committed to the acausal analysis. He resolved the issue of inconsistency by proposing the validity of two different probabilities of which urn is chosen. In that aspect, it's similar to the current post. However, Ape in the coat does not suppose the difference in probability comes from different perspectives and the subsequent difference between coordination or personal strategies. In fact, he specifically mentioned that self specification has nothing to do with probability or math in general. In contrast to the current post, he proposed both probabilities are valid because the ways of interpreting the fact of "I received a green ball" is depended on the betting schemes.

For instance, if a betting scheme's payoff is based on individuals (like Question 2), then the correct way to interpret "I received a green ball" would be "a random person received a green ball" generating the probability of 0.9. However, for a betting scheme specified in the paradox, where the participants with green balls make the decision, the correct way to interpret "I received a green ball" must be "A person who's always getting green balls received a green ball" which is no new info so the probability remains at 0.5.

I disagree with this approach for several reasons. First, this resolution implies probabilities are dependent on the context of betting schemes. It implies a reverse in reasoning: Instead of using the correct probability to generate correct decisions for bets, we ought to check what bets are offered and then work backward to get the correct probability.

But more importantly, I disagree with interpreting the first-person perspective (indexical such as "I' or "now") to objectively defined agents. As I have discussed in previous posts, perspective is primitive and there is no transcoding. People consistently try to use assumptions (e.g. SIA, SSA) to "explain away" the first-person perspective in anthropic problem because otherwise they would not be able to answer questions like self-locating probability. But it is accompanied with perspective switches that leads to paradoxes like the question presented here. For the current paradox, interpreting I as a decider who always sees green is akin to the logic of Self-Sampling Assumption, which I have long held against.

^{^}Because Perspective Based Reasoning consider the decisions from on different perspectives as distinct from one another, it has no problem recognizing different optimal strategies: A possible, though not very satisfying, solution to problems such as the Newcomb paradox.

^{^}We can even have a case that the correct personal strategy is completely opposite to the pregame plan: For instance, after drawing a green ball, say, you assumed others would all say yes to the bet. Then using the first-person probability of 0.9, you concluded the best act for yourself is to accept the bet. As it turns out, others did say yes, maybe even based on the exact reasoning that you had. So everyone's assumption about other's choice are factually correct, and everyone's decision is the best decision for themselves. But, one may ask, isn't everyone worse off? Isn't this a reflective inconsistency? No. The optimal coordination strategy hasn't changed, if participants were committed to it, they would keep saying no. The change in decision is because they are no longer coordinating like when they were while making the pre-game plan. A everyone-for-themselves situation could be worse for everyone than cooperation, even when they all made the best decision for themselves. And in this case it's actualized with an rather unfortunate, nevertheless correct, assumption of other's actions.