Let us start with a (non-quantum) logical coinflip - say, look at the heretofore-unknown-to-us-personally 256th binary digit of pi, where the choice of binary digit is itself intended not to be random.

If the result of this logical coinflip is 1 (aka "heads"), we'll create 18 of you in green rooms and 2 of you in red rooms, and if the result is "tails" (0), we'll create 2 of you in green rooms and 18 of you in red rooms.

After going to sleep at the start of the experiment, you wake up in a green room.

With what degree of credence do you believe - what is your posterior probability - that the logical coin came up "heads"?

There are exactly two tenable answers that I can see, "50%" and "90%".

Suppose you reply 90%.

And suppose you also happen to be "altruistic" enough to care about what happens to all the copies of yourself. (If your current system cares about yourself and your future, but doesn't care about very similar xerox-siblings, then you will tend to self-modify to have *future* copies of yourself care about each other, as this maximizes your *expectation *of pleasant experience over *future* selves.)

Then I attempt to force a reflective inconsistency in your decision system, as follows:

I inform you that, after I look at the unknown binary digit of pi, I will ask all the copies of you in green rooms whether to pay $1 to every version of you in a green room and steal $3 from every version of you in a red room. If they all reply "Yes", I will do so.

(It will be understood, of course, that $1 represents 1 utilon, with actual monetary amounts rescaled as necessary to make this happen. Very little rescaling should be necessary.)

(Timeless decision agents reply as if controlling all similar decision processes, including all copies of themselves. Classical causal decision agents, to reply "Yes" as a group, will need to somehow work out that other copies of themselves reply "Yes", and then reply "Yes" themselves. We can try to help out the causal decision agents on their coordination problem by supplying rules such as "If conflicting answers are delivered, everyone loses $50". If causal decision agents can win on the problem "If everyone says 'Yes' you all get $10, if everyone says 'No' you all lose $5, if there are conflicting answers you all lose $50" then they can presumably handle this. If not, then ultimately, I decline to be responsible for the stupidity of causal decision agents.)

Suppose that you wake up in a green room. You reason, "With 90% probability, there are 18 of me in green rooms and 2 of me in red rooms; with 10% probability, there are 2 of me in green rooms and 18 of me in red rooms. Since I'm altruistic enough to at least care about my xerox-siblings, I calculate the expected utility of replying 'Yes' as (90% * ((18 * +$1) + (2 * -$3))) + (10% * ((18 * -$3) + (2 * +$1))) = +$5.60." You reply yes.

However, before the experiment, you calculate the general utility of the conditional strategy "Reply 'Yes' to the question if you wake up in a green room" as (50% * ((18 * +$1) + (2 * -$3))) + (50% * ((18 * -$3) + (2 * +$1))) = -$20. You want your future selves to reply 'No' under these conditions.

This is a dynamic inconsistency - different answers at different times - which argues that decision systems which update on anthropic evidence will self-modify not to update probabilities on anthropic evidence.

I originally thought, on first formulating this problem, that it had to do with *double-counting *the *utilons *gained by your variable numbers of green friends, and the *probability *of being one of your green friends.

However, the problem also works if we care about paperclips. No selfishness, no altruism, just paperclips.

Let the dilemma be, "I will ask all people who wake up in green rooms if they are willing to take the bet 'Create 1 paperclip if the logical coinflip came up heads, destroy 3 paperclips if the logical coinflip came up tails'. (Should they disagree on their answers, I will destroy 5 paperclips.)" Then a paperclip maximizer, before the experiment, wants the paperclip maximizers who wake up in green rooms to refuse the bet. But a conscious paperclip maximizer who updates on anthropic evidence, who wakes up in a green room, will want to take the bet, with expected utility ((90% * +1 paperclip) + (10% * -3 paperclips)) = +0.6 paperclips.

This argues that, in general, decision systems - whether they start out selfish, or start out caring about paperclips - will not want their future versions to update on anthropic "evidence".

Well, that's not too disturbing, is it? I mean, the whole anthropic thing seemed very confused to begin with - full of notions about "consciousness" and "reality" and "identity" and "reference classes" and other poorly defined terms. Just throw out anthropic reasoning, and you won't have to bother.

When I explained this problem to Marcello, he said, "Well, we don't want to build conscious AIs, so of course we don't want them to use anthropic reasoning", which is a fascinating sort of reply. And I responded, "But when you have a problem this confusing, and you find yourself wanting to build an AI that just doesn't use anthropic reasoning to begin with, maybe that implies that the correct resolution involves *us* not using anthropic reasoning either."

So we can just throw out anthropic reasoning, and relax, and conclude that we are Boltzmann brains. QED.

In general, I find the sort of argument given here - that a certain type of decision system is not reflectively consistent - to be pretty damned compelling. But I also find the Boltzmann conclusion to be, ahem, more than ordinarily unpalatable.

In personal conversation, Nick Bostrom suggested that a division-of-responsibility principle might cancel out the anthropic update - i.e., the paperclip maximizer would have to reason, "If the logical coin came up heads then I am 1/18th responsible for adding +1 paperclip, if the logical coin came up tails then I am 1/2 responsible for destroying 3 paperclips." I confess that my initial reaction to this suggestion was "Ewwww", but I'm not exactly comfortable concluding I'm a Boltzmann brain, either.

EDIT: On further reflection, I also wouldn't want to build an AI that concluded it was a Boltzmann brain! Is there a form of inference which rejects this conclusion without relying on any reasoning about subjectivity?

EDIT2: Psy-Kosh has converted this into a non-anthropic problem!

Actually... how is this an anthropic situation

AT ALL?I mean, wouldn't it be equivalent to, say, gather 20 rational people (That understand PD, etc etc etc, and can certainly manage to agree to coordinate with each other) that are allowed to meet with each other in advance and discuss the situation...

I show up and tell them that I have two buckets of marbles, some of which are green, some of which are red

One bucket has 18 green and 2 red, and the other bucket has 18 red and 2 green.

I will (already have) flipped a logical coin. Depending on the outcome, I will use either one bucket or the other.

After having an opportunity to discuss strategy, they will be allowed to reach into the bucket without looking, pull out a marble, look at it, then, if it's green choose if to pay and steal, etc etc etc. (in case it's not obvious, the payout rules being equivalent to the OP)

As near as I can determine, this situation is entirely equivalent to the OP and is in no way an anthropic one. If the OP actually is an argument against anthropic updates in the presence of logical uncertainty... then it's actually an argument against the general case of Bayesian updating in the presence of logical uncertainty, even when there's no anthropic stuff going on at all!

EDIT: oh, in case it's not obvious, marbles are

notreplaced after being drawn from the bucket.An AI that runs UDT wouldn't conclude that it was a Boltzmann or non-Boltzmann brain. For such an AI, the statement has no meaning, since it's always

both. The closest equivalent would be "Most of the value I can create by making the right decision is concentrated in the vicinity of non-Boltzmann brains."BTW, does my indexical uncertainty and the Axiom of Independence post make any more sense now?

Why is anthropic reasoning related to consciousness at all? Couldn't any kind of Bayesian reasoning system update on the observation of its own existence (assuming such updates are a good idea in the first place)?

Curses on this problem; I spent the whole day worrying about it, and am now so much of a wreck that the following may or may not make sense. For better or worse, I came to a similar conclusion of Psy-Kosh: that this could work in less anthropic problems. Here's the equivalent I was using:

Imagine Omega has a coin biased so that it comes up the same way nine out of ten times. You know this, but you don't know which way it's biased. Omega allows you to flip the coin once, and asks for your probability that it's biased in favor of heads. The coin comes up head... (read more)

"I've made sacrifices! You don't know what it cost me to climb into that machine every night, not knowing if I'd be the man in the box or in the prestige!"

sorry- couldn't help myself.

Again: how can you talk about

concludingthat you are a Boltzmann brain? To conclude means to update, and here you refuse updating.More thinking out loud:

It really is in your best interest to accept the offer after you're in a green room. It really is in your best interest to accept the offer conditional on being in a green room before you're assigned. Maybe part of the problem arises because you think your decision will influence the decision of others, ie because you're acting like a timeless decision agent. Replace "me" with "anyone with my platonic computation", and "I should accept the offer conditional on being in a green room" with "anyone wit... (read more)

I've been watching for a while, but have never commented, so this may be horribly flawed, opaque or otherwise unhelpful.

I think the problem is entirely caused by the use of the wrong sets of belief, and that anything holding to Eliezer's 1-line summary of TDT or alternatively UDT should get this right.

Suppose that you're a rational agent. Since you are instantiated in multiple identical circumstances (green rooms) and asked identical questions, your answers should be identical. Hence if you wake up in a green room and you're asked to steal from the red roo... (read more)

I think I'm with Bostrom.

The problem seems to come about because the good effects of 18 people being correct are more than wiped out by the bad effects of 2 people being wrong.

I'm sure this imbalance in the power of the agents has something to do with it.

I read this and told myself that it only takes five minutes to have an insight. Five minutes later, here's what I'm thinking:

Anthropic reasoning is confusing because it treats consciousness as a primitive. By doing so, we're committing LW's ultimate no-no: assuming an ontologically fundamental mental state. We need to find a way to reformulate anthropic reasoning in terms Solomonoff induction. If we can successfully do so, the paradox will dissolve.

You can't reject the conclusion that you are a Boltzmann brain - but if you are, it doesn't matter what you do, so the idea doesn't seem to have much impact on decision theory.

There are lots of ordinary examples in game theory of time inconsistent choices. Once you know how to resolve them, then if you can't use those approaches to resolve this I might be convinced that anthropic updating is at fault. But until then I think you are making a huge leap to blame anthropic updating for the time inconsistent choices.

I waited to comment on this, to see what others would say. Right now Psy-Kosh seems to be right about anthropics; Wei Dai seems to be right about UDT; timtyler seems to be right about Boltzmann brains; byrnema seems to be mostly right about pointers; but I don't understand why nobody latched on to the "reflective consistency" part. Surely the kind of consistency under observer-splitting that you describe is too strong a requirement in general: if two copies of you play a game, the correct behavior for both of them would be to try to win, regardle... (read more)

I think I'll have to sit and reread this a couple times, but my

INITIALthought is "Isn't the apparent inconsistancy here qualitatively similar to the situation with a counterfactual mugging?"Huh. Reading this again, together with byrnema's pointer discussion and Psy-Kosh's non-anthropic reformulation...

It seems like the problem is that whether each person gets to make a decision depends on the evidence they think they have, in such a way to make that evidence meaningless. To construct an extreme example: The Antecedent Mugger gathers a billion people in a room together, and says:

"I challenge you to a game of wits! In this jar is a variable amount of coins, between $0 and $10,000. I will allow each of you to weigh the jar using this set of... (read more)

isn't this a problem with the frequency you are presented with the opportunity to take the wager? [no, see edit]

the equation: (50%

((18+$1) + (2-$3))) + (50%((18-$3) + (2+$1))) = -$20 neglects to take into account that you will be offered this wager nine times more often in conditions where you win than when you lose.for example, the wager: "i will flip a fair coin and pay you $1 when it is heads and pay you -$2 when it is tails" is -EV in nature. however if a conditional is added where you will be asked if you want to take the bet 9... (read more)

If the many worlds interpretation of quantum mechanics is true isn't anthropic reasoning involved in making predictions about the future of quantum systems. There exists some world in which, from the moment this comment is posted onward, all attempts to detect quantum indeterminacy fail, all two-slit experiments yield two distinct lines instead of a wave pattern etc. Without anthropic reasoning we have no reason to find this result at all surprising. So either we need to reject anthropic reasoning or we need to reject the predictive value of quantum mechan... (read more)

The reason we shouldn't update on the "room color" evidence has nothing to do with the fact that it constitutes anthropic evidence. The reason we shouldn't update is that we're

told, albeit indirectly, that we shouldn't update (because if we do then some of our copies will update differently and we will be penalized for our disagreement).In the real world, there is no incentive for all the copies of ourselves in all universes to agree, so it's all right to update on anthropic evidence.

[comment deleted]

Oops... my usual mistake of equivocating different things and evolving the problem until it barely resembles the original. I will update my "solution" later if it still works for the original.

... Sigh. Won't work. My previous "solution" recovered the correct answer of -20 because I bent the rules enough to have each of my green-room-deciders make a global rather than anthropic calculation.

This assumes that the question is asked only once, but then, to which of the 20 copies will it be asked?

If all 20 copies get asked the same question (or equivalently if a single copy chosen at random is) then the utility is (50%

18/20((18+$1) + (2-$3))) + (50%2/20((18-$3) + (2+$1))) = 2.8$ = 50% * 5.6$.Consider th... (read more)

I keep having trouble thinking of probabilities when I'm to be copied and >=1 of "me" will see red and >=1 of "me" will see green. My thought is that it is 100% likely that "I" will see red and know there are others, once-mes, who see green, and 100% likely vice-versa. Waking up to see red (green) is exactly the expected result.

I do not know what to make of this opinion of mine. It's as if my definition of self - or choice of body - is in superposition. Am I committing an error here? Suggestions for further reading would be appreciated.

I remain convinced that the probability is 90%.

The confusion is over whether you want to maximize the expectation of the number of utilons there will be if you wake up in a green room or the expectation of the number of utilons you will observe if you wake up in a green room.

The notion of "I am a bolzmann brain" goes away when you conclude that conscious experience is a Tegmark-4 thing, and that equivalent conscious experiences are mathematically equal and therefore there is no difference and you are at the same time a human being and a bolzmann brain, at least until they diverge.

Thus, antrhopic reasoning is right out.

Whoohoo! I just figured out the correct way to handle this problem, that renders the global and egocentric/internal reflections consistent.

We will see if my solution makes sense in the morning, but the upshot is that there was/is nothing wrong with the green roomer's

posterior, as many people have been correctly defending. The green roomer who computed an EV of $5.60 modeled the money pay-off scheme wrong.In the incorrect calculation that yields $5.6 EV, the green roomer models himself as winning (getting the favorable +$12) when he is right and losing (... (read more)

I would perhaps prefer to use diff... (read more)

Can someone come up with a situation of the same general form as this one where anthropic reasoning results in optimal actions and nonanthropic reasoning results in suboptimal actions?

I think you're missing a term in your second calculation. And why are anthropism and copies of you necessary for this puzzle. I suspect the answer will indicate something I'm completely missing about this series.

Take this for straight-up probability:

I have two jars of marbles, one with 18 green and 2 red, the other with 18 red and two green. Pick one jar at random, then look at one marble from that jar at random.

If you pick green, what's the chance that your jar is mostly green? I say 90%, by fairly straightforward application of bayes' rule.

I offer... (read more)

In this comment:

http://lesswrong.com/lw/17d/forcing_anthropics_boltzmann_brains/138u

I put forward my view that the best solution is to just maximize total utility, which correctly handles the forcing anthropics case, and expressed curiosity as to whether it would handle the outlawing anthropics case.

It now seems my solution does correctly handle the outlawing anthropics case, which would seem to be a data point in its favor.

Assume that each agent

has his own game(that is one game for each agent). That is there are overall 18 (or 2) games (depending the result of the coin flip.)Then the first calculation would be correct in every respect, and it makes sense to say yes from a global point of view. (And also with any other reward matrix, the dynamic update would be consistent with the apriori decision all the time)

This shows that the error made by the agent was to implicitely assume that he

has his own game.How about give all of your potential clones a vote, even though you can't communicate?

So, in one case, 18 of you would say "Yes, take the bet!" and 2 would say "No, let me keep my money." In the other case, 18 would say no and two would say yes. In either case, of course, you're one of the ones who would vote yes. OK, that leaves us tied. So why not let everyone's vote be proportional to what they stand to gain/lose? That leaves us with 20

-3 vs. 201. Don't take the bet.(Yes, I realize half the people that just voted above don't exist. We just don't know which half...)

As it's been pointed out, this is not an anthropic problem, however there still is a paradox. I'm may be stating the obvious, but the root of the problem is that you're doing something fishy when you say that the other people will think the same way and that your decision will theirs.

The proper way to make a decision is to have a probability distribution on the code of the other agents (which will include their prior on your code). From this I believe (but can't prove) that you will take the correct course of action.

Newcomb like problem fall in the same category, the trick is that there is always a belief about someone's decision making hidden in the problem.

[EDIT:] Warning: This post was based on a misunderstanding of the OP. Thanks orthonormal for pointing out the the mistake! I leave this post here so that the replies stay in context.

I think that decision matrix of the agent waking up in green room is not complete: it should contain the outcome of losing $50 if the answers are not consistent.

Therefore, it would compute that even if the probability of the coin was flipped to 1 is 90%, it still does not make sense to answer "yes" since two other copies would answer "no" and therefore the ... (read more)

EDIT: at first I thought this was equivalent, but then I tried the numbers and realized it's not.

If the color is green, do you take the bet?

EDIT: After playing with the numbers, I think reaso... (read more)

Perhaps we should look at Dresher's Cartesian Camcorder as a way of reducing consciousness, and thereby eliminate this paradox.

Or, to turn it around, this paradox is a litmus test for theories of consciousness.

The more I think about this, the more I suspect that the problem lies in the distinction between quantum and logical coin-flips.

Suppose this experiment is carried out with a quantum coin-flip. Then, under many-worlds, both outcomes are realized in different branches. There are 40 future selves--2 red and 18 green in one world, 18 red and 2 green in the other world--and your duty is clear:

Don't take the bet.

So why Eliezer's insistence on using a logical coin-flip? Because, I suspect,... (read more)

Is there any version of this post that doesn't involve technologies that we don't have? If not, then might the resolution to this paradox be that the copying technology assumed to exist can't exist because if it did it would give rise to a logical inconsistency.

Edit: presumably there's an answer already discussed that I'm not aware of, probably common to all games where Omega creates N copies of you. (Since so many of them have been discussed here.) Can someone please point me to it?

I'm having difficulties ignoring the inherent value of having N copies of you created. The scenario assumes that the copies go on existing after the game, and that they each have the same amount of utilons as the original (instead of a division of some kind).

For suppose the copies are short lived: Omega destroys them after the game.... (read more)