Anthropic Decision Theory III: Solving Selfless and Total Utilitarian Sleeping Beauty

Stuart_Armstrong

A near-final version of my Anthropic Decision Theory paper is available on the arXiv. Since anthropics problems have been discussed quite a bit on this list, I'll be presenting its arguments and results in this, subsequent, and previous posts 1 2 3 4 5 6.

Consistency

In order to transform the Sleeping Beauty problem into a decision problem, assume that every time she is awoken, she is offered a coupon that pays out £1 if the coin fell tails. She must then decide at what cost she is willing to buy that coupon.

The very first axiom is that of temporal consistency. If your preferences are going to predictably change, then someone will be able to exploit this, by selling you something now that they will buy back for more later, or vice versa. This axiom is implicit in the independence axiom in the von Neumann-Morgenstern axioms of expected utility, where non-independent decisions show inconsistency after partially resolving one of the lotteries. For our purposes, we will define it as:

Temporal Consistency: If an agent at two different times has the same knowledge and preferences, then the past version will never give up anything of value in order to change the decision of the future version.

This is appropriate for the standard Sleeping Beauty problem, but not for the incubator variant, where the different copies are not future or past versions of each other. To deal with that, we extend the axiom to:

Consistency: If two copies of an agent have the same knowledge and preferences, then the one version will never give up anything of value in order to change the decision of the other version.

Note that while 'same preferences' is something we could expect to see for the same agent at different times, it is not something the case for copies, who are generally assumed to be selfish towards each other. Indeed, this whole issue of selflessness, altruism and selfishness will be of great import for the agent's behaviour, as we shall now see.

Selfless Sleeping Beauty

Assume that Sleeping Beauty has an entirely selfless utility function. To simplify matters, we will further assume her utility is linear in cash (though cash for her is simply a tool to accomplish her selfless goals). Before Sleeping Beauty is put to sleep the first time, she will follow the following reasoning:

"In the tails world, future copies of myself will be offered the same deal twice. Any profit they make will be dedicated to my selfless goal, so from my perspective, profits (and losses) will be doubled in the tails world. If my future copies will buy the coupon for £x, there would be an expected £0.5(2(-x + 1) + 1(-x + 0)) = £(1-3/2x) going towards my goal. Hence I would want my copies to buy whenever x<2/3."

Then by the temporal consistency axiom, this is indeed what her future copies will do. Note that Sleeping Beauty is here showing a behaviour similar to the SIA probabilities -- she is betting on 2:1 odds that she is in the world with two copies.

Selfless Incubator Sleeping Beauty

In the incubator variant, there is no initial Sleeping Beauty to make decisions for her future copies. Thus consistency is not enough to resolve the decision problem, even for a selfless Sleeping Beauty. To do so, we will need to make use of our second axiom:

Outside agent: If there exists a collection of identical agents (which may be the same agent at different times) with same knowledge and preferences, then another copy of them with the same information would never give up anything of value to make them change their decisions.

With this axiom, the situation reduces to the above selfless Sleeping Beauty, by simply adding in the initial Sleeping Beauty again as 'another copy'. Some might feel that that axiom is too strong, that invariance under the creation or destruction of extra copies is something that cannot be simply assumed in anthropic reasoning. An equivalent axiom could be:

Total agent: If there exists a collection of identical agents (which may be the same agent at different times) with same knowledge and preferences, then they will make their decisions as if they were a single agent simultaneously controlling all their (correlated) decisions.

This axiom is equivalent to the other, with the total agent taking the role of the outside agent. Since all the agents are identical, going through exactly the same reasoning process to reach the same decision, the total agent formulation may be more intuitive. They are, from a certain perspective, literally the same agent. This is the decision that the agents would reach if they could all coordinate with each other, if there were a way of doing this without them figuring out how many of them there were.

Altruistic total utilitarian Sleeping Beauty

An altruistic total utilitarian will have the same preferences over the possible outcomes in a Sleeping Beauty situation: the outcomes in the tails world is doubled, as any gain/loss happens twice, and the altruist adds up the effect of each gain/loss. Hence the altruistic total utilitarian Sleeping Beauty will make the same decisions as the selfless one.

Copy-altruistic total utilitarian Sleeping Beauty

The above argument does not require that Sleeping Beauty be entirely altruistic, only that she be altruistic towards all her copies. Thus she may have selfish personal preferences ("I prefer to have this chocolate bar, rather than letting Snow White get it"), as long as these are not towards her copies ("I'm indifferent as to whether I or Sleeping Beauty II gets the chocolate bar"). And then she will make the same decision in this problem as if she was entirely altruistic.

This post looked at situation implying SIA-like behaviour; tomorrow's post will look at cases where SSA-like behaviour is the right way to go.

These could do with forward/backward links. The Article Navigator doesn't seem to be able to get me to number 4 in this series, and the page for 'sleeping_beauty' tag appears empty.

Then by the temporal consistency axiom, this is indeed what her future copies will do.

They have information she doesn't.

Suppose there are one trillion person-days if she wakes up once, and one trillion one if she wakes up twice. Specifically, her now, her on heads, and her on tails have three separate pieces of information.

The probability of being sleeping beauty on the day before the experiment is one in one trillion if the coin lands on heads, and one in one trillion one if it lands on tails. This gives an odds ration of 1.000000000001:1.

The probability of being sleeping beauty during the experiment is one in one trillion if the coin lands on heads, and two in one trillion one if it lands on tails (since there are then two days it could be). This gives an odds ratio of 1.000000000001:2.

Since her future self has different information, it makes perfect sense for her to make a different choice.

There is some disagreement on whether or not probability works that way. (This is technically not an understatement. Some people agree with me.) Suppose it doesn't.

Assuming Sleeping beauty experiences exactly the same thing both days, she will get no additional and relevant information.

If the experiences aren't exactly identical, she's twice as likely to have a given experience if she wakes up twice. For example, if she rolls a die each time she wakes up, there's a one in six chance of rolling a six at least once if she wakes up once, but a 11 in 36 chance if she wakes up twice.

Then by the temporal consistency axiom, this is indeed what her future copies will do.

They have information she doesn't.

Then let's improve the axiom to get rid of that potential issue. Change it to something like:

"If an agent at two different times has the same preferences, then the past version will never give up anything of value in order to change the conditional decision of its future version. Here, conditional decision means the mapping from information to decision."