Beauty quips, "I'd shut up and multiply!"

When it comes to probability, you should trust probability laws over your intuition.  Many people got the Monty Hall problem wrong because their intuition was bad.  You can get the solution to that problem using probability laws that you learned in Stats 101 -- it's not a hard problem.  Similarly, there has been a lot of debate about the Sleeping Beauty problem.  Again, though, that's because people are starting with their intuition instead of letting probability laws lead them to understanding.

The Sleeping Beauty Problem

On Sunday she is given a drug that sends her to sleep. A fair coin is then tossed just once in the course of the experiment to determine which experimental procedure is undertaken. If the coin comes up heads, Beauty is awakened and interviewed on Monday, and then the experiment ends. If the coin comes up tails, she is awakened and interviewed on Monday, given a second dose of the sleeping drug, and awakened and interviewed again on Tuesday. The experiment then ends on Tuesday, without flipping the coin again. The sleeping drug induces a mild amnesia, so that she cannot remember any previous awakenings during the course of the experiment (if any). During the experiment, she has no access to anything that would give a clue as to the day of the week. However, she knows all the details of the experiment.

Each interview consists of one question, "What is your credence now for the proposition that our coin landed heads?"

Two popular solutions have been proposed: 1/3 and 1/2

The 1/3 solution

From wikipedia:

Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1/3.

Yes, it's true that only in a third of cases would heads precede her awakening.

Radford Neal (a statistician!) argues that 1/3 is the correct solution.

This [the 1/3] view can be reinforced by supposing that on each awakening Beauty is offered a bet in which she wins 2 dollars if the coin lands Tails and loses 3 dollars if it lands Heads. (We suppose that Beauty knows such a bet will always be offered.) Beauty would not accept this bet if she assigns probability 1/2 to Heads. If she assigns a probability of 1/3 to Heads, however, her expected gain is 2 × (2/3) − 3 × (1/3) = 1/3, so she will accept, and if the experiment is repeated many times, she will come out ahead.

Neal is correct (about the gambling problem).

These two arguments for the 1/3 solution appeal to intuition and make no obvious mathematical errors.   So why are they wrong?

Let's first start with probability laws and show why the 1/2 solution is correct. Just like with the Monty Hall problem, once you understand the solution, the wrong answer will no longer appeal to your intuition.

The 1/2 solution

P(Beauty woken up at least once| heads)=P(Beauty woken up at least once | tails)=1.  Because of the amnesia, all Beauty knows when she is woken up is that she has woken up at least once.  That event had the same probability of occurring under either coin outcome.  Thus, P(heads | Beauty woken up at least once)=1/2.  You can use Bayes' rule to see this if it's unclear.

Here's another way to look at it:

If it landed heads then Beauty is woken up on Monday with probability 1.

If it landed tails then Beauty is woken up on Monday and Tuesday.  From her perspective, these days are indistinguishable.  She doesn't know if she was woken up the day before, and she doesn't know if she'll be woken up the next day.  Thus, we can view Monday and Tuesday as exchangeable here.

A probability tree can help with the intuition (this is a probability tree corresponding to an arbitrary wake up day):

If Beauty was told the coin came up heads, then she'd know it was Monday.  If she was told the coin came up tails, then she'd think there is a 50% chance it's Monday and a 50% chance it's Tuesday.  Of course, when Beauty is woken up she is not told the result of the flip, but she can calculate the probability of each.

When she is woken up, she's somewhere on the second set of branches.  We have the following joint probabilities: P(heads, Monday)=1/2; P(heads, not Monday)=0; P(tails, Monday)=1/4; P(tails, Tuesday)=1/4; P(tails, not Monday or Tuesday)=0.  Thus, P(heads)=1/2.

Where the 1/3 arguments fail

The 1/3 argument says with heads there is 1 interview, with tails there are 2 interviews, and therefore the probability of heads is 1/3.  However, the argument would only hold if all 3 interview days were equally likely.  That's not the case here. (on a wake up day, heads&Monday is more likely than tails&Monday, for example).

Neal's argument fails because he changed the problem. "on each awakening Beauty is offered a bet in which she wins 2 dollars if the coin lands Tails and loses 3 dollars if it lands Heads."  In this scenario, she would make the bet twice if tails came up and once if heads came up.  That has nothing to do with probability about the event at a particular awakening.  The fact that she should take the bet doesn't imply that heads is less likely.  Beauty just knows that she'll win the bet twice if tails landed.  We double count for tails.

Imagine I said "if you guess heads and you're wrong nothing will happen, but if you guess tails and you're wrong I'll punch you in the stomach."  In that case, you will probably guess heads.  That doesn't mean your credence for heads is 1 -- it just means I added a greater penalty to the other option.

Consider changing the problem to something more extreme.  Here, we start with heads having probability 0.99 and tails having probability 0.01.  If heads comes up we wake Beauty up once.  If tails, we wake her up 100 times.  Thirder logic would go like this:  if we repeated the experiment 1000 times, we'd expect her woken up 990 after heads on Monday, 10 times after tails on Monday (day 1), 10 times after tails on Tues (day 2),...., 10 times after tails on day 100.  In other words, ~50% of the cases would heads precede her awakening. So the right answer for her to give is 1/2.

Of course, this would be absurd reasoning.  Beauty knows heads has a 99% chance initially.  But when she wakes up (which she was guaranteed to do regardless of whether heads or tails came up), she suddenly thinks they're equally likely?  What if we made it even more extreme and woke her up even more times on tails?

Implausible consequence of 1/2 solution?

Nick Bostrom presents the Extreme Sleeping Beauty problem:

This is like the original problem, except that here, if the coin falls tails, Beauty will be awakened on a million subsequent days. As before, she will be given an amnesia drug each time she is put to sleep that makes her forget any previous awakenings. When she awakes on Monday, what should be her credence in HEADS?

He argues:

The adherent of the 1/2 view will maintain that Beauty, upon awakening, should retain her credence of 1/2 in HEADS, but also that, upon being informed that it is Monday, she should become extremely confident in HEADS:
P+(HEADS) = 1,000,001/1,000,002

This consequence is itself quite implausible. It is, after all, rather gutsy to have credence 0.999999% in the proposition that an unobserved fair coin will fall heads.

It's correct that, upon awakening on Monday (and not knowing it's Monday), she should retain her credence of 1/2 in heads.

However, if she is informed it's Monday, it's unclear what she conclude.  Why was she informed it was Monday?  Consider two alternatives.

Disclosure process 1:  regardless of the result of the coin toss she will be informed it's Monday on Monday with probability 1

Under disclosure process 1, her credence of heads on Monday is still 1/2.

Disclosure process 2: if heads she'll be woken up and informed that it's Monday.  If tails, she'll be woken up on Monday and one million subsequent days, and only be told the specific day on one randomly selected day.

Under disclosure process 2, if she's informed it's Monday, her credence of heads is 1,000,001/1,000,002.  However, this is not implausible at all.  It's correct.  This statement is misleading: "It is, after all, rather gutsy to have credence 0.999999% in the proposition that an unobserved fair coin will fall heads."  Beauty isn't predicting what will happen on the flip of a coin, she's predicting what did happen after receiving strong evidence that it's heads.

ETA (5/9/2010 5:38AM)

If we want to replicate the situation 1000 times, we shouldn't end up with 1500 observations.  The correct way to replicate the awakening decision is to use the probability tree I included above. You'd end up with expected cell counts of 500, 250, 250, instead of 500, 500, 500.

Suppose at each awakening, we offer Beauty the following wager:  she'd lose $1.50 if heads but win $1 if tails.  She is asked for a decision on that wager at every awakening, but we only accept her last decision. Thus, if tails we'll accept her Tuesday decision (but won't tell her it's Tuesday). If her credence of heads is 1/3 at each awakening, then she should take the bet. If her credence of heads is 1/2 at each awakening, she shouldn't take the bet.  If we repeat the experiment many times, she'd be expected to lose money if she accepts the bet every time.

The problem with the logic that leads to the 1/3 solution is it counts twice under tails, but the question was about her credence at an awakening (interview).

ETA (5/10/2010 10:18PM ET)


Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1/3.

Another way to look at it:  the denominator is not a sum of mutually exclusive events.  Typically we use counts to estimate probabilities as follows:  the numerator is the number of times the event of interest occurred, and the denominator is the number of times that event could have occurred. 

For example, suppose Y can take values 1, 2 or 3 and follows a multinomial distribution with probabilities p1, p2 and p3=1-p1-p2, respectively.   If we generate n values of Y, we could estimate p1 by taking the ratio of #{Y=1}/(#{Y=1}+#{Y=2}+#{Y=3}). As n goes to infinity, the ratio will converge to p1.   Notice the events in the denominator are mutually exclusive and exhaustive.  The denominator is determined by n.

The thirder solution to the Sleeping Beauty problem has as its denominator sums of events that are not mutually exclusive.  The denominator is not determined by n.  For example, if we repeat it 1000 times, and we get 400 heads, our denominator would be 400+600+600=1600 (even though it was not possible to get 1600 heads!).  If we instead got 550 heads, our denominator would be 550+450+450=1450.  Our denominator is outcome dependent, where here the outcome is the occurrence of heads.  What does this ratio converge to as n goes to infinity?  I surely don't know.  But I do know it's not the posterior probability of heads.

336 comments, sorted by
magical algorithm
Highlighting new comments since Today at 10:50 AM
Select new highlight date

This is one of those cases where we need to disentangle the dispute over definitions (1), forget about the notion of subjective anticipation (2), list the well-defined questions and ask which we mean.

If by the probability we mean the fraction of waking moments, the answer is 1/3.

If by the probability we mean the fraction of branches, the answer is 1/2.

  1. http://lesswrong.com/lw/np/disputing_definitions/

  2. http://lesswrong.com/lw/208/the_iless_eye/

It's hard to make a sensible notion of probability out of "fraction of waking moments". Two subsequent states of a given dynamical system make for poor distinct elements of a sample space: when we've observed that the first moment of a given dynamical trajectory is not the second, what are we going to do when we encounter the second one? It's already ruled "impossible"! Thus, Monday and Tuesday under the same circumstances shouldn't be modeled as two different elements of a sample space.

As Wei Dai and Roko have observed, that depends on why you're asking in the first place. Probability estimates should pay rent in correct decisions. If you're making a bet that will pay off once at the end of the experiment, you should count the fraction of branches. If you're making a bet that will pay off once per wake-up call, you should count the fraction of wake-up calls.

That's the wrong way to look at it. A certain bet may be the "correct" action to perform, or even a certain ritual of cognition may pay its rent, but it won't be about the concept of probability. Circumstances may make it preferable to do or say anything, but that won't influence the meaning of fixed concepts. You can't argue that 2+2 is in fact 5 on the grounds that saying that saves puppies. You may say that 2+2 is 5, or think that "probability of Tuesday" is 1/3 or 1/4 in order to win, but that won't make it so, it will merely make you win.

Subjective probability is not a well-defined concept in the general case. Fractions are well-defined, but only after you've decided where you are getting the numerator and denominator from.

That fractions are well-defined doesn't make them probabilities.

Not all of the waking moments have the same probability of occurring. If you estimate the probability of heads by the proportion of waking moments that were preceded by heads, you'd be throwing out information. Again, on a random waking moment, Monday preceded by heads is more likely than Monday preceded by tails.

On a random waking moment, Monday preceded by heads is equally likely as Monday preceded by tails.

I think you're thinking of a similar problem that we discussed last year, which involves a forgetful driver who is driving past 1 to n intersections, and needs to turn left at at least one of them. That problem is different, because it's asking about the probability of turning left at least once over the course of his drive.

The absent-minded driver is essentially the same problem, but it's easier to analyze because explicit payoff specification prompts you to estimate expected value of possible strategies. In estimating those strategies, we use the same probability model that would say "1/2" in the Beauty problem.

Add a payoff and the answer becomes clear, and it also becomes clear that the answer depends entirely on how the payoff works.

Without a payoff, this is a semantics problem revolving around the ill-defined concept of expectation and will continue to circle it endlessly.

The problem posed is, p(heads | Sleeping Beauty is awake). There is no payoff involved. Introducing a payoff only confuses matters. For instance, Roko wrote:

But if we specify that the money will be put into an account (and she will only be paid one winning) that she can spend after the experiment is over, which is next week, then she will find that 1/2 is the "right" answer.

This is true; but that would be the answer to "What is the probability that the coin was heads, given that Sleeping Beauty was woken up at least once after being put to sleep?" That isn't the problem posed. If that were the problem posed, we could eliminate her forgetfulness from the problem statement.

If you agree that the forgetfulness is necessary to the story, then 1/2 is the wrong answer, and 1/3 is the right answer. If you don't agree it's necessary, then its presence suggests that the speaker intended a different semantics than you're using to interpret it.

ADDED: This is depressing. Here we have a collection of people who have studied probability problems and anthropic reasoning and all the relevant issues for years. And we have a question that is, on the scale of questions in the project of preparing for AGI, a small, simple one. It isn't a tricky semantic or philosophical issue; it actually has an answer. And the LW community is doing worse than random at it.

In fact, this isn't the first time. My brief survey of recent posts indicates that the LessWrong community's track record when tackling controversial problems that actually have an answer is random at best.

There is no payoff involved. Introducing a payoff only confuses matters.

I define subjective probability in terms of what wagers I would be willing to make. I think a good rule of thumb is that if you can't figure out how to turn the problem into a wager you don't know what you're asking. And, in fact, when we introduce payoffs to this problem it becomes extremely clear why we get two answers. The debate then becomes a definition debate over what wager we mean by the sentence "what credence should the patient assign..."

As I just explained, the fact that the original author of the story wrote amnesia into it tells you which definition the author of the story was using.

And that's a good argument you've got there, but I don't think that is totally obvious on the first read of the problem. It's a weird feature of a probability problem for the relevant wager to be offered once under some circumstances and twice under others. So people get confused. It is a little tricky. But, far from confusing things, that entire issue can be avoided if we specify exactly how the payoff works when we state the problem! So I don't know why you're freaking out about Less Wrong's ability to answer these problems when it seems pretty clear that people interpret the question differently, not that they can't think through the issues.

(Not my downvote, btw)

Re: "Introducing a payoff only confuses matters."

Personally, I think it clarifies things - though at the expense of introducing complication. People disagree over which bet the problem represents. Describing those bets highlights this area of difference.

I see what you mean. But some comments have said, "I can set up a payoff scheme that gives this answer; therefore, this is an equally-valid answer." The correct response is to state the payoff scheme that gives your answer, and then admit your answer is not addressing the problem if you can't find justification for that payoff scheme in the problem statement.

Indeed - that would be bad - and confusing.

It is both bad and confusing that people are defending the idea that this problem is not clearly-stated enough to answer.

I suspect this happens because, people don't like criticising the views of others. They would rather just say 'you are both right' - since then no egos get bruised, and a costly fight is avoided. So, nonsense goes uncriticised, and the innocent come to believe it - because nobody has the guts to knock it down.

"ADDED: This is depressing. Here we have a collection of people who have studied probability problems and anthropic reasoning and all the relevant issues for years. And we have a question that is, on the scale of questions in the project of preparing for AGI, a small, simple one. It isn't a tricky semantic or philosophical issue; it actually has an answer. And the LW community is doing worse than random at it."

That's why I posted this to begin with. It is interesting that we can't come to an agreement on the solution to this problem, even though it involves very straightforward probability. Heck, I got heavily down voted after making statements that were correct. People are getting thrown off by doing the wrong kind of frequency counting.

--

However, I should note that the event 'sleeping beauty is awake' is equivalent to 'sleeping beauty has been woken up at least once' because of the amnesia. The forgetfulness aspect of the problem is why the solution is 1/2.