Consider the following statements:

1. The result of this coin flip is heads.

2. There is life on Mars.

3. The millionth digit of pi is odd.

What is the probability of each statement?

A frequentist might say, "P1 = 0.5. P2 is either epsilon or 1-epsilon, we don't know which. P3 is either 0 or 1, we don't know which."

A Bayesian might reply, "P1 = P2 = P3 = 0.5. By the way, there's no such thing as a probability of exactly 0 or 1."

Which is right? As with many such long-unresolved debates, the problem is that two different concepts are being labeled with the word 'probability'. Let's separate them and replace P with:

F = the fraction of possible worlds in which a statement is true. F can be exactly 0 or 1.

B = the Bayesian probability that a statement is true. B cannot be exactly 0 or 1.

Clearly there must be a relationship between the two concepts, or the confusion wouldn't have arisen in the first place, and there is: apart from both obeying  various laws of probability, in the case where we know F but don't know which world we are in, B = F. That's what's going on in case 1. In the other cases, we know F != 0.5, but our ignorance of its actual value makes it reasonable to assign B = 0.5.

When does the difference matter?

Suppose I offer to bet my $200 the millionth digit of pi is odd, versus your $100 that it's even. With B3 = 0.5, that looks like a good bet from your viewpoint. But you also know F3 = either 0 or 1. You can also infer that I wouldn't have offered that bet unless I knew F3 = 1, from which inference you are likely to update your B3 to more than 2/3, and decline.

On a larger scale, suppose we search Mars thoroughly enough to be confident there is no life there. Now we know F2 = epsilon. Our Bayesian estimate of the probability of life on Europa will also decline toward 0.

Once we understand F and B are different functions, there is no contradiction.

New to LessWrong?

New Comment
44 comments, sorted by Click to highlight new comments since: Today at 1:24 AM
[-]Rune14y100

Why does the Bayesian say that the probability of there being life on Mars is 0.5?

We don't. I'm not sure what's up with that, unless it was a deliberately bad example.

[-]Rune14y00

I can't imagine anyone assigning the event probability 0.5 just because it's a Yes/No question. Does the probability drop to 1/3 if I added 1 more option to the question?

The person who assigns probability 1/k to all outcomes of any question with k options is NOT a Bayesian. That's someone who has misunderstood Bayes rule and should re-read all of Eliezer's posts.

"So roughly speaking, what are the chances the world is going to be destroyed? One in a million, one in a billion?"

"Well, the best we can say is about a 1 in 2 chance."

http://www.thedailyshow.com/watch/thu-april-30-2009/large-hadron-collider (video, region blocked)

I seem to remember seeing the idea that "all possibilities equally likely" is sort of a "default prior". In the case of life on Mars: Imagine getting all your information about life and Mars in little dribs and drabs, each one of which lets you update your probability of life on Mars. The place you start from (before you know stuff like what DNA is and whether Mars has an atmosphere) is 0.5.

F = the fraction of possible worlds in which a statement is true. F can be exactly 0 or 1.

If "possible world" means "any imaginable world history which is consistent with our knowledge", F and B will probably collapse into one concept.

If it means "any world history which is consistent with logic, some specified set of physical laws and initial conditions given in one moment", and if we expect the laws to be deterministic, then F1 would also be 0 or 1 and not 0.5.

I would like to see your definition of "possible world".

The frequentist's probability you refer to is called "physical probability". See for example http://en.wikipedia.org/wiki/Probability_interpretations

Yes, that's exactly what I'm talking about. Thanks for the link.

[-]Cyan14y00

A frequentist might say, "P1 = 0.5. P2 is either epsilon or 1-epsilon, we don't know which. P3 is either 0 or 1, we don't know which."

I'm not sure why the frequentist would put an epsilon in P2. Surely there is a fact of the matter about statement 2 just as there is for statement 3.

I was assuming per Tegmark that we live in (at least one variety of) big world, and "I" denotes a set of entities indistinguishable with current information, but who live in different parts of the multiverse. But more prosaically, you could note that there is a nonzero albeit small probability that atoms on a lifeless Mars will arrange themselves into a life form between one visit and the next.

[-]Cyan14y10

I grant that a big world provides an ensemble such that epsilon could make sense. I think that the prosaic explanation fails, though -- a fully specified version of statement two refers either to an instant in time or an interval, and either way, in a small world there will be a fact of the matter.

Hmm, frequentist probability is most usually described in terms of, er, frequency; what fraction of the time we will get a given result when we run the test. But if you take it as referring to an instant of time (and you assume small world and no fuzziness) in that case I agree the epsilon would disappear.

[-]Cyan14y20

fraction of the time

It's a minor point, but wackily enough, the above quote is a subtle equivocation on the word "time". I can flip N exchangeable coins simultaneously and count the number of times I see "heads", and this is perfectly sensible in the frequentist interpretation. Physical clock time is something else again.

Sure, B and F don't directly contradict each other, but which one should we use when reasoning under uncertainty?

edit: better statement of what I was getting at

[-]tut14y10

I answered a different question than what this sits below, but I think that the answer is still both. B is probably the one that fits in the formulas, but you should also remember that the cases where B/=F are the cases where such formulas are least likely to serve you well.

Yep. To expand: correctly used, they should not contradict each other. If they give different answers, then at least one of them is being used in a way in which it is not applicable.

You're right, so far as it goes, but I don't think it gets you very far. My point is that this dodges what the debate is about. Proponents of F say that it should be used for doing inductive inference while proponents of B say they're wrong, B is what should be used. If you're not answering that question, you're not settling the debate.

Now if you're not trying to settle the debate, then we have no argument.

Well, the only debates I'm claiming to definitively settle are the philosophical ones about "what does probability really mean?", "are 0 and 1 really probabilities?" and suchlike, over which I've seen enough electrons spilled that I considered it well worth trying to put them to rest.

But in a typical inductive scenario, it seems to me that since we can't work directly with F, whereas we can work directly with B, Bayesian reasoning is the appropriate tool to use. Do you have any counterexamples in mind where the two approaches give different answers and the difference can't be resolved by noting that they aren't answering the same question?

Do you have any counterexamples in mind where the two approaches give different answers and the difference can't be resolved by noting that they aren't answering the same question?

Well, no, but but proponents of F may disagree.