LESSWRONG
LW

Wikitags
Main
1
Intro
3

Bayes' rule: Odds form

Edited by So8res, Eliezer Yudkowsky, et al. last updated 10th Jul 2016
Requires: , , ,
Teaches: , ,

One of the more convenient forms of uses . Bayes' rule says that, when you observe a piece of evidence e, your odds O(H∣e) for your hypothesis H given e is just your odds O(H) on H times the Le(H).

For example, suppose we're trying to solve a mysterious murder, and we start out thinking the odds of Professor Plum vs. Miss Scarlet committing the murder are 1 : 2, that is, Scarlet is twice as likely as Plum to have committed the murder . We then observe that the victim was bludgeoned with a lead pipe. If we think that Plum, if he commits a murder, is around 60% likely to use a lead pipe, and that Scarlet, if she commits a murder, would be around 6% likely to us a lead pipe, this implies of 10 : 1 for Plum vs. Scarlet using the pipe. The odds for Plum vs. Scarlet, after observing the victim to have been murdered by a pipe, are (1:2)×(10:1)=(10:2)=(5:1). We now think Plum is around five times as likely as Scarlet to have committed the murder.

Odds functions

Let H denote a of hypotheses. An odds function O is a function that maps H to a set of . For example, if H=(H1,H2,H3), then O(H) might be (6:2:1), which says that H1 is 3x as likely as H2 and 6x as likely as H3. An odds function captures our relative probabilities between the hypotheses in H; for example, (6 : 2 : 1) odds are the same as (18 : 6 : 3) odds. We don't need to know the absolute probabilities of the Hi in order to know the relative odds. All we require is that the relative odds are proportional to the absolute probabilities: O(H)∝P(H).

In the example with the death of Mr. Boddy, suppose H1 denotes the proposition "Reverend Green murdered Mr. Boddy", H2 denotes "Mrs. White did it", and H3 denotes "Colonel Mustard did it". Let H be the vector (H1,H2,H3). If these propositions respectively have probabilities of 80%, 8%, and 4% (the remaining 8% being reserved for other hypotheses), then O(H)=(80:8:4)=(20:2:1) represents our relative credences about the murder suspects — that Reverend Green is 10 times as likely to be the murderer as Miss White, who is twice as likely to be the murderer as Colonel Mustard.

Likelihood functions

Suppose we discover that the victim was murdered by wrench. Suppose we think that Reverend Green, Mrs. White, and Colonel Mustard, if they murdered someone, would respectively be 60%, 90%, and 30% likely to use a wrench. Letting ew denote the observation "The victim was murdered by wrench," we would have P(ew∣H)=(0.6,0.9,0.3). This gives us a defined as Lew(H)=P(ew∣H).

Bayes' rule, odds form

Let O(H∣e) denote the odds of the hypotheses H after observing evidence e. then states:

O(H)×Le(H)=O(H∣e)

This says that we can multiply the relative prior credence O(H) by the likelihood Le(H) to arrive at the relative posterior credence O(H∣e). Because odds are invariant under multiplication by a positive constant, it wouldn't make any difference if the likelihood function was scaled up or down by a constant, because that would only have the effect of multiplying the final odds by a constant, which does not affect them. Thus, only the are necessary to perform the calculation; the absolute likelihoods are unnecessary. Therefore, when performing the calculation, we can simplify Le(H)=(0.6,0.9,0.3) to the relative likelihoods (2:3:1).

In our example, this makes the calculation quite easy. The prior odds for Green vs White vs Mustard were (20:2:1). The relative likelihoods were (0.6:0.9:0.3) = (2:3:1). Thus, the relative posterior odds after observing ew = Mr. Boddy was killed by wrench are (20:2:1)×(2:3:1)=(40:6:1). Given the evidence, Reverend Green is 40 times as likely as Colonel Mustard to be the killer, and 20/3 times as likely as Mrs. White.

Bayes' rule states that this relative proportioning of odds among these three suspects will be correct, regardless of how our remaining 8% probability mass is assigned to all other suspects and possibilities, or indeed, how much probability mass we assigned to other suspects to begin with. For a proof, see .

Visualization

, , and may be helpful for explaining or visualizing the odds form of Bayes' rule.

Parents:
Children:
Math 2
1
1
prior
a priori
prior
vector
vector
posterior
posterior
posterior
Likelihood function
likelihood function
likelihood function
Discussion16
Discussion16
Bayes' rule
Bayes' rule
Bayes' rule
Bayes' rule
relative odds
odds
Odds
Odds
relative likelihoods
relative likelihoods
Conditional probability
Frequency diagrams
Bayes' rule
Proof of Bayes' rule
Introduction to Bayes' rule: Odds form
waterfall diagrams
spotlight diagrams