Recently someone posed an oddly-constructed exercise on Bayes' Theorem, where instead of the usual given information P(A|B),P(A|¬B),P(B) they gave P(A|B),P(A|¬B),P(A). I won't link to the source, because they intended to pose a problem in the usual form and I don't want to draw attention to their mistake. But the problem itself is an interesting one, and it does have a solution (in most cases) which turns out to be quite neat.

Problem Statement

Given P(A|B)=x,P(A|¬B)=y,P(A)=a, find P(B|A),P(B|¬A).

Why is this interesting?

(Putting this section here to separate the problem statement from my solution, since I don't know of a way to spoiler in MarkDown.)

Suppose you want to evaluate the performance of a heuristic, B, for determining the underlying A, which is accurately determined later. You know the base rate of A since it is (say) recorded in public statistics, and you know how common B -errors in either direction are, because the same statistics report P(A|B) and P(A|¬B) for their obvious use in decisionmaking.

But you are interested not just in the general case, but in some subcategory C. In the absence of further information your default prior hypothesis is that P(B|A∧C)=P(B|A) and similarly P(B|¬A∧C)=P(B|¬A) (i.e., the heuristic's reliability is independent of C), but you know that within C the base rate of A, P(A|C), is some value different from P(A).

Now under that hypothesis, knowing P(B|A),P(B|¬A) would allow you to determine your prior for P(A|B∧C) and P(A|¬B∧C) by an ordinary application of Bayes' Theorem. If you then have a small sample drawn from C, you could for instance denote those priors as p and q and take as your prior distributions that P(A|B∧C)∼β(p,1−p) and P(A|¬B∧C)∼β(q,1−q), which you then update from your sample. (I don't know whether this places the 'right' amount of credence in P(B|A∧C)=P(B|A) etc.; one could just as easily take β(2p,2(1−p)) to treat the ¬C results as more 'relevant'.)

But we don't have P(B|A), we just have P(A|B). Hence the problem-as-stated.

Solution

We first expand P(A|B) and P(¬A|B) using Bayes' Theorem:

P(A|B)=P(A∧B)P(B)P(¬A|B)=P(¬A∧B)P(B)

Now we can re-arrange these into two expressions for P(B), and set them equal:

P(A∧B)P(A|B)=P(B)=P(¬A∧B)P(¬A|B)

Cross-multiplying and substituting in our given x=P(A|B),

(1−x)P(A∧B)=xP(¬A∧B)xP(¬A∧B)+(x−1)P(A∧B)=0

We can do the same thing with y=P(A|¬B) to obtain:

yP(¬A∧¬B)+(y−1)P(A∧¬B)=0

We then use the exhaustivity of {B,¬B} to expand P(A) and P(¬A):

P(A∧B)+P(A∧¬B)=P(A)=aP(¬A∧B)+P(¬A∧¬B)=P(¬A)=1−a

At this point we have four simultaneous equations in the four unknowns P(A∧B),P(A∧¬B),P(¬A∧B),P(¬A∧¬B), which we can solve by matrix inversion, Gaussian elimination, or some similar method. Our matrix is

The determinant is det(M)=y−x, so our condition for a solution is that x≠y. This makes sense: if x=y then the heuristic B tells us nothing about A, so there is no way to link our givens (all probabilities of

[Epistemic status: QED.]

Recently someone posed an oddly-constructed exercise on Bayes' Theorem, where instead of the usual given information P(A|B),P(A|¬B),P(B) they gave P(A|B),P(A|¬B),P(A). I won't link to the source, because they

intendedto pose a problem in the usual form and I don't want to draw attention to their mistake. But the problem itself is an interesting one, and it does have a solution (in most cases) which turns out to be quite neat.## Problem Statement

Given P(A|B)=x,P(A|¬B)=y,P(A)=a, find P(B|A),P(B|¬A).

## Why is this interesting?

(Putting this section here to separate the problem statement from my solution, since I don't know of a way to spoiler in MarkDown.)

Suppose you want to evaluate the performance of a heuristic, B, for determining the underlying A, which is accurately determined later. You know the base rate of A since it is (say) recorded in public statistics, and you know how common B -errors in either direction are, because the same statistics report P(A|B) and P(A|¬B) for their obvious use in decisionmaking.

But you are interested not just in the general case, but in some subcategory C. In the absence of further information your default prior hypothesis is that P(B|A∧C)=P(B|A) and similarly P(B|¬A∧C)=P(B|¬A) (i.e., the heuristic's reliability is independent of C), but you know that within C the base rate of A, P(A|C), is some value different from P(A).

Now under that hypothesis, knowing P(B|A),P(B|¬A) would allow you to determine your prior for P(A|B∧C) and P(A|¬B∧C) by an ordinary application of Bayes' Theorem. If you then have a small sample drawn from C, you could for instance denote those priors as p and q and take as your prior distributions that P(A|B∧C)∼β(p,1−p) and P(A|¬B∧C)∼β(q,1−q), which you then update from your sample. (I don't know whether this places the 'right' amount of credence in P(B|A∧C)=P(B|A) etc.; one could just as easily take β(2p,2(1−p)) to treat the ¬C results as more 'relevant'.)

But we don't have P(B|A), we just have P(A|B). Hence the problem-as-stated.

## Solution

We first expand P(A|B) and P(¬A|B) using Bayes' Theorem:

Cross-multiplying and substituting in our given x=P(A|B),

We can do the same thing with y=P(A|¬B) to obtain:

We then use the exhaustivity of {B,¬B} to expand P(A) and P(¬A):

At this point we have four simultaneous equations in the four unknowns P(A∧B),P(A∧¬B),P(¬A∧B),P(¬A∧¬B), which we can solve by matrix inversion, Gaussian elimination, or some similar method. Our matrix is

and our linear system is

The determinant is det(M)=y−x, so our condition for a solution is that x≠y. This makes sense: if x=y then the heuristic B tells us nothing about A, so there is no way to link our givens (all probabilities of