Bayes, Backwards

ec429

[Epistemic status: QED.]

Recently someone posed an oddly-constructed exercise on Bayes' Theorem, where instead of the usual given information $P (A | B), P (A | \neg B), P (B)$ they gave $P (A | B), P (A | \neg B), P (A)$ . I won't link to the source, because they intended to pose a problem in the usual form and I don't want to draw attention to their mistake. But the problem itself is an interesting one, and it does have a solution (in most cases) which turns out to be quite neat.

Problem Statement

Given $P (A | B) = x, P (A | \neg B) = y, P (A) = a$ , find $P (B | A), P (B | \neg A)$ .

Why is this interesting?

(Putting this section here to separate the problem statement from my solution, since I don't know of a way to spoiler in MarkDown.)

Suppose you want to evaluate the performance of a heuristic, $B$ , for determining the underlying $A$ , which is accurately determined later. You know the base rate of $A$ since it is (say) recorded in public statistics, and you know how common $B$ -errors in either direction are, because the same statistics report $P (A | B)$ and $P (A | \neg B)$ for their obvious use in decisionmaking.

But you are interested not just in the general case, but in some subcategory $C$ . In the absence of further information your default prior hypothesis is that $P (B | A \land C) = P (B | A)$ and similarly $P (B | \neg A \land C) = P (B | \neg A)$ (i.e., the heuristic's reliability is independent of $C$ ), but you know that within $C$ the base rate of $A$ , $P (A | C)$ , is some value different from $P (A)$ .

Now under that hypothesis, knowing $P (B | A), P (B | \neg A)$ would allow you to determine your prior for $P (A | B \land C)$ and $P (A | \neg B \land C)$ by an ordinary application of Bayes' Theorem. If you then have a small sample drawn from $C$ , you could for instance denote those priors as $p$ and $q$ and take as your prior distributions that $P (A | B \land C) \sim β (p, 1 - p)$ and $P (A | \neg B \land C) \sim β (q, 1 - q)$ , which you then update from your sample. (I don't know whether this places the 'right' amount of credence in $P (B | A \land C) = P (B | A)$ etc.; one could just as easily take $β (2 p, 2 (1 - p))$ to treat the $\neg C$ results as more 'relevant'.)

But we don't have $P (B | A)$ , we just have $P (A | B)$ . Hence the problem-as-stated.

Solution

We first expand $P (A | B)$ and $P (\neg A | B)$ using Bayes' Theorem:

P (A | B) = \frac{P (A \land B)}{P (B)} P (\neg A | B) = \frac{P (\neg A \land B)}{P (B)}

Now we can re-arrange these into two expressions for

P (B)

, and set them equal:

\frac{P (A \land B)}{P (A | B)} = P (B) = \frac{P (\neg A \land B)}{P (\neg A | B)}

Cross-multiplying and substituting in our given $x = P (A | B)$ ,

(1 - x) P (A \land B) = x P (\neg A \land B) x P (\neg A \land B) + (x - 1) P (A \land B) = 0

We can do the same thing with $y = P (A | \neg B)$ to obtain:

y P (\neg A \land \neg B) + (y - 1) P (A \land \neg B) = 0

We then use the exhaustivity of ${B, \neg B}$ to expand $P (A)$ and $P (\neg A)$ :

P (A \land B) + P (A \land \neg B) = P (A) = a P (\neg A \land B) + P (\neg A \land \neg B) = P (\neg A) = 1 - a

At this point we have four simultaneous equations in the four unknowns $P (A \land B), P (A \land \neg B), P (\neg A \land B), P (\neg A \land \neg B)$ , which we can solve by matrix inversion, Gaussian elimination, or some similar method. Our matrix is

M = ⎡ ⎢ ⎢ ⎢ ⎣ \begin{matrix} 1 & 1 & 0 & 0 0 & 0 & 1 & 1 x - 1 & 0 & x & 0 0 & y - 1 & 0 & y \end{matrix} ⎤ ⎥ ⎥ ⎥ ⎦

and our linear system is

v = ⎛ ⎜ ⎜ ⎜ ⎜ ⎝ \begin{matrix} P (A \land B) P (A \land \neg B) P (\neg A \land B) P (\neg A \land \neg B) \end{matrix} ⎞ ⎟ ⎟ ⎟ ⎟ ⎠ M v = ⎛ ⎜ ⎜ ⎜ ⎝ \begin{matrix} a 1 - a 00 \end{matrix} ⎞ ⎟ ⎟ ⎟ ⎠

The determinant is $d e t (M) = y - x$ , so our condition for a solution is that $x \neq y$ . This makes sense: if $x = y$ then the heuristic $B$ tells us nothing about $A$ , so there is no way to link our givens (all probabilities of $A$ ) to joint probabilities involving $B$ . So let us assume that $x \neq y$ . In that case the solution is

v = \frac{1}{y - x} ⎛ ⎜ ⎜ ⎜ ⎜ ⎝ \begin{matrix} x (y - a) y (a - x) (1 - x) (y - a) (1 - y) (a - x) \end{matrix} ⎞ ⎟ ⎟ ⎟ ⎟ ⎠

as can be verified by multiplying out $M v$ .

From here the rest is trivial:

P (B | A) = \frac{x (y - a)}{x (y - a) + y (a - x)} = \frac{x (y - a)}{a (y - x)} P (B | \neg A) = \frac{(1 - x) (y - a)}{(1 - x) (y - a) + (1 - y) (a - x)} = \frac{(1 - x) (y - a)}{(1 - a) (y - x)}

And we're done.

Analysis

To see what this result means, let's write it out with our givens substituted back in:

P (B | A) = \frac{P (A | B)}{P (A)} \frac{(P (A | \neg B) - P (A))}{(P (A | \neg B) - P (A | B))}

which holds as long as $P (A | \neg B) \neq P (A | B)$ .

We could also have determined this by summing the first and third rows of $v$ , $\frac{1}{y - x} (x + 1 - x) (y - a) = \frac{y - a}{y - x}$ , to find that

P (B) = \frac{P (A | \neg B) - P (A)}{P (A | \neg B) - P (A | B)}

under the same condition, and then substituting this into Bayes' Theorem.

Now let's consider $b = P (A | B) - P (A), ¯ b = P (A | \neg B) - P (A)$ as, in some sense, measures of the "excess $A$ " under $B$ and $\neg B$ respectively. This now tells us that

P (B) = \frac{¯ b}{¯ b - b}

which I think is a good place to stop.