[Epistemic status: QED.]

Recently someone posed an oddly-constructed exercise on Bayes' Theorem, where instead of the usual given information they gave . I won't link to the source, because they intended to pose a problem in the usual form and I don't want to draw attention to their mistake. But the problem itself is an interesting one, and it does have a solution (in most cases) which turns out to be quite neat.

Problem Statement

Given , find .

Why is this interesting?

(Putting this section here to separate the problem statement from my solution, since I don't know of a way to spoiler in MarkDown.)

Suppose you want to evaluate the performance of a heuristic, , for determining the underlying , which is accurately determined later. You know the base rate of since it is (say) recorded in public statistics, and you know how common -errors in either direction are, because the same statistics report and for their obvious use in decisionmaking.

But you are interested not just in the general case, but in some subcategory . In the absence of further information your default prior hypothesis is that and similarly (i.e., the heuristic's reliability is independent of ), but you know that within the base rate of , , is some value different from .

Now under that hypothesis, knowing would allow you to determine your prior for and by an ordinary application of Bayes' Theorem. If you then have a small sample drawn from , you could for instance denote those priors as and and take as your prior distributions that and , which you then update from your sample. (I don't know whether this places the 'right' amount of credence in etc.; one could just as easily take to treat the results as more 'relevant'.)

But we don't have , we just have . Hence the problem-as-stated.


We first expand and using Bayes' Theorem:

Now we can re-arrange these into two expressions for , and set them equal:

Cross-multiplying and substituting in our given ,

We can do the same thing with to obtain:

We then use the exhaustivity of to expand and :

At this point we have four simultaneous equations in the four unknowns , which we can solve by matrix inversion, Gaussian elimination, or some similar method. Our matrix is

and our linear system is

The determinant is , so our condition for a solution is that . This makes sense: if then the heuristic tells us nothing about , so there is no way to link our givens (all probabilities of ) to joint probabilities involving