Bayes' Law is About Multiple Hypothesis Testing

Now we've got it: we see the need to enumerate every hypothesis we can in order to test even one hypothesis properly.

A cached handle I have for this is "the negation of a hypothesis is not a hypothesis"; said another way, "the negation of a model is not a model." Insofar as a hypothesis / model is a thing that makes predictions, "not (a thing that makes predictions)" isn't a thing that makes predictions. E.g. "person X just didn't understand the concept" is not a hypothesis about what's going on when person X gets a problem wrong on a test.

[-]Dacyn8y30

Typos: "MHH" -> "MMH", "lense" -> "lens"

You seem to be a little confused about hypothesis testing: the null hypothesis is the one that actually makes predictions, whereas the "alternative hypothesis" is the one that just states "the null hypothesis is false" (and so doesn't make predictions). Of course, the null hypothesis is often a strawman, while the actual most plausible hypotheses go unrepresented in the calculation (but are supposedly made more plausible by the failure of the null hypothesis).

I think the chart's ambiguity between $P (e | h)$ and $P (h | e)$ isn't too serious, since I think $P (e | h)$ lines up more with the intuitive concept of "compatibility" than $P (h | e)$ does.

Finally, to push back slightly on your main argument, sometimes the most important hypotheses are the ones that you can't state explicitly right now. In which case maybe you need some sort of "default hypothesis" to represent this possibility. Though such calculations are certainly something to be more skeptical of.

[-]abramdemski8y30

Ah... yeah, I forgot that the non-null hypothesis being tested isn't explicitly represented.

Finally, to push back slightly on your main argument, sometimes the most important hypotheses are the ones that you can't state explicitly right now. In which case maybe you need some sort of "default hypothesis" to represent this possibility. Though such calculations are certainly something to be more skeptical of.

I think I've seen a paper put forward that kind of approach (I don't remember enough to find it right now), but yeah, it is hard to see how a "default hypothesis" can be representative enough of all the neglected hypotheses.

Taking a logical-induction approach to the problem, we could say: it is possible to have a principled estimate of the probability which does not add up to the average probability assigned by all the hypotheses we can explicitly write down, because we can learn adjustment heuristics through experience (such as "probabilities estimated from the explicit hypotheses I can think of to write down tend to be overconfident by about x%).

[-]avturchin8y10

It started to look like AIXI, where you create all possible hypothesis and weight them one against another. In AIXI simplest hypothesis in Kolmogorov complexity sense are regarded as the best alternative.

I tried to implement something similar when I created my roadmaps where I listed all possible ways how something could happen (mostly different x-risks). Typically, I limited myself to around 100 ideas, as I have intuition that first 100 hypothesis are enough. However, this intuition is not yet supported, and I would be interested to find the ways to estimate how many hypothesis should be listed before the correct one is in the list. If the number is very large, like 1000s, when listing hypothesis is not productive.

[-]abramdemski8y60

I would be interested to find the ways to estimate how many hypothesis should be listed before the correct one is in the list. If the number is very large, like 1000s, when listing hypothesis is not productive.

I think the better question is how many hypotheses should be listed before the value of information is too low to be worth continuing.

If you think (exactly) one of the hypotheses is correct, then the prior probability that you have already included the correct one in your list is exactly the sum of the prior probabilities of all hypotheses so far. The posterior probability that your list contains the correct hypothesis cannot be computed, though, since it requires knowledge of the prior probability of the observed evidence (which requires summing over all the hypotheses, including the ones you didn't list yet).

(If more than one hypothesis can be correct due to several hypotheses being equivalent, the probability is higher.)

If there is a chance reality is not any hypothesis you would ever list, then you could multiply the above calculation by the probability reality is one of the hypotheses you would ever list.

All this seems rather artificial, since it assumes the probabilities in the prior are meaningful, but it seems to me that if we're asking what the probability that we've already listed the correct hypothesis is, we don't want to trust the prior. But, what else can you do?

However, getting the correct hypothesis in your list is much less important than getting hypotheses which are good enough to help you make accurate decisions later. That's why I said value of information seems like the more relevant measurement. It seems like this can't be estimated without knowing anything about the hypotheses you haven't listed yet, though.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

36

Bayes' Law is About Multiple Hypothesis Testing

36

36