Error margins

by [anonymous]1 min read21st Mar 201511 comments

10

Personal Blog

Why don't probabilities come with error margins, or other means of describing uncertainty in their assessments?

If I evaluate a prior probability P(new glacial period starting within the next 100 years) to, say, 0.1, shouldn't I then also communicate how certain I feel about that judgement?
A scientist might make the same estimate but be more sure about it's accuracy than I.

In our everyday judgements we often use such package deals:
A: where's Jamie?
B: I think he went to the club house, but you know Jamie - he could be anywhere.

High P, high uncertainty

A: Where's Susie? Do you think she ran astray after that hefty argument?
B: no I'm certain she would *never* do that. She must have gone to a friends place.

High P, low uncertainty.

11 comments, sorted by Highlighting new comments since Today at 5:38 AM
New Comment

Probabilities do not necessarily need confidence intervals. A probability is already an assessment of uncertainty.

Qualitatively Confused seems relevant. If you assign probability 0.8 that Jamie is at the club, it doesn't make sense to attach an error margin to this number. An error margin would mean something like "I think there is probability 0.8 that Jamie is at the club, but the real probability might be 0.2-0.9." But this is an error. Probability is map, not territory: there is no "real probability" that Jamie is at the club. He's either there or he's not, but you don't know which, and your degree of uncertainty is quantified as a probability.

To those responding with the (obvious) insight that a distribution of probabilities collapses into a probability:

This is true, but there's more to it. Consider, for example, the following two experiments:

  • Alice flips a coin to see if it comes up Heads or Tails.

  • Bob carves a rough wooden disk, colors both sides, and flips it to see if it comes up Red or Blue.

To a very good first approximation, Pr[Alice flips Heads]=1/2. For lack of better information, Pr[Bob flips Red]=1/2 as well. In many cases, you should treat these probabilities identically. For example, if choosing between "$2 if Alice flips Heads" and "$3 if Alice flips Tails", you should pick the second option; you should do the same for Bob.

There are two (related) ways in which we can meaningfully say we're more uncertain about Bob's flip than Alice's.

  1. Repeated trials. If Alice flips her coin 100 times, there's only about a 0.0016% chance that she gets Heads more than 70 times. If Bob flips his disk 100 times, the chance of seeing Red more than 70 times is much higher: 30/101, for instance, if we start with a uniform distribution for Pr[Bob flips Red]. Here, it's meaningful to say that there's a true (frequentist) probability of Pr[Bob flips Red], that repeated trials will converge to the probability, and that we're not at all certain that they will converge to 1/2.

  2. Updating. If you see Alice flip her coin 5 times and it comes up Heads every time, then you most likely say "Huh, that's odd" and continue to expect Pr[Alice flips Heads] to be about 1/2 for the next flip. If you see Bob flip his disk 5 times and it comes up Red every time, then you begin to suspect that it's fairly likely to continue coming up Red.

The second of these is much more important: there are few things in life for which we can take repeated independent samples, but many things in life for which we expect to learn additional information. Unfortunately, the second of these is also much more complicated.

We can't, you see, just stick with a probability distribution on an underlying parameter p = Pr[Bob flips Red] and update this probability distribution with new information. That helps a lot, but it doesn't help with everything. For instance, if we already updated on "Chad saw the disk come up Red, like, ten times in a row" we won't update very much on "Dana was there and she saw it, too." The only complete description of our uncertainty is a list of all the evidence we have collected.

(Of course, if we expect to see a bunch of independent trials and update based on those, we can do that with Beta distributions and such easily. But, as I mentioned, that doesn't always happen.)

[-][anonymous]6y 0

I was thinking in the context of the bayesian theorem. Those articles that describe how evidence updates works, using bayes theorem, never seems to include confidence intervals. Maybe I have just looked the wrong places. I'll find out once I've gone through the links somervta gave me.

If you're doing one of those simple problems (e.g. cancer test has X false positive rate and Y false negative and the prior rate of cancer is Z) then you're not getting confidence intervals because you're assuming that X, Y, and Z are known 100% correctly. If you have confidence intervals to input for X, Y, and Z, you will output confidence intervals as well.

Similarly, 2+2 will always give you 4, without a confidence interval attached. But if you add two numbers with confidence intervals, you'll get a number with a confidence interval, probably after you make some assumptions about independence or about what your intervals mean.

May I suggest a book on statistics? I am partial to Larry Wasserman's "All of Statistics," but Larry is not a Bayesian.

[-][anonymous]6y 3

guys, I'm not evil, you don't have to downvote me all the time

If you have a probability of probabilities, you can just collapse it into one probability. Suppose you're 50% sure that A has 80% probability, and 50% sure it has 60% probability. Let B be that A has 80% probability.

P(A|B) = 0.8

P(A|!B) = 0.6

P(B) = 0.5

P(A&B) = P(A|B)P(B) = 0.8*0.5 = 0.4

P(A&!B) = P(A|!B)P(!B) = 0.6*0.5 = 0.3

P(A) = P(A&B)+P(A&!B) = 0.4+0.3 = 0.7

So you can just say that A has 70% probability and be done with it. No need for a confidence interval.

[-][anonymous]6y 0

To those responding with the (obvious) insight that a distribution of probabilities collapses into a probability:

This is true, but there's more to it. Consider, for example, the following two experiments:

  • Alice flips a coin to see if it comes up Heads or Tails.

  • Bob carves a rough wooden disk, colors both sides, and flips it to see if it comes up Red or Blue.

To a very good first approximation, Pr[Alice flips Heads]=1/2. For lack of better information, Pr[Bob flips Red]=1/2 as well. In many cases, you should treat these probabilities identically. For example, if choosing between "$2 if Alice flips Heads" and "$3 if Alice flips Tails", you should pick the second option; you should do the same for Bob.

There are two (related) ways in which we can meaningfully say we're more uncertain about Bob's flip than Alice's.

  1. Repeated trials. If Alice flips her coin 100 times, there's only about a 0.0016% chance that she gets Heads more than 70 times. If Bob flips his disk 100 times, the chance of seeing Red more than 70 times is much higher: 30/101, for instance, if we start with a uniform distribution for Pr[Bob flips Red]. Here, it's meaningful to say that there's a true (frequentist) probability of Pr[Bob flips Red], that repeated trials will converge to the probability, and that we're not at all certain that they will converge to 1/2.

  2. Updating. If you see Alice flip her coin 5 times and it comes up Heads every time, then you most likely say "Huh, that's odd" and continue to expect Pr[Alice flips Heads] to be about 1/2 for the next flip. If you see Bob flip his disk 5 times and it comes up Red every time, then you begin to suspect that it's fairly likely to continue coming up Red.

The second of these is much more important: there are few things in life for which we can take repeated independent samples, but many things in life for which we expect to learn additional information. Unfortunately, the second of these is also much more complicated.

We can't, you see, just stick with a probability distribution on an underlying parameter p = Pr[Bob flips Red] and update this probability distribution with new information. That helps a lot, but it doesn't help with everything. For instance, if we already updated on "Chad saw the disk come up Red, like, ten times in a row" we won't update very much on "Dana was there and she saw it, too." The only complete description of our uncertainty is a list of all the evidence we have collected.

(Of course, if we expect to see a bunch of independent trials and update based on those, we can do that with Beta distributions and such easily. But, as I mentioned, that doesn't always happen.)

[This comment is no longer endorsed by its author]Reply