Calibration for continuous quantities

[-]gjm16y50

You shouldn't need to do any integrals to show that the PIT gives a uniform distribution. Suppose Pr(X <= x) = p; then (assuming no jumps in the cdf) the PIT maps x to p. In other words, writing P for the random variable produced by the PIT, Pr(P <= p) = p, so P is uniform.

[-]Cyan16y00

Yup, that works. I would only caution that "assuming no jumps in the cdf" is not quite the right condition: singular distributions (e.g., the Cantor distribution) contain jumps, and the PIT works fine for them. The correct condition is that the random variable not have a discrete component.

[-]gjm16y00

Sure.

[-]RobinZ16y00

Pr(X <= x) is an integral, but I find this explanation clearer than the one in the OP. Upvoted.

[-]Sebastian_Hagen16y30

This degree of over-confidence was chosen because I seem to recall reading on OB that if you ask a subject matter expert for a 95% interval, you tend to get back a reasonable 50% interval. Anyone know a citation for that?

Not exactly; but you might be thinking of asking students for 99% intervals instead. That result was quoted in Planning Fallacy, which also lists the original sources.

[-]Cyan16y00

That might be it, but the memory that swirls foggily about in my mind has to do with engineers being asked to give intervals for the point of failure of dams...

[-]Sebastian_Hagen16y20

There is a result in Cognitive biases potentially affecting judgement of global risks, p. 17, which is about intervals given by engineers about points of dam failure. It really doesn't make your claim, but it does look like the kind of thing that could be misremembered in this way. Quoting the relevant paragraph:

Similar failure rates have been found for experts. Hynes and Vanmarke (1976) asked seven internationally known geotechical engineers to predict the height of an embankment that would cause a clay foundation to fail and to specify confidence bounds around this estimate that were wide enough to have a 50% chance of enclosing the true height. None of the bounds specified enclosed the true failure height. Christensen-Szalanski and Bushyhead (1981) reported physician estimates for the probability of pneumonia for 1,531 patients examined because of a cough. At the highest calibrated bracket of stated confidences, with average verbal probabilities of 88%, the proportion of patients actually having pneumonia was less than 20%.

[-]Cyan16y00

Yup, I think the two links you found explain my misremembered factoid.

[-]sharpneli16y30

Very useful considering that many variables can be approximated as a continous with a good precision.

Small nitpicking about "or any actual measurement of a continuous quantity". All actual measurements give rational numbers, therefore they are discrete.

[-]Cyan16y00

I agree with you. The bolded "or" in the quoted sentence below was accidental (and is now corrected), so this misunderstanding is likely my fault. The other "or" is an inclusive or.

"smooth" distributions over discrete quantities like dates of historical events, dollar amounts on the order of hundreds of thousands, or populations of countries, or any actual measurement of a continuous quantity

[-]Stuart_Armstrong16y20

Useful. Slightly technical. Fun. Upvoted.

[-]Cyan16y00

Thanks!

[-]Daniel_Burfoot16y20

Note the similarity between PIT values and bits. If you have a good model of a data set and use it to encode the data, then the bit string you get out will be nearly random (frequency of 1s=50%, freq. 10s=25%, etc.) Analogously when you have a good model then the PIT values should be uniform on [0,1]. A tendency of the PITs to clump up in one section of the unit interval corresponds to a randomness deficiency in a bit string.

LESSWRONG
LW

LESSWRONG
LW

30

Calibration for continuous quantities

30

30