Nope. Suppose I roll a 100-sided die, and all LessWrongers write down their centred 80% credible interval for where the answer should fall. If the LWers are rational and calibrated, that interval should be [10,90]. So the actual outcome will fall in everybody's credible interval or nobody's. The relevant averaging should happen across questions, not across predictors.

Reply

[-]Jotto9996y70

So I thought about it, and I think you're correct. At least, if their error was correlated and nonrandom (which is very natural in many situations), then we wouldn’t expect 14 of 18 in the group to get it in their range. So you're right. I can imagine hypotheticals where "group calibration in one-offs" might mean something, but not here now that I think about it.

Instead of that, I should've just stuck to pointing out how far out of the ranges many of them were. Rather than the proportion of the group that got it in their range. I.e. suppose an anonymous forecaster places a 0.0000001% probability on some natural macro-scale event having outcome A instead of B, on the next observation. Outcome A happens. That is strong evidence that they weren't just "incorrect with a lot of error"; they're probably really uncalibrated too.

Eyeballing the survey chart, many of those 80% CIs are so narrow, they imply the true figure to have been a many-sigma event. That's really incredible, and I would take it as evidence they're also uncalibrated (not just wrong in this one-off). That's a much better way to infer they're uncalibrated, than the 14 of 18 thing.

Semi-related: I'm unclear about a detail in your dice example. You say a rational and calibrated interval “should be [10,90]” for the dice rolling. Why? A calibrated-over-time 80% confidence interval on such dice could be placed anywhere (e.g. [1,80] or [21,100], so long as they are 80 units wide.

Reply

[+][comment deleted]6y30

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

46

A quick and crude comparison of epidemiological expert forecasts versus Metaculus forecasts for COVID-19

46

46