Confidence Confusion

[-]Qiaochu_Yuan8y80

I had exactly this argument with Critch several years ago. He was very strongly on the side of reporting all of the digits you have. I disagreed with him at the time but now I think he's right. As outside view support, I hear that Tetlock's superforecasters do noticeably worse if you round their probabilities off to the nearest multiple of 5 or 10, but I don't remember where I heard this and haven't read Superforecasting myself to corroborate.

As an inside view argument, rounding is extremely sensitive to how you choose to parameterize probabilities. Here are four options: you could choose to think in terms of probabilities, log probabilities, odds, or log odds. In each of these "coordinate systems" rounding has very different results. So mathematically it's not a very principled thing to do to a probability.

The thing I usually do, when asked to elicit a probability, is report a probability (usually 2 sig figs) and then also a subjective sense of how easy it would be to shift that probability by giving me more evidence / allowing me more time to think. I also sometimes straight up refuse to report a probability. The thing I generally prefer to do is to share my models instead of sharing my probabilities.

I think the thought experiment is dramatically underspecified. Who are Albert and Betty reporting probabilities to, and what will those probabilities be used for?

[-]Unnamed8y80

Scott mentioned that fact about superforecasters in his review; from what I remember the book doesn't add much detail beyond Scott's summary.

One result is that while poor forecasters tend to give their answers in broad strokes – maybe a 75% chance, or 90%, or so on – superforecasters are more fine-grained. They may say something like “82% chance” – and it’s not just pretentious, Tetlock found that when you rounded them off to the nearest 5 (or 10, or whatever) their accuracy actually decreased significantly. That 2% is actually doing good work.

[-]Robert Miles8y60

Perhaps the principled way is to try representing your probability to the same number of significant figures as a probability, as a log probability, as odds, and as log odds, and then present whichever option happens to fall closest to your true estimate :p

[-]alkjash8y20

I think I'm most interested in the last question I posed: as a conversational default when I'm not interested in diving into models and computations, should I share all the digits or as many as my confidence allows?

[-]Qiaochu_Yuan8y50

I think you should share 2 digits.

[-]habryka8y70

I think you should share more digits. I sometimes say 33.5%, and experience it as meaningfully different from 34% or 33%.

This is obviously exacerbated around the ends of the probability spectrum. I.e. there is a massive difference between 99% and 99.5%, and it seems very important to feel comfortable distinguishing between them.

[-]Qiaochu_Yuan8y50

That's fair.

When I say 2 digits I mean 2 sig figs, so e.g. 0.05% is one digit. I think if you're reporting a probability near 99% it makes sense to report 1 minus that probability, to 2 (or 3, or more if you have them) sig figs.

[-]Richard_Ngo8y10

The thing I usually do, when asked to elicit a probability, is report a probability (usually 2 sig figs) and then also a subjective sense of how easy it would be to shift that probability by giving me more evidence / allowing me more time to think.

What is the correct technical way to summarise the latter quantity (ease of shifting), in an idealised setting?

[-]Qiaochu_Yuan8y20

Uh, I dunno, something like, I currently have a belief about the probability distribution of kinds of evidence I expect to encounter in the future, and from there I can compute a probability distribution over what my posterior beliefs are after updating on that evidence, then compute some summary statistic of that distribution that measures how spread out it is. An easy setting in which this can be made completely formal is repeatedly flipping a coin of unknown bias.

[-]gjm8y60

Despite the first sentence of this post, I don't think it's actually about probabilities. The same questions arise when you have any other sort of number to report.

The answer seems to me to be composed of one kinda-obvious part and one kinda-impossible-to-determine part.

The obvious part is that unless for some reason you're deliberately deceiving, you should do your best to convey the information you have, which includes both your best estimate of the number and how much you think you know about it (really it's something like a probability distribution over the possible values of the number) within whatever constraints you have -- e.g., on the attention span of whoever you're reporting the number to, or your insight into your beliefs.

The kinda-impossible part is figuring out how those considerations actually trade off. Usually you will have limited resources, limited insight into what you actually believe, an audience with limited patience, a social context in which numbers with lots of nonzero digits in them are taken as implicit claims to detailed knowledge, etc., and exactly what that means for how you should report the numbers is going to be (1) different each time, as all those factors vary, and (2) very difficult to determine.

If you're dealing with a fairly technical audience, or one strongly motivated to pay attention to the details of what you say, I think it should be OK to say things like "31.5% +- 2.5%". Otherwise, I suspect there usually is no way to avoid their understanding being seriously deficient, and you get to choose between saying 31.5% and misleading them about your confidence, and saying 30% and misleading them about your best point estimate.

[-]Unnamed8y60

Albert and Betty should share likelihood ratios, not posterior beliefs.

[-]alkjash8y20

This is definitely a major improvement in situations where people have mostly independent data to aggregate, but in real life independence seems to be fairly rare and using likelihood ratios would cause extremely unnatural updating. In the end you can't get around sharing all your models and data to solve the independence issue.

I guess my real question is: in a low-bandwidth/low-effort setting like casual conversation what is the best single number to share? If you had to design discourse norms surrounding sharing probabilities in this setting, is likelihood ratio really the right norm?

[-]Ben Pace8y20

Moved to frontpage.

[-]John Faben8y10

>If social norms indeed dictate that significant figures transmit confidence, might it be deceptive to report 31.5 instead of 30 in conversation about the dinosaur market?

Not if it's your bid in a market, yes if someone asks you for your probability estimate. Your best point estimate is obviously your best point estimate, and suffers from rounding.

> Betty’s landing capsule collides with a giant teapot in upper orbit and lands several hundred miles away from target. She still has to report her beliefs about the mysterious coin.

Why? If she has to offer someone a bet as to which side the coin lands on, she should probably offer even odds, but I can't see any situation where she has to report her probability but isn't able to report that this is based on nothing more than her prior.

[-]alkjash8y10

The thought experiment was an attempt to formalize the constraints of casual conversation, where it would feel to me pedantic to report additional details about a probability beyond a single number.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

6

Confidence Confusion

6

6

The Dinosaur Market

Precision is Confidence

A Thought Experiment