Suspiciously balanced evidence

by gjm3 min read12th Feb 202024 comments


Inside/Outside ViewCalibrationRationality

What probability do you assign to the following propositions?

  • "Human activity has caused substantial increases in global mean surface temperature over the last 50 years and barring major policy changes will continue to do so over at least the next 50 years -- say, at least half a kelvin in each case."
  • "On average, Christians in Western Europe donated a larger fraction of their income last year to non-religious charitable causes than atheists."
  • "Over the next 10 years, a typical index-fund-like basket of US stocks will increase by more than 5% per annum above inflation."
  • "In 2040, most global electricity generation will be from renewable sources."

These are controversial and/or difficult questions. There are surely a lot of people who think they know the answers and will confidently proclaim that these propositions are just true or, as the case may be, false. But you are a sophisticated reasoner, able to think in probabilistic terms, and I expect your answers to questions like these mostly lie between p=0.1 and p=0.9 or thereabouts, just like mine. No foolish black-and-white thinking for the likes of us!

(I confess that my estimate for the first of those propositions is above . But it's not above .)

... But isn't it odd that the evidence should be so evenly balanced? No more than 3 bits or so either way from perfect equality? Shouldn't we expect that the totality of available evidence, if we could evaluate it properly, would make for a much larger imbalance? If we encounter only a small fraction of it (which we'd need to, to explain such evenly balanced results), shouldn't we expect that randomness in the subset we happen to encounter will in some cases make us see a large imbalance even if any given single piece of evidence is about as likely to go one way as the other? What's going on here?

Let me add a bit of further fuel to the fire. I could have tweaked all those propositions somewhat -- more than 3%, or more than 7%, above inflation; more than 40%, or more than 60%, or 2050 instead of 2040. Surely that ought to change the probabilities quite a bit. But the answers I'd have given to the questions would still have probabilities between 0.1 and 0.9, and I bet others' answers would have too. Can things really be so finely enough balanced to justify this?

I can think of two "good" explanations (meaning ones that don't require us to be thinking badly) and one not-so-good one.

Good explanation #1: I chose propositions that I know are open to some degree of doubt or controversy. When I referred to "questions like these", you surely understood me to mean ones open to doubt or controversy. So questions where the evidence is, or seems to us to be, much more one-sided were filtered out. (For instance, I didn't ask about young-earth creationism, because I think it's almost certainly wrong and expect most readers here to feel the same way.) ... But isn't it strange that there are so many questions for which the evidence we have is so very balanced?

Good explanation #2: When assessing a question that we know is controversial but that seems one-sided to us, we tend to adjust our probabilities "inward" towards 1:1 as a sort of a nod to the "outside view". I think this, or something like it, is probably a very sensible idea. ... But I think it unlikely that many of us do it in a principled way, not least because it's not obvious how to.

Not-so-good explanation: We have grown used to seeing probability estimates as a sign of clear thought and sophistication, and every time we accompany some opinion with a little annotation "" we get a little twinge of pride at how we quantify our opinions, avoid black-and-white thinking, etc. And so it becomes a habit, and we translate an internal feeling of confidence-but-not-certainty into something like "" even when we haven't done the sort of evidence-weighing that might produce an actual numerical result.

Now, I'm not sure which of two quite different conclusions I actually want to endorse.

  • "The temptation to push all probabilities for not-crazy-sounding things into the middle of the possible range is dangerous. We are apt to treat things as substantially-open questions that really aren't; to be timid where we should be bold. Let's overcome our cowardice and become more willing to admit when the evidence substantially favours one position over another."
  • "Our practice is better than our principles. Empirically, we make lots of mistakes even in cases where numerical evidence-weighing would lead us to probabilities close to 0 or 1. So we should continue to push our probability estimates inward. The challenge is to figure out a more principled way to do it."

Here is a possible approach that tries to combine the virtues of both:

  • Allow accumulating evidence to push your probability estimates towards 0 or 1; be unashamed by these extreme-sounding probabilities. BUT
  • Keep a separate estimate of how confident you are that your approach is correct; that your accumulation of evidence is actually converging on something like the right answer. THEN,
  • When you actually need a probability estimate, bring these together.

Suppose your "internal" probability estimate is , your probability that your approach is correct is , and your probability conditional on your approach being wrong is . Then your overall probability estimate is and (holding constant) in effect your internal probability estimates are linearly squashed into the interval from to . So, for instance, if you're 90% sure your approach is right and your best guess if your approach is all wrong is that the thing's 75% likely to be true, then your estimates are squashed into the range [7.5%,97.5%].

Cautionary note: There's an important error I've seen people make when trying to do this sort of thing (or encourage others to do it), which is to confuse the propositions "I'm thinking about this all wrong" and "the conclusion I'm fairly sure of is actually incorrect". Unless the conclusion in question is a very specific one, that's likely a mistake; the probability I've called above matters and surely shouldn't be either 0 or 1.