I was chatting with Andrew Critch about the idea of Reacts on LessWrong.

Specifically, the part where I thought there are particular epistemic states that don’t have words yet, but should. And that a function of LessWrong might be to make various possible epistemic states more salient as options. You might have reacts for “approve/disapprove” and “agree/disagree”... but you might also want reactions that let you quickly and effortless express “this isn’t exactly false or bad but it’s subtly making this discussion worse.”

Fictionalized, Paraphrased Critch said “hmm, this reminds me of some particular epistemic states I recently noticed that don’t have names.”

“Go on”, said I.

“So, you know the feeling of being uncertain? And how it feels different to be 60% sure of something, vs 90%?”

“Sure.”

“Okay. So here’s two other states you might be in:

  • 75% sure that you’ll eventually be 99% sure,
  • 80% sure that you’ll eventually be 90% sure.

He let me process those numbers for a moment.

...

Then he continued: "Okay, now imagine you’re thinking about a particular AI system you’re designing, which might or might not be alignable.

“If you’re feeling 75% sure that you’ll eventually be 99% sure that that AI is safe, this means you think that eventually you’ll have a clear understanding of the AI, such that you feel confident turning it on without destroying humanity. Moreover you expect to be able to convince other people that it’s safe to turn it on without destroying humanity.

“Whereas if you’re 80% sure that eventually you’ll be 90% sure that it’ll be safe, even in the future state where you’re better informed and more optimistic, you might still not actually be confident enough to turn it on. And even if for some reason you are, other people might disagree about whether you should turn it on.

“I’ve noticed people tracking how certain they are of something, without paying attention to whether their uncertainty is possible to resolve. And this has important ramifications for what kind of plans they can make. Some plans require near-certainty. Especially many plans that require group coordination.

“Makes sense”, said I. "Can I write this up as a blogpost?"


I’m not quite sure about the best name here, but this seems like a useful concept to have a handle for. Something like “unresolvable uncertainty?”

New to LessWrong?

New Comment
15 comments, sorted by Click to highlight new comments since: Today at 3:56 PM

Minor clarification/nitpick: seems like it's not about whether the uncertainty is possible to resolve. Rather, it's about whether we currently see an easy path to resolve it, or whether we expect it to be resolved in the near future. If we don't have an easy path to resolving some uncertainty, then searching for such a path can be worthwhile - talking about whether the uncertainty is "possible" to resolve de-emphasizes that aspect.

Nod. Although I'll note "near future" can include "years or in some cases decades."

I think I must be the odd one out here in terms of comfort using probabilities close to 1 and 0. Because 90% and 99% are not "near certainty" to me.

How sure are you that the English guy who you've been told helped invent calculus and did stuff with gravity and optics was called "Isaac Newton"? We're talking about probabilities like 99.99999% here. (Conditioning on no dumb gotchas from human communication, e.g. me using a unicode character from a different language and claiming it's no longer the same, which has suddenly become much more salient to you and me both. An "internal" probability, if you will.)

Maybe it would help to think of this as about 20 bits of information past 50%? Every bit of information you can specify about something means you are assigning a more extreme probability distribution about that thing. The probability of the answer being "Isaac Newton" has a very tiny prior for any given question, and only rises to 50% after lots of bits of information. And if you could get to 50%, it's not strange that you could have quite a few more bits left over, before eventually running into the limits set by the reliability of your own brain.

So when you say some plans require near certainty, I'm not sure if you mean what I mean but chose smaller probabilities, or if you mean some somewhat different point about social norms about when numbers are big/small enough that we are allowed to stop/start worrying about them. Or maybe you mean a third thing about legibility and communicability that is correlated with probability but not identical?

This feels highly related to the idea of Transparent, Opaque, and Knightian Uncertainty. You can have object level certainty/uncertainty, and then you can have meta uncertainty about WHICH type of environment you're in, which changes what strategies you should be using to mitigate the risk associated with the uncertainty.

you can have meta uncertainty about WHICH type of environment you're in, which changes what strategies you should be using to mitigate the risk associated with the uncertainty.

While I agree that it's helpful to recognize situations where it's useful to play more defensively than normal, I don't think "meta uncertainty" (or "Knightian uncertainty", as it's more typically called) is a good concept to use when doing so. This is because there is fundamentally no such thing as Knightian uncertainty; any purported examples of "Knightian uncertainty" can actually be represented just fine in the standard Bayesian expected utility framework in one of two ways: (1) by modifying your prior, or (2) by modifying your assignment of utilities.

I don't think it's helpful to assign a separate label to something that is, in fact, not a separate thing. Although humans do exhibit ambiguity aversion in a number of scenarios, ambiguity aversion is a bias, and we shouldn't be attempting to justify biased/irrational behavior by introducing additional concepts that are otherwise unnecessary. Nate Soares wrote a mini-sequence addressing this idea several years ago, and I really wish more people had read it (although if memory serves, it was posted during the decline of LW1.0, which may explain the lack of familiarity).

I seriously recommend anyone unfamiliar with the sequence to give it a read; it's not long, and it's exceptionally well-written. I already linked three of the posts above, so here's the last one.

I specifically define knightian uncertainty (which is seperated from my use of meta-uncertainty) in the linked post, as referring to specific strategic scenarios where naive STRATEGIES of making decisions with expected value fail, for a number of reasons (the distribution is changing too fast, the environment is adversarial, etc).

This is different from the typical definition in that it's not implying that you can't measure the uncertainty - the Bayesian epistimology still applies. Rather, it's claiming that there are other strategies of risk mitigation you should use seperated from your measurement of uncertainty, simply implied by the environment. This is I think what proponents of knightian uncertainty are actually talking about, and it's not at odds with Bayesianism.

I'm confused. How does conservation of expected evidence come in here? When you say "90% sure" do you mean "assign 90% probability of truth to the hypothesis"? You can't expect that to change. Or do you mean "90% sure that my probability estimate of X% is correct"?

Your probability estimate ALREADY INCLUDES the probability distributions of future experiments (and of future research).


"I am 75% confident that hypothesis X is true--but if X really is true, I expect to gather more and more evidence in favor of X in the future, such that I expect my probability estimate of X to eventually exceed 99%. Of course, right now I am only 75% confident that X is true in the first place, so there is a 25% (subjective) chance that my probability estimate of X will decrease toward 0 instead of increasing toward 1."

That makes perfect sense. And you should probably make BOTH halves of the statement: I expect to increase my estimate to 90% or decrease it to 20%.

Another way to put this: I expect a large chance of a small update upwards, and a small chance of a large update downwards. This still conserves expected evidence.

On net, I expect to end up back where I started, EVEN though there's a higher chance I'll get evidence confirming my view.

While a true Bayesian's estimate already includes the probability distributions of future experiments, in practice I don't think it's easy for us humans to do that. For instance, I know based on past experience that a documentary on X will not incorporate as much nuance and depth as an academic book on X. I *should* immediately reduce the strength of any update to my beliefs on X upon watching a documentary given that I know this, but it's hard to do in practice until I actually read the book that provides the nuance.

In a context like that, I definitely have experienced the feeling of "I am pretty sure that I will believe X less confidently upon further research, but right now I can't help but feel very confident in X."

Thank you - this is an important distinction. Are we talking about how something feels, or about probability estimates? I'd argue the error is in using numbers and probability notation to describe feelings of confidence that you haven't actually tried to be rational about.

The topic of illegible beliefs (related to aliefs), and how to apply math to them is virtually unexplored.

Are we talking about how something feels, or about probability estimates?

In practice what I'm trying to do with practices like calibration training is determine the latter from the former.

This feels like it's a specific instance of a more general thing around how fast I can converge on a guess about a distribution I'm sampling from. Imagine the scatterplot with data points added one at a time. There are both negative guesses (this point rules out these distributions) and positive guesses (it sorta looks like this will converge to a bimodal distribution). Depending on payoff structure and priors I might want to lean more heavily towards faster/sparser guesses. I'm not up on current ML but this has to be common enough to be a named thing.