Anthony DiGiovanni — LessWrong

Examples of awareness growth vs. logical updates

(Thanks to Lukas Finnveden for discussion that prompted these examples, and for authoring examples #3-#6 verbatim.)

A key concept in the theory of open-minded updatelessness (OMU) is "awareness growth", i.e., conceiving of hypotheses you hadn't considered before. It's helpful to gesture at "discovering crucial considerations" as examples of awareness growth. But not all CC discoveries are awareness growth. And we might think we don't need this OMU idea if awareness growth is just logical updating, i.e. you already had nonzero credence in some hypothesis, but you changed this credence purely by thinking more. What's the difference? Here are some examples.

I realize it's possible that I’m in a simulation.
- Awareness growth. Because before realizing that, I simply hadn’t conceived of “I’m in a sim” as a way the world could be. (I don’t know how/why I could/should model myself-before-I-read-Bostrom as having had nonzero credence in “I’m in a sim”.)
I realize that the simulation argument implies (under such and such assumptions) I should have high credence in "I’m in a simulation".
- Logical update. Because I’m not discovering a new way the world could be, I’m just learning of a logical implication governing my credences over ways the world could be.
Someone has conceived of the idea that they might be in a simulation ("like a video game!") but hasn't considered that simulations might be done for scientific investigations of the past. This updates their view on how likely it is that they're in a simulation, and also makes them think that more of the probability mass goes to the specific type "investigation of the past"-sim.
- The first sentence is awareness growth "by refinement" (Steele and Stefansson (2021, Sec 3.3.2)). The more specific hypotheses you’re now aware of were covered by a more coarse hypothesis you were previously aware of. And the second sentence is a logical update.
I see a rainbow colored car. I had never previously explicitly pictured a rainbow colored car, or thought about those words.
- Pretty straightforward awareness growth.
I'm forecasting the likelihood of regime change in a country. I look up the base rates. I forecast based on them and some inside-view adjustments. Later on, the news come in that there was a regime change due to chaos from a natural disaster. I hadn't conceptualized that possible contribution to regime change! But fortunately I had implicitly accounted for it via base rates that (it turns out) included some examples of natural disasters.
- Also pretty straightforward awareness growth. It's just not the decision-relevant kind, to the extent that you already implicitly priced in “natural disasters lead to regime change” via the base rate. (I think in practice it will very often be ambiguous how precisely we’re pricing in hypotheses we’re unaware of, even if one might argue we’re always kinda sorta pricing them in if you squint.)
Someone learns Newtonian physics.
- A mix of awareness growth and a logical update. E.g., the exact law “F = ma” wouldn’t have occurred to most people before learning any Newtonian physics — that's awareness growth. But some people might've both (a) conceived of an object getting pushed in a vacuum never slowing down, yet (b) not assigned very high P(object never slows down | object is pushed in a vacuum) before learning this was a law — that's a logical update.

^{^}

Also relatively prosaic causal pathways we haven't thought of in detail, not just high-level "considerations" per se.

^{^}

I'm not necessarily committed to this interpretation of the representor, but for the purposes of this discussion I think it's sufficient.

Anthony DiGiovanni's Shortform

Anthony DiGiovanni2d60

Please, Don't Roll Your Own Metaethics

Anthony DiGiovanni22d30

tendency to "bite bullets" or accepting implications that are highly counterintuitive to others or even to themselves, instead of adopting more uncertainty

I find this contrast between "biting bullets" and "adopting more uncertainty" strange. The two seem orthogonal to me, as in, I've ~just as frequently (if not more often) observed people overconfidently endorse their pretheoretic philosophical intuitions, in opposition to bullet-biting.

Legible vs. Illegible AI Safety Problems

Anthony DiGiovanni1mo80

What other, perhaps slightly more complex or less obvious, crucial considerations are we still missing?

I agree this is very important. I've argued that if we appropriately price in missing crucial considerations,^[1] we should consider ourselves clueless about AI risk interventions (here and here).

^{^}
Also relatively prosaic causal pathways we haven't thought of in detail, not just high-level "considerations" per se.

Noah Birnbaum's Shortform

Anthony DiGiovanni1mo40

A salient example to me: This post essentially consists of Paul briefly remarking on some mildly interesting distinctions about different kinds of x-risks, and listing his precise credences without any justification for them. It's well-written for what it aims to be (a quick take on personal views), but I don't understand why this post was so strongly celebrated.

MichaelDickens's Shortform

Anthony DiGiovanni2mo20

I'm curious if you think you could have basically written this exact post a year ago. Or if not, what's the relevant difference? (I admit this is partly a rhetorical question, but it's mostly not.)

Winning isn't enough

Anthony DiGiovanni3mo20

Oops, right. I think what's going on is:

"It's only permissible to bet at odds that are inside your representor" is only true if the representor is convex. If my credence in some proposition X is, say, P(X) = (0.2, 0.49) U (0.51, 0.7), IIUC it's permissible to bet at 0.5. I guess the claim that's true is "It's only permissible to bet at odds in the convex hull of your representor".
But I'm not aware of an argument that representors should be convex in general.
- If there is such an argument, my guess is that the way things would work is: We start with the non-convex set of distributions that seem no less reasonable than each other, and then add in whichever other distributions are needed to make it convex. But there would be no particular reason we'd need to interpret these other distributions as "reasonable" precise beliefs, relative to the distributions in the non-convex set we started with.
And, the kind of precise distribution P that would rationalize e.g. working on shrimp welfare seems to be the analogue of "betting at 0.5" in my example above. That is:
- Our actual "set of distributions that seem no less reasonable than each other" would include some distributions that imply large positive long-term EV from working on shrimp welfare, and some that imply large negative long-term EV.
- Whereas the distributions like P that imply vanishingly small long-term EV — given any evidence too weak to resolve our cluelessness w.r.t. long-term welfare — would lie in the convex hull. So betting at odds P would be permissible, and yet this wouldn't imply that P is "reasonable" as precise beliefs.

Sorry, I don't understand the argument yet. Why is it clear that I should bet on odds P, e.g., if P is the distribution that the CCT says I should be represented by?

Thanks for explaining!

An intuitively compelling criterion is: these precise beliefs (which you are representable as holding) are within the bounds of your imprecise credences.

I think this is the step I reject. By hypothesis, I don't think the coherence arguments show that the precise distribution P that I can be represented as optimizing w.r.t. corresponds to (reasonable) beliefs. P is nothing more than a mathematical device for representing some structure of behavior. So I'm not sure why I should require that my representor — i.e., the set of probability distributions that would be no less reasonable than each other if adopted as beliefs^[1] — contains P.

^{^}
I'm not necessarily committed to this interpretation of the representor, but for the purposes of this discussion I think it's sufficient.

Anthony DiGiovanni4mo30

Thanks, this was thought-provoking. I feel confused about how action-relevant this idea is, though.

For one, let's grant that (a) "researching considerations + basing my recommendation on the direction of the considerations" > (b) "researching considerations + giving no recommendation". This doesn't tell me how to compare (a) "researching considerations + basing my recommendation on the direction of the considerations" vs. (c) "not doing research". Realistically, the act of "doing research" would have various messy effects relative to, say, doing some neartermist thing — so I'd think (a) is incomparable with (c). (More on this here.)

But based on the end of your comment, IIUC you're conjecturing that we can compare plans based on a similar idea to your example even if no "research" is involved, just passively gaining info. If so:

It seems like this wouldn't tell me to change anything about what I work on in between times when someone asks for my recommendation.
Suppose I recommend that someone do more of [intervention that I've positively updated on]. Again, their act of investing more in that intervention will presumably have lots of messy side effects, besides "more of the intervention gets implemented" in the abstract. So I should only be clueful that this plan is better if I've "positively updated" on the all-things-considered set of effects of this person investing more in that intervention. (Intuitively this seems like an especially high bar.)

Leon Lang's Shortform

Anthony DiGiovanni4mo52

What more do you want?

Relevance to bounded agents like us, and not being sensitive to an arbitrary choice of language. More on the latter (h/t Jesse Clifton):

The problem is that Kolmogorov complexity depends on the language in which algorithms are described. Whatever you want to say about invariances with respect to the description language, this has the following unfortunate consequence for agents making decisions on the basis of finite amounts of data: For any finite sequence of observations, we can always find a silly-looking language in which the length of the shortest program outputting those observations is much lower than that in a natural-looking language (but which makes wildly different predictions of future data). For example, we can find a silly-looking language in which “the laws of physics have been as you think they are ‘til now, but tomorrow all emeralds will turn blue” is simpler than “all emeralds will stay green and the laws of physics will keep working”...
You might say, “Well we shouldn’t use those languages because they’re silly!” But what are the principles by which you decide a language is silly? We would suggest that you start with the actual metaphysical content of the theories under consideration, the claims they make about how the world is, rather than the mere syntax of a theory in some language.

LESSWRONG
LW

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments