LESSWRONG
LW

Anthony DiGiovanni

Researcher at the Center on Long-Term Risk. All opinions my own.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

Newest

Winning isn't enough

Anthony DiGiovanni17d20

Oops, right. I think what's going on is:

"It's only permissible to bet at odds that are inside your representor" is only true if the representor is convex. If my credence in some proposition X is, say, P(X) = (0.2, 0.49) U (0.51, 0.7), IIUC it's permissible to bet at 0.5. I guess the claim that's true is "It's only permissible to bet at odds in the convex hull of your representor".
But I'm not aware of an argument that representors should be convex in general.
- If there is such an argument, my guess is that the way things would work is: We start with the non-convex set of distributions that seem no less reasonable than each other, and then add in whichever other distributions are needed to make it convex. But there would be no particular reason we'd need to interpret these other distributions as "reasonable" precise beliefs, relative to the distributions in the non-convex set we started with.
And, the kind of precise distribution P that would rationalize e.g. working on shrimp welfare seems to be the analogue of "betting at 0.5" in my example above. That is:
- Our actual "set of distributions that seem no less reasonable than each other" would include some distributions that imply large positive long-term EV from working on shrimp welfare, and some that imply large negative long-term EV.
- Whereas the distributions like P that imply vanishingly small long-term EV — given any evidence too weak to resolve our cluelessness w.r.t. long-term welfare — would lie in the convex hull. So betting at odds P would be permissible, and yet this wouldn't imply that P is "reasonable" as precise beliefs.

Winning isn't enough

Anthony DiGiovanni19d20

Sorry, I don't understand the argument yet. Why is it clear that I should bet on odds P, e.g., if P is the distribution that the CCT says I should be represented by?

Winning isn't enough

Anthony DiGiovanni19d20

Thanks for explaining!

An intuitively compelling criterion is: these precise beliefs (which you are representable as holding) are within the bounds of your imprecise credences.

I think this is the step I reject. By hypothesis, I don't think the coherence arguments show that the precise distribution P that I can be represented as optimizing w.r.t. corresponds to (reasonable) beliefs. P is nothing more than a mathematical device for representing some structure of behavior. So I'm not sure why I should require that my representor — i.e., the set of probability distributions that would be no less reasonable than each other if adopted as beliefs^[1] — contains P.

^{^}
I'm not necessarily committed to this interpretation of the representor, but for the purposes of this discussion I think it's sufficient.

Winning isn't enough

Anthony DiGiovanni22d30

Thanks, this was thought-provoking. I feel confused about how action-relevant this idea is, though.

For one, let's grant that (a) "researching considerations + basing my recommendation on the direction of the considerations" > (b) "researching considerations + giving no recommendation". This doesn't tell me how to compare (a) "researching considerations + basing my recommendation on the direction of the considerations" vs. (c) "not doing research". Realistically, the act of "doing research" would have various messy effects relative to, say, doing some neartermist thing — so I'd think (a) is incomparable with (c). (More on this here.)

But based on the end of your comment, IIUC you're conjecturing that we can compare plans based on a similar idea to your example even if no "research" is involved, just passively gaining info. If so:

It seems like this wouldn't tell me to change anything about what I work on in between times when someone asks for my recommendation.
Suppose I recommend that someone do more of [intervention that I've positively updated on]. Again, their act of investing more in that intervention will presumably have lots of messy side effects, besides "more of the intervention gets implemented" in the abstract. So I should only be clueful that this plan is better if I've "positively updated" on the all-things-considered set of effects of this person investing more in that intervention. (Intuitively this seems like an especially high bar.)

Leon Lang's Shortform

Anthony DiGiovanni22d52

What more do you want?

Relevance to bounded agents like us, and not being sensitive to an arbitrary choice of language. More on the latter (h/t Jesse Clifton):

The problem is that Kolmogorov complexity depends on the language in which algorithms are described. Whatever you want to say about invariances with respect to the description language, this has the following unfortunate consequence for agents making decisions on the basis of finite amounts of data: For any finite sequence of observations, we can always find a silly-looking language in which the length of the shortest program outputting those observations is much lower than that in a natural-looking language (but which makes wildly different predictions of future data). For example, we can find a silly-looking language in which “the laws of physics have been as you think they are ‘til now, but tomorrow all emeralds will turn blue” is simpler than “all emeralds will stay green and the laws of physics will keep working”...
You might say, “Well we shouldn’t use those languages because they’re silly!” But what are the principles by which you decide a language is silly? We would suggest that you start with the actual metaphysical content of the theories under consideration, the claims they make about how the world is, rather than the mere syntax of a theory in some language.

Winning isn't enough

Anthony DiGiovanni1mo20

Sorry this wasn't clear: In the context of this post, when we endorsed "use maximality to restrict your option set, and then pick on the basis of some other criterion", I think we were implicitly restricting to the special case where {permissible options w.r.t. the other criterion} ⊆ {permissible options w.r.t. consequentialism}. If that doesn't hold, it's not obvious to me what to do.

Regardless, it's not clear to me what alternative you'd propose in this situation that's less weird than choosing "saying 'yeah it's good'". (In particular I'm not sure if you're generally objecting to incomplete preferences per se, or to some way of choosing an option given incomplete preferences (w.r.t. consequentialism).)

Daniel Kokotajlo's Shortform

Anthony DiGiovanni2mo53

Ah sorry, I realized that "in expectation" was implied. It seems the same worry applies. "Effects of this sort are very hard to reliably forecast" doesn't imply "we should set those effects to zero in expectation". Cf. Greaves's discussion of complex cluelessness.

Tbc, I don't think Daniel should beat himself up over this either, if that's what you mean by "grade yourself". I'm just saying that insofar as we're trying to assess the expected effects of an action, the assumption that these kinds of indirect effects cancel out in expectation seems very strong (even if it's common).

Daniel Kokotajlo's Shortform

Anthony DiGiovanni2mo30

attempts to control such effects with 3d chess backfire as often as not

Taken literally, this sounds like a strong knife-edge condition to me. Why do you think this? Even if what you really mean is "close enough to 50/50 that the first-order effect dominates," that also sounds like a strong claim given how many non-first-order effects we should expect there to be (ETA: and given how out-of-distribution the problem of preventing AI risk seems to be).

AI 2027: What Superintelligence Looks Like

Anthony DiGiovanni3mo50

(Replying now bc of the "missed the point" reaction:) To be clear, my concern is that someone without more context might pattern-match the claim "Anthony thinks we shouldn't have probabilistic beliefs" to "Anthony thinks we have full Knightian uncertainty about everything / doesn't think we can say any A is more or less likely than any B". From my experience having discussions about imprecision, conceptual rounding errors are super common, so I think this is a reasonable concern even if you personally find it obvious that "probabilistic" should be read as "using a precise probability distribution".

Clarifying “wisdom”: Foundational topics for aligned AIs to prioritize before irreversible decisions

Anthony DiGiovanni3mo60

Sorry to be clear, I don't claim LW has overlooked these topics (except unawareness and alternatives to classical Bayesian epistemology, which I do think have been quite severely neglected). The reason I wrote this post was that the following claims seem non-obvious:

Thinking further about wisdom concepts these days is not just a distraction from "notkilleveryoneism".
The concepts in the checklist do in fact seem to satisfy conditions (1)+(2) (the definition of "wisdom concepts"). (My impression is that it's somewhat common for people to think many of the concepts I list admit "objective" answers (i.e. just believe and do what "works" / has the best empirical track record), which all sufficiently intelligent agents will converge to. ETA: Relatedly, it might not be salient to some readers that the answer to "is this decision a catastrophic mistake?" could be sensitive to all these topics.)
The sub-questions I list are open questions. (E.g., I expect it to be controversial that agents aren't necessarily rationally required to avoid diachronic sure losses.)