LESSWRONG
LW

460
Anthony DiGiovanni
1064Ω58111520
Message
Dialogue
Subscribe

Researcher at the Center on Long-Term Risk. All opinions my own.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
The challenge of unawareness for impartial altruist action guidance
No wikitag contributions to display.
Noah Birnbaum's Shortform
Anthony DiGiovanni6d40

A salient example to me: This post essentially consists of Paul briefly remarking on some mildly interesting distinctions about different kinds of x-risks, and listing his precise credences without any justification for them. It's well-written for what it aims to be (a quick take on personal views), but I don't understand why this post was so strongly celebrated.

Reply1
MichaelDickens's Shortform
Anthony DiGiovanni15d20

I'm curious if you think you could have basically written this exact post a year ago. Or if not, what's the relevant difference? (I admit this is partly a rhetorical question, but it's mostly not.)

Reply
Winning isn't enough
Anthony DiGiovanni2mo20

Oops, right. I think what's going on is:

  • "It's only permissible to bet at odds that are inside your representor" is only true if the representor is convex. If my credence in some proposition X is, say, P(X) = (0.2, 0.49) U (0.51, 0.7), IIUC it's permissible to bet at 0.5. I guess the claim that's true is "It's only permissible to bet at odds in the convex hull of your representor".
  • But I'm not aware of an argument that representors should be convex in general.
    • If there is such an argument, my guess is that the way things would work is: We start with the non-convex set of distributions that seem no less reasonable than each other, and then add in whichever other distributions are needed to make it convex. But there would be no particular reason we'd need to interpret these other distributions as "reasonable" precise beliefs, relative to the distributions in the non-convex set we started with.
  • And, the kind of precise distribution P that would rationalize e.g. working on shrimp welfare seems to be the analogue of "betting at 0.5" in my example above. That is:
    • Our actual "set of distributions that seem no less reasonable than each other" would include some distributions that imply large positive long-term EV from working on shrimp welfare, and some that imply large negative long-term EV.
    • Whereas the distributions like P that imply vanishingly small long-term EV — given any evidence too weak to resolve our cluelessness w.r.t. long-term welfare — would lie in the convex hull. So betting at odds P would be permissible, and yet this wouldn't imply that P is "reasonable" as precise beliefs.
Reply
Winning isn't enough
Anthony DiGiovanni2mo20

Sorry, I don't understand the argument yet. Why is it clear that I should bet on odds P, e.g., if P is the distribution that the CCT says I should be represented by?

Reply
Winning isn't enough
Anthony DiGiovanni2mo20

Thanks for explaining! 

An intuitively compelling criterion is: these precise beliefs (which you are representable as holding) are within the bounds of your imprecise credences.

I think this is the step I reject. By hypothesis, I don't think the coherence arguments show that the precise distribution P that I can be represented as optimizing w.r.t. corresponds to (reasonable) beliefs. P is nothing more than a mathematical device for representing some structure of behavior. So I'm not sure why I should require that my representor — i.e., the set of probability distributions that would be no less reasonable than each other if adopted as beliefs[1] — contains P.

  1. ^

    I'm not necessarily committed to this interpretation of the representor, but for the purposes of this discussion I think it's sufficient.

Reply
Winning isn't enough
Anthony DiGiovanni2mo30

Thanks, this was thought-provoking. I feel confused about how action-relevant this idea is, though.

For one, let's grant that (a) "researching considerations + basing my recommendation on the direction of the considerations" > (b) "researching considerations + giving no recommendation". This doesn't tell me how to compare (a) "researching considerations + basing my recommendation on the direction of the considerations" vs. (c) "not doing research". Realistically, the act of "doing research" would have various messy effects relative to, say, doing some neartermist thing — so I'd think (a) is incomparable with (c). (More on this here.)

But based on the end of your comment, IIUC you're conjecturing that we can compare plans based on a similar idea to your example even if no "research" is involved, just passively gaining info. If so:

  • It seems like this wouldn't tell me to change anything about what I work on in between times when someone asks for my recommendation.
  • Suppose I recommend that someone do more of [intervention that I've positively updated on]. Again, their act of investing more in that intervention will presumably have lots of messy side effects, besides "more of the intervention gets implemented" in the abstract. So I should only be clueful that this plan is better if I've "positively updated" on the all-things-considered set of effects of this person investing more in that intervention. (Intuitively this seems like an especially high bar.)
Reply
Leon Lang's Shortform
Anthony DiGiovanni2mo52

What more do you want?

Relevance to bounded agents like us, and not being sensitive to an arbitrary choice of language. More on the latter (h/t Jesse Clifton):

The problem is that Kolmogorov complexity depends on the language in which algorithms are described. Whatever you want to say about invariances with respect to the description language, this has the following unfortunate consequence for agents making decisions on the basis of finite amounts of data: For any finite sequence of observations, we can always find a silly-looking language in which the length of the shortest program outputting those observations is much lower than that in a natural-looking language (but which makes wildly different predictions of future data). For example, we can find a silly-looking language in which “the laws of physics have been as you think they are ‘til now, but tomorrow all emeralds will turn blue” is simpler than “all emeralds will stay green and the laws of physics will keep working”...

You might say, “Well we shouldn’t use those languages because they’re silly!” But what are the principles by which you decide a language is silly? We would suggest that you start with the actual metaphysical content of the theories under consideration, the claims they make about how the world is, rather than the mere syntax of a theory in some language.

Reply
Winning isn't enough
Anthony DiGiovanni3mo20

Sorry this wasn't clear: In the context of this post, when we endorsed "use maximality to restrict your option set, and then pick on the basis of some other criterion", I think we were implicitly restricting to the special case where {permissible options w.r.t. the other criterion} ⊆ {permissible options w.r.t. consequentialism}. If that doesn't hold, it's not obvious to me what to do.

Regardless, it's not clear to me what alternative you'd propose in this situation that's less weird than choosing "saying 'yeah it's good'". (In particular I'm not sure if you're generally objecting to incomplete preferences per se, or to some way of choosing an option given incomplete preferences (w.r.t. consequentialism).)

Reply
Daniel Kokotajlo's Shortform
Anthony DiGiovanni4mo53

Ah sorry, I realized that "in expectation" was implied. It seems the same worry applies. "Effects of this sort are very hard to reliably forecast" doesn't imply "we should set those effects to zero in expectation". Cf. Greaves's discussion of complex cluelessness.

Tbc, I don't think Daniel should beat himself up over this either, if that's what you mean by "grade yourself". I'm just saying that insofar as we're trying to assess the expected effects of an action, the assumption that these kinds of indirect effects cancel out in expectation seems very strong (even if it's common).

Reply
Daniel Kokotajlo's Shortform
Anthony DiGiovanni4mo30

attempts to control such effects with 3d chess backfire as often as not 

Taken literally, this sounds like a strong knife-edge condition to me. Why do you think this? Even if what you really mean is "close enough to 50/50 that the first-order effect dominates," that also sounds like a strong claim given how many non-first-order effects we should expect there to be (ETA: and given how out-of-distribution the problem of preventing AI risk seems to be).

Reply
Load More
3Anthony DiGiovanni's Shortform
3y
31
20Resource guide: Unawareness, indeterminacy, and cluelessness
4mo
0
37Clarifying “wisdom”: Foundational topics for aligned AIs to prioritize before irreversible decisions
4mo
2
264. Why existing approaches to cause prioritization are not robust to unawareness
5mo
0
243. Why impartial altruists should suspend judgment under unawareness
5mo
0
252. Why intuitive comparisons of large-scale impact are unjustified
5mo
0
471. The challenge of unawareness for impartial altruist action guidance: Introduction
5mo
6
65Should you go with your best guess?: Against precise Bayesianism and related views
9mo
15
44Winning isn't enough
1y
30
36What are your cruxes for imprecise probabilities / decision rules?
Q
1y
Q
33
41Individually incentivized safe Pareto improvements in open-source bargaining
1y
2
Load More