LESSWRONG
LW

«Boundaries» Sequence

Oct 31, 2022 by Andrew_Critch

In this short sequence of posts, I aim to circumscribe a causal pathway from 

  1. a key missing idea in the utility-theoretic foundations of game theory, leading to
  2. some problems I think I see in effective altruism discourse, leading further to
  3. gaps in some approaches to AI alignment, and finally, 
  4. implications for existential risk.

By default, I'm writing one post for each of the above points, since they have different epistemic statuses and can be debated separately.  Posts 1 and 3 will be somewhat technical and research-oriented, and cross-posted to the alignment forum, whereas 2 and 4 will be non-technical and community-oriented, and cross-posted to the EA forum.  After that there might be more posts in the sequence, depending on the ensuing conversation.  In any case I'll try to keep this index post updated with the full sequences.

Here goes!

25«Boundaries» Sequence (Index Post)
Andrew_Critch
3y
1
158«Boundaries», Part 1: a key missing concept from utility theory
Ω
Andrew_Critch
3y
Ω
33
81«Boundaries», Part 2: trends in EA's handling of boundaries
Andrew_Critch
3y
15
90«Boundaries», Part 3a: Defining boundaries as directed Markov blankets
Ω
Andrew_Critch
3y
Ω
20
72«Boundaries», Part 3b: Alignment problems in terms of boundaries
Ω
Andrew_Critch
3y
Ω
7