LESSWRONG
LW

12
Value Learning

Value Learning

Oct 29, 2018 by Rohin Shah

This is a sequence investigating the feasibility of one approach to AI alignment: value learning.

70Preface to the sequence on value learning
Ω
Rohin Shah
7y
Ω
6
Ambitious Value Learning
55What is ambitious value learning?
Ω
Rohin Shah
7y
Ω
28
71The easy goal inference problem is still hard
Ω
paulfchristiano
7y
Ω
20
54Humans can be assigned any values whatsoever…
Ω
Stuart_Armstrong
7y
Ω
27
24Latent Variables and Model Mis-Specification
Ω
jsteinhardt
7y
Ω
8
34Model Mis-specification and Inverse Reinforcement Learning
Ω
Owain_Evans, jsteinhardt
7y
Ω
3
48Future directions for ambitious value learning
Ω
Rohin Shah
7y
Ω
9
Goals vs Utility Functions

Ambitious value learning aims to give the AI the correct utility function to avoid catastrophe. Given its difficulty, we revisit the arguments for utility functions in the first place.

55Intuitions about goal-directed behavior
Ω
Rohin Shah
7y
Ω
15
134Coherence arguments do not entail goal-directed behavior
Ω
Rohin Shah
7y
Ω
69
61Will humans build goal-directed agents?
Ω
Rohin Shah
7y
Ω
43
68AI safety without goal-directed behavior
Ω
Rohin Shah
7y
Ω
15
Narrow Value Learning
23What is narrow value learning?
Ω
Rohin Shah
7y
Ω
3
31Ambitious vs. narrow value learning
Ω
paulfchristiano
7y
Ω
16
34Human-AI Interaction
Ω
Rohin Shah
7y
Ω
10
26Reward uncertainty
Ω
Rohin Shah
7y
Ω
3
19The human side of interaction
Ω
Rohin Shah
7y
Ω
5
31Following human norms
Ω
Rohin Shah
7y
Ω
10
12Future directions for narrow value learning
Ω
Rohin Shah
7y
Ω
4
52Conclusion to the sequence on value learning
Ω
Rohin Shah
6y
Ω
20