Value Learning

Oct 29, 2018

by rohinmshah

This is a sequence investigating the feasibility of one approach to AI alignment: value learning.

Preface to the sequence on value learning

644moΩ 18Show Highlight
6

Ambitious Value Learning

What is ambitious value learning?

443moΩ 12Show Highlight
28

The easy goal inference problem is still hard

383moΩ 9Show Highlight
16

Humans can be assigned any values whatsoever…

433moΩ 11Show Highlight
8

Latent Variables and Model Mis-Specification

193moΩ 5Show Highlight
1
1

Future directions for ambitious value learning

423moΩ 12Show Highlight
9

Goals vs Utility Functions

Ambitious value learning aims to give the AI the correct utility function to avoid catastrophe. Given its difficulty, we revisit the arguments for utility functions in the first place.

Intuitions about goal-directed behavior

292moΩ 7Show Highlight
12

Coherence arguments do not imply goal-directed behavior

622moΩ 19Show Highlight
22

Will humans build goal-directed agents?

391moΩ 11Show Highlight
40

AI safety without goal-directed behavior

401moΩ 12Show Highlight
9

Narrow Value Learning

What is narrow value learning?

191moΩ 6Show Highlight
3

Ambitious vs. narrow value learning

181moΩ 6Show Highlight
15

Human-AI Interaction

181moΩ 7Show Highlight
7

Reward uncertainty

181moΩ 6Show Highlight
0

The human side of interaction

1623dΩ 6Show Highlight
2

Following human norms

231moΩ 9Show Highlight
8

Future directions for narrow value learning

1221dΩ 7Show Highlight
4

Conclusion to the sequence on value learning

4412dΩ 13Show Highlight
13