Value Learning

Oct 29, 2018

by rohinmshah

This is a sequence investigating the feasibility of one approach to AI alignment: value learning.

Preface to the sequence on value learning

644moΩ 18Show Highlight
6

Ambitious Value Learning

What is ambitious value learning?

444moΩ 12Show Highlight
28

The easy goal inference problem is still hard

384moΩ 9Show Highlight
16

Humans can be assigned any values whatsoever…

434moΩ 11Show Highlight
8

Latent Variables and Model Mis-Specification

194moΩ 5Show Highlight
1
1

Future directions for ambitious value learning

424moΩ 12Show Highlight
9

Goals vs Utility Functions

Ambitious value learning aims to give the AI the correct utility function to avoid catastrophe. Given its difficulty, we revisit the arguments for utility functions in the first place.

Intuitions about goal-directed behavior

293moΩ 7Show Highlight
12

Coherence arguments do not imply goal-directed behavior

623mo7 min readΩ 19Show Highlight
22

Will humans build goal-directed agents?

392moΩ 11Show Highlight
41

AI safety without goal-directed behavior

402moΩ 12Show Highlight
9

Narrow Value Learning

What is narrow value learning?

192moΩ 6Show Highlight
3

Ambitious vs. narrow value learning

182moΩ 6Show Highlight
15

Human-AI Interaction

182moΩ 7Show Highlight
7

Reward uncertainty

181moΩ 6Show Highlight
0

The human side of interaction

161moΩ 6Show Highlight
2

Following human norms

231moΩ 9Show Highlight
8

Future directions for narrow value learning

121moΩ 7Show Highlight
4

Conclusion to the sequence on value learning

441moΩ 13Show Highlight
13