Value Learning

Oct 29, 2018

by rohinmshah

This is a sequence investigating the feasibility of one approach to AI alignment: ambitious value learning.

(The sequence will update with a second half on related topics in a few weeks.)

Preface to the Sequence on Value Learning

611mo3 min readΩ 17Show Highlight

Ambitious Value Learning

What is ambitious value learning?

431mo2 min readΩ 12Show Highlight

The easy goal inference problem is still hard

381mo4 min readΩ 9Show Highlight

Humans can be assigned any values whatsoever…

431mo3 min readΩ 11Show Highlight

Latent Variables and Model Mis-Specification

181mo9 min readΩ 5Show Highlight

Model Mis-specification and Inverse Reinforcement Learning

281mo15 min readΩ 6Show Highlight

Future directions for ambitious value learning

341mo4 min readΩ 10Show Highlight

Goals vs Utility Functions

Ambitious value learning aims to give the AI the correct utility function to avoid catastrophe. Given its difficulty, we revisit the arguments for utility functions in the first place.

Intuitions about goal-directed behavior

2815d6 min readΩ 7Show Highlight

Coherence arguments do not imply goal-directed behavior

4513d7 min readΩ 13Show Highlight