The mesa-optimizer paper, along with some other examples of important intellectual contributions to AI alignment, have two important properties:

  • They are part of a research program, not an end result. Rough intuitions can absolutely be a useful guide which (hopefully eventually) helps us figure out what mathematical results are possible and useful.
  • They primarily point at problems rather than solutions. Because (it seems to me) existential risk seems asymmetrically bad in comparison to potential technology upsides (large as upsides may be), I just have diffe
... (read more)

AI Alignment Open Thread August 2019

by habryka 1 min read4th Aug 201996 comments


Ω 12

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is an experiment in having an Open Thread dedicated to AI Alignment discussion, hopefully enabling researchers and upcoming researchers to ask small questions they are confused about, share very early stage ideas and have lower-key discussions.