Well, any system that satisfies the Minimal Requirement is doing long term planning on some level. For example, if your AI is approval directed, it still needs to learn how to make good plans that will be approved. Once your system has a superhuman capability of producing plans somewhere inside, you should worry about that capability being applied in the wrong direction (in particular due to mesa-optimization / daemons). Also, even without long term planning, extreme optimization is dangerous (for example an approval directed AI might create some kind of memetic supervirus).

But, I agree that these arguments are not enough to be confident of the strong empirical claim.

AI Alignment Open Thread August 2019

by habryka 1 min read4th Aug 201996 comments


Ω 12

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is an experiment in having an Open Thread dedicated to AI Alignment discussion, hopefully enabling researchers and upcoming researchers to ask small questions they are confused about, share very early stage ideas and have lower-key discussions.