My thoughts: we can't really expect to prove something like "this ai will be beneficial". However, relying on empiricism to test our algorithms is very likely to fail, because it's very plausible that there's a discontinuity in behavior around the region of human-level generality of intelligence (specifically as we move to the upper end, where the system can understand things like the whole training regime and its goal systems). So I don't know how to make good guesses about the behavior of very capable systems except through mathematical analysis.

There ar

... (read more)

AI Alignment Open Thread August 2019

by habryka 1 min read4th Aug 201996 comments


Ω 12

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is an experiment in having an Open Thread dedicated to AI Alignment discussion, hopefully enabling researchers and upcoming researchers to ask small questions they are confused about, share very early stage ideas and have lower-key discussions.