I don't think you need to posit a discontinuity to expect tests to occasionally fail.

I suspect the crux is more about how bad a single failure of a sufficiently advanced AI is likely to be.

I'll admit I don't feel like I really understand the perspective of people who seem to think we'll be able to learn how to do alignment via trial-and-error (i.e. tolerating multiple failures). Here are some guesses why people might hold that sort of view:

  • We'll develop AI in a well-designed box, so we can do a lot of debugging and stress testing
... (read more)

AI Alignment Open Thread August 2019

by habryka 1 min read4th Aug 201996 comments

37

Ω 12


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is an experiment in having an Open Thread dedicated to AI Alignment discussion, hopefully enabling researchers and upcoming researchers to ask small questions they are confused about, share very early stage ideas and have lower-key discussions.