I don't see any analog to mutually assured destruction, which seems like a pretty key feature with nukes.

Perhaps the appropriate analogy here would be two teams which both say "The other team is going to get to AI first if we don't, and we prefer misalignment to losing, so we might as well push ahead." The disanalogy here is that it's not adversarial in the sense of being destructive (although it could be if they are enemies). But it's analogous in the sense that they could either both decide to do nothing, or both decide to take the action. If they decide to take the action, they will both ensure their own destruction in the case of misalignment.

AI Alignment Open Thread August 2019

by habryka 1 min read4th Aug 201996 comments


Ω 12

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is an experiment in having an Open Thread dedicated to AI Alignment discussion, hopefully enabling researchers and upcoming researchers to ask small questions they are confused about, share very early stage ideas and have lower-key discussions.