[ Parent Question — What are some good examples of incorrigibility? ]

What is corrigibility? / What are the right background readings on it?

by Ruby1 min read2nd May 20192 comments


Personal Blog

Ryan Carey asks in a related question "What are some good examples of incorrigibility?" He provides the following overview:

The idea of corrigibility is roughly that an AI should be aware that it may have faults, and therefore allow and facilitate human operators to correct these faults. I'm especially interested in scenarios where the AI system controls a particular input channel that is supposed to be used to control it, such as a shutdown button, a switch used to alter its mode of operation, or another device used to control its motivation.

What's a more detailed understanding? What are the right things to read? I believe there's at least one MIRI paper, some Arbital posts. Writing to this question to center my inquiry.

New Answer
Ask Related Question
New Comment

1 Answers

The Aribital entry is a very comprehensive and clear introduction.