Henrik Åslund

Corrigibility as Constrained Optimisation

This post is coauthored with Ryan Carey. Much of the work on developing a corrigible agent has focused on ensuring that an AI will not manipulate the shutdown button or any other kind of device that the human operator would use to control it. Suppose, however, that the AI lacked...

Apr 11, 201915

LESSWRONG
LW

LESSWRONG
LW

Henrik Åslund

Corrigibility as Constrained Optimisation

Henrik Åslund

Henrik Åslund

Corrigibility as Constrained Optimisation