This article addresses the interaction between the alignment of an artificial intelligence (AI) and the power balance between the AI and its handlers. It aims to clarify some definitions in a more general article, and to show that capability control and motivation control may have identical effects on the alignment of an AI. In order to prevent confusion with more general uses of the terms “capability control” and “motivation control”, here I will use the terms “extrinsic alignment” and “intrinsic alignment”. These are probably not new concepts, and I apologize to the readers who already know them by different names or using a different formalism; I haven’t found a clear description elsewhere.
For the... (read 852 more words →)