Intrinsic vs. Extrinsic Alignment
This article addresses the interaction between the alignment of an artificial intelligence (AI) and the power balance between the AI and its handlers. It aims to clarify some definitions in a more general article, and to show that capability control and motivation control may have identical effects on the alignment...
Jun 1, 20231