Outer Alignment is the Necessary Compliment to AI 2027's Best Case Scenario
To the extent we believe more advanced training and control techniques will lead to alignment of agents capable enough to strategically make successor agents -- and be able to solve inner alignment as a convergent instrumental goal -- we must also consider that inner alignment for successor systems can be...
Indeed. I am tempted to edit to say every guess (instead of every answer), to resolve this ambiguity, but I suspect it's well understood as-is?