x
Anatta-RLHF: Preventing "Benevolent Tyranny" via Causal Separation of Control and Contribution — LessWrong