x
On Preference Manipulation in Reward Learning Processes — LessWrong