x
Towards deconfusing wireheading and reward maximization — LessWrong