Model-based RL, Desires, Brains, Wireheading — LessWrong