Reinforcement, Preference and Utility — LessWrong