Reward uncertainty — LessWrong