x
Pascal's mugging in reward learning — LessWrong