Value Learning for Irrational Toy Models — LessWrong