Cooperative Inverse Reinforcement Learning vs. Irrational Human Preferences — LessWrong