IRL 1/8: Inverse Reinforcement Learning and the problem of degeneracy — LessWrong