x
Inverse Reinforcement Learning - History — LessWrong