IRL 8/8: Generative Adversarial Imitation Learning — LessWrong