x
Gamified narrow reverse imitation learning — LessWrong