Meta learning to gradient hack — LessWrong