x
Adversarial Training Against Goal Misgeneralization Is ELK-Hard — LessWrong