x
The Future of Aligning Deep Learning systems will probably look like "training on interp" — LessWrong