x
Is anyone developing optimisation-robust interpretability methods? — LessWrong