x
Automating Interpretability with Agents — LessWrong