Paper: Open Problems in Mechanistic Interpretability — LessWrong