Mechanistic interpretability through clustering — LessWrong