x
A “Scaling Monosemanticity” Explainer — LessWrong