x
SAE on activation differences — LessWrong