x
Sparse Autoencoder Feature Ablation for Unlearning — LessWrong