x
Training Superior Sparse Autoencoders for Instruct Models — LessWrong