Explaining "Taking features out of superposition with sparse autoencoders" — LessWrong