x
An Intuitive Explanation of Sparse Autoencoders for Mechanistic Interpretability of LLMs — LessWrong