Mechanistic Interpretability of Llama 3.2 with Sparse Autoencoders — LessWrong