Case Studies in Reverse-Engineering Sparse Autoencoder Features by Using MLP Linearization — LessWrong