HDBSCAN is Surprisingly Effective at Finding Interpretable Clusters of the SAE Decoder Matrix
*All authors have equal contribution. This is a short informal post about recent discoveries our team made when we: 1. Clustered feature vectors in the decoder matrices of Sparse Autoencoders (SAEs) trained on the residual stream of each layer of GPT-2 Small and Gemma-2B. 2. Visualized the clusters in 2D...