SAE Feature Matchmaking (Layer-to-Layer)
Last week I read Mechanistic Permutability: Match Features Across Layers, an interesting paper on matching features detected with Spare Autoencoders across multiple layers of the Transformer neural network. In this paper, the authors studying the problem of aligning SAE extracted features across multiple layers in the neural network without having...