x
SAE It Out Loud: Cross-Model Feature Labeling with NLA Verbalizers — LessWrong