x
SAE It Across Models: Explaining Features With Foreign NLA Verbalizers — LessWrong