x
Cross-Layer Transcoders are incentivized to learn Unfaithful Circuits — LessWrong