x
Crafting Polysemantic Transformer Benchmarks with Known Circuits — LessWrong