Polysemantic Attention Head in a 4-Layer Transformer — LessWrong