Understanding the tensor product formulation in Transformer Circuits — LessWrong