Hypothesis on Composition Circuits in Vision Transformers — LessWrong