Characterizing Intrinsic Compositionality in Transformers with Tree Projections — LessWrong