x

LESSWRONG

LW

adam elimadi — LessWrong

adam elimadi

adam elimadi

Message

1

2mo

adam elimadi

2mo

What Transformers Learn When They Solve Majority!

What Transformers Learn When They Solve MAJORITY

Full post with figures, code, and experimental details: brokttv.github.io/Majority Merrill et al. (2022) prove that a 1-layer transformer with saturated attention can recognize MAJORITY, placing such models strictly above AC⁰ in circuit complexity. Their proof is a hand-crafted existence result — one explicit weight configuration constructed to establish expressivity: set...