I Found Catastrophe Geometry in GPT-2's Residual Stream
tl;dr: When GPT-2 encounters an ambiguous token — a period that could be a decimal or a sentence boundary — it resolves the ambiguity by crossing a fold: a low-dimensional decision boundary with the geometric properties predicted by catastrophe theory. The transition is sharp, directional, asymmetric, and context-dependent. I built...
Feb 201