Hello, my name is Benjamin "Frye" Kelley. This post is regarding some independent research I've been doing expanding on the work, Progress Measures for Grokking via Mechanistic Interpretability, by Neel Nanda et al. I've been trying to understand how sinusoids move through a transformer, allowing it to grok modular addition. I've been looking at each wave (well... not every wave) in each operation the model performs to break down the algorithm it is implementing and I believe I have some new insights. Normally I work with digital audio so this is right up my ally!
This is primarily a link to my Google Colab that I've put up, but the gist is that I've been able to visualize symmetries formed in the attention layer of this transformer that continue through the model that I believe to be responsible for the functionality of the generalizing algorithm. I've also run many tests to confirm information about the phase of these sinusoids. I'll post a second (very long) notebook of tests if anyone is interested. Also, in the primary notebook, there are other, hopefully illuminating observations of the effects of other parts of the model. There are one or two mysteries that I think in a week or so I'll have clarity on, but I welcome any feedback, corrections, criticism...
Frye