Testing a resonant transformer idea
We've been experimenting with a small inference-time "attractor layer"... basically trying to get a lightweight stateful influence on generation without touching weights or training. Early results are interesting... stable perplexity, small gain on one task, and a pretty spectacular failure mode that I think is worth sharing. This has some...
Nov 25, 20251