When Does the Local Learning Coefficient Track Circuit Formation?
There is a lot of justified hype right now around applying Singular Learning Theory (SLT) to mechanistic interpretability. We all desperately want a magic, noise-tolerant number that tells us, "Ah, yes, the model just sculpted a meaningful circuit here". Recently, the Local Learning Coefficient (LLC) has been floated as the...
May 81