x
Explaining grokking through circuit efficiency — LessWrong