Grokking Beyond Neural Networks — LessWrong