An interesting mathematical model of how LLMs work — LessWrong