Understanding the diffusion of large language models: summary — LessWrong