How does GPT-3 spend its 175B parameters? — LessWrong