GPT-2 XL's capacity for coherence and ontology clustering — LessWrong