Beyond Gaussian: Language Model Representations and Distributions — LessWrong