Transformer language models are doing something more general — LessWrong