Transformer trained on it's own content? — LessWrong