Gradient Descent on Token Input Embeddings — LessWrong