Anomalous tokens reveal the original identities of Instruct models — LessWrong