x
How Far Apart Does a Model Think Its Tokens Are? — LessWrong