Investigating Internal Representations of Correctness in SONAR Text Autoencoders — LessWrong