x
Verbalised evaluation awareness in language models has little effect on their behaviour — LessWrong