x
Evaluation Awareness Scales Predictably in Open-Weights Large Language Models — LessWrong