x
Measuring Non-Verbalised Eval Awareness by Implanting Eval-Aware Behaviours — LessWrong