x
Training Deliberative Monitors for Black-Box Scheming Detection — LessWrong