x
Evaluating Superhuman Models with Consistency Checks — LessWrong