x
Interpreting Gradient Routing’s Scalable Oversight Experiment — LessWrong