x
Counterfactual oversight vs. training data — LessWrong