x
Counterfactual oversight vs. training data - History — LessWrong