x
Weak-To-Strong Generalization (W2SG) — LessWrong