x
OpenAI Superalignment: Weak-to-strong generalization — LessWrong