OpenAI lied about SFT vs. RLHF — LessWrong