Beginner's question about RLHF — LessWrong