Open Problems and Fundamental Limitations of RLHF — LessWrong