Starting Thoughts on RLHF — LessWrong