x
A philosopher's critique of RLHF — LessWrong