RLHF does not appear to differentially cause mode-collapse — LessWrong