RLHF is the worst possible thing done when facing the alignment problem — LessWrong