In the Short-Term, Why Couldn't You Just RLHF-out Instrumental Convergence? — LessWrong