Value drift threat models — LessWrong