Reward model hacking as a challenge for reward learning — LessWrong