Hackable Rewards as a Safety Valve? — LessWrong