On Agent Incentives to Manipulate Human Feedback in Multi-Agent Reward Learning Scenarios — LessWrong