Safely controlling the AGI agent reward function — LessWrong