Reward is the optimization target (of capabilities researchers) — LessWrong