Heroin model: AI "manipulates" "unmanipulatable" reward — LessWrong