Inverse reinforcement learning on self, pre-ontology-change — LessWrong