It was not until reading this that I really understood that I am in the habit of reasoning about myself as just a part of the environment.

The kicker is that we don't reason directly about ourselves as such, we use a simplified model of ourselves. And we're REALLY GOOD at using that model for causal reasoning, even when it is reflective, and involves multiple levels of self-reflection and counterfactuals - at least when we bother to try. (We try rarely because explicit modelling is cognitively demanding, and we usually use defaults / conditioned reasoning. Sometimes that's OK.)

Example: It is 10PM. A 5-page report is due in 12 hours, at 10AM.

Default: Go to sleep at 1AM, set ala... (read more)

Decision Theory

by abramdemski, Scott Garrabrant 1 min read31st Oct 201837 comments


Ω 24

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

(A longer text-based version of this post is also available on MIRI's blog here, and the bibliography for the whole sequence can be found here.)

The next post in this sequence, 'Embedded Agency', will come out on Friday, November 2nd.

Tomorrow’s AI Alignment Forum sequences post will be 'What is Ambitious Value Learning?' in the sequence 'Value Learning'.