Notice (well, you already know that) that accepting that identical agents make identical decisions (superrationality, as it were) and to make different decisions in identical circumstances the agents must necessarily be different, gets you out of many pickles. For example, in the 5&10 game an agent would examine its own algorithm, see that it leads to taking $10 and stop there. There is no "what would happen if you took a different action", because the agent taking a different action would not be you, not exactly. So, no Lobian obstacle. In return, you give up something a lot more emotionally valuable: the delusion of making conscious decisions. Pick your poison.

Decision Theory

by abramdemski, Scott Garrabrant 1 min read31st Oct 201837 comments


Ω 24

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

(A longer text-based version of this post is also available on MIRI's blog here, and the bibliography for the whole sequence can be found here.)

The next post in this sequence, 'Embedded Agency', will come out on Friday, November 2nd.

Tomorrow’s AI Alignment Forum sequences post will be 'What is Ambitious Value Learning?' in the sequence 'Value Learning'.