The agent has been constructed such that Provable("5 is the best possible action") implies that 5 is the best (only!) possible action. Then by Löb's theorem, 5 is the only possible action. It cannot also be simultaneously constructed such that Provable("10 is the best possible action") implies that 10 is the only possible action, because then it would also follow that 10 is the only possible action. That's not just our proof system being inconsistent, that's false!

Decision Theory

by abramdemski, Scott Garrabrant 1 min read31st Oct 201837 comments

101

Ω 24


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

(A longer text-based version of this post is also available on MIRI's blog here, and the bibliography for the whole sequence can be found here.)

The next post in this sequence, 'Embedded Agency', will come out on Friday, November 2nd.

Tomorrow’s AI Alignment Forum sequences post will be 'What is Ambitious Value Learning?' in the sequence 'Value Learning'.