LESSWRONG
LW

558
Toon Alfrink
0012
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Hard Problem Of Corrigibility
Toon Alfrink8y*10

If we say that this uncertainty correlates with some outside physical object (intended to be the programmers), the default result in a sufficiently advanced agent is that you disassemble this object (the programmers) to learn everything about it on a molecular level, update fully on what you've learned according to whatever correlation that had with your utility function, and plunge on straight ahead.

Would this still happen if we give high prior probability to utility functions that only favor a small target, and yield negative billion utility otherwise? Would the information value of disassembling the programmer still outweigh the high probability that the utility function comes out negative?

Wouldn't this restrict the AI to baby steps until it is more certain about the target, in general?

Reply
Intentional Communities wiki pages
9 years ago
(+109/-13)
Intentional Communities wiki pages
9 years ago
(+4687)
No posts to display.