If we say that this uncertainty correlates with some outside physical
object (intended to be the programmers), the default result in a
sufficiently advanced agent is that you disassemble this object (the
programmers) to learn everything about it on a molecular level, update
fully on what you've learned according to whatever correlation that
had with your utility function, and plunge on straight ahead.
Would this still happen if we give high prior probability to utility functions that only favor a small target, and yield negative billion utility otherwise? Would the information value of disassembling the programmer still outweigh the high probability that the utility function comes out negative?
Wouldn't this restrict the AI to baby steps until it is more certain about the target, in general?
Would this still happen if we give high prior probability to utility functions that only favor a small target, and yield negative billion utility otherwise? Would the information value of disassembling the programmer still outweigh the high probability that the utility function comes out negative?
Wouldn't this restrict the AI to baby steps until it is more certain about the target, in general?