Open thread, September 2-8, 2013

David_Gerard

The distinction I have in mind is that a self-modifying AI can come up with a new thinking algorithm to use and decide to trust it, whereas a non-self-modifying AI could come up with a new algorithm or whatever, but would be unable to trust the algorithm without sufficient justification.

Likewise, if an AI's decision-making algorithm is immutably hard-coded as "think about the alternatives and select the one that's rated the highest", then the AI would not be able to simply "write a new AI … and then just hand off all its tasks to it"; in order to do that, it would somehow have to make it so that the highest-rated alternative is always the one that the new AI would pick. (Of course, this is no benefit unless the rating system is also immutably hard-coded.)

I guess my idea in a nutshell is that instead of starting with a flexible system and trying to figure out how to make it safe, we should start with a safe system and try to figure out how to make it flexible. My major grounds for believing this, I think, is that it's probably going to be much easier to understand a safe but inflexible system than it is to understand a flexible but unsafe system, so if we take this approach, then the development process will be easier to understand and will therefore go better.

[anonymous]13y20

ChristianKl13y20

Likewise, if an AI's decision-making algorithm is immutably hard-coded as "think about the alternatives and select the one that's rated the highest", then the AI would not be able to simply "write a new AI … and then just hand off all its tasks to it"; in order to do that, it would somehow have to make it so that the highest-rated alternative is always the one that the new AI would pick.

You basically say that the AI should be unable to learn to trust a process that was effective in the past to also be effective in the future. I think that would restrict intelligence a lot.

1drethelin13y

immutably hard-coding something in is a lot easier to say than to do.