One could say that the thing you are reflecting at, is some portion of yourself; that would perhaps be fair as 'limited reflection' goes.

We're certainly better at reflecting at some parts of our self than others. The ironic thing is, though, that when we look more closely and analyze just what it is that we are not reflecting on very well that we open up the can of worms that we had previously been avoiding.

It occurs to me that it may be a good thing that we are limited in this regard and are not yet able to reflect well enough to reproduce our intelligence in the form of a self reflective AI. If we could we'd probably have gone extinct already.

If we could we'd probably have gone extinct already.

(I've been trying to avoid the uFAI=extinction thing lately, as there are various reasons that might not be the case. If we build uFAI we'll probably lose in some sense, maybe even to the extent that the common man will be able to notice we lost, but putting emphasis on the extinction scenario might be inaccurate. Killing all the humans doesn't benefit the AI much in most scenarios and can easily incur huge costs, both causally and acausally. Do you disagree that it's worth avoiding conflating losing and extinction?)

0Dmytry8yIn the context of the original post - suppose that SGAI is logging some of the internal state into a log file, and then gains access to reading this log file, and reasons about it in same way as it reasons about the world - noticing correlation between it's feelings and state with the log file. Wouldn't that be the kind of reflection that we have? Is SGAI even logically possible without hard-coding some blind spot inside the AI about itself? Or maybe we're going to go extinct real soon now, because we lack ability to reflect like this, and consequently didn't have couple thousands years to develop effective theory of mind for FAI before we make the hardware.

Scenario analysis: semi-general AIs

by Will_Newsome 1 min read22nd Mar 201266 comments

1


Are there any essays anywhere that go in depth about scenarios where AIs become somewhat recursive/general in that they can write functioning code to solve diverse problems, but the AI reflection problem remains unsolved and thus limits the depth of recursion attainable by the AIs? Let's provisionally call such general but reflection-limited AIs semi-general AIs, or SGAIs. SGAIs might be of roughly smart-animal-level intelligence, e.g. have rudimentary communication/negotiation abilities and some level of ability to formulate narrowish plans of the sort that don't leave them susceptible to Pascalian self-destruction or wireheading or the like.

At first blush, this scenario strikes me as Bad; AIs could take over all computers connected to the internet, totally messing stuff up as their goals/subgoals mutate and adapt to circumvent wireheading selection pressures, without being able to reach general intelligence. AIs might or might not cooperate with humans in such a scenario. I imagine any detailed existing literature on this subject would focus on computer security and intelligent computer "viruses"; does such literature exist, anywhere?

I have various questions about this scenario, including:

  • How quickly should one expect temetic selective sweeps to reach ~99% fixation?
  • To what extent should SGAIs be expected to cooperate with humans in such a scenario? Would SGAIs be able to make plans that involve exchange of currency, even if they don't understand what currency is or how exactly it works? What do humans have to offer SGAIs?
  • How confident can we be that SGAIs will or won't have enough oomph to FOOM once they saturate and optimize/corrupt all existing computing hardware?
  • Assuming such a scenario doesn't immediately lead to a FOOM scenario, how bad is it? To what extent is its badness contingent on the capability/willingness of SGAIs to play nice with humans?
Those are the questions that immediately spring to mind, but I'd like to see who else has thought about this and what they've already considered before I cover too much ground.
My intuition says that thinking about SGAIs in terms of population genetics and microeconomics will somewhat counteract automatic tendencies to imagine cool stories rather than engage in dispassionate analysis. I'd like other suggestions for how to achieve that goal.
I'm confused that I don't see people talking about this scenario very much; why is that? Why isn't it the default expected scenario among futurologists? Or have I just not paid close enough attention? Is there already a name for this class of AIs? Is the name better than "semi-general AIs"?
Thanks for any suggestions/thoughts, and my apologies if this has already been discussed at length on LessWrong.

 

1