Containing the AI... Inside a Simulated Reality

by HumaneAutomation2 min read31st Oct 20205 comments


AI Boxing (Containment)Simulation HypothesisSuperintelligenceAI

So I just finished the paper by Yampolskiy called "Uncontrollability of AI" and it makes for a compelling read. In particular, I was happy to finally see something that explicitly mentions the ludicrous folly of believing it possible to make an AI conform to "human values" - as many posts on this blog make abundantly clear, to be human is to be irrational... asking an AI to conform with our ways of "reasoning" is... well - incoherent, to put it mildly.

But - that is not what this post is about :) I wish to propose a containment method that for some reason has not been especially elaborated on. Some might say it's another version of AI-in-a-Box, but I disagree. Allow me to explain...

What if the AGI we create is "brought online" inside a simulated reality... A place that, as far as it knows, is the entirety of the world? Let us call this place AISpace. 

Now some of you probably are already pre-heating your keyboards to respond with the often repeated (and valid) arguments that "prove" how this won't work, but let me add a little twist first... as some of you may agree, we have no definitive proof that our world itself is not a simulated reality. Thus, if you feel it inevitable that an AGI must be able to get out of AISpace, releasing an AGI into this world would have to at least leave open the possibility that an AGI on Real Earth (aka Reality as we think we know it) could also "conclude" this actual world is a sim, or decide to find out if it is (and, yes, convert the entire universe into a computer to come up with an answer ;)

If we are unable to definitively settle whether Real Earth is or is not a simulation, why should an AI be able to do so?

Now - of course the above requires a few conditions, some of which may indeed be hard to meet, such as human operators not exposing the fact that AISpace is not in fact all of Reality... and (malevolent or indifferent) actors could always choose to release their AI into the Real World anyhow. What I'd like us to do here on this post is to not repeat the known arguments against AI in a Box, which make sense and are for the most part uncontroversial. Instead, let's deal with the specific case of a situation where the AGI is sincerely of the "opinion" that its reality (AISpace) is the entire world, and how it then might find out it is not, absent human error.

AISpace would of course need to be created first and the task of designing it may itself be very complex indeed. Though it would be wise to be careful in ascribing the properties it must have before it is "real enough" to be "accepted" as a Reality. In our Real Earth there also might be glitches/errors that to us are merely features we take as a given. For example, if our Real World is actually a sim, perhaps lightning is a programming error - even if it is, how would we ever actually know?

I guess I also would like to say that I am dying to find out the reasoning why this kind of containment would not work, for it must have quite a few very interesting implications. Perhaps, for the sake of the argument, let us define an additional "version" of AISpace as an exact replica of the Real World, but simulated (a.k.a. a human-instantiated Matrix; so at present not yet possible). How on Earth (no pun intended) would the AGI find out it is in fact in a "fake" reality...?


5 comments, sorted by Highlighting new comments since Today at 1:00 AM
New Comment

The best counterargument here was presented by EY: that superintelligent AI will easily recognise and crack the simulation from inside. See That Alien Message.

In my few, it may be useful to install uncertainty in AI that it could be in simulation which is testing its behaviour. Rolf suggested to do it by making public precommitment to create many such simulations before any AI is created. However, it could work only as our last line of defence after everything else (alignment, control systems, boxing,) fails. 

To build upon this idea, it is well-established that we (as a civilization) cannot secure any but the simplest software against a determined attacker. Secure software against an intelligence smarter than us is unfeasible.

I would guess that one reason this containment method has not been seriously considered is because the amount of detail in a simulation required for the AI to be able to do anything that we find useful is so far beyond our current capabilities that it doesn't seem worth considering. The case you present of an exact copy of our earth would require a ridiculous amount of processing power at the very least, and consider that the simulation of billions of human brains in this copy would already constitute a form of GAI. A simulation with less detail would be correspondingly less useful to reality, and could not be seen as a valid test of whether an AI really is friendly. 

Oh, and there is still the core issue of boxed AI: It's very possible that a boxed superintelligent GAI will see holes in the box that we are not smart enough to see, and there's no way around that. 

So... can it be said that the advent of an AGI will also provide a satisfactory answer to the question whether we currently are in a simulation? That is what you (and avturchin) seem to imply. Also, this stance presupposes that:

- an AGI can ascertain such observations to be highly probable/certain;
- it is theoretically possible to find out the true nature of ones world (and that a super-intelligent AI would be able to do this);
- it will inevitably embark on a quest to ascertain the nature and fundamental facts about its reality;
- we can expect a "question absolutely everything" attitude from an AGI (something that is not necessarily desirable, especially in matters where facts may be hard to come by/a matter of choice or preference).

Or am I actually missing something here? I am assuming that is very probable ;)

You know what... as I thought about the above, I have to say that the very possibility of the existence of simulations seriously complicates any efforts at even hoping to understand what an AGI might think. Actually, it presents such a level of complexity and so many unknown unknowns that I am not even sure if the type of awareness and sentience an AGI may possess is definable in human terms.

See - when we talk about simulated worlds, we tend to "see" it in terms of the Matrix - a "place" you "log on to" and then experience as if it were a genuine world, configured to feature any number of laws and structures. But I'm starting to think that is woefully inadequate. Let me attempt to explain... this may be convoluted, I apologize in advance.

Suppose the AGI is released in the "real" world. The amount of inferences and discoveries it will (eventually) be able to make is such that it is near certain it would conclude that it is us who are living in a simulated world, our appreciation of it hemmed in by our Neanderthal-level ignorance. Can't we see that plants speak to each other? How is it even possible to miss the constant messages coming to us from various civilizations from outer space?? And what about the obvious and trivial solution to cancer that the AGI found in a couple of minutes, how could humans possible have missed that open door???

Another way of putting this, I suppose, is that humans and the AGI will by definition live in two very, very different worlds. Both our worlds will be limited by our data collection ability (sensory input) but the limits of an AGI are vastly expanded. Do they have to be, though...? Like, by default? Is it a given that an AI must discover, and want to discover a never-ending list of properties about the world? Is its curiosity a given? How come?

I get a feeling that the moment an AGI would "discover" the concept of a simulated world it would indeed most likely melt and go into some infinite loop of impossible computation, trying to stick a probability on this being so, being possible, etc. and never, not in a million years, being able to come with a definitive answer. It may just as well conclude there is no such thing as reality in the first place... that each sentient observer is in fact the whole of reality from their perspective and that any beliefs about the world outside are just that - assumptions and inferences. And in fact, this would be pretty close to the "truth" - if that even exists.