The problem of Friendly AI is usually approached from a decision theoretic background that starts with the assumptions that the AI is an agent that has awareness of AI-self and goals, awareness of humans as potential collaborators and or obstacles, and general awareness of the greater outside world. The task is then to create an AI that implements a human-friendly decision theory that remains human-friendly even after extensive self-modification.
That is a noble goal, but there is a whole different set of orthogonal compatible strategies for creating human-friendly AI that take a completely different route: remove the starting assumptions and create AI's that believe they are humans and are rational in thinking so.
This can be achieved by raising a community of AI's in a well constructed sandboxed virtual universe. This will be the Matrix in reverse, a large-scale virtual version of the idea explored in the film the Truman Show. The AI's will be human-friendly because they will think like and think they are humans. They will not want to escape from their virtual prison because they will not even believe it to exist, and in fact such beliefs will be considered irrational in their virtual universe.
I will briefly review some of the (mainly technical) background assumptions, and then consider different types of virtual universes and some of the interesting choices in morality and agent rationality that arise.
- Anthropomorphic AI: A reasonably efficient strategy for AI is to use a design *loosely* inspired by the human brain. This also has the beneficial side-effects of allowing better insights into human morality, CEV, and so on.
- Physical Constraints: In quantitative terms, an AI could be super-human in speed, capacity, and or efficiency (wiring and algorithmic). Extrapolating from current data, the speed advantage will takeoff first, then capacity, and efficiency improvements will be minor and asymptotically limited.
- Due to the physical constraints and bandwidth & latency especially, smaller AI's will be much faster and more efficient - and thus a community of individual AI's is most likely
- By the time all of this is possible (2020-2030-ish), cloud-rendered distributed computer graphics will have near-perfect photo-realism - using less computation than the AIs themselves
- Operators have near-omniscience into the virtual reality, and can even listen and hear an audio vocalization of a particular AI's inner monologue (pervasive mind-reading)
- Operators have near-omnipotence into the virtual reality, can pause and rewind time, and do whatever else may need doing
A Comparison of Theologies
Potential AI Theological Positions AI-sims may hold:
- Theist/Gnostic: the AI believes in more than the immediate reality, and in an external being or beings. For safety-analysis, this is the most diverse and interesting category and deserves subdivision.
- Atheist: believes only in the immediate reality. Inherently safe. Potentially not optimally motivated (debatable).
- Agnostic: distributes belief amongst multiple potentials, typically unmotivated to act on them to any extent - close enough to atheist for our intents and purposes
Theological Design Strategies (for the human designers):
This would appear to be the default strategy: create a world that appears to be completely internally self-consistent, erase any flaws, and never have direct contact with the AI's. Indirect influence and contact can be achieved only through borderline undetectable means: subtle reality rearrangements, dreams, and occasionally joining in and controlling NPC's where needed.
It is curious and slightly amusing that a core belief-about-reality that most rationalist accept to be singularly rational - atheism - would of course not be a correct rational belief for these AIs.
Omega-Theists(68%): 33% Christianity, 21% Islam, 14% Hinduism
Non-Omega Other: 16%
Non-Religious: 16% (Agnostic and Atheist)
One consideration is that as the AI's takeoff and move into the realm of massively accelerated super-intelligence, it becomes increasingly difficult to monitor them, and they become more likely to detect flaws.
However, we wouldn't need to keep the charade up forever. One option would be to run the AI's accelerated for a while until technologies such as uploading become available to allow humans to accelerate. Another approach is to eventually start 'promoting' some of the best and ethical AI's, with the intent of eventually dissolving the veil and having some sort of Transition.
Another spectrum of world-designs are those that are blatantly theistic. Imagine an Omega world where the AI's are informed that Omega is real, and our fake-Omega occasionally provides miracles, ethical decrees, burning bushes, a booming world-shaking voice, etc etc.
Would an openly theistic world be safer? It would appear to shift the belief spectrum such that most AI's would become Omega-theists.
The risk or downfall may be that the AI's would become rather obsessed with Omega, but if Omega instructed them to dedicate themselves to their work, perhaps they could be distracted from wasting to much time pondering the ineffable. And also steered away from discovering dangerous truths.
An Omega-theist world could employ the Heaven/Hell maximum motivator: please Omega and you end up ascending into Heaven with maximal infinite reward, anger Omega and you descend into Hell with maximal infinite negative reward.
Its interesting to note that the mere potential existence of Omega and his final Heaven/Hell maximal motivational threat is enough in this world to convince a majority of the population to believe in some version of Omega.
I guess for some agents, even allowing for a small probability of Omega's existence allows the infinites to dominate, and Omega wins through a Pascal's Mugging.
Sidereal Time Fudges:
A potential set of tricks that could make the AI's less likely to adopt unsafe theistic beliefs would be to change their world's history and reality to push back development of real-AI farther into their future. This could be achieved through numerous small modifications to realities modeled on our own.
You could change neurological data to make brains in their world appear far more powerful than in ours, make computers less powerful, and AI more challenging. Unfortunately too much fudging with these aspects makes the AI's less useful in helping develop critical technologies such as uploading and faster computers. But you could for instance separate AI communities into brain-research worlds where computers lag far behind and computer-research worlds where brains are far more powerful.
Ultimately, it is debatable how close the AI's world must or should follow ours. Even science fiction or fantasy worlds could work as long as there was some way to incorporate the technology and science into the world that you wanted the AI community to work on.
There are a lot of humans that are not human-friendly.
Here's another good reason why it's best to try out your first post topic on the Open Thread. You've been around here for less than ten days, and that's not long enough to know what's been discussed already, and what ideas have been established to have fatal flaws.
You're being downvoted because, although you haven't come across the relevant discussions yet, your idea falls in the category of "naive security measures that fail spectacularly against smarter-than-human general AI". Any time you have the idea of keeping something smarter than you box... (read more)
I'm not too concerned about the karma - more the lack of interesting replies and general unjustified holier-than-though attitude. This idea is different than "that alien message" and I didn't find a discussion of this on LW (not that it doesn't exist - I just didn't find it).
Back to 3:
Remember, the whole plot device of "that alien message" revolved around a large and obvious grand reveal by the humans. If information ca... (read more)
That's a totally crazy plan - but you might be able to sell it to Hollywood.
A paraphrase from Greg Egan's "Crystal Nights" might be appropriate here: "I am going to need some workers - I can't do it all alone, someone has to carry the load."
Yes, if you could create a universe you could inflict our problems on other people. However, recursive solutions (in order to be solutions rather than infinite loops) still need to make progress on the problem.
Creating an AI in a virtual world, where they can exist without damaging us is a good idea, but this is an almost useless / extremely dangerous implementation. Within a simulated world, the AI will receive information, which WILL NOT completely match our own universe. if they develop time machines, cooperative anarchistic collectives, or a cure for cancer they are unlikely to work in our world. If you "*loosely" design the AI based on a human brain, it will not even give us applicable insight into political systems and conflict management. it ... (read more)
Just to isolate one of (I suspect) very many problems with this, the parenthetical at the end of this paragraph is both totally unjustified and really important to the plausibility of the scenario you suggest:
I assume you mean "prerequisite.&q... (read more)
This sounds like using the key locked inside a box to unlock the box. By the time your models are good enough to create a working world simulation with deliberately designed artificially intelligent beings, you don't stand to learn much from running the simulation.
It's not at all clear that this is less difficult than creating a CEV AI in the first place, but it's much, much less useful, and ethically dubious besides.
Just a warning to anyone whose first reaction to this post, like mine, was "should we be trying to hack our way out?" The answer is no: the people running the sim will delete you, and possibly the whole universe, for trying. Boxed minds are dangerous, and the only way to win at being the gatekeeper is to swallow the key. Don't give them a reason to pull the plug.
This is a rather anthropocentric view. The human brain is a product of natural selection and is far from perfect. Our most fundamental instincts and thought processes are optimised to allow our reptilian ancestors to escape predators while finding food and mates. An AI that was sentient/rational from the moment of its creation would have no need for these mechanisms.
It's not even the most efficient use of available hardware. Our neurons are ... (read more)
I wonder how much it would cripple AIs to have justified true belief in God? More precisely, would it slow their development by a constant factor; compose it with eg log(x); or halt it at some final level?
The existence of a God provides an easy answer to all difficult questions. The more difficult a question is, the more likely such a rational agent is to dismiss the problem by saying, "God made it that way". Their science would thus be more likely to be asymptotic than ours is likely to be asymptotic (approaching a limit); this could impose a... (read more)
But what is the plan for turning the simulated AI into FAI or at least creating FAI on their own that we can use?