Neutral AI

by PhilGoetz3 min read27th Dec 201030 comments

12

Personal Blog

Unfriendly AI has goal conflicts with us.  Friendly AI (roughly speaking) shares our goals.  How about an AI with no goals at all?

I'll call this "neutral AI".  Cyc is a neutral AI.  It has no goals, no motives, no desires; it is inert unless someone asks it a question.  It then has a set of routines it uses to try to answer the question.  It executes these routines, and terminates, whether the question was answered or not.  You could say that it had the temporary goal to answer the question.  We then have two important questions:

  1. Is it possible (or feasible) to build a useful AI that operates like this?
  2. Is an AI built in this fashion significantly less-dangerous than one with goals?

Many people have answered the first question "no".  This would probably include Hubert Dreyfus (based on a Heideggerian analysis of semantics, which was actually very good but I would say misguided in its conclusions because Dreyfus mistook "what AI researchers do today" for "what is possible using a computer"), Phil Agre, Rodney Brooks, and anyone who describes their work as "reactive", "behavior-based", or "embodied cognition".  We could also point to the analogous linguistic divide.  There are two general approaches to natural language understanding, one descending from generative grammars and symbolic AI and embodied by James Allen's book Natural Language Understanding, and in the "program in the knowledge" camp that would answer the first question "yes".  The other approach has more kinship with construction grammars and machine learning, and is embodied by Manning & Schutze's Foundations of Statistical Natural Language Processing, and its practitioners would be more likely to answer the first question "no". (Eugene Charniak is noteworthy for having been prominent in both camps.)

The second question, I think, hinges on two sub-questions:

  1. Can we prevent an AI from harvesting more resources than it should for a question?
  2. Can we prevent an AI from conceiving the goal of increasing its own intelligence as a subgoal to answering a question?

The Jack Williamson story "With Folded Hands" (1947) tells how humanity was enslaved by robots given the order to protect humanity, who became... overprotective.  Or suppose a physicist asked an AI, "Does the Higgs boson exist?"  You don't want it to use the Earth to build a supercollider.  These are cases of using more resources than intended to carry out an order.

You may be able to build a Cyc-like question-answering architectures that would have no risk of doing any such thing.  It may be as simple as placing resource limitations on every question.  The danger is that if the AI is given a very thorough knowledge base that includes, for instance, an understanding of human economics and motivations, it may syntactically construct a plan to find the answer to a question that is technically within the resource limitations posed, for instance by manipulating humans in ways that don't tweak its cost function.  This could lead to very big mistakes; but it isn't the kind of mistake that builds on itself, like a FOOM scenario.  The question is whether any of these very big mistakes would be  irreversible.  My intuition is that there would be a power-law distribution of mistake sizes, with a small number of irreversible mistakes.  We might then figure out a reasonable way of determining our risk level.

If the answer to the second subquestion is "yes", then we probably don't need to fear a FOOM from neutral AI.

The short answer is, Yes, there are "neutral AI architectures" that don't currently have the risk either of harvesting too many resources, or of attempting to increase their own intelligence.  Many existing AI architectures are examples.  (I'm thinking specifically of "hierarchical task-network planning", which I don't consider true planning; it only allows the piecing together of plan components that were pre-built by the programmer.)  But they can't do much.  There's a power / safety tradeoff.  The question is how much power you can get in the "completely safe" region, and where the sweet spots are in that tradeoff outside the completely safe region.

If you could build an AI that did nothing but parse published articles to answer the question, "Has anyone said X?", that would be very useful, and very safe. I worked on such a program (SemRep) at NIH. It works pretty well within the domain of medical journal articles.  If it could take one step more, and ask, "Can you find a set of one to four statements that, taken together, imply X?", that would be a huge advance in capability, with little if any additional risk.  (I added that capability to SemRep, but no one has ever used it, and it isn't accessible through the web interface.)

12

30 comments, sorted by Highlighting new comments since Today at 6:15 PM
New Comment

What about narrow-domain AI? If, through neuroscience, we figure out more about how the brain comes up with and generalizes theories about, say, physics and math, couldn't we emulate just this part, so that we get a complicated calculator/theory generator that can't modify itself or do much else? Then it would only use the computing power we provide it with (won't turn the Earth into a computer) and we could use it to help prove Friendliness-related theorems or explore math/physics in general (on much faster timescales than today).

Theorem-proving is a narrow domain that operates entirely within an abstract logic, and so is probably completely safe. Theorem-generating would benefit from more general cognitive abilities, so is more open-ended and potentially dangerous.

If you're talking about an AI general enough to answer interesting questions, something that doesn't just recite knowledge from a database, but something that can actually solve problems by using and synthesizing information in novel ways (which I assume you are, if you're talking about preventing it from turning the Earth into a supercollider by putting limits on its resource usage), then you would need to solve the additional problem of constraining what questions it's allowed to answer — you don't want someone asking it for the source code for some other type of AI, for example. I suspect that this part of the problem is FAI-complete.

If you could build an AI that did nothing but parse published articles to answer the question, "Has anyone said X?", that would be very useful, and very safe. I worked on such a program (SemRep) at NIH. It works pretty well within the domain of medical journal articles.

If it could take one step more, and ask, "Can you find a set of one to four statements that, taken together, imply X?", that would be a huge advance in capability, with little if any additional risk.

I added that capability to SemRep, but no one has ever used it, and it isn't accessible through the web interface. (I introduced a switch that makes it dump its output as structured Prolog statements instead of as a flat file; you can then load them into a Prolog interpreter and ask queries, and it will perform Prolog inference.) In fact, I don't think anyone else is aware that capability exists; my former boss thought it was a waste of time and was angry with me for having spent a day implementing it, and has probably forgotten about it. It needs some refinement to work properly, because a search of, say, 100,000 article abstracts will find many conflicting statements. It needs to pick one of "A / not A" for every A found directly in an article, based on the number of and quality of assertions found in favor of each.

How close to you have to get to natural language to do the search?

I've wondered whether a similar system could check legal systems for contradictions-- probably a harder problem, but not as hard as full natural language.

Most of the knowledge used, is in its ontology. It doesn't try to parse sentences with categories like {noun, verb, adverb}; it uses categories like {drug, disease, chemical, gene, surgery, physical therapy}. It doesn't categorize verbs as {transitive, intransitive, etc.}; it categorizes verbs as eg {increases, decreases, is-a-symptom-of}. When you build a grammar (by hand) out of word categories that are this specific, it makes most NLP problems disappear.

ADDED: It isn't really a grammar, either - it grabs onto the most-distinctive simple pattern first, which might be the phrase "is present in", and then says, "Somewhere to the left I'll probably find a symptom, and somewhere to the right I'll probably find a disease", and then goes looking for those things, mostly ignoring the words in-between.

I don't know what you mean by 'ontology'. I thought it meant the study of reality.

I can believe that the language in scientific research (especially if you limit the fields) is simplified enough for the sort of thing you describe to work.

I don't know what you mean by 'ontology'. I thought it meant the study of reality.

See: http://en.wikipedia.org/wiki/Ontology_(information_science)

In computer science and information science, an ontology is a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts. It is used to reason about the entities within that domain, and may be used to describe the domain.

[-][anonymous]11y 0

I don't know what you mean by 'ontology'. I thought it meant the study of reality.

"In computer science and information science, an ontology) is a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts. It is used to reason about the entities within that domain, and may be used to describe the domain."

If you're talking about an AI general enough to answer interesting questions, something that doesn't just recite knowledge from a database, but something that can actually solve problems by using and synthesizing information in novel ways (which I assume you are, if you're talking about preventing it from turning the Earth into a supercollider by putting limits on its resource usage), then you would need to solve the additional problem of constraining what questions it's allowed to answer

Nitpick, to some extent we have weak AI that can within very narrow knowledge bases already answer interesting novel question. For example, the Robbins conjecture was proven using the assistance of an automated theorem prover. And Simon Colton made AI that were able to make new interesting mathematicial definitions and make conjectures about them (see this paper). There's been similar work in biochemistry. So even very weak AIs can not only answer interesting questions but come with new questions themselves.

Or control access (which you probably want to do for the source for any sort of AI, anyway).

(Are you sure you're not searching for additional restrictions to impose until the problem becomes FAI-complete?)

you don't want someone asking it for the source code for some other type of AI, for example.

But that is as easy as not being reckless with it. One can still deliberately crash a car, but they're pretty safe.

The relative sizes of space of safe/dangerous questions compares favorably to the sizes of space of FAI/UFAI designs.

I suspect that this part of the problem is FAI-complete.

If so, that's not necessarily an argument against an oracle AI. It still may be 'as hard as creating FAI', but only because FAI can be made through an oracle AI (and all other paths being much harder).

For a very long time I assumed the first strong AI would be neutral (and I thought that hoping for a friendly AI first was both unrealistic and unnecessary). Now I'm unsure. Of course I'm pretty ignorant so you should take what I say with a grain of salt.

The most obvious objection is that almost all optimism about the power of an AGI comes from its potential ability to understand the world well enough to construct a better AI and then doing it, which is automatically ruled out by the notion of safety.

Moreover, as far as I can tell the most difficult problem facing society now is managing to build an AI which is smart for some reason you can understand (a prerequisite to being either safe or friendly) before we accidentally build an AI which is smart for some reason we can't understand (which is therefore likely to be unfriendly if you believe the SIAI).

to build an AI which is smart for some reason you can understand (a prerequisite to being either safe or friendly) before we accidentally build an AI which is smart for some reason we can't understand (which is therefore likely to be unfriendly if you believe the SIAI).

Entirely agreed, but to nitpick, an AI that's smart for some reason you understand is no more likely to be Friendly if you don't try to make it Friendly — it just allows you to try with a decent hope of success.

I have a personal slogan regarding this issue: "All intelligence is perceptual intelligence". Humans are excellent at perception and very poor at decision-making. We use our strong perceptual capacities to make up for the deficits in our planning system. This belief is based on introspection - I am able to predict very accurately, for example, that a certain configuration of photons really represents a vending machine, and if I insert a coin in the right place (another hard calculation), a soft drink will come out. In contrast to the confidence with which I can perform these perceptual feats, I have no strong opinion whether buying a soft drink is really a good idea.

Are you sure that this perception is not just due to your very small sample size of intelligent entities?

In any event, I'm not even sure given the comparisons to other species. Chimpanzees in some respects are better at some aspects of visual perception than humans.And corvids can learn to use vending machines, in some cases with minimal direct prompting from humans. But we can make much more sophisticated decisions than they can. And we can benefit from using complicated language to communicate what other humans have learned. It seems that perception of the sort you mention is not what makes us very different.

That brings up another of my personal beliefs: human intelligence differs from animal intelligence only in quantitative respects: we have more memory, more neurons, a longer developmental phase, etc. Falsifiable prediction: if you took some chimps (or dolphins or dogs), bred or genetically engineered them to increase their brain size and developmental period so as to be comparable to humans, they would have no trouble learning human language.

But we can make much more sophisticated decisions than they can.

Also, my guess is that our best decision-making abilities come from reusing our perceptual powers to transform a decision problem into a perception problem. So when an engineer designs a complex system, he does so by using drawings, schematics, diagrams, etc that allow the decision (design) problem to be attacked using perceptual apparatus.

Credit for the falsifiable prediction - which has in fact been falsified: elephants have larger brains than we have, and a similar development period, and they can't learn language.

It's clear that our decision making abilities do reuse perceptual machinery, but it's also clear that some additional machinery was needed.

Elephants may have a language - of sorts:

"Elephant 'secret language' clues"

Whales almost certainly do. Their auditory systems are amazing compared to our own.

Elephants and whales both certainly have communication systems. So do dolphins, wolves, canaries, mice, ants, oak trees, slime molds etc. That's not the same as language.

http://en.wikipedia.org/wiki/Language defines "language" as being a human capability.

That seems like a joke to me, but sure, if you define the term "language" that way, then elephants and whales don't have it. What whales do have is a complex communication system which supports dialects and persistent culture.

Is there evidence that elephant communication supports dialects and culture?

Is there any consistent research being done on language modes of other species? Especially cephalopods.

Learning to have a conversation with a cephalopod ranks first on my list of "things that might be worth dropping everything to pursue that I personally would have a greater than 1% chance of actually accomplishing."

I think that you should have a 3rd sub-question of "Can we prevent the AI from significantly altering the system it wishes to answer a question about in order to answer the question?"

For a (ridiculous and slightly contrived) example:

"Tell me, as precisely as possible, how long I have to live." "2 seconds." "What?" Neutral AI kills human.

Here, rather than calculating an expected answer based on genetics, lifestyle, etc., the AI finds that it can accurately answer the question if it simply kills the human.

Less ridiculously:

"This frog is sick, what's wrong with it?" AI proceeds with all manner of destructive tests which, while conclusively finding out what's wrong with the frog, kills it.

If you wanted to make the frog better, that's a problem.

This is closely related to the issue of "stopping" - which has been previously discussed by me on:

http://alife.co.uk/essays/stopping_superintelligence/

A common framing of the discussion is in terms of "satisficing" - a term apparently coined by Herbert Simon.

There is a previous discussion of this whole issue on this site here.

There's another problem. Even if you get it so that it just answers questions using only its own processing power, if you ask it how to best make paperclips, it will most likely respond by designing a paperclip-maximizer.

Only if you ask it the most effective method of ultimately producing the largest number of paperclips, without upper limit and without regard to schedule and cost. But that's not something anybody wants to know in the first place. What a stationery manufacturer actually wants to know is the lowest cost method of producing a certain number of paperclips within a certain schedule, and the answer to that will not be a paperclip maximizer.

If you can afford to build a paper clip maximizer for less than it costs for a given number of pape clips, then if you want to build at least that many paperclips, the cheapest way will be to build a paperclip maximizer. It's not going to turn the world into paperclips in a reasonable time, but it probably will build enough within that time limit.

No it won't. A paperclip maximizer is basically a two-stage process: first, conquer the world, then start producing paperclips via an efficient manufacturing process. If what you want is to turn the whole world into paperclips, and you don't care about cost or schedule, then the first stage is necessary. If all you want is a billion paperclips as cheap as possible, then it's entirely irrational to incur the cost and schedule impact of the first stage when you could hop directly to the second stage and just build an efficient paperclip factory.

Good point. Perhaps it would build a paperclip-making Vonn Neumann machine, or just a machine that maximizes paperclips within a given time.

Besides rwallace's point, I would think that there's a large class of useful questions you could ask that wouldn't produce such a dangerous response.

(Are you sure you're not searching for problems with the idea until it looks like it couldn't offer an advantage over building FAI?)