Points of Departure

by Eliezer Yudkowsky 11y9th Sep 20082 min read36 comments

16


Followup toAnthropomorphic Optimism

If you've watched Hollywood sci-fi involving supposed robots, androids, or AIs, then you've seen AIs that are depicted as "emotionless".  In the olden days this was done by having the AI speak in a monotone pitch - while perfectly stressing the syllables, of course.  (I could similarly go on about how AIs that disastrously misinterpret their mission instructions, never seem to need help parsing spoken English.)  You can also show that an AI is "emotionless" by having it notice an emotion with a blatant somatic effect, like tears or laughter, and ask what it means (though of course the AI never asks about sweat or coughing).

If you watch enough Hollywood sci-fi, you'll run into all of the following situations occurring with supposedly "emotionless" AIs:

  1. An AI that malfunctions or otherwise turns evil, instantly acquires all of the negative human emotions - it hates, it wants revenge, and feels the need to make self-justifying speeches.
  2. Conversely, an AI that turns to the Light Side, gradually acquires a full complement of human emotions.
  3. An "emotionless" AI suddenly exhibits human emotion when under exceptional stress; e.g. an AI that displays no reaction to thousands of deaths, suddenly showing remorse upon killing its creator.
  4. An AI begins to exhibit signs of human emotion, and refuses to admit it.

Now, why might a Hollywood scriptwriter make those particular mistakes?

These mistakes seem to me to bear the signature of modeling an Artificial Intelligence as an emotionally repressed human.

At least, I can't seem to think of any other simple hypothesis that explains the behaviors 1-4 above.  The AI that turns evil has lost its negative-emotion-suppressor, so the negative emotions suddenly switch on.  The AI that turns from mechanical agent to good agent, gradually loses the emotion-suppressor keeping it mechanical, so the good emotions rise to the surface.  Under exceptional stress, of course the emotional repression that keeps the AI "mechanical" will immediately break down and let the emotions out.  But if the stress isn't so exceptional, the firmly repressed AI will deny any hint of the emotions leaking out - that conflicts with the AI's self-image of itself as being emotionless.

It's not that the Hollywood scriptwriters are explicitly reasoning "An AI will be like an emotionally repressed human", of course; but rather that when they imagine an "emotionless AI", this is the intuitive model that forms in the background - a Standard mind (which is to say a human mind) plus an extra Emotion Suppressor.

Which all goes to illustrate yet another fallacy of anthropomorphism - treating humans as your point of departure, modeling a mind as a human plus a set of differences.

This is a logical fallacy because it warps Occam's Razor.  A mind that entirely lacks chunks of brainware to implement "hate" or "kindness", is simpler - in a computational complexity sense - than a mind that has "hate" plus a "hate-suppressor", or "kindness" plus a "kindness-repressor".  But if you start out with a human mind, then adding an activity-suppressor is a smaller alteration than deleting the whole chunk of brain.

It's also easier for human scriptwriters to imagine themselves repressing an emotion, pushing it back, crushing it down, then it is for them to imagine once deleting an emotion and it never coming back.  The former is a mode that human minds can operate in; the latter would take neurosurgery.

But that's just a kind of anthropomorphism previously covered - the plain old ordinary fallacy of using your brain as a black box to predict something that doesn't work like it does.  Here, I want to talk about the formally different fallacy of measuring simplicity in terms of the shortest diff from "normality", i.e., what your brain says a "mind" does in the absence of specific instruction otherwise, i.e., humanness.  Even if you can grasp that something doesn't have to work just like a human, thinking of it as a human+diff will distort your intuitions of simplicity - your Occam-sense.

16