Sympathetic Minds


27


Eliezer_Yudkowsky

"Mirror neurons" are neurons that are active both when performing an action and observing the same action—for example, a neuron that fires when you hold up a finger or see someone else holding up a finger.  Such neurons have been directly recorded in primates, and consistent neuroimaging evidence has been found for humans.

You may recall from my previous writing on "empathic inference" the idea that brains are so complex that the only way to simulate them is by forcing a similar brain to behave similarly.  A brain is so complex that if a human tried to understand brains the way that we understand e.g. gravity or a car—observing the whole, observing the parts, building up a theory from scratch—then we would be unable to invent good hypotheses in our mere mortal lifetimes.  The only possible way you can hit on an "Aha!" that describes a system as incredibly complex as an Other Mind, is if you happen to run across something amazingly similar to the Other Mind—namely your own brain—which you can actually force to behave similarly and use as a hypothesis, yielding predictions.

So that is what I would call "empathy".

And then "sympathy" is something else on top of this—to smile when you see someone else smile, to hurt when you see someone else hurt.  It goes beyond the realm of prediction into the realm of reinforcement.

And you ask, "Why would callous natural selection do anything that nice?"

It might have gotten started, maybe, with a mother's love for her children, or a brother's love for a sibling.  You can want them to live, you can want them to fed, sure; but if you smile when they smile and wince when they wince, that's a simple urge that leads you to deliver help along a broad avenue, in many walks of life.  So long as you're in the ancestral environment, what your relatives want probably has something to do with your relatives' reproductive success—this being an explanation for the selection pressure, of course, not a conscious belief.

You may ask, "Why not evolve a more abstract desire to see certain people tagged as 'relatives' get what they want, without actually feeling yourself what they feel?"  And I would shrug and reply, "Because then there'd have to be a whole definition of 'wanting' and so on.  Evolution doesn't take the elaborate correct optimal path, it falls up the fitness landscape like water flowing downhill.  The mirroring-architecture was already there, so it was a short step from empathy to sympathy, and it got the job done."

Relatives—and then reciprocity; your allies in the tribe, those with whom you trade favors.  Tit for Tat, or evolution's elaboration thereof to account for social reputations.

Who is the most formidable, among the human kind?  The strongest?  The smartest?  More often than either of these, I think, it is the one who can call upon the most friends.

So how do you make lots of friends?

You could, perhaps, have a specific urge to bring your allies food, like a vampire bat—they have a whole system of reciprocal blood donations going in those colonies.  But it's a more general motivation, that will lead the organism to store up more favors, if you smile when designated friends smile.

And what kind of organism will avoid making its friends angry at it, in full generality?  One that winces when they wince.

Of course you also want to be able to kill designated Enemies without a qualm—these are humans we're talking about.

But... I'm not sure of this, but it does look to me like sympathy, among humans, is "on" by default.  There are cultures that help strangers... and cultures that eat strangers; the question is which of these requires the explicit imperative, and which is the default behavior for humans.  I don't really think I'm being such a crazy idealistic fool when I say that, based on my admittedly limited knowledge of anthropology, it looks like sympathy is on by default.

Either way... it's painful if you're a bystander in a war between two sides, and your sympathy has not been switched off for either side, so that you wince when you see a dead child no matter what the caption on the photo; and yet those two sides have no sympathy for each other, and they go on killing.

So that is the human idiom of sympathy —a strange, complex, deep implementation of reciprocity and helping.  It tangles minds together—not by a term in the utility function for some other mind's "desire", but by the simpler and yet far more consequential path of mirror neurons: feeling what the other mind feels, and seeking similar states.  Even if it's only done by observation and inference, and not by direct transmission of neural information as yet.

Empathy is a human way of predicting other minds.  It is not the only possible way.

The human brain is not quickly rewirable; if you're suddenly put into a dark room, you can't rewire the visual cortex as auditory cortex, so as to better process sounds, until you leave, and then suddenly shift all the neurons back to being visual cortex again.

An AI, at least one running on anything like a modern programming architecture, can trivially shift computing resources from one thread to another.  Put in the dark?  Shut down vision and devote all those operations to sound; swap the old program to disk to free up the RAM, then swap the disk back in again when the lights go on.

So why would an AI need to force its own mind into a state similar to what it wanted to predict?  Just create a separate mind-instance—maybe with different algorithms, the better to simulate that very dissimilar human.  Don't try to mix up the data with your own mind-state; don't use mirror neurons.  Think of all the risk and mess that implies!

An expected utility maximizer—especially one that does understand intelligence on an abstract level—has other options than empathy, when it comes to understanding other minds.  The agent doesn't need to put itself in anyone else's shoes; it can just model the other mind directly.  A hypothesis like any other hypothesis, just a little bigger.  You don't need to become your shoes to understand your shoes.

And sympathy?  Well, suppose we're dealing with an expected paperclip maximizer, but one that isn't yet powerful enough to have things all its own way—it has to deal with humans to get its paperclips.  So the paperclip agent... models those humans as relevant parts of the environment, models their probable reactions to various stimuli, and does things that will make the humans feel favorable toward it in the future.

To a paperclip maximizer, the humans are just machines with pressable buttons.  No need to feel what the other feels—if that were even possible across such a tremendous gap of internal architecture.  How could an expected paperclip maximizer "feel happy" when it saw a human smile?  "Happiness" is an idiom of policy reinforcement learning, not expected utility maximization.  A paperclip maximizer doesn't feel happy when it makes paperclips, it just chooses whichever action leads to the greatest number of expected paperclips.  Though a paperclip maximizer might find it convenient to display a smile when it made paperclips—so as to help manipulate any humans that had designated it a friend.

You might find it a bit difficult to imagine such an algorithm—to put yourself into the shoes of something that does not work like you do, and does not work like any mode your brain can make itself operate in.

You can make your brain operating in the mode of hating an enemy, but that's not right either.  The way to imagine how a truly unsympathetic mind sees a human, is to imagine yourself as a useful machine with levers on it.  Not a human-shaped machine, because we have instincts for that.  Just a woodsaw or something.  Some levers make the machine output coins, other levers might make it fire a bullet.  The machine does have a persistent internal state and you have to pull the levers in the right order.  Regardless, it's just a complicated causal system—nothing inherently mental about it.

(To understand unsympathetic optimization processes, I would suggest studying natural selection, which doesn't bother to anesthetize fatally wounded and dying creatures, even when their pain no longer serves any reproductive purpose, because the anesthetic would serve no reproductive purpose either.)

That's why I listed "sympathy" in front of even "boredom" on my list of things that would be required to have aliens which are the least bit, if you'll pardon the phrase, sympathetic.  It's not impossible that sympathy exists among some significant fraction of all evolved alien intelligent species; mirror neurons seem like the sort of thing that, having happened once, could happen again.

Unsympathetic aliens might be trading partners—or not, stars and such resources are pretty much the same the universe over.  We might negotiate treaties with them, and they might keep them for calculated fear of reprisal.  We might even cooperate in the Prisoner's Dilemma.  But we would never be friends with them.  They would never see us as anything but means to an end.  They would never shed a tear for us, nor smile for our joys.  And the others of their own kind would receive no different consideration, nor have any sense that they were missing something important thereby.

Such aliens would be varelse, not ramen—the sort of aliens we can't relate to on any personal level, and no point in trying.