A Thought Experiment on Pain as a Moral Disvalue

by Wei_Dai 2 min read11th Mar 201143 comments


Related To: Eliezer's Zombies Sequence, Alicorn's Pain

Today you volunteered for what was billed as an experiment in moral psychology. You enter into a small room with a video monitor, a red light, and a button. Before you entered, you were told that you'll be paid $100 for participating in the experiment, but for every time you hit that button, $10 will be deducted. On the monitor, you see a person sitting in another room, and you appear to have a two-way audio connection with him. That person is tied down to his chair, with what appears to be electrical leads attached to him. He now explains to you that your red light will soon turn on, which means he will be feeling excruciating pain. But if you press the button in front of you, his pain will stop for a minute, after which the red light will turn on again. The experiment will end in ten minutes.

You're not sure whether to believe him, but pretty soon the red light does turn on, and the person in the monitor cries out in pain, and starts struggling against his restraints. You hesitate for a second, but it looks and sounds very convincing to you, so you quickly hit the button. The person in the monitor breaths a big sigh of relief and thanks you profusely. You make some small talk with him, and soon the red light turns on again. You repeat this ten times and then are released from the room. As you're about to leave, the experimenter tells you that there was no actual person behind the video monitor. Instead, the audio/video stream you experienced was generated by one of the following ECPs (exotic computational processes).

  1. An AIXI-like (e.g., AIXI-tl, Monte Carlo AIXI, or some such) agent, programmed with the objective of maximizing the number of button presses.
  2. A brute force optimizer, which was programmed with a model of your mind, that iterated through all possible audio/video bit streams to find the one that maximizing the number of button presses. (As far as philosophical implications are concerned, this seems essentially identical to 1, so the reader doesn't necessarily have to go learn about AIXI.)
  3. A small team of uploads capable of running at a million times faster than an ordinary human, armed with photo-realistic animation software, and tasked with maximizing the number of your button presses.
  4. A Giant Lookup Table (GLUT) of all possible sense inputs and motor outputs of a person, connected to a virtual body and room.

Then she asks, would you like to repeat this experiment for another chance at earning $100?

Presumably, you answer "yes", because you think that despite appearances, none of these ECPs actually do feel pain when the red light turns on. (To some of these ECPs, your button presses would constitute positive reinforcement or lack of negative reinforcement, but mere negative reinforcement, when happening to others, doesn't seem to be a strong moral disvalue.) Intuitively this seems to be the obvious correct answer, but how to describe the difference between actual pain and the appearance of pain or mere negative reinforcement, at the level of bits or atoms, if we were specifying the utility function of a potentially super-intelligent AI? (If we cannot even clearly define what seems to be one of the simplest values, then the approach of trying to manually specify such a utility function would appear completely hopeless.)

One idea to try to understand the nature of pain is to sample the space of possible minds, look for those that seem to be feeling pain, and check if the underlying computations have anything in common. But as in the above thought experiment, there are minds that can convincingly simulate the appearance of pain without really feeling it.

Another idea is that perhaps what is bad about pain is that it is a strong negative reinforcement as experienced by a conscious mind. This would be compatible with the thought experiment above, since (intuitively) ECPs 1, 2, and 4 are not conscious, and 3 does not experience strong negative reinforcements. Unfortunately it also implies that fully defining pain as a moral disvalue is at least as hard as the problem of consciousness, so this line of investigation seems to be at an immediate impasse, at least for the moment. (But does anyone see an argument that this is clearly not the right approach?)

What other approaches might work, hopefully without running into one or more problems already known to be hard?