"What's the worst that can happen?" goes the optimistic saying. It's probably a bad question to ask anyone with a creative imagination. Let's consider the problem on an individual level: it's not really the worst that can happen, but would nonetheless be fairly bad, if you were horribly tortured for a number of years. This is one of the worse things that can realistically happen to one person in today's world.
What's the least bad, bad thing that can happen? Well, suppose a dust speck floated into your eye and irritated it just a little, for a fraction of a second, barely enough to make you notice before you blink and wipe away the dust speck.
For our next ingredient, we need a large number. Let's use 3^^^3, written in Knuth's up-arrow notation:
- 3^3 = 27.
- 3^^3 = (3^(3^3)) = 3^27 = 7625597484987.
- 3^^^3 = (3^^(3^^3)) = 3^^7625597484987 = (3^(3^(3^(... 7625597484987 times ...)))).
3^^^3 is an exponential tower of 3s which is 7,625,597,484,987 layers tall. You start with 1; raise 3 to the power of 1 to get 3; raise 3 to the power of 3 to get 27; raise 3 to the power of 27 to get 7625597484987; raise 3 to the power of 7625597484987 to get a number much larger than the number of atoms in the universe, but which could still be written down in base 10, on 100 square kilometers of paper; then raise 3 to that power; and continue until you've exponentiated 7625597484987 times. That's 3^^^3. It's the smallest simple inconceivably huge number I know.
Now here's the moral dilemma. If neither event is going to happen to you personally, but you still had to choose one or the other:
Would you prefer that one person be horribly tortured for fifty years without hope or rest, or that 3^^^3 people get dust specks in their eyes?
I think the answer is obvious. How about you?
I wonder if some people's aversion to "just answering the question" as Eliezer notes in the comments many times has to do with the perceived cost of signalling agreement with the premises.
It's straightforward to me that answering should take the question at face value; it's a thought experiment, you're not being asked to commit to a course of action. And going by the question as asked the answer for any utilitarian is "torture", since even a very small increment of suffering multiplied by a large enough number of people (or an infinite number) will outweigh a great amount of suffering by one person.
Signalling that would be highly problematic for some people because of what might be read into our answer -- does Eliezer expect that signalling assent here means signalling assent to other, as-yet-unknown conclusions he's made about (whatever issue where that bears some resemblance)? Does Eliezer intend to codify the terms of this premise into the basis for a decision theory underlying the cognitive architecture of a putative Friendly AI? Does Eliezer think that the real world, in short, maps to his gedankenexperiment sufficiently well that the terms of this scenario can meaningfully stand in for decisions made in that domain by real actors (human or otherwise)?
For my own part I'd be very, very hesitant to signal any of that. Hence I find it difficult to answer the question as asked. It's analogous to my discomfort with the Ticking Time Bomb scenario -- by a straight reading of the premise you should trade a finite chance of finding and disabling the bomb, thereby saving a million lives, for the act of torturing the person who planted it. The logic is internally-consistent, but it doesn't map to any real-world situation I can plausibly imagine (where torture is not terribly effective in soliciting confessions, and the scenario of a "ticking time bomb with a single suspect unwilling to talk mere minutes beforehand" has AFAIK never happened as presented, and would be extremely difficult to set up).
I recognize the internal consistency, yet I'm troubled by my uncertainty about what the author thinks I'm signing up for when I reply.