First consider the following question to make sure we're on the same page in terms of moral reasoning: social consequences aside, is it morally correct to kill one person to create a million people who would not have otherwise existed? Let's suppose these people are whisked into existence on a spaceship travelling away from earth at light speed, and they live healthy, happy lives, but eventually die. 

I'd argue that anyone who adheres to "shut up and multiply" (i.e. total utilitarianism) has to say yes. Is it better to create one such person than to donate 200 dollars to Oxfam? Is one life worth more than a 200 million dollar donation to Oxfam? Seems pretty clear that the answers are "yes" and "no".

Now, suppose we have a newly created superintelligent FAI that's planning out how to fill the universe with human value. Should it first record everyone's brain, thus saving them, or should do whatever it takes to explode as quickly as possible? It's hard to estimate how much it would slow things down to get everyone's brain recorded, but it's certainly some sort of constraint. Depending on the power of the FAI, my guess is somewhere between a second and a few hours. If the FAI is going to be filling the universe with computronium simulating happy, fulfilled humans at extremely high speeds, that's a big deal! A second's delay across the future light-cone of earth could easily add up to more than the value of every currently living human's life. It may sound bad to kill everyone on earth just to save a second (or maybe scan only a few thousand people for "research"), but that's only because of scope insensitivity. If only we understood just how good saving that second would be, maybe we would all agree that it is not only right but downright heroic to do so!

A related scenario: a FAI that we are very, very sure correctly implements CEV sets up a universe in which everyone gets 20 years to live, starting from a adult transhuman state. It turns out that there are diminishing returns in terms of value to longer and longer life spans, and this is the best way to use the computational power. The transhumans have been modified not to have any anxiety or fear about death, and agree this is the best way to do things. Their human ancestors' desire for immortality is viewed as deeply wrong, even barbaric. In short, all signs point to this really being the coherent extrapolated volition of humanity. 

 


 

Besides opinions on whether or not either of these scenarios are plausible, I'd like hear reactions to these scenarios as thought experiments. Is this a problem for total utilitarianism or for CEV? Is this an argument for "grabbing the banana" as a species and if necessary knowingly making an AI that does something other than the morally correct thing? Anyone care to bite the bullet?

New to LessWrong?

New Comment
43 comments, sorted by Click to highlight new comments since: Today at 5:26 AM

To the extent your question is, "Suppose X is the correct answer. Is X the correct answer?", X is the correct answer. Outside of that supposition it probably isn't.

I don't think that's what I'm asking. Here's an analogy. A person X comes to the conclusion fairly late in life that the morally best thing they can think of to do is to kill themselves in a way that looks like an accident and will their sizable life insurance policy to charity. This conclusion isn't a reducto ad absurdum of X's moral philosophy, even if X doesn't like it. Regardless of this particular example, it could presumably be correct for a person to sacrifice themselves in a way that doesn't feel heroic, isn't socially accepted, and doesn't save the whole world but maybe only a few far-away people. I think most people in such a situation (who managed not to rationalize the dilemma away) would probably not do it.

So I'm trying to envision the same situation for humanity as a whole. Is there any situation that humanity could face that would make us collectively say "Yeah doing Y is right, even though it seems bad for us. But the sacrifice is too great, we aren't going to do it". That is, if there's room for space between "considered morality" and "desires" for an individual, is there room for space between them for a species?

Is there any situation that humanity could face that would make us collectively say "Yeah doing Y is right, even though it seems bad for us. But the sacrifice is too great, we aren't going to do it"

This is still probably not the question that you want to ask. Humans do incorrect things all the time, with excellent rationalizations, so "But the sacrifice is too great, we aren't going to do it" is not a particularly interesting specimen. To the extent that you think that "But the sacrifice is too great" is a relevant argument, you think that "Yeah doing Y is right" is potentially mistaken.

I guess the motivation for this post is in asking whether it is actually possible for a conclusion like that to be correct. I expect it might be, mainly because humans are not particularly optimized thingies, so it might be more valuable to use the atoms to make something else that's not significantly related to the individual humans. But again to emphasize the consequentialist issue: to the extent such judgment is correct, it's incorrect to oppose it; and to the extent it's correct to oppose it, the judgment is incorrect.

"But the sacrifice is too great" is a relevant argument, you think that "Yeah doing Y is right" is potentially mistaken.

I think I disagree with this. On a social and political level, the tendency to rationalize is so pervasive it would sound completely absurd to say "I agree that it would be morally correct to implement your policy but I advocate not doing it, because it will only help future generations, screw those guys." In practice, when people attempt to motivate each other in the political sphere to do something, it is always accompanied by the claim that doing that thing is morally right. But it is in principle possible to try to get people not to do something by arguing "hey this is really bad for us!" without arguing against it's moral rightness. This thought experiment is a case where this exact "lets grab the banana" position is supposed to be tempting.

People aren't motivated by morality alone -- people aren't required to do what they recognize to be morally correct.

e.g. a parent may choose their kid's life over the lives of a hundred other children. Because they care more about their own child -- not because they think it's the morally correct thing to do.

Our moral sense is only one of the many things that motivate us.

Our moral sense is only one of the many things that motivate us.

I'm talking about extrapolated morality, which is not the same thing as moral sense (i.e. judgments accessible on human level without doing much more computation). This extrapolated morality determines what should motivate you, but of course it's not what does motivate you, and neither is non-extrapolated moral sense. In this sense it's incorrect to oppose extrapolated morality (you shouldn't do it), but you are in actuality motivated by other things, so you'll probably act incorrectly (in this sense).

Could you please point me in the direction of some discussion about 'extrapolated morality' (unless you mean CEV, in which case there's no need)?

CEV for individuals is vaguely analogous to what I'm referring to, but I don't know in any detail what I mean.

It's basically CEV for individuals, yeah.

In what sense 'should' individuals be motivated by their CEV rather than by their non-CEV preferences? Wouldn't breaking down the word 'should' in that previous sentence give you "Individuals want to achieve a state whereby they want to achieve what a perfect version of themselves would want to achieve rather than what they want to achieve"? Isn't that vaguely self-defeating?

It's more a useful definition of "should" than advice using a preexisting meaning for "should".

Congratulations, you have successfully answered the title!

Now, on to the actual post ...

[-][anonymous]11y90

Why is this so heavily downvoted?

OP screwed up terminology and local conventions (vladimir_nesov is spot on), but the question seems serious. I can't think of a solution.

I don't want to die, but it might be that I have to die to save a marginal second, which turns out to be valuable to future humanity. You sunk my battleship.

IMO, future humanity can shove the marginal second up their ass for all I care; I want to live.

it might be that I have to die to save a marginal second, which turns out to be valuable to future humanity

Not "to future humanity", but instead according to your own extrapolated values. There might be no future humanity in this hypothetical (say, FAI determines that it's better for there to be no people). The interesting form of this paradox is where you get a stark conflict between your own extrapolated values and your intuitive perception of what you value.

Why is this so heavily downvoted?

I believe the criterion is "I want to see fewer things like this, so I'll downvote."

I assume Nyan was aware of this, but it rather begs the question: why did so many people decide they wanted to see fewer things like this?

For the lazy, the punchline:

And if an external objective morality does say that the universe should occupy some horrifying state... let's not even ask what you're going to do about that. No, instead I ask: What would you have wished for the external objective morality to be instead? What's the best news you could have gotten, reading that stone tablet?

Go ahead. Indulge your fantasy. Would you want the stone tablet to say people should die of old age, or that people should live as long as they wanted? If you could write the stone tablet yourself, what would it say?

Maybe you should just do that?

I mean... if an external objective morality tells you to kill people, why should you even listen?

No. Humans are not perfect moral reasoners; if he booted up a Friendly superintelligence and it mentioned black people, should a racist conclude he screwed up?

It's unlikely, to my mind, that the arguments presented by the OP are correct - but they are not trivially false. There are people who already espouse similar views; hell, there are people who believe we should all commit suicide now, to spare future generations the pain of living. To simply say "I don't want to do that" is a fully general counteragument, and if it was properly integrated you would be immune to any moral argument.

(You would also have to conclude that everyone but you has a wildly different utility function, which would rather defeat the purpose of CEV.)

Deep in the implementation of my utility function, I found this heuristic: if you divide the set of people into reference classes via any natural classification scheme, no such reference class should act like a utility monster to any other reference class.

What if some "people" can have unbounded utility functions, where other "people" cannot. Do you think it's still appropriate to apply that heuristic?

[-][anonymous]11y20

Are you trying to compare utility functions between people?

Well, if one "person" is a transhuman with the ability to experience more things in a minute than a present day human can in a lifetime, without the present day human's functional limits on such things as how many pleasure-associated chemicals can be circulating through its system at a time, or how much it can care about a whole lot of things at once relative to a few things, then those are probably going to result in some highly relevant differences in their utility functions.

We may not be able to compare utility functions between humans particularly well, but we can compare the utility functions of humans to, say, jellyfish, and say that along important metrics, jellyfish can scarcely be said to have utility functions by comparison.

If it's possible for a being to be to a human what a human is to a jellyfish in that respect, I think I'd rather be that sort of being.

[-]Jack11y20

The following is going to use a really crude ontology but I think my point will survive it. Model the part of our brains that tells us what is moral as simply a subroutine of a larger utility pseudo-function that determines what we do (which also includes, say, selfish pleasures) . It seems both possible and plausible that the "morality part" will return a value for some action that the larger function does not. And of course, this happens all the time. Most people have felt like they failed to live up to their own values. If you program an AI to do what a subroutine of your utility function says than it is totally possible for the AI to ask you to do something you really don't want to do.

It's a problem for total utilitarianism insofar as it is yet another bullet to bite. Though, this is really just a new description of the utility monster bullet. I'm not a total utilitarian and I find such bullets too bitter. But the mere existence of a gap between "what we should do" and "what we want to do" doesn't seem very important.

This is yet another poorly phrased, factually inaccurate post containing some unorthodox viewpoints that are unlikely to be taken seriously because people around here are vastly better at deconstructing others' arguments than fixing them for them.

Ignoring any formal and otherwise irrelevant errors such as what utilitarianism actually is, I'll try to address the crucial questions; both to make Bundle_Gerbe's viewpoints more accessible to LW members and also to make it more clear to him why they're not as obvious as he seems to think.

1: How does creating new life compare to preserving existing life in terms of utility or value?

Bundle_Gerbe seems to be of the view that they are of identical value. That's not a view I share, mostly because I don't assign any value to the creation of new life, but I must admit that I am somewhat confused (or undecided) about the value of existing human life, both in general and as a function of parameters such as remaining life expectancy. Maybe there's some kind of LW consensus I'm not aware of, but the whole issue seems like a matter of axioms to me rather than anything that could objectively be inferred from some sort of basic truth.

2: If creation of life has some positive value, does this value increase if creation is preponed?

Not a question relevant to me, but it seems that this would partly depend on whether earlier creation implied higher total amount of lives, or just earlier saturation, for example because humans live forever and ultimately the only constraints will be space. I'm not entirely certain I correctly understand Bundle_Gerbe's position on this, but it seems that his utility function is actually based on total lifetime as opposed to actual number of human lives, meaning that two humans existing for one second each would be equivalent to one human existing for two seconds. That's kind of an interesting approach with lots of implied questions, such as whether travelling at high speeds would reduce value because of relativistic effects.

3: Is sacrificing personal lifetime to increase total humanity lifetime value a good idea?

If your utility function is based on total humanity lifetime value, and you're completely altruistic, sure. Most people don't seem to be all that altruistic, though. If I had to choose between saving one or two human beings, I would choose the latter option, but I'd never sacrifice myself to save a measly two humans. I would be very suprised if CEV turned out to require my death after 20 years, and in fact I would immediately reclassify the FAI in question as UFAI. Sounds like an interesting setup for an SF story, though.

For what it's worth, I upvoted the post. Not because the case was particularly well presented, obviously, but because I think it's not completely uninteresting and because I perceived some of the comments such as Vladimir_Nesov's which got quite some upvotes as rather unfair.

That being said, the title is badly phrased while not being very relevant, either.

Thanks for this response. One comment about one of your main points: I agree that the tradeoff of number of humans vs. length of life is ambiguous. But to the extent our utility function favors numbers of people over total life span, that makes the second scenario more plausible, whereas if total life span is more important, the first is more plausible.

I agree with you that both the scenarios would be totally unacceptable to me personally, because of my limited altruism. I would badly want to stop it from happening, and I would oppose creating any AI that did it. But I disagree in that I can't say that any such AI is unfriendly or "evil". Maybe if I was less egoistic, and had a better capacity to understand the consequences, I really would feel the sacrifice was worth it.

If you would oppose an AI attempting to enforce a CEV that would be detrimental to you, but still classify it as FAI and not evil, then wouldn't that make you evil?

Obviously this is a matter of definitions, but it still seems to be the logical conclusion.

If your utility function is based on total humanity lifetime value, and you're completely altruistic, sure. Most people don't seem to be all that altruistic, though. If I had to choose between saving one or two human beings, I would choose the latter option, but I'd never sacrifice myself to save a measly two humans.

That seems like a bias/heuristic; people are known to be biased in favor of themseves, and there is instrumental value in more life to help people with.

That's not bias, it's subjective morals.

social consequences aside, is it morally correct to kill one person to create a million people who would not have otherwise existed?

How would a world in which it is morally correct to kill one person in order to create a million people look different than a world in which this is not the case?

Friendly AIs would behave differently, for one thing.

You may have to be a bit more specific. What in the FAI's code would look different between world 1 and world 2?

Define "moral" as referring to human ethics, whatever those may be. Define "Friendly" as meaning "does the best possible thing according to human ethics, whatever those may be." Define AI as superintelligence. Any Friendly AI, by these definitions, would behave differently depending on whether X is "moral".

Does that answer your question?

I don't assign any negative utility to ending a life. It's not like it's something you can experience. I suppose it might be since all your experiences are all about a change in brain state, but still, it lasts an instant. It can't be that bad. As such, of course I'd want an AI that would kill us all.

In general, I would consider it odd for someone to find killing that bad. At that point, it would probably be better to just design an AI to destroy the world, because if you don't a large number of people would be born and die in the intervening time while waiting for FAI.

I don't assign any negative utility to ending a life.

Um ... most people don't want to die. That in and of itself would seem to suggest you may possibly have gone wrong somewhere in this line of reasoning.

More generally, are you a strict hedonic utilitarian? Because I can see how focusing solely on pleasure etc. could lead to that conclusion, but I think most LWers are closer to preference utilitarians.

That in and of itself would seem to suggest you may possibly have gone wrong somewhere in this line of reasoning.

It would suggest it, but it's not that strong evidence. Most people are okay with factory farming. Most people put little value on things they don't consider themselves responsible for.

More generally, are you a strict hedonic utilitarian?

I am but I could understand valuing anything you can experience. Valuing things that can't be experienced just seems silly. Would you value ice cream independent of your ability to taste it?

It would suggest it, but it's not that strong evidence. Most people are okay with factory farming. Most people put little value on things they don't consider themselves responsible for.

OTOH, in my experience at least, people become a lot less biased when it comes to themselves. Few people would want to be factory farmed ;)

Would you value ice cream independent of your ability to taste it?

Personally? No. But I can imagine a paperclipper that would gladly sacrifice it's life to save the paperclip collection.

OTOH, in my experience at least, people become a lot less biased when it comes to themselves. Few people would want to be factory farmed ;)

Do they, or does their bias just change?

In my experience, people value themselves vastly more than they value other people. Ergo, if you replace them with someone else, they consider it a huge loss in utility.

It's possible that rationality tends to bias you away from your natural impassiveness to large groups and other "far" situations. There's a post on this, "Shut up and divide". But there do seem to be genuine biases leading to underestimates of their suffering, not just knowing about it and not caring.

[-][anonymous]11y00

I don't assign any negative utility to ending a life. It's not like it's something you can experience.

What? Are you talking about decision theoretic utility or hedonic utility? You can't experience any decision-utility, and you can't make decisions based on hedonic utility (directly). What are you trying to do?

I don't assign decision theoretic utility to it. It doesn't seem like something could actually matter unless someone experiences it.