Upgrading moral theories to include complex values

by Ghatanathoah5 min read27th Mar 201321 comments

3

Personal Blog

Like many members of this community, reading the sequences has opened my eyes to a heavily neglected aspect of morality.  Before reading the sequences I focused mostly on how to best improve people's wellbeing in the present and the future.  However, after reading the sequences, I realized that I had neglected a very important question:  In the future we will be able to create creatures with virtually any utility function imaginable. What sort of values should we give the creatures of the future?  What sort of desires should they have, from what should they gain wellbeing?

Anyone familiar with the sequences should be familiar with the answer.  We should create creatures with the complex values that human beings possess (call them "humane values").  We should avoid creating creatures with simple values that only desire to maximize one thing, like paperclips or pleasure. 

It is important that future theories of ethics formalize this insight.  I think we all know what would happen if we programmed an AI with conventional utilitarianism:  It would exterminate the human race and replace them with creatures whose preferences are easier to satisfy (if you program it with preference utilitarianism) or creatures whom it is easier to make happy (if you program it with hedonic utilitarianism).  It is important to develop a theory of ethics that avoids this.

Lately I have been trying to develop a modified utilitarian theory that formalizes this insight.  My focus has been on population ethics.  I am essentially arguing that population ethics should not just focus on maximizing welfare, it should also focus on what sort of creatures it is best to create.  According to this theory of ethics, it is possible for a population with a lower total level of welfare to be better than a population with a higher total level of welfare, if the lower population consists of creatures that have complex humane values, while the higher welfare population consists of paperclip or pleasure maximizers. (I wrote a previous post on this, but it was long and rambling, I am trying to make this one more accessible).

One of the key aspects of this theory is that it does not necessarily rate the welfare of creatures with simple values as unimportant.  On the contrary, it considers it good for their welfare to be increased and bad for their welfare to be decreased.  Because of this, it implies that we ought to avoid creating such creatures in the first place, so it is not necessary to divert resources from creatures with humane values in order to increase their welfare. 

My theory does allow the creation of simple-value creatures for two reasons. One is if the benefits they generate for creatures with humane values outweigh the harms generated when humane-value creatures must divert resources to improving their welfare (companion animals are an obvious example of this).  The second is if creatures with humane values are about to go extinct, and the only choices are replacing them with simple value creatures, or replacing them with nothing.

So far I am satisfied with the development of this theory.  However, I have hit one major snag, and would love it if someone else could help me with it.  The snag is formulated like this:

1. It is better to create a small population of creatures with complex humane values (that has positive welfare) than a large population of animals that can only experience pleasure or pain, even if the large population of animals has a greater total amount of positive welfare.  For instance, it is better to create a population of humans with 50 total welfare than a population of animals with 100 total welfare.

2. It is bad to create a small population of creatures with humane values (that has positive welfare) and a large population of animals that are in pain.  For instance, it is bad to create a population of animals with -75 total welfare, even if doing so allows you to create a population of humans with 50 total welfare.

3.  However, it seems like, if creating human beings wasn't an option, that it might be okay to create a very large population of animals, the majority of which have positive welfare, but the some of which are in pain.  For instance, it seems like it would be good to create a population of animals where one section of the population has 100 total welfare, and another section has -75, since the total welfare is 25. 

The problem is that this leads to what seems like a circular preference.  If the population of animals with 100 welfare existed by itself it would be okay to not create it in order to create a population of humans with 50 welfare instead.  But if the population we are talking about is the one in (3) then doing that would result in the population discussed in (2), which is bad.

My current solution to this dilemma is to include a stipulation that a population with negative utility can never be better than one with positive utility.  This prevents me from having circular preferences about these scenarios.  But it might create some weird problems.  If population (2) is created anyway, and the humans in it are unable to help the suffering animals in any way, does that mean they have a duty to create lots of happy animals to get their population's utility up to a positive level?  That seems strange, especially since creating the new happy animals won't help the suffering ones in any way.  On the other hand, if the humans are able to help the suffering animals, and they do so by means of some sort of utility transfer, then it would be in the best interests to create lots of happy animals, to reduce the amount of utility each person has to transfer.

So far some of the solutions I am considering include:

1. Instead of focusing on population ethics, just consider complex humane values to have greater weight in utility calculations than pleasure or paperclips.  I find this idea distasteful because it implies it would be acceptable to inflict large harms on animals for relatively small gains for humans.  In addition, if the weight is not sufficiently great it could still lead to an AI exterminating the human race and replacing them with happy animals, since animals are easier to take care of and make happy than humans.

2. It is bad to create the human population in (2) if the only way to do so is to create a huge amount of suffering animals.  But once both populations have been created, if the human population is unable to help the animal population, they have no duty to create as many happy animals as they can.  This is because the two populations are not causally connected, and that is somehow morally significant. This makes some sense to me, as I don't think the existence of causally disconnected populations in the vast universe should bear any significance on my decision-making.

3. There is some sort of overriding consideration besides utility that makes (3) seem desirable.  For instance, it might be bad for creatures with any sort of values to go extinct, so it is good to create a population to prevent this, as long as its utility is positive on the net.  However, this would change in a situation where utility is negative, such as in (2).

4. Reasons to create a creature have some kind complex rock-paper-scissors-type "trumping" hierarchy.  In other words, the fact that the humans have humane values can override the reasons to create a happy animals, but they cannot override the reason to not create suffering animals.  The reasons to create happy animals, however, can override the reasons to not create suffering animals.  I think that this argument might lead to inconsistent preferences again, but I'm not sure.

I find none of these solutions that satisfying.  I would really appreciate it if someone could help me with solving this dilemma.  I'm very hopeful about this ethical theory, and would like to see it improved.

 

*Update.  After considering the issue some more, I realized that my dissatisfaction came from equivocating two different scenarios.  I was considering the scenario, "Animals with 100 utility and animals with -75 utility are created, no humans are created at all" to be the same as the scenario "Humans with 50 utility and animals with -75 utility are created, then the humans (before the get to experience their 50 utility) are killed/harmed in order to create more animals without helping the suffering animals in any way" to be the same scenario.  They are clearly not.

To make the analogy more obvious, imagine I was given a choice between creating a person who would experience 95 utility over the course of their life, or a person who would experience 100 utility over the course of their life.  I would choose the person with 100 utility.  But if the person destined to experience 95 utility already existed, but had not experienced the majority of that utility yet, I would oppose killing them and replacing them with the 100 utility person.

Or to put it more succinctly, I am willing to not create some happy humans to prevent some suffering animals from being created.  And if the suffering animals and happy humans already exist I am willing to harm the happy humans to help the suffering animals.  But if the suffering animals and happy humans already exist I am not willing to harm the happy humans to create some extra happy animals that will not help the existing suffering animals in any way.

Personal Blog

3

21 comments, sorted by Highlighting new comments since Today at 5:02 AM
New Comment

It is bad to create a small population of creatures with humane values (that has positive welfare) and a large population of animals that are in pain. For instance, it is bad to create a population of animals with -75 total welfare, even if doing so allows you to create a population of humans with 50 total welfare.

Why do you believe this? I don't. Due to wild animal suffering, this proposition implies that it would have been better if no life had appeared on Earth, assuming average human/animal welfare and the human/animal ratio don't dramatically change in the future.

I do expect it to change in the far future as the human race (barring some extinction event) expands into space.

I am also a little skeptical of one of the author's premises, I would not give up a significant portion of my lifespan (probably less than a week at most) to avoid a painful, but relatively brief death. I am concerned about the suffering wild animals feel in their day-to-day life, but I don't think any painful deaths they experience are as significant as the author implies. I'm not expert enough to know how frequent predator encounters, starvation and other such things are among animals to know whether the average of their day-to-day life is mostly pain, I'm guessing it's closer to neutral, but I can't be sure.

I have also read some studies that suggest fear may be much more harmful than pain to animals, I have no idea what that implies.

Then there's this, although I wouldn't take it seriously at all, and neither does the author.

Another weird idea I don't think anyone has considered before, what about the wants of animals, are they significant at all? It's well known that humans can want things that do not give them pleasure (i.e. not wanting to be told a comforting lie). It seems like that is true of animals as well. If I knock out the part of a rat's brain that likes food, and it still tries to get food (because it wants it) am I morally obligated to give it food?

Generally when I want things I don't enjoy I can divide those wants into ego-syntonic wants that I consider part of my "true self" (i.e. wanting to be told the truth, even if it's upsetting) versus ego-dystonic wants that I consider an encroachment on my true self I want to eliminate (like wanting to eat yet another potatoe chip). Since animals are not sapient, and so lack any reflective "true self" does that mean none of their wants matter, or all of them? If an animal gets what it wants does that make up for pain it has experienced, or not?

Still, you make a good point, maybe I should revise my opinion on this.

Top-level posts don't use the Markdown that comments use; they're formatted in HTML, so you'll have to reformat those links.

Thanks so much! I can't believe I forgot about that. Fixed.

One of the key aspects of this theory is that it does not necessarily rate the welfare of creatures with simple values as unimportant. On the contrary, it considers it good for their welfare to be increased and bad for their welfare to be decreased. Because of this, it implies that we ought to avoid creating such creatures in the first place, so it is not necessary to divert resources from creatures with humane values in order to increase their welfare.

If you assign any positive utility at all, no matter how small, to creating happy low-complexity life, you end up having to create lots of happy viruses (they are happy when they can replicate). If you put a no-value threshold somewhere below humans, you run into other issues, like "it's OK to torture a cat".

Allowing yourself to be swayed by thought experiments of the form "If you value x at all, then if offered enough x, you would give up y, and you value one y so much more than one x!" is a recipe for letting scope insensitivity rewrite your utility function.

Be careful about making an emotional judgement on numbers bigger than you can intuitively grasp. If possible, scale down the numbers involved (Instead of 1 million x vs 1 thousand y, imagine 1 thousand x vs 1 y) before imagining each alternative as viscerally as possible.

In my own experience, not estimating a number, just thinking "Well, it's really big" basically guarantees I will not have the proper emotional response.

You would only create these viruses if the total utility of the viruses you can create with the resources at your disposal exceeds the utility of the humans you could make with these same resources. For instance, if you give a utility of 1 to a steel paperclip weighing 1 gram, then assuming a simple additive model (which I wouldn't, but that's besides the point) making one metric ton of paperclips has an utility of 1,000,000. If you give an utility of 1,000,000,000 to a steel sculpture weighing a ton, it follows that you will never make any paperclips unless you have less than a ton of iron. You will always make the sculpture, because it gives 1,000 times the utility for the exact same resources.

You would only create these viruses if the total utility of the viruses you can create with the resources at your disposal exceeds the utility of the humans you could make with these same resources.

True, if you start with resource constraints, you can rig the utility scaling to overweigh more intelligent life. However, if you don't cheat and assign the weights before considering constraints, there is a large chance that the balance will tip the other way. Or if there is no obvious competition for resources. If you value creating at least mildly happy life, you ought to consider working on, say, silicon-based life, which does not compete with carbon-based life. Or maybe on using all this stored carbon in the ocean to create more plankton. In other words, it is easy to find a case where preassigned utilities lead to a runaway simple life creation imperative.

I have tried to get around this by creating a two step process. When considering whether or not to create a creature the first step is asking "Does it have humane values?" The second step is asking "Will it live a good life, without excessively harming others in the process?" If the answer to either of those questions is "no," then it is not good to create such a creature.

Now, it doesn't quite end there. If the benefits to others are sufficiently large then the goodness of creating a creature that fails the process may outweigh the badness of creating it. Creating an animal without humane values may still be good if such a creature provides companionship or service to a human, and the value of that outweighs the cost of caring for it. However, once such a creature is created we have a responsibility to include it in utility calculations along with everyone else. We have to make sure it lives a good life, unless there's some other highly pressing concern that outweighs it in our utility calculations.

Now, obviously creating a person with humane values who will not live a good life may still be a good thing as well, if they invent a new vaccine or something like that.

I think this process can avoid mandating we create viruses and mice, while also preserving our intuition that torturing cats is bad.

If you assign any positive utility at all, no matter how small, to creating happy low-complexity life, you end up having to create lots of happy viruses (they are happy when they can replicate).

You are using a rather expansive definition of "happy." I consider happiness to be a certain mental process that only occurs in the brains of sufficiently complex creatures. I consider it to not be synonymous with utility, which includes both happiness, and other desires people have.

You should create creatures with exactly your own values, regardless of everyone else's, unless your values are better fulfilled by compromising them.

I don't understand why your values specifically include values of embodied others.

[-][anonymous]8y 1

Hmm. I don't think those priorities are necessarily circular, but that doesn't seem to mean they are correct: I seemed to have similar priorities, but after writing all of them out in this post, I think I may need to reconsider them: They seem quite depressing. (It's also entirely possible these AREN'T similar, in which case I still need to reconsider mine, but perhaps you don't need to reconsider this.)

They seem similar to those of a selfish ruler who cares mostly about his fellow nobles, but is concerned about being too selfish in helping his fellow nobles because if there is a majority of unhappy peasants, that's revolting. I'll list 3 priorities from A-C.

Priority A: If there are peasants, I don't want there to be a majority of them who are unhappy because of nobles, because that would hurt the happiness of nobles.

Priority B: I want there to be more happy nobles.

Priority C: I want there to be more peasants who are happy as well.

Now let me consider your points:

(1) It is better to create a small population of creatures with complex humane values (that has positive welfare) than a large population of animals that can only experience pleasure or pain, even if the large population of animals has a greater total amount of positive welfare. For instance, it is better to create a population of humans with 50 total welfare than a population of animals with 100 total welfare.

1 Fits. Priority B is more important than Priority C.

(2) It is bad to create a small population of creatures with humane values (that has positive welfare) and a large population of animals that are in pain. For instance, it is bad to create a population of animals with -75 total welfare, even if doing so allows you to create a population of humans with 50 total welfare.

2 Fits. If the only way to do Priority B is to do break Priority A, don't do it.

(3) However, it seems like, if creating human beings wasn't an option, that it might be okay to create a very large population of animals, the majority of which have positive welfare, but the some of which are in pain. For instance, it seems like it would be good to create a population of animals where one section of the population has 100 total welfare, and another section has -75, since the total welfare is 25.

3 Fits. If you can do Priority C without violating Priority A, then do it.

In essence: When someone is below you in some way: (A peasant, an animal, an other) you may end up helping them not because you want them to be happy, but because you are afraid of what would happen if they are not happy enough. Even if the fear is "But if they aren't happy enough, I'd feel guilty." but you don't dislike them. If you can give them happiness at no cost to you, sure.

Now that I've typed that out, I'm fairly sure I've acted like that on several different occasions. Which really makes me think I need to reevaluate my ethics. Because I act like that, but I can't actually justify acting like that now that I've noticed myself acting like that.

Although the reason it may be hard to justify is that I have to explain to others who are somewhat distant, "Well, I care about you, but I don't really CARE care, you know? Because of X, Y, and Z you see. You understand." That's not really something that's polite to say to someone's face, so if you say in public "Yeah, I don't really care about everyone who isn't my friends and family, I just want to be sufficiently nice to all of you that I don't get hurt out of spite or guilt." That's offensive.

And what really bothers me is that if I want to be honest, I do the same things to people that I care about at least some of the time, and they do the same thing to me some of the time to.

You're right to say that treating actual people that way seems pretty unpleasant. But the examples I gave involved creating new people and animals, not differentially treating existing people and animals in some fashion.

I don't see refusing to create a new person because of a certain set of priorities as morally equivalent to disrespectfully treating existing persons because of that set of priorities. If you say "I care about you, but I don't really CARE, you know?" to a person you've decided to not create, who are you talking to? Who have you been rude to? Who have you been offensive to? Nobody, that's who.

I agree with you that we have a duty to treat people with care and respect in real life. That's precisely why I oppose creating new people in certain circumstances. Because once they're created there is no turning back, you have to show them that respect and care, and you have to be genuine about it, even if doing so messes up your other priorities. I want to make sure that my duty to others remains compatible with my other priorities. And I don't see anything wrong with creating slightly less people in order to do so.

Or to put it another way, it seems like I have a switch that, when flipped, makes me go from considering a person in the sort of cynical way you described to considering them a fellow person that I love and care for and respect with all my heart. And what flips that switch from "cynical" to "loving and compassionate" is an affirmative answer to the question "Does this person actually exist, or are they certain to exist in the future?" I don't see this as a moral failing. Nonexistant people don't mind if you don't love or respect them.

Although the reason it may be hard to justify is that I have to explain to others who are somewhat distant, "Well, I care about you, but I don't really CARE care, you know? Because of X, Y, and Z you see. You understand." That's not really something that's polite to say to someone's face, so if you say in public "Yeah, I don't really care about everyone who isn't my friends and family, I just want to be sufficiently nice to all of you that I don't get hurt out of spite or guilt." That's offensive.

I see it as an unfortunate fact of limited resources. Caring about the entire world in enough detail is impossible because each of us only has a few neurons for every other person on Earth. Until we can engineer ourselves for more caring we will have to be satisfied with putting most other people into classes such that we care about the class as a whole and leave individual caring up to other individuals in that class. Being sufficiently nice to any particular individual of a class is probably the most you can rationally do for them barring extenuating evidence. If they are obviously out of the norm for their class (starving, injured, in danger, etc.) and you can immediately help them significantly more efficiently than another member of the class then you should probably spend more caring on them. This avoids the bystander effect by allowing you to only care about overall accidents and injuries in the class in general but also care specifically about an individual if you happen to be in a good position to help.

Otherwise try to maximize the utility of the class as a whole with regard to your ability to efficiently affect their utility, weighted against your other classes and individuals appropriately.

This isn't really 'upgrading' moral theories by taking account of moral intuitions, but rather 'trying to patch them up to accord with a first order intuition of mine, regardless of distortions elsewhere'.

You want to avoid wireheading like scenarios by saying (in effect) 'fulfilled humane values have lexical priority over non-humane values', so smaller populations of happy populations of humans can trump very large populations of happy mice, even if that would be a better utilitarian deal (I'm sceptical that's how the empirical calculus plays out, but that's another issue).

This just seems wrong on its face: it seems a wrong to refuse to trade one happy human for e.g. a trillion (or quadrillion, 3^^3) happy mice (or octopi, or dolphins - just how 'humane' are we aiming for?). It seems even worse to trade one happy human (although you don't specify, this could be 'barely positive welfare' human life) for torturing unbounded numbers of mice or dolphins.

You seem to find the latter problem unacceptable (although I think you should find the first really bad too), so you offer a different patch: although positive human welfare lexically dominates positive animal welfare, it doesn't dominate negative animal welfare, which can be traded off, so happy humans at the expense of vast mice-torture is out. Yet this has other problems due to weird separability violations. Adjusting the example you give for clarity:

1) 100 A 2) 1H + 100A - 50A 3) 0 4) 1H - 50A

The account seems to greenlight the move from 1 -->2, as the welfare of the animal population remains positive, and although it goes down, we don't care because human welfare trumps it. However, we cannot move from 3-4, as the total negative animal welfare trumps the human population. This seems embarrassing, particularly as our 100 happy animal welfare could be on an adjecent supercluster, and so virtually any trade of human versus animal interests is rendered right or wrong by total welfare across the universe.

As far as I can see, the next patch you offer (never accept a population with overall negative welfare is better than one with overall positive welfare) won't help you avoid these separability concerns: 4 is still worse than 3, as it is net negative, but 2 is net positive, so is still better than 1.

You want to assert nigh-deontological 'do not do this if X' to avoid RC-like edge cases. The problem with these sorts of rules is you need to set them to strict lexical priority so they really rule out the edge case (as you rightly point out, if you, like I, take the 'human happiness has a high weight' out, you still are forced to replace humans with stoned mice so long as enough stoned mice are on the other end of the scales). Yet lexical priority seems to lead to another absurd edge case where you have to refuse even humongous trade-offs: is one happy human really worth more than a gazillion happy mice? Or indeed torturing trillions of dolphins?

So then you have to introduce other nigh-deontological trumping rules to rule out these nasty edge cases (e.g. 'you can trade any amount of positive animal welfare for any increase in human welfare, but negative welfare has a 1-1 trade with human welfare'). But then these too have nasty edge cases, and worse, you can (as deontologists find) find these rules clash to give you loss of transitivity, loss of separability, dependence on irrelevant alternatives, etc.

FWIW, I think the best fix is empirical: just take humans to be capable of much greater positive welfare states than animals, such that you have to make really big trades of humans to animals to be worth it. Although these leaves open the possibility we'd end up with little complex value, it does not seem a likely possibliity (I trade humans to mice at ~ 1000000 : 1, and I don't think mice are 1000000x cheaper). It also avoids getting edge cased, and thanks to classical util cardinally ordering states of affairs, you don't violate transitivity or separability.

As a second recommendation (with pre-emptive apologies for offense: I can't find a better way of saying this), I'd recommend going back to the population ethics literature (and philosophy generally) rather than trying to reconstitute ethical theory yourself. I don't think you know enough or have developed enough philosophical skill to make good inroads into these problems, and the posts you've made so far I think are unlikely to produce anything interesting or important in the field, and are considerably inferior to academic work ongoing.

By the way, 3^^3 = 3^27 is "only" 7625597484987, which is less than a quadrillion. If you want a really big number, you should add a third arrow (or use a higher number than three).

1) 100 A 2) 1H + 100A - 50A 3) 0 4) 1H - 50A

I wouldn't actually condone the move from 1 to 2. I would not condone inflicting huge harms on animals to create a moderately well-off human. But I would condone never creating some happy animals in the first place. Not creating is not the same as harming. The fact that TU treats not creating an animal with 50 utility to be equivalent to inflicting 50 points of disutility on an animal that has 50 utility is a strike against it. If we create an animal we have a responsibility to make it happy, if we don't we're free to make satisfied humans instead (to make things simpler I'm leaving the option of painlessly killing the animals off the table).

My argument is existing animals and existing humans have equal moral significance (or maybe humans have somewhat more if you're actually right about humans being capable of greaters level of welfare), but when deciding which to create, human creation is superior to animal (or paperclipper or wirehead) creation.

FWIW, I think the best fix is empirical: just take humans to be capable of much greater positive welfare states than animals, such that you have to make really big trades of humans to animals to be worth it. Although these leaves open the possibility we'd end up with little complex value, it does not seem a likely possibliity

I lack your faith in this fix. I consider it almost certain that if we were to create a utilitarian AI it would kill the entire human race and replace it with creatures whose preferences are easier to satisfy. And by "easier to satisfy" I mean "simpler and less ambitious," not that the creatures are more mentally and physically capable of satisfying humane desires.

In addition to this, total utilitarianism has several other problems that need fixing, the most obvious being that it considers it a morally good act to kill someone destined to live a good long life and replace them with a new person whose overall utility is slightly higher than the utility of the remaining years of that person's life would have been (I know in practice doing this is impossible without side-effects that create large dis-utilities, but that shouldn't matter, it's bad because it's bad, not because it has bad side-effects).

Quite frankly I consider the Repugnant Conclusion to be far less horrible than the "Kill and Replace" conclusion. I'd probably accept total utilitarianism as a good moral system if the RC was all it implied. The fact that TU implies something strange like the RC in an extremely unrealistic toy scenario isn't that big a deal. But that's not all TU implies. The "Kill and Replace" conclusion isn't a toy scenario, there are tons of handicapped people that could be killed and replaced with able people right now. But we don't do that, because it's wrong.

I don't spend my days cursing the fact that various disutilitous side effects prevent me from killing handicapped people and creating able people to replace them. Individual people, once they are created, matter. Once someone has been brought into existence we have a greater duty to make sure they stay alive and happy then we do to create new people. There may be some vastly huge amount of happy people that it's okay to kill one slightly-less-happy-person in order to create, but that number should be way, way, way, way, bigger than 1.

As a second recommendation (with pre-emptive apologies for offense: I can't find a better way of saying this), I'd recommend going back to the population ethics literature (and philosophy generally) rather than trying to reconstitute ethical theory yourself.

I've looked at some of the literature, and I've noticed there does not seem to be much interest in the main question I am interested in, which is, "Why make humans and not something else?" Peter Singer mentions it a few times in one essay I read, but didn't offer any answers, he just seemed to accept it as obvious. I thought it was a fallow field that I might be able to do some actual work in.

Peter Singer also has the decency to argue that human beings are not replaceable, that killing one person to replace them with a slightly happier one is bad. But I have trouble seeing how his arguments work against total utilitarianism.

I consider it almost certain that if we were to create a utilitarian AI it would kill the entire human race and replace it with creatures whose preferences are easier to satisfy. And by "easier to satisfy" I mean "simpler and less ambitious," not that the creatures are more mentally and physically capable of satisfying humane desires.

It would not necessarily kill off humanity to replace it by something else, though. Looking at the world right now, many countries run smoothly, and others horribly, even though they are all inhabited and governed by humans. Even if you made the AI "prefer" human beings, it could still evaluate that "fixing" humanity would be too slow and costly and that "rebooting" it is a much better option. That is to say, it would kill all humans, restructure the whole planet, and then repopulate the planet with human beings devoid of cultural biases, ensuring plentiful resources throughout. But the genetic makeup would stay the exact same.

Once someone has been brought into existence we have a greater duty to make sure they stay alive and happy then we do to create new people. There may be some vastly huge amount of happy people that it's okay to kill one slightly-less-happy-person in order to create, but that number should be way, way, way, way, bigger than 1.

Sure. Just add the number of deaths to the utility function with an appropriate multiplier, so that world states obtained through killing get penalized. Of course, an AI who wishes to get rid of humanity in order to set up a better world unobstructed could attempt to circumvent the limitation: create an infertility epidemic to extinguish humanity within a few generations, fudge genetics to tame it (even if it is only temporary), and so forth.

Ultimately, though, it seems that you just want the AI to do whatever you want it to do and nothing you don't want it to do. I very much doubt there is any formalization of what you, me, or any other human really wants. The society we have now is the result of social progress that elders have fought tooth and nail against. Given that in general humans can't get their own offspring to respect their taboos, what if your grandchildren come to embrace some options that you find repugnant or disagree with your idea of utopia? What if the AI tells itself "I can't kill humanity now, but if I do this and that, eventually, it will give me the mandate"? Society is an iceberg drifting along the current, only sensing the direction it's going at the moment, but with poor foresight as to what the direction is going to be after that.

I've noticed there does not seem to be much interest in the main question I am interested in, which is, "Why make humans and not something else?"

Because we are humans and we want more of ourselves, so of course we will work towards that particular goal. You won't find any magical objective reason to do it. Sure, we are sentient, intelligent, complex, but if those were the criteria, then we would want to make more AI, not more humans. Personally, I can't see the utility of plastering the whole universe with humans who will never see more than their own little sector, so I would taper off utility with the number of humans, so that eventually you just have to create other stuff. Basically, I would give high utility to variety. It's more interesting that way.

That is to say, it would kill all humans, restructure the whole planet, and then repopulate the planet with human beings devoid of cultural biases, ensuring plentiful resources throughout. But the genetic makeup would stay the exact same.

That would be bad, but it would still be way better than replacing us with paperclippers or orgasmium.

The society we have now is the result of social progress that elders have fought tooth and nail against.

That's true, but if it's "progress" then it must be progress towards something. Will we eventually arrive at our destination, decide society is pretty much perfect, and then stop? Is progress somehow asymptotic so we'll keep progressing and never quite reach our destination?

The thing is, it seems to me that what we've been progressing towards is greater expression of our human natures. Greater ability to do what the most positive parts of our natures think we should. So I'm fine with future creatures that have something like human nature deciding some new society I'm kind of uncomfortable with is the best way to express their natures. What I'm not fine with is throwing human nature out and starting from scratch with something new, which is what I think a utilitarian AI would do.

Because we are humans and we want more of ourselves, so of course we will work towards that particular goal. You won't find any magical objective reason to do it. Sure, we are sentient, intelligent, complex, but if those were the criteria, then we would want to make more AI, not more humans.

I didn't literally mean humans, I meant "Creatures with the sorts of goals, values, and personalities that humans have." For instance, if given a choice between creating an AI with human-like values, and creating a human sociopath, I would pick the AI. And it wouldn't just be because there was a chance the sociopath would harm others. I simply consider the values of the AI more worthy of creation than the sociopath's.

Personally, I can't see the utility of plastering the whole universe with humans who will never see more than their own little sector, so I would taper off utility with the number of humans, so that eventually you just have to create other stuff. Basically, I would give high utility to variety. It's more interesting that way.

I don't necessarily disagree. If having a large population of creatures with humane values and high welfare was assured then it might be better to have a variety of creatures. But I still think maybe there should be some limits on the sort of creatures we should create, i.e. lawful creativity. Eliezer has suggested that consciousness, sympathy, and boredom are the essential characteristics any intelligent creature should have. I'd love for there to be a wide variety of creatures, but maybe it would be best if they all had those characteristics.

That's true, but if it's "progress" then it must be progress towards something. Will we eventually arrive at our destination, decide society is pretty much perfect, and then stop? Is progress somehow asymptotic so we'll keep progressing and never quite reach our destination?

It's quite hard to tell. "Progress" is always relative to the environment you grew up in and on which your ideas and aspirations are based. At the scale of a human life, our trajectory looks a lot like a straight line, but for all we know, it could be circular. At every point on the circle, we would aim to follow the tangent, and it would look like that's what we are doing. However, as we move along, the tangent would shift ever so subtly and over the course of millennia we would end up doing a roundabout.

I am not saying that's precisely what we are doing, but there is some truth to it: human goals and values shift. Our environment and upbringing mold us very deeply, in a way that we cannot really abstract away. A big part of what we consider "ideal" is therefore a function of that imprint. However, we rarely ponder the fact that people born and raised in our "ideal world" would be molded differently and thus may have a fundamentally different outlook on life, including wishing for something else. That's a bit contrived, of course, but it would probably be possible to make a society which wants X when raised on Y, and Y when raised on X, so that it would constantly oscillate between X and Y. We would have enough foresight to figure out a simple oscillator, but if ethics were a kind of semi-random walk, I don't think it would be obvious. The idea that we are converging towards something might be a bit of an illusion due to underestimating how different future people will be from ourselves.

The thing is, it seems to me that what we've been progressing towards is greater expression of our human natures. Greater ability to do what the most positive parts of our natures think we should.

I suspect the negative aspects of our natures occur primarily when access to resources is strained. If every human is sheltered, well fed, has access to plentiful energy, and so on, there aren't really be any problems to blame on anyone, so everything should work fine (for the most part, anyway). In a sense, this simplifies the task of the AI: you ask it to optimize supply to existing demand and the rest is smooth sailing.

I didn't literally mean humans, I meant "Creatures with the sorts of goals, values, and personalities that humans have."

Still, the criterion is explicitly based on human values. Even if not human specifically, you want "human-like" creatures.

Eliezer has suggested that consciousness, sympathy, and boredom are the essential characteristics any intelligent creature should have. I'd love for there to be a wide variety of creatures, but maybe it would be best if they all had those characteristics.

Still fairly anthropomorphic (not necessarily a bad thing, just an observation). In principle, extremely interesting entities could have no conception of self. Sympathy is only relevant to social entities -- but why not create solitary ones as well? As for boredom, what makes a population of entities that seek variety in their lives better than one of entities who each have highly specialized interests (all different from each other)? As a whole, wouldn't the latter display more variation than the former? I mean, when you think about it, in order to bond with each other, social entities must share a lot of preferences, the encoding of which is redundant. Solitary entities with fringe preferences could thus be a cheap and easy way to increase variety.

Or how about creating psychopaths and putting them in controlled environments that they can destroy at will, or creating highly violent entities to throw in fighting pits? Isn't there a point where this is preferable to creating yet another conscious creature capable of sympathy and boredom?

Sympathy is only relevant to social entities -- but why not create solitary ones as well?

A creature that loves solitude might not necessarily be bad to create. But it would still be good to give it capacity for sympathy for pragmatic reasons, to ensure that if it ever did meet another creature it would want to treat it kindly and avoid harming it.

As for boredom, what makes a population of entities that seek variety in their lives better than one of entities who each have highly specialized interests (all different from each other)? As a whole, wouldn't the latter display more variation than the former?

It's not about having a specialized interest and exploring it. A creature with no concept of boredom would would, (to paraphrase Eliezer), "play the same screen of the same level of the same fun videogame over and over again." They wouldn't be like an autistic savant who knows one subject inside and out. They'd be little better than a wirehead. Someone with narrow interests still explores every single aspect of that interest in great detail. A creature with no boredom would find one tiny aspect of that interest and do it forever.

Or how about creating psychopaths and putting them in controlled environments that they can destroy at will, or creating highly violent entities to throw in fighting pits? Isn't there a point where this is preferable to creating yet another conscious creature capable of sympathy and boredom?

Yes, I concede that if there is a sufficient quantity of creatures with humane values, it might be good to create other types of creatures for variety's sake. However, such creatures could be potentially dangerous, we'd have to be very careful.

A creature that loves solitude might not necessarily be bad to create. But it would still be good to give it capacity for sympathy for pragmatic reasons, to ensure that if it ever did meet another creature it would want to treat it kindly and avoid harming it.

Fair enough, though at the level of omnipotence we're supposing, there would be no chance meetups. You might as well just isolate the creature and be done with it.

A creature with no concept of boredom would would, (to paraphrase Eliezer), "play the same screen of the same level of the same fun videogame over and over again."

Or it would do it once, and then die happy. Human-like entities might have a lifespan of centuries, and then you would have ephemeral beings living their own limited fantasy for thirty seconds. I mean, why not? We are all bound to repeat ourselves once our interests are exhausted -- perhaps entities could be made to embrace death when that happens.

Yes, I concede that if there is a sufficient quantity of creatures with humane values, it might be good to create other types of creatures for variety's sake. However, such creatures could be potentially dangerous, we'd have to be very careful.

I agree, though an entity with the power to choose the kind of creatures that come to exist probably wouldn't have much difficulty doing it safely.