I'm posting this article on behalf of Brian Tomasik, who authored it but is at present too busy to respond to comments.
Update from Brian: "As of 2013-2014, I have become more sympathetic to at least the spirit of CEV specifically and to the project of compromise among differing value systems more generally. I continue to think that pure CEV is unlikely to be implemented, though democracy and intellectual discussion can help approximate it. I also continues to feel apprehensive about the conclusions that a CEV might reach, but the best should not be the enemy of the good, and cooperation is inherently about not getting everything you want in order to avoid getting nothing at all."
I'm often asked questions like the following: If wild-animal suffering, lab universes, sentient simulations, etc. are so bad, why can't we assume that Coherent Extrapolated Volition (CEV) will figure that out and do the right thing for us?
Most of my knowledge of CEV is based on Yudkowsky's 2004 paper, which he admits is obsolete. I have not yet read most of the more recent literature on the subject.
Reason 1: CEV will (almost certainly) never happen
CEV is like a dream for a certain type of moral philosopher: Finally, the most ideal solution for discovering what we really want upon reflection!
The fact is, the real world is not decided by moral philosophers. It's decided by power politics, economics, and Darwinian selection. Moral philosophers can certainly have an impact through these channels, but they're unlikely to convince the world to rally behind CEV. Can you imagine the US military -- during its AGI development process -- deciding to adopt CEV? No way. It would adopt something that ensures the continued military and political dominance of the US, driven by mainstream American values. Same goes for China or any other country. If AGI is developed by a corporation, the values will reflect those of the corporation or the small group of developers and supervisors who hold the most power over the project. Unless that group is extremely enlightened, CEV is not what we'll get.
Anyway, this is assuming that the developers of AGI can even keep it under control. Most likely AGI will turn into a paperclipper or else evolve into some other kind of Darwinian force over which we lose control.
Objection 1: "Okay. Future military or corporate developers of AGI probably won't do CEV. But why do you think they'd care about wild-animal suffering, etc. either?"
Well, they might not, but if we make the wild-animal movement successful, then in ~50-100 years when AGI does come along, the notion of not spreading wild-animal suffering might be sufficiently mainstream that even military or corporate executives would care about it, at least to some degree.
If post-humanity does achieve astronomical power, it will only be through AGI, so there's high value for influencing the future developers of an AGI. For this reason I believe we should focus our meme-spreading on those targets. However, this doesn't mean they should be our only focus, for two reasons: (1) Future AGI developers will themselves be influenced by their friends, popular media, contemporary philosophical and cultural norms, etc., so if we can change those things, we will diffusely impact future AGI developers too. (2) We need to build our movement, and the lowest-hanging fruit for new supporters are those most interested in the cause (e.g., antispeciesists, environmental-ethics students, transhumanists). We should reach out to them to expand our base of support before going after the big targets.
Objection 2: "Fine. But just as we can advance values like preventing the spread of wild-animal suffering, couldn't we also increase the likelihood of CEV by promoting that idea?"
Sure, we could. The problem is, CEV is not an optimal thing to promote, IMHO. It's sufficiently general that lots of people would want it, so for ourselves, the higher leverage comes from advancing our particular, more idiosyncratic values. Promoting CEV is kind of like promoting democracy or free speech: It's fine to do, but if you have a particular cause that you think is more important than other people realize, it's probably going to be better to promote that specific cause than to jump on the bandwagon and do the same thing everyone else is doing, since the bandwagon's cause may not be what you yourself prefer.
Indeed, for myself, it's possible CEV could be a net bad thing, if it would reduce the likelihood of paperclipping -- a future which might (or might not) contain far less suffering than a future directed by humanity's extrapolated values.
Reason 2: CEV would lead to values we don't like
Some believe that morality is absolute, in which case a CEV's job would be to uncover what that is. This view is mistaken, for the following reasons: (1) Existence of a separate realm of reality where ethical truths reside violates Occam's razor, and (2) even if they did exist, why would we care what they were?
Yudkowsky and the LessWrong community agree that ethics is not absolute, so they have different motivations behind CEV. As far as I can gather, the following are two of them:
Motivation 1: Some believe CEV is genuinely the right thing to do
As Eliezer said in his 2004 paper (p. 29), "Implementing CEV is just my attempt not to be a jerk." Some may believe that CEV is the ideal meta-ethical way to resolve ethical disputes.
I have to differ. First, the set of minds included in CEV is totally arbitrary, and hence, so will be the output. Why include only humans? Why not animals? Why not dead humans? Why not humans that weren't born but might have been? Why not paperclip maximizers? Baby eaters? Pebble sorters? Suffering maximizers? Wherever you draw the line, there you're already inserting your values into the process.
And then once you've picked the set of minds to extrapolate, you still have astronomically many ways to do the extrapolation, each of which could give wildly different outputs. Humans have a thousand random shards of intuition about values that resulted from all kinds of little, arbitrary perturbations during evolution and environmental exposure. If the CEV algorithm happens to make some more salient than others, this will potentially change the outcome, perhaps drastically (butterfly effects).
Now, I would be in favor of a reasonable extrapolation of my own values. But humanity's values are not my values. There are people who want to spread life throughout the universe regardless of suffering, people who want to preserve nature free from human interference, people who want to create lab universes because it would be cool, people who oppose utilitronium and support retaining suffering in the world, people who want to send members of other religions to eternal torture, people who believe sinful children should burn forever in red-hot ovens, and on and on. I do not want these values to be part of the mix.
Maybe (hopefully) some of these beliefs would go away once people learned more about what these wishes really implied, but some would not. Take abortion, for example: Some non-religious people genuinely oppose it, and not for trivial, misinformed reasons. They have thought long and hard about abortion and still find it to be wrong. Others have thought long and hard and still find it to be not wrong. At some point, we have to admit that human intuitions are genuinely in conflict in an irreconcilable way. Some human intuitions are irreconcilably opposed to mine, and I don't want them in the extrapolation process.
Motivation 2: Some argue that even if CEV isn't ideal, it's the best game-theoretic approach because it amounts to cooperating on the prisoner's dilemma
I think the idea is that if you try to promote your specific values above everyone else's, then you're timelessly causing this to be the decision of other groups of people who want to push for their values instead. But if you decided to cooperate with everyone, you would timelessly influence others to do the same.
This seems worth considering, but I'm doubtful that the argument is compelling enough to take too seriously. I can almost guarantee that if I decided to start cooperating by working toward CEV, everyone else working to shape values of the future wouldn't suddenly jump on board and do the same.
Objection 1: "Suppose CEV did happen. Then spreading concern for wild animals and the like might have little value, because the CEV process would realize that you had tried to rig the system ahead of time by making more people care about the cause, and it would attempt to neutralize your efforts."
Well, first of all, CEV is (almost certainly) never going to happen, so I'm not too worried. Second of all, it's not clear to me that such a scheme would actually be put in place. If you're trying to undo pre-CEV influences that led to the distribution of opinions to that point, you're going to have a heck of a lot of undoing to do. Are you going to undo the abundance of Catholics because their religion discouraged birth control and so led to large numbers of supporters? Are you going to undo the over-representation of healthy humans because natural selection unfairly removed all those sickly ones? Are you going to undo the under-representation of dinosaurs because an arbitrary asteroid killed them off before CEV came around?
The fact is that who has power at the time of AGI will probably matter a lot. If we can improve the values of those who will have power in the future, this will in expectation lead to better outcomes -- regardless of whether the CEV fairy tale comes true.
There's a brief discussion of butterfly effects as a potential pitfall for CEV in this thread.
I have two objections.
Therefore, CEV. Regardless of FAI developments.
This issue and related ones were raised in this post and its comments.
I don't think that people valuing eternal torture of other humans is much of a concern, because they don't value it nearly as much as the people in question disvalue being tortured. I bet there are a lot more people who care about animals' feelings and who care a lot more, than those who care about the aesthetics of brutality in nature. I think the majority of people have more instincts of concern for animals than their actions suggest, because now it is convenient to screw over animals as an externality of eating tasty food, and the animals suffering are ... (read more)
It is quite simple to make a LessWrong account, and it would be helpful so that you can respond to comments.
If you think it might be difficult to get the sufficient karma, you can also post a comment in the open thread asking for upvotes so that you can post. It's worked nicely before :)
There seem to be two objections here. The first is that CEV does not uniquely identify a value system; starting with CEV, you don't have actual values until you've identified the set of people/nonpeople you're including, an extrapolation procedure, and a reconciliation procedure. But when this is phrased as "the set of minds included in CEV is totally arbitrary, and hence, so will be the output," an essential truth is lost: while parts of CEV are left unspecified, other parts are, and so the output is not fully arbitrary. The set of CEV-compatibl... (read more)
The CEV of humanity is not likely to promote animal suffering. Most people don't value animal suffering. They value eating hamburgers, and aren't particularly bothered by the far away animal suffering that makes it possible for them to eat hamburgers. An FAI can give us hamburgers without causing any animal suffering.
Future humans may not care enough about animal suffering relative to other things, or may not regard suffering as being as bad as I do. As noted in the post, there are people who want to spread biological life as much as possible throughout the galaxy. Deep ecologists may actively want to preserve wild-animal suffering (Ned Hettinger: "Respecting nature means respecting the ways in which nature trades values, and such respect includes painful killings for the purpose of life support.") Future humans might run ancestor sims that happen to include astronomical numbers of sentient insects, most of which die (possibly painfully) shortly after birth. In general, humans have motivations to simulate minds similar to theirs, which means potentially a lot more suffering along for the ride.
This is a question about utilitarianism, not AI, but can anyone explain (or provide a link to an explanation) of why reducing the total suffering in the world is considered so important? I thought that we pretty much agreed that morality is based on moral intuitions and it seems pretty counterintuitive to value the states of mind of people too numerous to sympathize with as highly as people here do.
It seems to me that reducing suffering in a numbers game is the kind of thing you would say is your goal because it makes you sound like a good person, rather ... (read more)
Another thing to worry about with CEV is that the nonperson predicates that whoever writes it decides on will cover things that you consider people, or would not like to see be destroyed at the end of an instrumental simulation.
Humans probably have no built-in intuitions for the details of distinction of things that deserve ethical consideration at the precision required for a nonperson predicate that can flag things as nonpersons that will be useful for instrumental simulations, and yet not flag a fully-detailed simulation of you or me as a nonperson. We ... (read more)
I think that there's a misunderstanding about CEV going on.
I don't think an AI would just ask us what we want, and then do what suits most of us. It would consider how our brains work, and exactly what shards of value make us up. Intuition isn't a very good guide to what is the best decision for us - the point of CEV is that if we knew more about the world and ethics, we would do different things, and think different thoughts about ethics.
You might... (read more)
Why would the chicken have to learn to follow the ethics in order for its interests to be fully included in the ethics? We don't include cognitively normal human adults because they are able to understand and follow ethical rules (or, at the very least, we don't include them only in virtue of that fact). We include them because to them as sentient beings, their subjective well-being matters. And thus we also include the many humans who are unable to understand and follow ethical rules. We ourselves, of course, would want to be still included in case we lost the ability to follow ethical rules. In other words: Moral agency is not necessary for the status of a moral patient, i.e. of a being that matters morally.
The question is how we should treat humans and chickens (i.e. whether and how our decision-making algorithm should take them and their interests into account), not what social behavior we find among humans and chickens.
I agree that it is impossible to avoid inserting your values, and CEV does not work as a meta-ethical method of resolving ethical differences. However, it may be effective as a ... (read more)