In a recent Facebook post, Eliezer said :

You can believe that most possible minds within mind design space (not necessarily actual ones, but possible ones) which are smart enough to build a Dyson Sphere, will completely fail to respond to or care about any sort of moral arguments you use, without being any sort of moral relativist. Yes. Really. Believing that a paperclip maximizer won't respond to the arguments you're using doesn't mean that you think that every species has its own values and no values are better than any other.

And so I think part of the metaethics sequence went over my head.

I should re-read it, but I haven't yet. In the meantime I want to give an summary of my current thinking and ask some questions.

My current take on morality is that, unlike facts about the world, morality is a question of preference. The important caveats are :

  1. The preference set has to be consistent. Until we develop something akin to CEV, humans are probably stuck with a pre-morality where they behave and think over time in contradictory ways, and at the same time believe they have a perfectly consistent moral system.
  2. One can be mistaken about morality, but only in the sense that, unknown to them, they actually hold values different from what the deliberative part of their mind thinks it holds. An introspection failure or a logical error can cause the mistake. Once we identify ground values (not that it's effectively feasible), "wrong" is a type error.
  3. It is OK to fight for one's morality. Just because it's subjective doesn't mean one can't push for it. So "moral relativism" in the strong sense isn't a consequence of morality being a preference. But "moral relativism" in the weak, technical sense (it's subjective) is.

I am curious about the following :

  • How does your current view differ from what I've written above?
  • How exactly does that differ from the thesis of the metaethics sequence? In the same post, Eliezer also said : "and they thought maybe I was arguing for moral realism...". I did kind of think that, at times.
  • I specifically do not understand this : "Believing that a paperclip maximizer won't respond to the arguments you're using doesn't mean that you think that every species has its own values and no values are better than any other.". Unless "better" is used in the sense of "better according to my morality", but that would make the sentence barely worth saying.


New Comment
6 comments, sorted by Click to highlight new comments since:

The Metaethics Sequence is notoriously ambiguous: what follows is my best estimate at modelling what Eliezer believes (which is not necessarily what I believe, although I agree on mostly everything).

1) There is a deep psychological uniformity of values shared by every human beings on things like fairness, happiness, freedom, etc.

2) This psychological layer is so ingrained in us that we cannot change it at will, it's impossible for example to make "fairness" means "distributed according to a geometric series" instead of "distributed equally".

3) Concepts like morality, should, better, fair, etc. should be understood as the basic computations that motivate actions instantiated by the class of human beings. That is: morality = actions(humans).

4) Morality, being a computation, has the same onthological status of being 'subjectively objective': it is a well defined point in the space of all possible algorithms, but it does not become a physical pattern unless there's an agent performing it.

To answer your questions:

  • the most important difference with your point of view is that in (what I believe is Eliezer's metaethics) morality is inevitable: since we cannot go outside our moral framework, is epistemically necessary that whenever we encounter some truly alien entity, a conflict ensues.

  • that would require a post in itself. I like Eliezer's approach, although I think it should be relaxed a little, the general framework I believe is correct.

  • I think that "better" here is doing all the work: by 3, better is to be understood in human terms. You can understand that other species will behave according to a different set of computations, but those computations aren't moral, and by point 2 it doesn't even make sense to ascribe a different morality to a paperclip maximizer. His computations might be paperclippy, but surely they aren't moral.
    Of course a paperclip maximizer will not be moved by our moral arguments (since it's paperclippy and not moral), and of course our computation is right and the computation of Clippy is wrong (because right/wrong is a computation that is part of morality).

Does the following make sense: A possible intuition pump would be to imagine a specific superintelligent mind with no consciousness, some kind of maximally stripped-down superintelligent algorithm, and ask whether the contents of its objective function constitute a "morality". The answer to this question is more obviously "no" than if you vaguely wonder about "possible minds".

Or maybe imagine a superintelligent spider.

Where "superintelligent" doesn't mean "anthropomorphic", but it literally means "a spider that is astronomically more intelligent at being a spider". Like, building webs that use quantum physics. Using chaos theory to predict the movement of flies. Mad spider science beyond our imagination.

But when it comes to humans, the superintelligent spider cares about them exactly as much as a normal spider, that is not at all, unless they become a threat. Of course the superintelligent spider is better able to imagine how humans could become a threat in the future, and is better at using its spider science to eliminate humans. But it's still a spider with spider values (not some kind of a human mind trapped in a fluffy eight-legged body). It might overcome the limitations of the spider body, and change its form into a machine or whatever. But its mind would remain (an extrapolation of) a spider mind.

I think the point of this:

Believing that a paperclip maximizer won't respond to the arguments you're using doesn't mean that you think that every species has its own values and no values are better than any other.

is to say that there doesn't exist any relevant perspective from which "no values are better than any other" makes sense. Contrast with the case of "no ice cream flavor is better than any other", where there does exist such a perspective (that of maximizing ice cream delight by matching people to their preferred ice cream flavors). The claim is that importing any moral intuitions from the ice cream case to the morality case would be a mistake; that purely descriptive interpretations of "every species has its own values" may be true, but any normative content you might read into it ("every species has its own values [that are in some sense proper to it and appropriate for it to pursue]") is confused.

It's a good sign that you think it's going over your head. Most of the time when people think "metaethics" (or metaphysics, or even normal ethics) is going over their head, it's a sign that it's actually nonsense. It's always awkward trying to disprove nonsense, as dope Australian philosopher David Stove argued (

EY and a few others have the ability to gaze into the technological machinations of a distant future involving artificial minds and their interactions with the coolest theoretical physics engineering feats. It's clear to all smart people this is an inevitable conclusion, right? Fine.

Okay going back to ethics, the point of the post (sorry, metaethics). Moral relativism is nothing more than a concept we hold in our minds. Insofar as it classifies different human beliefs on the world, and predicts their actions, it's a useful term. It has no particularly profound meaning otherwise. It's nothing more than a personal belief on how others should behave. You can't test moral relativism, it has no fundamental property. The closest you can get to testing it is, as I just noted, asking how it predicts different human behaviors.

Again, you tried to break this down which is understandable. But it's not possible to refute or breakdown absolute nonsense. Some paperclip maximize doesn't have values? So it won't respond to some sort of 'argument' (which is a anthromorphic nonsensical set of information for the paperclip maximizer). And somehow this now connects to an argument that some other species will have some value, but it might be a bad one?

Please let me know if you think I'm missing something, or some context from previous stuff he's written that changed my interpretation of the writing above.

I suspect that in the quoted passage above, EY is stating that even if you are a moral realist, you can still believe in the possibility of a highly intelligent species/AI/whatever that does not share/respect your moral values. Nothing in the quoted passage suggests (to me) that EY is arguing for or believes in strong moral realism.

He might be attempting to broaden the reach of his arguments concerning the danger of unfriendly AGI by showing that even if moral realism were true, we should still be concerned about unfriendly AGI.