It seems to me that Wei_Dai's six hypotheses do a good job of covering a lot of the logical space. A good enough job that even though I've been professionally trained to think about this problem, I can't come up with any significantly different suggestions.

But maybe I'm being unimaginative (a side effect of training, often enough). If you think these are merely six of countless hypotheses, do you think you could come up with, say, two more?

If you think these are merely six of countless hypotheses, do you think you could come up with, say, two more?

Two more possible positions:

  • There is a great variety of possible consistent preferences that intelligent beings can have, and there are no facts about what one should value that apply to all possible intelligent beings. However, there are still facts about rationality that do apply to all intelligent beings. Also, if you narrow the scope from "intelligent beings" to "humans", most humans , when consistent, share similar p

... (read more)
1shminux6yOP discusses "facts about what everyone should value", (which is an odd use of the term "fact", by the way). His classification is: There is a unique set of values which 1. is a limit 2. is an attractor of sorts There is no unique set of values 3. (I failed to understand what this item says) 4. but you can come up with your own "consistent" (in some sense) set of preferences to optimize for 5. you cannot come up with a consistent set of values (preferences?), though you can optimize for each one separately 6. value is not something you can optimize for at all. Eliezer's position is something like "1. but limited to humans/FAI only", which seems like a separate hypothesis. Other options off the top of my head are that there can be multiple self-consistent limits or attractors, or that the notion of value only makes sense for humans or some subset of them. Or maybe a hard enough optimization attempt disturbs the value enough to change it, so one can only optimize so much without changing preferences. Or maybe the way to meta-morality is maximizing the diversity of moralities by creating/simulating a multiverse with all the ethical systems you can think of, consistent or inconsistent. Or maybe we should (moral "should") matrix-like break out of the simulation we are living in and learn about the level above us. Or that the concept of "intelligent being" is inconsistent to begin with. Or... Options are many and none are testable, so, while it's good to ask grand questions, it's silly to try to give grand answers or classification schemes.

Six Plausible Meta-Ethical Alternatives

by Wei_Dai 2 min read6th Aug 201436 comments


In this post, I list six metaethical possibilities that I think are plausible, along with some arguments or plausible stories about how/why they might be true, where that's not obvious. A lot of people seem fairly certain in their metaethical views, but I'm not and I want to convey my uncertainty as well as some of the reasons for it.

  1. Most intelligent beings in the multiverse share similar preferences. This came about because there are facts about what preferences one should have, just like there exist facts about what decision theory one should use or what prior one should have, and species that manage to build intergalactic civilizations (or the equivalent in other universes) tend to discover all of these facts. There are occasional paperclip maximizers that arise, but they are a relatively minor presence or tend to be taken over by more sophisticated minds.
  2. Facts about what everyone should value exist, and most intelligent beings have a part of their mind that can discover moral facts and find them motivating, but those parts don't have full control over their actions. These beings eventually build or become rational agents with values that represent compromises between different parts of their minds, so most intelligent beings end up having shared moral values along with idiosyncratic values.
  3. There aren't facts about what everyone should value, but there are facts about how to translate non-preferences (e.g., emotions, drives, fuzzy moral intuitions, circular preferences, non-consequentialist values, etc.) into preferences. These facts may include, for example, what is the right way to deal with ontological crises. The existence of such facts seems plausible because if there were facts about what is rational (which seems likely) but no facts about how to become rational, that would seem like a strange state of affairs.
  4. None of the above facts exist, so the only way to become or build a rational agent is to just think about what preferences you want your future self or your agent to hold, until you make up your mind in some way that depends on your psychology. But at least this process of reflection is convergent at the individual level so each person can reasonably call the preferences that they endorse after reaching reflective equilibrium their morality or real values.
  5. None of the above facts exist, and reflecting on what one wants turns out to be a divergent process (e.g., it's highly sensitive to initial conditions, like whether or not you drank a cup of coffee before you started, or to the order in which you happen to encounter philosophical arguments). There are still facts about rationality, so at least agents that are already rational can call their utility functions (or the equivalent of utility functions in whatever decision theory ends up being the right one) their real values.
  6. There aren't any normative facts at all, including facts about what is rational. For example, it turns out there is no one decision theory that does better than every other decision theory in every situation, and there is no obvious or widely-agreed-upon way to determine which one "wins" overall.

(Note that for the purposes of this post, I'm concentrating on morality in the axiological sense (what one should value) rather than in the sense of cooperation and compromise. So alternative 1, for example, is not intended to include the possibility that most intelligent beings end up merging their preferences through some kind of grand acausal bargain.)

It may be useful to classify these possibilities using labels from academic philosophy. Here's my attempt: 1. realist + internalist 2. realist + externalist 3. relativist 4. subjectivist 5. moral anti-realist 6. normative anti-realist. (A lot of debates in metaethics concern the meaning of ordinary moral language, for example whether they refer to facts or merely express attitudes. I mostly ignore such debates in the above list, because it's not clear what implications they have for the questions that I care about.)

One question LWers may have is, where does Eliezer's metathics fall into this schema? Eliezer says that there are moral facts about what values every intelligence in the multiverse should have, but only humans are likely to discover these facts and be motivated by them. To me, Eliezer's use of language is counterintuitive, and since it seems plausible that there are facts about what everyone should value (or how each person should translate their non-preferences into preferences) that most intelligent beings can discover and be at least somewhat motivated by, I'm reserving the phrase "moral facts" for these. In my language, I think 3 or maybe 4 is probably closest to Eliezer's position.