Arguments for moral indefinability

Richard_Ngo

Epistemic status: I endorse the core intuitions behind this post, but am only moderately confident in the specific claims made.

Moral indefinability is the term I use for the idea that there is no ethical theory which provides acceptable solutions to all moral dilemmas, and which also has the theoretical virtues (such as simplicity, precision and non-arbitrariness) that we currently desire. I think this is an important and true perspective on ethics, and in this post will explain why I hold it, with the caveat that I'm focusing more on airing these ideas than constructing a watertight argument.

Here’s another way of explaining moral indefinability: let’s think of ethical theories as procedures which, in response to a moral claim, either endorse it, reject it, or do neither. Moral philosophy is an attempt to find the theory whose answers best match our intuitions about what answers ethical theories should give us (e.g. don’t cause unnecessary suffering), and whose procedure for generating answers best matches our meta-level intuitions about what ethical theories should look like (e.g. they should consistently apply impartial principles rather than using ad-hoc, selfish or random criteria). None of these desiderata are fixed in stone, though - in particular, we sometimes change our intuitions when it’s clear that the only theories which match those intuitions violate our meta-level intuitions. My claim is that eventually we will also need to change our meta-level intuitions in important ways, because it will become clear that the only theories which match them violate key object-level intuitions. In particular, this might lead us to accept theories which occasionally evince properties such as:

Incompleteness: for some claim A, the theory neither endorses nor rejects either A or ~A, even though we believe that the choice between A and ~A is morally important.
Vagueness: the theory endorses an imprecise claim A, but rejects every way of making it precise.
Contradiction: the theory endorses both A and ~A (note that this is a somewhat provocative way of framing this property, since we can always add arbitrary ad-hoc exceptions to remove the contradictions. So perhaps a better term is arbitrariness of scope: when we have both a strong argument for A and a strong argument for ~A, the theory can specify in which situations each conclusion should apply, based on criteria which we would consider arbitrary and unprincipled. Example: when there are fewer than N lives at stake, use one set of principles; otherwise use a different set).

Why take moral indefinability seriously? The main reason is that ethics evolved to help us coordinate in our ancestral environment, and did so not by giving us a complete decision procedure to implement, but rather by ingraining intuitive responses to certain types of events and situations. There were many different and sometimes contradictory selection pressures driving the formation of these intuitions - and so, when we construct generalisable principles based on our intuitions, we shouldn't expect those principles to automatically give useful or even consistent answers to very novel problems. Unfortunately, the moral dilemmas which we grapple with today have in fact "scaled up" drastically in at least two ways. Some are much greater in scope than any problems humans have dealt with until very recently. And some feature much more extreme tradeoffs than ever come up in our normal lives, e.g. because they have been constructed as thought experiments to probe the edges of our principles.

Of course, we're able to adjust our principles so that we are more satisfied with their performance on novel moral dilemmas. But I claim that in some cases this comes at the cost of those principles conflicting with the intuitions which make sense on the scales of our normal lives. And even when it's possible to avoid that, there may be many ways to make such adjustments whose relative merits are so divorced from our standard moral intuitions that we have no good reason to favour one over the other. I'll give some examples shortly.

A second reason to believe in moral indefinability is the fact that human concepts tend to be open texture: there is often no unique "correct" way to rigorously define them. For example, we all know roughly what a table is, but it doesn’t seem like there’s an objective definition which gives us a sharp cutoff between tables and desks and benches and a chair that you eat off and a big flat rock on stilts. A less trivial example is our inability to rigorously define what entities qualify as being "alive": edge cases include viruses, fires, AIs and embryos. So when moral intuitions are based on these sorts of concepts, trying to come up with an exact definition is probably futile. This is particularly true when it comes to very complicated systems in which tiny details matter a lot to us - like human brains and minds. It seems implausible that we’ll ever discover precise criteria for when someone is experiencing contentment, or boredom, or many of the other experiences that we find morally significant.

I would guess that many anti-realists are sympathetic to the arguments I’ve made above, but still believe that we can make morality precise without changing our meta-level intuitions much - for example, by grounding our ethical beliefs in what idealised versions of ourselves would agree with, after long reflection. My main objection to this view is, broadly speaking, that there is no canonical “idealised version” of a person, and different interpretations of that term could lead to a very wide range of ethical beliefs. I explore this objection in much more detail in this post. (In fact, the more general idea that humans aren’t really “utility maximisers”, even approximately, is another good argument for moral indefinability.) And even if idealised reflection is a coherent concept, it simply passes the buck to your idealised self, who might then believe my arguments and decide to change their meta-level intuitions.

So what are some pairs of moral intuitions which might not be simultaneously satisfiable under our current meta-level intuitions? Here’s a non-exhaustive list - the general pattern being clashes between small-scale perspectives, large-scale perspectives, and the meta-level intuition that they should be determined by the same principles:

Person-affecting views versus non-person-affecting views. Small-scale views: killing children is terrible, but not having children is fine, even when those two options lead to roughly the same outcome. Large-scale view: extinction is terrible, regardless of whether it comes about from people dying or people not being born.
The mere addition paradox, aka the repugnant conclusion. Small-scale view: adding happy people can only be an improvement. Large-scale view: a world consisting only of people whose lives are barely worth living is deeply suboptimal. (Note also Arrhenius' impossibility theorems, which show that you can't avoid the repugnant conclusion without making even greater concessions).
Weighing theories under moral uncertainty. I personally find OpenPhil's work on cause prioritisation under moral uncertainty very cool, and the fundamental intuitions behind it seem reasonable, but some of it (e.g. variance normalisation) has reached a level of abstraction where I feel almost no moral force from their arguments, and aside from an instinct towards definability I'm not sure why I should care.
Infinite and relativistic ethics. Same as above. See also this LessWrong post arguing against applying the “linear utility hypothesis” at vast scales.
Whether we should force future generations to have our values. On one hand, we should be very glad that past generations couldn't do this. But on the other, the future will probably disgust us, like our present would disgust our ancestors. And along with "moral progress" there'll also be value drift in arbitrary ways - in fact, I don't think there's any clear distinction between the two.

I suspect that many readers share my sense that it'll be very difficult to resolve all of the dilemmas above in a satisfactory way, but also have a meta-level intuition that they need to be resolved somehow, because it's important for moral theories to be definable. But perhaps at some point it's this very urge towards definability which will turn out to be the weakest link. I do take seriously Parfit's idea that secular ethics is still young, and there's much progress yet to be made, but I don't see any principled reason why we should be able to complete ethics, except by raising future generations without whichever moral intuitions are standing in the way of its completion (and isn't that a horrifying thought?). From an anti-realist perspective, I claim that perpetual indefinability would be better. That may be a little more difficult to swallow from a realist perspective, of course. My guess is that the core disagreement is whether moral claims are more like facts, or more like preferences or tastes - if the latter, moral indefinability would be analogous to the claim that there’s no (principled, simple, etc) theory which specifies exactly which foods I enjoy.

There are two more plausible candidates for moral indefinability which were the original inspiration for this post, and which I think are some of the most important examples:

Whether to define welfare in terms of preference satisfaction or hedonic states.
The problem of "maximisation" in utilitarianism.

I've been torn for some time over the first question, slowly shifting towards hedonic utilitarianism as problems with formalising preferences piled up. While this isn't the right place to enumerate those problems (see here for a previous relevant post), I've now become persuaded that any precise definition of which preferences it is morally good to satisfy will lead to conclusions which I find unacceptable. After making this update, I can either reject a preference-based account of welfare entirely (in favour of a hedonic account), or else endorse a "vague" version of it which I think will never be specified precisely.

The former may seem the obvious choice, until we take into account the problem of maximisation. Consider that a true (non-person-affecting) hedonic utilitarian would kill everyone who wasn't maximally happy if they could replace them with people who were (see here for a comprehensive discussion of this argument). And that for any precise definition of welfare, they would search for edge cases where they could push it to extreme values. In fact, reasoning about a "true utilitarian" feels remarkably like reasoning about an unsafe AGI. I don't think that's a coincidence: psychologically, humans just aren't built to be maximisers, and so a true maximiser would be fundamentally adversarial. And yet many of us also have strong intuitions that there are some good things, and it's always better for there to be more good things, and it’s best if there are most good things.

How to reconcile these problems? My answer is that utilitarianism is pointing in the right direction, which is “lots of good things”, and in general we can move in that direction without moving maximally in that direction. What are those good things? I use a vague conception of welfare that balances preferences and hedonic experiences and some of my own parochial criteria - importantly, without feeling like it's necessary to find a perfect solution (although of course there will be ways in which my current position can be improved). In general, I think that we can often do well enough without solving fundamental moral issues - see, for example, this LessWrong post arguing that we’re unlikely to ever face the true repugnant dilemma, because of empirical facts about psychology.

To be clear, this still means that almost everyone should focus much more on utilitarian ideas, like the enormous value of the far future, because in order to reject those ideas it seems like we’d need to sacrifice important object- or meta-level moral intuitions to a much greater extent than I advocate above. We simply shouldn’t rely on the idea that such value is precisely definable, nor that we can ever identify an ethical theory which meets all the criteria we care about.

I think I broadly agree with all the arguments to characterize the problem and to motivate indefinability as a solution, but I have a different (meta-)meta-level intuitions about how palatable indefinability would be, and as a result of that, I'd say I have been thinking about similar issues in a differently drawn framework. While you seem to advocate for "salvaging the notion of ’one ethics’“ while highlighting that we then need to live with indefinability, I am usually thinking of it in terms of: "Most of this is underdefined, and that’s unsettling at least in some (but not necessarily all) cases, and if we want to make it less underdefined, the notion of 'one ethics' has to give.“ Maybe one reason why I find indefinability harder to tolerate is because in my own thinking, the problem arises forcefully at an earlier/higher-order stage already, and therefore the span of views that "ethics" is indefinable about(?) is larger and already includes questions of high practical significance. Having said that, I think there are some important pragmatic advantages to an "ethics includes indefinability“ framework, and that might be reason enough to adopt it. While different frameworks tend to differ in the underlying intuitions they highlight or move into the background, I think there is more than one parsimonious framework in which people can "do moral philosophy“ in a complete and unconfused way. Translation between frameworks can be difficult though (which is one reason I started to write a sequence about moral reasoning under anti-realism, to establish a starting points for disagreements, but then I got distracted – it’s on hold now).

Some more unorganized comments (apologies for "lazy“ block-quote commenting):

Moral indefinability is the term I use for the idea that there is no ethical theory which provides acceptable solutions to all moral dilemmas, and which also has the theoretical virtues (such as simplicity, precision and non-arbitrariness) that we currently desire.

This idea seems correct to me. And as you indicate later in the paragraph, we can add that it’s plausible that the "theoretical virtues“ are not well-specified either (e.g., there’s disagreement between people’s theoretical desiderata, or there’s vagueness in how to cash out a desideratum such as "non-arbitrariness").

My claim is that eventually we will also need to change our meta-level intuitions in important ways, because it will become clear that the only theories which match them violate key object-level intuitions.

This recommendation makes sense to me (insofar as one can still do that), but I don’t think it’s completely obvious. Because both meta-level intuitions and object-level intuitions are malleable in humans, and because there’s no(t obviously a) principled distinction between these two types of intuitions, it’s an open question to what degree people want to adjust their meta-level intuitions in order to not have to bite the largest bullets.

If the only reason people were initially tempted to bite the bullets in question (e.g., accept a counterintuitive stance like the repugnant conclusion) was because they had a cached thought that "Moral theories ought to be simple/elegant“, then it makes a lot of sense to adjust this one meta-level intuition after the realization that it seems ungrounded. However, maybe "Moral theories ought to be simple/elegant“ is more than just a cached thought for some people:

Some moral realists buy the "wager" that their actions matter infinitely more in case moral realism is true. I suspect that an underlying reason why they find this wager compelling is that they have strong meta-level intuitions about what they want morality to be like, and it feels to them that it’s pointless to settle for something other than that.

I’m not a moral realist, but I find myself having similarly strong meta-level intuitions about wanting to do something that is "non-arbitrary" and in relevant ways "simple/elegant". I’m confused about whether that’s literally the whole intuition, or whether I can break it down into another component. But motivationally it feels like this intuition is importantly connected to what makes it easy for me to go "all-in“ for my ethical/altruistic beliefs.

A second reason to believe in moral indefinability is the fact that human concepts tend to be open texture: there is often no unique "correct" way to rigorously define them.

I strongly agree with this point. I think even very high-level concepts in moral philosophy or the philosophy of reason/self-interest are "open texture“ like that. In your post you seem to start with an assumption that people have a rough, shared sense of what "ethics“ is about. But if the fuzziness is already attacking at this very high level, it calls into question whether you can find a solution that seems satisfying to different people’s (fuzzy and underdetermined) sense of what the question/problem is even about.

For instance, there is the narrow interpretations such as "ethics as altruism/caring/doing good“ (which I think roughly captures at least large parts of what you assume, and it also captures the parts I’m personally most interested in). There's also "ethics as cooperation or contract“. And maybe the two blend into each other.

Then there’s the broader (I label it "existentialist“) sense in which ethics is about "life goals“ or "Why do I get up in the morning?“. And within this broader interpretation of it, you suddenly get narrower subdomains like "realism about rationality“ or "What makes up a person's self-interest?“ where the connection to the other narrower domains (e.g. "ethics as altruism“) are not always clear.

I think indefinability is a plausible solution (or meta-philosophical framework?) for all of these. But when the scope over which we observe indefinability becomes so broad, it illustrates why it might feel a bit frustrating for some people, because without clearly delineated concepts it can be harder to make progress, and so a framework in which indefinability plays a central role could in some cases obscure conceptual progress in subareas where one might be able to make such progress (at least at the "my personal morality“ level, though not necessarily at the level of a "consensus morality“).

(I’m not sure I’m disagreeing with you BTW; probably I’m just adding thoughts and blowing up the scope of your post.)

I would guess that many anti-realists are sympathetic to the arguments I’ve made above, but still believe that we can make morality precise without changing our meta-level intuitions much - for example, by grounding our ethical beliefs in what idealised versions of ourselves would agree with, after long reflection. My main objection to this view is, broadly speaking, that there is no canonical “idealised version” of a person, and different interpretations of that term could lead to a very wide range of ethical beliefs.

I agree. The second part of my comment here tries to talk about this as well.

And even if idealised reflection is a coherent concept, it simply passes the buck to your idealised self, who might then believe my arguments and decide to change their meta-level intuitions.

Yeah. I assume most of us are familiar with a deep sense of uncertainty about whether we found the right approach to ethical deliberation. And one can maybe avoid to feel this uncomfortable feeling of uncertainty by deferring to idealized reflection. But it’s not obvious that this lastingly solves the underlying problem: Maybe we’ll always feel uncertain whenever we enter the mode of "actually making a moral judgment“. If I found myself as a virtual person who is part of a moral reflection procedure such as Paul Christiano's indirect normativity, I wouldn’t suddenly know and feel confident in how to resolve my uncertainties. And the extra power, and the fact that life in the reflection procedure would be very different from the world I currently know, introduces further risks and difficulties. I think there are still reasons why one might want to value particularly-open-ended moral reflection, but maybe it's important that people don’t use the uncomfortable feeling of "maybe I’m doing moral philosophy wrong“ as their sole reason to value particularly-open-ended moral reflection. If the reality is that this feeling never goes away, then there seems something wrong with the underlying intuition that valuing particularly-open-ended moral reflection is by default the "safe" or "prudent" thing to do. (And I'm not saying it's wrong for people value particularly-open-ended moral reflection; I suspect that it depends on one's higher-order intuitions: For every perspective there's a place where the buck stops.)

From an anti-realist perspective, I claim that perpetual indefinability would be better.

It prevents fanaticism, which is a big plus. And it plausibly creates more agreement, which is also a plus in some weirder sense (there's a "non-identity problem" type thing about whether we can harm future agents by setting up the memetic environment such that they'll end up having less easily satisfiable goals, compared to an alternative where they'd find themselves in larger agreement and therefore with more easily satisfiable goals). A drawback is that it can mask underlying disagreements and maybe harm underdeveloped positions relative to the status quo.

That may be a little more difficult to swallow from a realist perspective, of course. My guess is that the core disagreement is whether moral claims are more like facts, or more like preferences or tastes

That’s a good description. I sometimes use the analogy of "morality is more like career choice than scientific inquiry“.

I don't think that's a coincidence: psychologically, humans just aren't built to be maximisers, and so a true maximiser would be fundamentally adversarial.

This is another good instrumental/pragmatic argument why anti-realists interested in shaping the memetic environment where humans engage in moral philosophy might want to promote the framing of indefinability rather than "many different flavors of consequentialism, and (eventually) we should pick“.

Thanks for the detailed comments! I only have time to engage with a few of them:

Most of this is underdefined, and that’s unsettling at least in some (but not necessarily all) cases, and if we want to make it less underdefined, the notion of 'one ethics' has to give.

I'm not that wedded to 'one ethics', more like 'one process for producing moral judgements'. But note that if we allow arbitrariness of scope, then 'one process' can be a piecewise function which uses one subprocess in some cases and another in others.

I find myself having similarly strong meta-level intuitions about wanting to do something that is "non-arbitrary" and in relevant ways "simple/elegant". ...motivationally it feels like this intuition is importantly connected to what makes it easy for me to go "all-in“ for my ethical/altruistic beliefs.

I agree that these intuitions are very strong, and they are closely connected to motivational systems. But so are some object-level intuitions like "suffering is bad", and so the relevant question is what you'd do if it were a choice between that and simplicity. I'm not sure your arguments distinguish one from the other in that context.

one can maybe avoid to feel this uncomfortable feeling of uncertainty by deferring to idealized reflection. But it’s not obvious that this lastingly solves the underlying problem

Another way of phrasing this point: reflection is almost always good for figuring out what's the best thing to do, but it's not a good way to define what's the best thing to do.

there's a "non-identity problem" type thing about whether we can harm future agents by setting up the memetic environment such that they'll end up having less easily satisfiable goals, compared to an alternative where they'd find themselves in larger agreement and therefore with more easily satisfiable goals

I hadn't heard of that before, I'm glad you mentioned it. Your comment (as a whole) was both interesting/insightful/etc. and long, and I'd be interested in reading any future posts you make.

For the record, this is probably my key objection to preference utilitarianism, but I didn't want to dive into the details in the post above (for a very long post about such things, see here).

Would it be fair to say that this post is mostly trying to move people who are currently at 3 or 4 in this list to positions 5 or 6?

If I had to define it using your taxonomy, then yes. However, it's also trying to do something broader. For example, it's intended to be persuasive to people who don't think of meta-ethics in terms of preferences and rationality at all. (The original intended audience was the EA forum, not LW).

Edit: on further reflection, your list is more comprehensive than I thought it was, and maybe the people I mentioned above actually would be on it even if they wouldn't describe themselves that way.

Another edit: maybe the people who are missing from your list are those who would agree that morality has normative force but deny that rationality does (except insofar as it makes you more moral), or at least are much more concerned with the former than the latter. E.g. you could say that morality is a categorical imperative but rationality is only a hypothetical imperative.

I know I often sound like a broken record, but I'd say this just keeps coming back to the fundamental uncertainty we have about the relationship between reality as it is and as we know it and the impossibility of bringing those two into perfect, provable alignment. This is further complicated by the issue of whether or not the thing we're dealing with, moral facts, exist or, if they do exist, exist mind-independently, and question to which it so far seems we are unlikely to find a solution for unless we can find a synthesis over our existing notions of morality such that we are able to becomes deconfused about what we were previously trying to point at with the handle "moral".

I think the least repugnant aspect of a perfect moral theory* to sacrifice might be simplicity, the way you mean it. (Though intuitively, a lot of conditions would have to be met for that to seem a reasonable move to make, personally.)

I'm not clear on how "moral undefinability" would look different from "defining morality is hard".

*General moral theory.

Would it be fair to say that moral indefinability is basically what Yudkowsky was talking about with his slogan "Value is complex?"

What about the stance of Particularism in moral philosophy? On the face of it it seems very different, but I think it may be getting at a similar phenomenon.

I address (something similar to) Yudkowsky's view in the paragraph starting:

I would guess that many anti-realists are sympathetic to the arguments I’ve made above, but still believe that we can make morality precise without changing our meta-level intuitions much - for example, by grounding our ethical beliefs in what idealised versions of ourselves would agree with, after long reflection.

Particularism feels relevant and fairly similar to what I'm saying, although maybe with a bit of a different emphasis.