What we talk about when we talk about maximising utility

by Richard_Ngo5 min read24th Feb 201818 comments

14

Utility FunctionsConsequentialism
Frontpage

tl;dr: “Utility” is used on LW to mean what people want, but that’s not what's morally relevant. Utilitarians aren't trying to maximise this sort of utility, but rather "well-being".

Epistemic status: probably obvious to some, but this particular framing wasn't totally clear to me until recently, and the terminology is definitely ambiguous.

Use of the term “utility” on Less Wrong implicitly conflates two definitions. Consider person X. In the economic sense, X's utility corresponds to the things that X would choose to maximise; we can abstract this as a "utility function" which maps possible worlds to a real number. For example, if X would save the lives of their family even at the cost of their own life, then we'd say that X assigns higher utility to a world in which their family lives happily than one in which they do. This is perfectly reasonable and normal. (Some people argue that X is actually prioritising their own happiness because if they chose otherwise, they'd be miserable from guilt. But this seems like an implausible model of their actual reasoning; I don't think many people who would save their families over themselves would change their minds even if offered guaranteed happiness afterwards.) A similar definition of utility is used when reasoning about artificial agents; for example, LW Wiki says “Utility is how much a certain outcome satisfies an agent’s preferences”.

However, this makes it very confusing to talk about maximising utility as a moral goal. Taken literally, maximising (economic) utility means wanting the sum of all people’s utility functions to be as high as possible. (Edit: in the standard definition of economic utility, this is not well-defined, since utilities can't be compared between people. The following argument is one intuitive reason why we can't maximise even versions of economic-style utility which do allow interpersonal comparision, such as the ones I'll discuss later.) But by doing so, we are double-counting! Let's say I assign utility U to living a happy life, and utility U+1 to my wife living a happy life; my wife does the converse. If we both have happy lives, we have total utility 4U+2, which means that our lives should be prioritised over the lives of four other people who value their own lives just as highly, but don't care much about other people! This is bizarre, and gets more so when we consider that people might have many strong relationships. By this calculation method, a family of five people who all value each other over themselves have more total utility than 25 equally happy loners. Obviously maximising this sort of utility is not what standard utilitarians want.

By contrast, “utility” as used in the context of utilitarianism and ethics in general (which I will from now on call well-being) is a metric of how good a life is for the person living it. There are various accounts of well-being; the two most prominent types are desire theories, and hedonic theories. Under the former, a person has high well-being if the things they desire actually occur, even if they never find out about them. This is basically the same as the definition of utility I outlined above - which means it faces exactly the same double-counting problem. Hedonic theories of well-being, on the other hand, imply that your well-being is a function of only your psychological state. There are many different functions that it could be - for example, ones which only care about suffering; or also care about pleasure; or also care about a sense of fulfillment and meaningfulness. The specifics don't matter for our purposes; let's accept the broad idea and see where it leads.

Unfortunately, it immediately leads us to a major problem: since well-being is distinct from utility, people’s actions aren’t a good guide to their actual well-being function. In fact, maximising the well-being of any group of people might be opposed by every person who is affected by the change! Consider first a group of size one: just me. Suppose my life's goal is to write the greatest novel ever, even though I know that slaving away to complete it will make me less happy than I could have been. I also know that if I ever stop working on it, I'll become lazy, my goals will change, and I'll settle for a happy but boring life. You decide that you could maximise my well-being by forcing me to stop working on it - and by the account above, you'd be doing a moral good even though I'd fight you tooth and nail.

One more example, this time with n=2: suppose I am about to suffer torture. Suppose also that I have a wife, who I love deeply, although she doesn't love me nearly as much; also, she has a higher pain tolerance than me. Now you intervene so that instead of me being tortured, my wife is tortured instead, without my knowledge. My well-being is now higher than it would have been, and the total well-being between the two of us is also higher (since she can bear the pain better). Yet if either of us heard about your plan, we would both strongly object.

Some people are willing to bite the bullet and say that we should just maximise hedonic well-being even if all people we are "benefiting" think we're making their lives worse. This implies that, all else being equal, it would be better to force everyone into experience machines, because psychological experiences are all that matter. At a certain point, accepting or rejecting this position comes down to a brute clash of intuitions. I think that that my life would have less value if all my friends were secretly contemptuous of me, and all the things I learned throughout my life were actually wrong, and after my death I was despised - even if I never found out about any of those facts. Your mileage may vary.

The best compromise I can come up with is a solution in which your well-being is the sum of a desire-satisfaction function and a hedonic function - but where the desires we consider are limited to those about your own life. As always with morality, this is somewhat vague. For example, you might desire to have a child, and desire that the child has certain traits, and go into a certain career, and have a good life. Where does this stop becoming "about you"? I don't think there's any clear line to be drawn between desires that are and aren't about your own life, but if we want people’s desires to be morally relevant in a sensible way, we need to pick some boundary; even if they are all well-informed and reflectively consistent, we can't just classify them all as part of the "utility function" which should be maximised.

14

18 comments, sorted by Highlighting new comments since Today at 1:24 PM
New Comment

A theorem that, as far as I know, isn't in the collective LW memescape and really should be, is Harsanyi's theorem. Loosely, it states that any decision-making procedure for making decisions on behalf of a collection of VNM agents satisfying a few straightforward axioms must be equivalent to maximizing a weighted sum of the VNM agents' utility functions. The sum has to be weighted since VNM utility functions are only well-defined up to a positive affine transformation, and Harsanyi's theorem doesn't provide any guidance in determining the weights. I don't know that I've seen a public discussion anywhere of the ramifications of this.

Harsanyi's theorem only applies to VNM agents whose beliefs about the world agree. In the general case where they don't, Critch proved a generalization in this paper which I also haven't seen a public discussion of the ramifications of. I haven't digested the details, but the gist is that how much weight your utility function gets increases the better your beliefs predict reality.

A while ago I described something similar on the decision-theory-workshop mailing list, with two differences (I think they are improvements but YMMV):

1) It's easier to use UDT than VNM, because UDT agents have no beliefs, only a utility function that mixes beliefs with values (e.g. if you think a coin is/was fair, your utility is the average of your utilities in heads-world and tails-world). When two agents merge, you just take a weighted sum of their utility functions.

2) In some games, specifying the weighted sum is not enough. For example, in the dividing the dollar game, maximizing any weighted sum will give the whole dollar to one player, except the equally weighted sum which is indifferent. To achieve e.g. an equal division of the dollar, the agents need to jointly observe a coinflip while constructing the merged agent, or more generally to agree on a probability distribution over merged agents (with the restriction that it must lie on the Pareto frontier).

I have had some conversations about this in Berkeley and at FHI, and I think I remember some posts by Stuart Armstrong on this. So this hasn't fully avoided the landscape, though I agree that I haven't seen any particularly good coverage of this.

Taken literally, maximising (economic) utility means wanting the sum of all people’s utility functions to be as high as possible. But by doing so, we are double-counting! Let’s say I assign utility U to living a happy life, and utility U+1 to my wife living a happy life; my wife does the converse. If we both have happy lives, we have total utility 4U+2, which means that our lives should be prioritised over the lives of four other people who value their own lives just as highly, but don’t care much about other people! This is bizarre, and gets more so when we consider that people might have many strong relationships. By this calculation method, a family of five people who all value each other more than themselves have more total utility than 25 equally happy loners.

This is incorrect, and the mistake is a critical flaw in your reasoning; but, ironically, it is also incorrect in a second, totally different way, which makes your reasoning come out right in the end (but only by coincidence).

Mistake One: Decision-theoretic utility attaches to outcomes, and ‘outcomes’ here must be understood to be world-states—not component of world-states. In other words, your example would properly go like this:

Say you assign utility U to the outcome “I live a happy life, and so does my wife”; and your wife assigns utility U to the outcome “I live a happy life, and so does my husband”. Adding these gives us 2U; no double-counting occurs, nor is it ever possible for double-counting to occur.

However, celebrations are premature, because then there is…

Mistake Two: As per the Von Neumann–Morgenstern utility theorem (which tells us that the preferences of any agent from which a utility function may be constructed, a.k.a. any agent that “has” a utility function, must comply with the VNM axioms), an agent’s utility function is defined only up to positive affine transformation. This makes interpersonal utility comparison—and thus, any arithmetic, such as summation—impossible (i.e., undefined and meaningless).

This means that we cannot, in fact, add up two people’s decision-theoretic utilities for any given outcome.

I think your first objection is technically correct, but irrelevant to the point I was making; and your second objection is entirely consistent with my conclusion.

On "mistake one": I am using "I assign utility U to living a happy life" as a shorthand for something like "In general, the difference in utilities I assign between worlds in which I am happily alive, and worlds in which I am not, is U, all else being equal." This is a perfectly normal sort of phrasing; for example, the wikipedia page on utility says that it "represents satisfaction experienced by the consumer from a good." Do you object to this and any other talk of utility which isn't phrased in terms of world-states?

On "mistake two": I should have mentioned (and will edit to add) that economists don't endorse interpersonal comparison of economic utility. But I'm not endorsing it either: I'm explicitly flagging it as a philosophical mistake, and explaining one reason why attempts to do so are misguided. This is more useful than simply saying that it's ill-defined, because the latter leaves us to wonder why we can't just construct a new way to compare utilities between people - for example, in another comment cousin_it is basically arguing for economic-style utility + interpersonal comparison.

Re: mistake one:

Firstly, your redefinition of utility values assumes that the difference within any pair of world-states which differ in some fixed way is constant, regardless of what other properties those world-states have. That does not seem a likely assumption to me, and in any case must be stated explicitly and defended (and I expect you will have some difficulty defending it).

More importantly, even if we agree to this quite significant assumption, what then? If we understand that it’s world-states that we’re concerned with, then the notion of “double-counting” is simply inappropriate. Each person’s valuation of a given world-state counts separately. Why should it not? Importantly, I do not see how your objection about families, etc., can be constructed in such a framework—even if you do the transformation to “relative utilities” that you propose!

Re: mistake two:

If you agree that interpersonal utility comparison is a mistake, then we do seem to be on the same page.

On the other hand, if your stated reason for believing it to be a mistake is the “double-counting” issue, then that is a bad reason, because there is no double-counting! The right reason for viewing it as a mistake is that it’s simply undefined—mathematical nonsense.

Re mistake two:

Okay, so it's a mistake because it's simply undefined mathematical nonsense. Now let me define a new form of utility which differs from economic utility only by the fact that interpersonal comparisons are allowed, and occur in whatever way you think is most reasonable. How do you feel about using this new form of utility to draw moral conclusions? I think my arguments are relevant to that question.

Re mistake one:

I'm not assuming that the difference within any pair of world states which differ in a certain way is constant any more than an economist is when they say "let X be the utility that is gained from consuming one unit of good Y". Both are approximations, but both are useful approximations.

If you'd prefer, I can formalise the situation more precisely terms of world-states. For each world-state, each member of the family assigns it utility equal to the number of family members still alive. So if they all die, that's 0. If they all survive, that's 5, and then the total utility from all of them is 25 (assuming we're working in my "new form of utility" from above, where we can do interpersonal addition).

Meanwhile each loner assigns 1 utility to worlds in which they survive, and 0 otherwise. So now, if we think that maximising utility is moral, we'd say it's more moral to kill 24 loners than one family of 5, even though each individual values their own life equally. I think that this conclusion is unacceptable, and so it is a reductio of the idea that we should maximise any quantity similar to economic utility.

Okay, so it’s a mistake because it’s simply undefined mathematical nonsense. Now let me define a new form of utility which differs from economic utility only by the fact that interpersonal comparisons are allowed, and occur in whatever way you think is most reasonable. How do you feel about using this new form of utility to draw moral conclusions?

My feelings about this new form of utility is “this definition is incoherent”. It can’t be used to draw moral conclusions because it’s a nonsensical concept in the first place.

That interpersonal utility comparisons are impossible in VNM utility is not some incidental fact, it is an inevitable consequence of the formalism’s assumptions. If you believe a different formalism—one without that consequence—is possible, I should very much like to hear about it… not to mention the fact that if you were to discover such a thing, tremendous fame and glory, up to and possibly even including a Nobel Prize, would be yours!

I’m not assuming that the difference within any pair of world states which differ in a certain way is constant any more than an economist is when they say “let X be the utility that is gained from consuming one unit of good Y”.

Just because economists sometimes say a thing, does not make that thing any less nonsensical. (If you doubt this, read any of Oskar Morgenstern’s work, for instance.)

If you’d prefer, I can formalise the situation more precisely terms of world-states. [details snipped]

What if the loner assigns 50 utility to worlds in which they survive? Or 500? Then would we say that it’s more moral to kill many families than than to kill one loner?

This problem has absolutely nothing to do with any “double-counting”, and everything to do with the obvious absurdities that result when you simply allow anyone to assign any arbitrary number they like to world-states, and then treat those numbers as if, somehow, they are on the same scale. I should hardly need to point out how silly that is. (And this is before we get into the more principled issues with interpersonal comparisons, of course.)

The first question in any such scenario has to be: “Where are these numbers coming from, and what do they mean?” If we can’t answer it in a rigorous way, then the discussion is moot.

That interpersonal utility comparisons are impossible in VNM utility is not some incidental fact, it is an inevitable consequence of the formalism’s assumptions.

Any consequence of a formalism's assumptions is inevitable, so I don't see what you mean. This happens to be an inevitable consequence which you can easily change just by adding a normalisation assumption. The wikipedia page for social choice theory is all about how social choice theorists compare utilities interpersonally - and yes, Amartya Sen did win a Nobel prize for related work. Mostly they use partial comparison, but there have been definitions of total comparison which aren't "nonsensical".

The first question in any such scenario has to be: “Where are these numbers coming from, and what do they mean?” If we can’t answer it in a rigorous way, then the discussion is moot.

I agree that if you're trying to formulate a moral theory, then you need to come up with such numbers. My point is that, once you have come up with your numbers, then you need to solve the issue that I present. You may not think this is useful, but there are plenty of people who believe in desire utilitarianism; this is aimed at them.

Very clear thinking, thanks for writing that! I think the desire view is the only one that matters, and the right way to aggregate desires of many people is by negotiation (real or simulated). CEV is a proposal along these lines, though it's a huge research problem and nowhere near solved. Anyway, since we don't have a superpowered AI or a million years to negotiate everything, we should probably pick a subset of desires that don't need much negotiating (e.g. everyone wants to be healthy but few people want others to be sick) and point effective charity at that. Not sure the hedonic view should ever be used - you're right that it has unfixable problems.

I feel like "negotiation" is very handwavey. Can you explain what that looks like in a simple zero-sum situation?
For example, suppose that you can either save the lives of the family of 5 that I described above, or else save 20 loners who have no strong relationships; assume every individual has an equally strong desire to remain alive. How do we actually aggregate all their desires, without the problem of double counting?

The reason I think hedonic views are important is because desires can be arbitrarily weird. I don't want to endorse as moral a parent who raises their child with only one overwhelmingly strong desire - that the sky remains blue. Is that child's well-being therefore much higher than anyone else's, since everyone else has had some of their desires thwarted? More generally, I don't think a "desire" is a particularly well-defined concept, and wouldn't want it to be my main moral foundation.

If everyone in the family has X-strong desire that the family should live, and every loner has Y-strong desire to live, I'd save the family iff 5X>20Y. Does that make sense?

It makes sense, but I find it very counterintuitive, partly because it's not obvious to me whether the concept of "measuring desire" makes sense. Here are two ways that I might measure whether people have a stronger desire for A or B:

1) I hook up a brainwave reader to each person, and see how strongly/emotional/determined they feel about outcome A vs outcome B.

2) I ask each person whether they would swap outcome A for outcome B.

In the first case, it's plausible to me that each person's emotions are basically maxed out at the thought of either their own death, or their family's death (since we know people are very bad at having emotions which scale appropriately with numbers). So then X = Y, and you save the 20 people.

In the second case, assume that each person involved desires to continue living, personally, at about the same strength S. But then you ask each member of the family whether they'd swap that for someone else in their family surviving, and they'd say yes. So therefore each member of the family has total desire > 5S that their family survives, whereas each loner has desire S to survive themselves, and so you save the family.

Which one is closer to your view of measuring desire? 2 seems more intuitive to me, because it matches the decisions we'd actually make, but then I find the conclusion that it's more moral to save the family very strange.

The first case is closer to my view, but it's not about emotions getting maxed out. It's more about "voting power" being equalized between people, so you can't get a billion times more voting power by caring about a billion people. You only get a fixed amount of votes to spread between outcomes. That's how I imagine negotiation to work, though it's still very handwavey.

Okay, but now you've basically defined "increasing utility" out of existence? If voting power is roughly normalised, then it's roughly equally important to save the life of an immensely happy, satisfied teenager with a bright future, and a nearly-suicidal retiree who's going to die soon anyway, as long as staying alive is the strongest relevant desire for both. In fact, it's even worse: assuming the teenager has a strong unreciprocated crush, then I can construct situations where only 1/2 of their voting power will go towards saving themselves, so their life is effectively half as valuable as a loner.

I don't think that's a big problem as long as there are enough people like you, whose altruistic desires slightly favor the teenager.

I don't see what the problem is. Utilitarianism says that there is, or ought to be, some objective utility function, the maximization of which is what determines "good" and "evil". This function need not be a linear combination of people's personal utility functions, it can be "well-being" as you describe, but this doesn't make it fundamentally different from other utility functions, it's simply a set of preferences (even if nobody in real life actually has these preferences in this precise order). Theoretically, if someone did posess this as their actual utility function they would be a perfectly good person, and if we knew exactly how to formulate it we could describe people's goodness or evilness based on how well their personal utility function aligned with this one.

What you've defined above is just morality in general: basically any moral theory can be expressed as a "nonlinear" function of some properties of individuals plus some properties of the world. For example, in deontology one nonlinearity is the fact that murdering someone is nearly-infinitely bad.

The key thing that utilitarianism does is claim that the function we should be maximising is roughly linear in well-being; my main point is clarifying that it shouldn't be linear in "utility" (in either a desire or an economic sense).