(Edit May 9th: I've gone and added a quick addendum to the end.)
In this post I'd basically like to collect some underappreciated points about utility functions that I've made in the comments of various places but which I thought were collecting into a proper, easily-referenceable post. The first part will review the different things referred to by the term "utility function", review how they work, and explain the difference between them. The second part will explain why -- contrary to widespread opinion on this website -- decision-theoretic utility functions really do need to be bounded.
(It's also worth noting that as a consequence, a number of the decision-theoretic "paradoxes" discussed on this site simply are not problems since they rely on unbounded decision-theoretic utility. An example is the original Pascal's Mugging (yes, I realize that term has since been applied to a bunch of things that have nothing to do with unbounded utility, but I mean the original problem).)
Anyway. Let's get on with it.
Part 1: "Utility function" refers to two different things that are often conflated and you should be sure you know which one you're talking about
The term "utility function" refers to two significantly different, but somewhat related, things, which are, due to the terminological and conceptual overlap, often conflated. This results in a lot of confusion. So, I want to cover the distinction here.
The two things called utility functions are:
- A function that describes the preferences of a given agent, assuming that agent's preferences satisfy certain rationality conditions, and does not depend on any particular ethical theory, but rather is useful in decision theory and game theory more generally. If I need to be unambiguous, I'll call this a decision-theoretic utility function.
- A function, used specifically in utilitarianism, that describes something like a person's preferences or happiness or something -- it's never been clearly defined, and different versions of utilitarianism suggest different ideas of what it should look like; these are then somehow aggregated over all people into an overall (decision-theoretic!) utility function (which is treated as if it were the decision-theoretic utility function describing what an ideal moral agent would do, rather than the preferences of any particular agent). If I need to be unambiguous, I'll call this an E-utility function.
(There's actually a third thing sometimes called a "utility function", which also gets confused with these other two, but this is a rarer and IMO less important usage; I'll get back to this in a bit.)
It's important to note that much discussion online conflates all of these and yields nonsense as a result. If you see someone talking nonsense about utility functions, before replying, it's worth asking -- are they mixing together different definitions of "utility function"?
So. Let's examine these in a bit more detail.
Decision-theoretic utility functions and their assumptions
Decision-theoretic utility functions describe the preferences of any consequentialist agent satisfying certain rationality conditions; by "it describes the agent's preferences", I mean that, given a choice between two options, the one yielding the higher expected utility is the one the agent chooses.
It's not obvious in advance that a rational agent's preferences need to be described by a utility function, but there are theorems guaranteeing this; Savage's theorem probably provides the best foundation for this, although the VNM theorem may be a little more familiar. (We'll discuss the difference between these two in more detail in the quick note below and discuss it further in the second part of this post.) Note that these functions are not entirely unique -- see below. Also note that these are conditions of rationality under uncertainty.
Again, a decision-theoretic utility function simply describes an agent's preferences. It has nothing to do with any particular idea of morality, such as utilitarianism. Although you could say -- as I've said above -- that it assumes a consequentialist agent, who cares only about consequences. So, any rational consequentialist agent has a decision-theoretic utility function; but only a utilitarian would admit the existence of E-utility functions. (While this doesn't exactly bear on the point here, it is worth noting that utilitarianism is a specific type of consequentialism and not identical with it!)
Note that real people will not actually obey the required rationality assumptions and thus will not actually have decision-theoretic utility functions; nonetheless, idealized rational agents, and therefore decision-theoretic utility functions are a useful abstraction for a number of purposes.
Decision-theoretic utility functions are usually stated as taking values in the real numbers, but they're only defined up to positive affine transformations (scaling by a positive constant, and translation); applying such a transformation to a utility function for an agent will yield another, equally-valid utility function. As such they may be better thought of not as taking values in R, exactly, but rather a sort of ordered 1-dimensional affine space over R. Outputs of a decision-theoretic utility function are not individually meaningful; in order to get meaningful numbers, with concrete meaning about the agent's preferences, one must take ratios of utilities, (a-b)/|c-d|. (Note the absolute value in the denominator but not the numerator, due to the importance of order.)
Decision-theoretic utility functions really need to be bounded -- a point seriously underappreciated on this website -- but I'll save discussion of that for the second part of this post.
A quick tangential note on probability and additivity
This is pretty tangential to the point of this post, but it's probably worth taking the time here to explain what the difference is between the Savage and VNM formalisms. (Well, one of two differences; the other will be discussed in the second part of this post, but as we'll see it's actually not such a difference.) The main difference is that the VNM theorem assumes that we already believe in the idea of probability -- it justifies decision-theoretic utility, but it does nothing to justify probability, it just assumes it. Savage's theorem, by contrast, provides a foundation for both probability and decision-theoretic utility simultaneously, based on just on rationality axioms about preferences, which is why I think it's the better foundation.
However, the probability measure it constructs need not actually be a probability measure as such, as it need only be finitely additive rather than countably additive. It's not clear what to make of this. Maybe countable additivity of probability just isn't necessary for a rational agent? It's hard to say. (If I'm not mistaken, the limiting probabilities of MIRI's logical inductor are merely (the analogue of) finitely additive, not countably additive, but I could be wrong about that...) But this is really off the point, so I'm just going to raise the question and then move on; I just wanted to mention it to ward off nitpicks on this point. As we'll see below, the choice of formalism doesn't actually matter much to my point here.
E-utility functions and their assumptions
This is the older meaning of the term if I'm not mistaken, but there is mostly not a lot to say about these because they're fairly ill-defined. They are, as mentioned above, specifically a utilitarian notion (not a general consequentialist notion). How to define these, as well as how to aggregate them, remain disputed.
Utilitarians say that one should try to maximize the expected value of the aggregated utility function, which means that the aggregated function is actually a weird sort of decision-theoretic utility function (corresponding to an ideal moral agent rather than any particular agent), not an E-utility function. One does not attempt to maximize expected value of E-utility functions.
One thing we can say about E-utility functions is that while only idealized rational agents have decision-theoretic utility functions and not real people, real people are supposed to have E-utility functions. Or at least so I gather, or else I don't see how utilitarianism makes sense?
Actually, one could say that it is not only utilitarians who rely on these -- there is also the notion of prioritarianism; one sometimes sees the term "aggregative consequentialism" to cover both of these (as well as other potential variants). But, because E-utility functions are so ill-defined, there is, as best I can tell, not really any meaningful distinction between the two. For example, consider a utilitarian theory that assigns to each agent p a real-valued E-utility function U_p, and aggregates them by summing. Let's suppose further that each U_p takes values in the nonnegative reals; then if we change the aggregation rule to summing the square roots of the U_p, we have changed our utilitarian theory into a prioritarian one. Except, instead of doing that, we could define U'_p = sqrt(U_p), and call U'_p the E-utilities; because there's no precise definition of E-utilities, there's nothing stopping us from doing this. But then the utilitarian theory described by the U'_p, describes exactly the same theory as the prioritarian theory described by the U_p! The theory could equally well be described as "utilitarian" or "prioritarian"; for this reason, unless one puts further restrictions on E-utility functions, I do not consider there to be any meaningful difference between the two.
As such, throughout this post I simply say "utilitarianism" rather than "aggregative consequentialism"; but if I'm wrong in identifying the two, well, whenever I say "utilitarianism" I really kind of mean "aggregative consequentialism". Hope that's OK.
Preference utilitarianism and Harsanyi's theorem (using decision-theoretic utility functions as E-utility functions)
Above I've made a point of emphasizing that decision-theoretic utility and E-utility functions are different things. But could there be cases where it makes sense to use one as the other? Specifically, to use decision-theoretic utility functions as E-utility functions? (The reverse clearly doesn't make much sense.)
Well, yes, that's basically what preference utilitarianism is! OK, precise formulations of preference utilitarianism may vary, but the idea is to use people's preferences as E-utility functions; and how are you going to encode people's preferences if not with decision-theoretic utility functions? (OK, this may only really work for a population of idealized agents, but it's still worth thinking about.)
Indeed, we can go further and formalize this with Harsanyi's theorem, which gives a series of moral assumptions (note: among them is that the agents in the population do indeed have decision-theoretic utility functions!) under which morality does indeed come down to maximizing a sort of aggregate of the population's decision-theoretic utility functions.
(Note that it also assumes that the population is fixed, which arguably assumes away a lot of the hard parts of utilitarianism, but it's still a useful starting point.)
But, what is this aggregation? If we think of the agent's utility functions as taking values in R, as they're usually thought of, then the aggregation consists of summing a utility function for each agent. But which one? As mentioned above, utility functions are only unique up to positive affine transformations. Harsanyi's theorem provides no guidance on which utility function to use for each agent -- how could it? They're all equally valid. And yet using different ones can yield very different (and meaningfully different) aggregated results, essentially letting you adjust weightings between agents! Except there's no meaningful notion of "equal weighting" to use as baseline. It's something of a problem.
(This is often discussed in terms of "weights", coefficients put in front of the utility functions; but I think this obscures the fundamental issue, in making it sound like there's a meaningful notion of "equal weights" when there really isn't.)
Still, despite these holes, preference utilitarianism and Harsanyi's theorem are definitely worth thinking about.
Brief note on that third sort of utility function
Finally, before we get to the second part of this post, I wanted to mention that third thing sometimes called a "utility function".
The term "utility function" is sometimes used for a real-valued function that describe's an agents deterministic preferences; i.e., if A and B are two options, and U the utility function, then the agent prefers A to B if and only if U(A) > U(B). Note the lack of any requirement here about expected value! This is a weaker sense than a decision-theoretic utility function as I described it above; any decision-theoretic utility function is one of these, but not vice-versa.
While you're occasionally encounter this, it's frankly a useless and even counterproductive notion. Why? Because fundamentally, it's the wrong abstraction for the situation. If uncertainty isn't coming into play, and you're only applying deterministic rationality constraints, then the right structure for describing an agent's preferences is a total preorder. Why would you introduce real numbers? That just restricts what you can express! Not every total preorder will embed in the real numbers. So, there isn't any sensible set of rationality conditions that will lead to this notion of utility function; they'll lead you instead to the idea of a total preorder, and then oops maybe that total preorder will fail to embed in R and the agent won't have a "utility function" in this sense.
Such a function is of course only unique up to order-preserving functions on R, meaning it's not very unique at all (one more sign of it being the wrong abstraction).
Why were such functions ever even used, when they're clearly the wrong abstraction? I think basically it's because a lot of people lack familiarity with mathematical structures, or how to build an abstraction to suit a set of requirements, and instead tend to just immediately reach for the real numbers as a familiar setting to put things in. (Honestly, that's probably why decision-theoretic utility functions were initially defined as R-valued as well; fortunately, in that case, it turns out to be the correct choice! The real numbers can indeed be quite useful...)
Of course, as discussed above, if your agent not only obeys requirements of deterministic rationality, but also requirements of rationality under uncertainty, then in fact they'll have a decision-theoretic utility function, taking values in R, and so will have one of these. So in that sense the assumption of taking these values in R is harmless. But still...
Part 2: Yes, decision-theoretic utility functions really do need to be bounded
OK. Now for the main point: Contrary to widespread opinion on this site, decision-theoretic utility functions really do need to be bounded.
First, I'm going to discuss this in terms of Savage's theorem. I realize this is the less familiar formalism for many here, but I think it's the better one; if you're not familiar with it I recommend reading my post on it. I'll discuss the point in terms of the more familiar VNM formalism shortly.
OK. So under Savage's formalism, well, Savage's theorem tells us that (under Savage's rationality constraints) decision-theoretic utility functions must be bounded. Um, OK, hm, that's not a very helpful way of putting it, is it? Let's break this down some more.
There's one specific axiom that guarantees the boundedness of utility functions: Savage's axiom P7. Maybe we don't need axiom P7? Is P7 really an important rationality constraint? It seems intuitive enough, like a constraint any rational agent should obey -- what rational agent could possibly violate it? -- but maybe we can do without it?
Let's hold that thought and switch tracks to the VNM formalism instead. I mean -- why all this discussion of Savage at all? Maybe we prefer the VNM formalism. That doesn't guarantee that utility functions are bounded, right?
Indeed, as usually expressed, the VNM formalism doesn't guarantee that utility functions are bounded... except the usual VNM formalism doesn't actually prove that utility functions do everything we want!
The point of a decision-theoretic utility function is that it describes the agent's preferences under uncertainty; given two gambles A and B, the one with the higher expected utility (according to the function) is the one the agent prefers.
Except, the VNM theorem doesn't actually prove this for arbitrary gambles! It only proves it for gambles with finitely many possible outcomes. What if we're comparing two gambles and one of them has infinitely many possible outcomes? This is something utility functions are often used for on this site, and a case I think we really do need to handle -- I mean, anything could potentially have infinitely many possible outcomes, couldn't it?
Well, in this case, the VNM theorem by itself provides absolutely no guarantee that higher expected utility actually describes the agent's preference! Our utility function might simply not work -- might simply fail to correctly describe the agent's preference -- once gambles with infintely many outcomes are involved!
Hm. How troublesome. OK, let's take another look at Savage and his axiom P7. What happens if we toss that out? There's no longer anything guaranteeing that utility functions are bounded. But also, there's no longer anything guaranteeing that the utility function works when comparing gambles with infinitely many outcomes!
Sounds familiar, doesn't it? Just like with VNM. If you don't mind a utility function that might fail to correctly describe your agent's preferences once infinite gambles get involved, then sure, utility functions can be unbounded. But, well, that's really not something we can accept -- we do need to be able to handle such cases; or at least, such cases are often discussed on this site. Which means bounded utility functions. There's not really any way around it.
And if you're still skeptical of Savage, well, this all has an analogue in the VNM formalism too -- you can add additional conditions to guarantee that the utility function continues to work even when dealing with infinite gambles, but you end up proving in addition that the utility function is bounded. I'm not so familiar with this, so I'll just point to this old comment by AlexMennen for that...
Anyway, point is, it doesn't really matter which formalism you use -- either you accept that utility functions are bounded, or you give up on the idea that utility functions produce meaningful results in the face of infinite gambles, and, as I've already said, the second of these is not acceptable.
Really, the basic reason should go through regardless of the particular formalism; you can't have both unbounded utility functions, and meaningful expected utility comparisons for infinite gambles, because, while the details will depend on the particular formalism, you can get contradictions by considering St. Petersburg-like scenarios. For instance, in Savage's formalism, you could set up two St. Petersburg-like gambles A and B such that the agent necessarily prefers A to B but also necessarily is indifferent between them; forcing the conclusion that in fact the agent's utility function must have been bounded, preventing this setup.
I'd like to note here a consequence of this I already noted in the intro -- a number of the decision-theoretic "paradoxes" discussed on this site simply are not problems since they rely on unbounded decision-theoretic utility. An example is the original Pascal's Mugging; yes, I realize that term has since been applied to a bunch of things that have nothing to do with unbounded utility, but the original problem, the one Yudkowsky was actually concerned with, crucially does.
And I mean, it's often been noted before that these paradoxes go away if bounded utility is assumed, but the point I want to make is stronger -- that the only reason these "paradoxes" seem to come up at all is because contradictory assumptions are being made! That utility functions can be unbounded, and that utility functions work for infinite gambles. One could say "utility functions have to be bounded", but from a different point of view, one could say "expected utility is meaningless for infinite gambles"; either of these would dissolve the problem, it's only insisting that neither of these are acceptable that causes the conflict. (Of course, the second really is unacceptable, but that's another matter.)
Does normalization solve the weighting problem? (I wouldn't bet on it)
One interesting note about bounded utility functions is that it suggests a solution to the weighting problem discussed above with Harsanyi's theorem; notionally, one could use boundedness to pick a canonical normalization -- e.g., choosing everyone's utility function to have infimum 0 and supremum 1. I say it suggests a solution rather than that it provides a solution, however, in that I've seen nothing to suggest that there's any reason one should actually do that other than it just seeming nice, which, well, is not really a very strong reason for this sort of thing. While I haven't thought too much about it, I'd bet someone can come up with an argument as to why this is actually a really bad idea.
(And, again, this would still leave the problem of population ethics, as well as many others, but still, in this idealized setting...)
Some (bad) arguments against boundedness
Finally, I want to take a moment here to discuss some arguments against boundedness that have come up here.
Eliezer Yudkowsky has argued against this (I can't find the particular comment at the moment, sorry) basically on the idea of that total utilitarianism in a universe that can contain arbitrarily many people requires unbounded utility functions. Which I suppose it does. But, to put it simply, if your ethical assumptions contradict the mathematics, it's not the mathematics that's wrong.
That's being a bit flip, though, so let's examine this in more detail to see just where the problem is.
Eliezer would point out that the utility function is not up for grabs. To which I can only say, yes, exactly -- except that this way of formulating it is slightly less than ideal. We should say instead, preferences are not up for grabs -- utility functions merely encode these, remember. But if we're stating idealized preferences (including a moral theory), then these idealized preferences had better be consistent -- and not literally just consistent, but obeying rationality axioms to avoid stupid stuff. Which, as already discussed above, means they'll correspond to a bounded utility function. So if your moral theory is given by an unbounded utility function, then it is not, in fact, a correct description of anyone's idealized preferences, no matter how much you insist it is, because you're saying that people's idealized (not real!) preferences are, essentially, inconsistent. (I mean, unless you claim that it's not supposed to be valid for infinite gambles, in which case it can I suppose be correct within its domain of applicability, but it won't be a complete description of your theory, which will need some other mechanism to cover those cases; in particular this means your theory will no longer be utilitarian, if that was a goal of yours, and so in particular will not be total-utilitarian.)
One could question whether the rationality constraints of Savage (or VNM, or whatever) really apply to an aggregated utility function -- above I claimed this should be treated as a decision-theoretic utility function, but is this claim correct? -- but I think we have to conclude that they do for the same reason that they apply to preferences of ideal agents, i.e., they're supposed to be a consistent set of preferences; an incosistent (or technically consistent but having obvious perversities) moral system is no good. (And one could imagine, in some idealized world, one's ethical theory being programmed as the preferences of an FAI, so...)
Basically the insistence on unbounded utility functions strikes me as, really, backwards reasoning -- the sort of thing that only makes sense if one starts with the idea of maximizing expected utility (and maybe not distinguishing too strongly between the two different things called "utility functions"), rather than if one starts from agents' actual preferences and the rationality constraints these must obey. If one remembers that utility functions are merely meant to describe preferences that obey rationality constraints, there's no reason you'd ever want them to be unbounded; the math rules this out. If one reasons backwards, however, and starts with the idea of utility functions, it seems like a harmless little variant (it isn't). So, I'd like to encourage everyone reading this to beware of this sort of backwards thinking, and to remember that the primary thing is agents' preferences, and that good rationality constraints are directly interpretable in terms of these. Whereas "the agent has a decision-theoretic utility function"... what does that mean, concretely? Why are there real numbers involved, where did those come from? These are a lot of very strong assumptions to be making with little reason! Of course, there are good reason to believe these strong-sounding claims, such as the use of real numbers specifically; but they make sense as conclusions, not assumptions.
Tangential note about other formalisms (or: I have an axe to grind, sorry)
One final tangential note: Eliezer Yudkowsky has occasionally claimed here that probability and decision-theoretic utility should be grounded not in Savage's theorem but rather in the complete class theorem (thus perhaps allowing unbounded utilities, despite the reasons above the particular formalism shouldn't matter?), but the arguments he has presented for this do not make any sense to me and as best I can tell contain a number of claims that are simply incorrect. Like, obviously, the complete class theorem cannot provide a foundation for probability, when it already assumes a notion of probability; I may be mistaken but it looks to me like it assumes a notion of decision-theoretic utility as well; and his claims about it requiring weaker assumptions than Savage's theorem are not only wrong but likely exactly backwards. Apologies for grinding this axe here, but given how this has come up here before I thought it was necessary. Anyway, see previous discussion on this point, not going to discuss it more here. (Again, sorry for that.)
Anyway I hope this has clarified the different things meant by the term "utility function", so you can avoid getting these mixed up in the future, and if you see confused discussion of them you can come in and de-confuse the issue.
...and yes, decision-theoretic utility functions really do need to be bounded.
Addendum May 9th: I should note, if for some reason you really want to bite the bullet and instead say, OK, utility functions don't apply to choices with infinitely many possible outcomes, then, well, I think that's silly, but it is consistent. But what I want to make clear here is that doing both that and insisting utility functions should be unbounded -- which is what you need for "paradoxes" as discussed above to come up -- is just inconsistent; you are at the least going to need to pick one of the horns of the dilemma