Mentioned in

Underappreciated points about utility functions (of both sorts)

10johnswentworth

4Sniffnoy

6Said Achmiz

3cousin_it

3Said Achmiz

2cousin_it

2Said Achmiz

3cousin_it

2Said Achmiz

2[anonymous]

2Said Achmiz

4[anonymous]

2Said Achmiz

2[anonymous]

2Said Achmiz

2[anonymous]

2Said Achmiz

1TAG

2[anonymous]

2Said Achmiz

2[anonymous]

2Said Achmiz

2[anonymous]

2Said Achmiz

2[anonymous]

2Said Achmiz

2Said Achmiz

2[anonymous]

2Said Achmiz

6Richard_Kennaway

2Sniffnoy

10Richard_Kennaway

2Sniffnoy

2Richard_Kennaway

2Sniffnoy

3Richard_Kennaway

2Sniffnoy

2Richard_Kennaway

2Sniffnoy

2Sniffnoy

5Eigil Rischel

2Sniffnoy

1Isnasene

2Sniffnoy

1Isnasene

2Sniffnoy

1Isnasene

3Sniffnoy

1Isnasene

4Sniffnoy

4Isnasene

2Sniffnoy

1Eigil Rischel

3Isnasene

3Isnasene

2Sniffnoy

2Christopher “Chris” Upshaw

1FactorialCode

2Sniffnoy

3FactorialCode

2Sniffnoy

New Comment

You should check out Abram's post on complete class theorems. He specifically addresses some of the concerns you mentioned in the comments of Yudkowsky's posts.

Also, my inner model of Jaynes says that the right way to handle infinities is not to outlaw them, but to be explicit and consistent about what limits we're taking.

You should check out Abram's post on complete class theorems. He specifically addresses some of the concerns you mentioned in the comments of Yudkowsky's posts.

So, it looks to me like what Abrams is doing -- once he gets past the original complete class theorem -- is basically just inventing some new formalism along the lines of Savage. I think it is very misleading to refer to this as "the complete class theorem" -- how on earth was I supposed to know that *this* was what was being referred to when "the complete class theorem" was mentioned, when it resembles the original theorem so little (and it's the original theorem that was linked to)? -- and I don't see why it was necessary to invent this anew, but sure, I can accept that it presumably works, even if the details aren't spelled out.

But I must note that he starts out by saying that he's only considering the case when there's only a finite set of states of the world! I realize you weren't making a point about bounded utility here; but from that point of view, it is quite significant...

Also, my inner model of Jaynes says that the right way to handle infinities is not to outlaw them, but to be explicit and consistent about what limits we're taking.

I don't really understand what that means in this context. It is already quite explicit what limits we're taking: Given an action (a measurable function from states of the world to outcomes), take its expected utility, with regard to the [finitely-additive] probability on states of the world. (Which is implicitly a limit of sorts.)

I think this is another one of those comments that makes sense if you're reasoning backward, starting from utility functions, but not if you're reasoning forward, from preferences. If you look at things from a utility-functions-first point of view, then it looks like you're outlawing infinities (well, unboundedness that leads to infinities). But from a preferences-first point of view, you're not outlawing anything. You haven't outlawed unbounded utility functions, rather they've just failed to satisfy fundamental assumptions about decision-making (remember, if you don't have P7 your utility function is not guaranteed to return correct results about infinite gambles at all!) and so clearly do not reflect your idealized preferences. You didn't get rid of the infinity, it was simply never there in the first place; the idea that it might have been turned out to be mistaken.

But if we’re stating idealized preferences (including a moral theory), then these idealized preferences had better be consistent—and not literally just consistent, but obeying rationality axioms to avoid stupid stuff.

“Stupid” by what standard?

Surely it can’t be “by the standard of your preferences”—because your preferences are what’s being evaluated. What, then? Is there some other standard? But why should we accept this standard, if it is not contained within, or implied by, our preferences?

It is not at all obvious to me that the quoted claim is coherent or true. I should like to see it argued for, explicitly.

Relatedly:

So if your moral theory is given by an

unboundedutility function, then it isnot, in fact, a correct description of anyone’s idealized preferences, no matter how much you insist it is, because you’re saying that people’s idealized (not real!) preferences are, essentially, inconsistent.

Suppose I am saying exactly this. And what of it? What’s wrong with this?

an incosistent

[sic](or technically consistent but having obvious perversities) moral system is no good

Why not?

In other words: suppose that we have a choice between a moral system that fails to properly capture all of our preferences (but is consistent/etc.), and a moral system which is inconsistent (or perverse etc.), but captures all of our preferences. You say that we should *obviously* choose the former. Why? This choice does not seem obvious to me.

Building FAI that will correctly optimize something inconsistent seems like an even more daunting task than building FAI :-)

Sure, but that's a reason to research consistent values that are close to ours, so we have something to program into a certain kind of FAI. That's why people research "idealizing values", and I think it's a worthwhile direction. Figuring out how to optimize inconsistent values could be another direction, they are not mutually exclusive.

I think that’s a very dangerous direction. It seems like it would be all too easy for judgments of value ‘closeness’ to be made on the basis of possibility/convenience/etc. (i.e., “how easy it would be to program this into an FAI”), rather than… unbiased evaluation.

Furthermore, it seems to me that if you take any set of values “close to” your own, and then *optimize* for those values, that optimization itself will make these values less and less close to yours. (This would be especially true if there is no practical/meaningful way to optimize your actual values!)

These two things put together, which complement (in a negative and dangerous way) each other, make me very leery of the “research consistent values that are close to ours” approach.

I think it makes sense to worry about value fragility and shoehorning, but it's a cost-benefit thing. The benefits of consistency are large: it lets you prove stuff. And the costs seem small to me, because consistency requires nothing more than having an ordering on possible worlds. For example, if some possible world seems ok to you, you can put it at the top of the ordering. So assuming infinite power, any ok outcome that can be achieved by any other system can be achieved by a consistent system.

And even if you want to abandon consistency and talk about messy human values, OP's point still stands: unbounded utility functions are useless. They allow "St Petersburg inconsistencies" and disallow "bounded inconsistencies", but human values probably have both.

consistency requires nothing more than having an ordering on possible worlds. For example, if some possible world seems ok to you, you can put it at the top of the ordering. So assuming infinite power, any ok outcome that can be achieved by any other system can be achieved by a consistent system

This is an interesting point. I will have to think about it, thanks.

And even if you want to abandon consistency and talk about messy human values, OP’s point still stands: unbounded utility functions are useless.

To be clear, I take no position on this point in particular. My disagreements are as noted in my top-level comment—no more nor less. (You might say that I am questioning various aspects of the OP’s “local validity”. The broader point may stand anyway, or it may not; that is to be evaluated once the disagreements are resolved.)

Preferences cannot be inconsistent, only stated preferences which are not the same thing. So that seems like an impossible choice...

Preferences cannot be inconsistent

This is the first I’m hearing of this. As far as I’m aware, in all previous discussions on LW, inconsistent preferences have been taken to be, not only possible, but actual (and possessed by most humans). On what do you base your claim?

I'm responding here rather than deeper in the thread. It's not the whole response I wanted to do, which probably deserves an entire sequence, but it gets the ball rolling at least.

My whole quote was:

Preferences cannot be inconsistent, only stated preferences which are not the same thing.

Let's break that down a bit. First of all I'm taking "preferences" to be shorthand for "preferences over outcomes," which is to say a qualitative ranking of possible future worlds: E.g. getting no ice cream is bad, me getting ice cream is good, everyone getting ice cream is better, etc.

To quantify this, you can assign numeric values to outcomes in order of increasing preference. Now we have a utility function that scores outcomes. But despite adding numbers we are really just specifying a ranked distribution of outcomes based on preference. U(no ice cream) < U(I have ice cream) < U(everyone has ice cream).

Now what does it mean for preferences to be inconsistent? The simplest example would be the following:

U(no ice cream) < U(I have ice cream)

U(I have ice cream) < U(everyone has ice cream)

U(everyone has ice cream) < U(no ice cream)

If you've followed closely so far, alarm bells should be going off. This supposed ordering is nonsensical. No ice cream is worse than just me having ice cream, which is worse than everyone having ice cream... which is worse than no ice cream at all? We've looped around! There's no way to maximize this ordering in order to find the "best" outcome. All outcomes are "better" than all other outcomes, and at the same time "worse" as well.

Inconsistent utility functions ("preferences") are *nonsensical*. You can't reason about them. You can't use them to solve problems. And when you look at them in detail it doesn't really make sense as a preference either: how can one outcome be strictly better than another, and also strictly worse at the same time? Make up your mind already!

But of course when we consider a real problem like, say, abortion rights, our preferences *feel* conflicted. I would argue this is just what it feels like on the inside to have a weighted utility function, with some terms positive and some terms negative. If we consider the case of abortion, we may feel bad for the terminated potential human represented by the fetus, but feel good about the woman's control of their body or the prevention of an unhappy childhood stemming from an unwanted pregnancy. We also know that every real abortion circumstance is a complex jumble of a bunch of other factors, far too many to write here or even consider in one sitting in real life. So what we end up feeling is:

(positive term) + (negative term) + (...ugh...)

Most people seek reasons to ignore either the positive or negative terms so they can feel good or righteous about themselves for voting Blue or Green. But I would argue that in an issue which feels unresolved and conflicted, the term which dominates in magnitude is really the ...*ugh...* term. We can feel preferences for and against, but we are unable to weigh the balance because the gigantic unresolved pool of *...ugh...* we have yet to wade into.

What this feels like from the inside is that your preferences are inconsistent: you feel simultaneous pull towards or against multiple outcomes, without a balance of weight to one side or the other.

And yes like you mentioned in another comment what I think you should do is delve deeper and explore that *...ugh...* field. You don't have to uncover literally every piece of evidence, but you do have to keep at it until it is balanced by the revealed positive and negative terms, as which point you will no longer feel so conflicted and your preferences will be clear.

Now inconsistent preferences often show up when considering things like virtue ethics: "Stealing is wrong. We want to be good people, so we should never steal." But what if I need to steal bread to feed my starving family? Providing for my family is also virtuous, is it not? "Inconsistency!" you shout.

**No.** I defy that notion. Nobody is born with a strong moral instinct against such a complex social virtual such as a prohibition of theft, which requires a concept of other minds, a notion of property, and parent values of fairness and equality. At best you are perhaps born with a spectrum of hedonic likes and dislikes, a few social instincts, and based on child development trajectories, maybe also some instinctual values relating to fairness and social status. As you grow up you develop and/or have imprinted upon you some culturally transmitted terminal values, as well as the non-linear weights to these terminal values which make up your own personal utility function. Since you have to make choices day-to-day in which you evaluate outcomes, you also develop a set of heuristics for maximizing your utility functions: these are the actual reasons you have for the things you do, and include both terminal and instrumental values as well as functional heuristics. To an even smaller extent you also have a meta-level awareness of these goals and heuristics, which upon request you might translate into words. *These* are your stated preferences.

In circumstances that you are likely to encounter as a rationalist (e..g no cult reprogramming), seeking reflective equilibrium should cause your instrumental goals and functional heuristics to more closely match your underlying utility function without changing the weights, thereby not altering your actual preferences, even if you do occasionally change your stated preferences as a result.

This comment has already gotten super long, and I've run out of time. There's more I wanted to say about on a more mathematical and AI basis about how seeking reflective equilibrium *must* always create instrumental values which reflect stable terminal values under reflection. But that would be a post equally long and more math heavy...

You are talking, here, about preferences that are *intransitive*.

The von Neumann–Morgenstern utility theorem specifies four axioms which an agent’s preferences must conform to, in order for said preferences to be formalizable as a utility function. Transitivity of preferences is one of these axioms.

However, the VNM theorem is just a formal mathematical result: it says that *if, and only if*, an agent’s preferences comply with these four axioms, *then* there exists (up to positive affine transformation) a utility function which describes these preferences.

The axioms are often described as rules that a “rational agent” must comply with, or as being axioms of “rationality”, etc., but this is a tendentious phrasing—one which is in no way implicit in the theorem (which, again, is *only* a formally proved result in mathematics), nor presupposed by the theorem. Whether compliance with the VNM axioms is normative (or, equivalently, whether it constitutes, or is required by, “rationality”) is thus an open question.

(Note that whether the actual preferences of existing agents (i.e., humans) comply with the VNM axioms is *not* an open question—we know that they do not.)

It may interest you to know that, of the four VNM axioms, transitivity is one which I (like you) find intuitively and obviously normative. I cannot see any good reason to have preferences that are intransitive upon reflection; this would be clearly irrational.

But there are three other axioms: independence, continuity, and completeness. I do not find *any* of those three to be obviously normative. In fact, there are good reasons to reject each of the three. And my *actual* preferences do indeed violate *at least* the independence and continuity axioms.

If you search through my comment history, you will find discussions of this topic dating back many years (the earliest, I think, would have been around 2011; the most recent, only a few months ago). My opinion has not materially shifted, over this period; in other words, my views on this have been stable under reflection.

Thus we have the situation I have been describing: my preferences are “inconsistent” in a certain formal sense (namely, they are not VNM-compliant), and thus *cannot* be represented with a utility function. This property of my preferences is stable under reflection, and furthermore, I endorse it as normative.

*P.S.:* There are certain other things in your comment which I disagree with, but, as far as I can tell, all are immaterial to the central point, so I am ignoring them.

Note that whether the actual preferences of existing agents (i.e., humans) comply with the VNM axioms isnotan open question—we know that they do not.

I defy the data. Give me a hard example please, or I don't think there's much benefit to continuing this.

Certainly I can do this (in fact, you can find several examples yourself by, as I said, looking through my comment history—but yes, I’m willing to dig them up for you).

But before I do, let me ask: what sorts of examples will satisfy you? After all, suppose I provide an example; you could then say: “ah, but *actually* this is not a VNM axiom violation, because these are not your *real* preferences—if you thought about it rationally, you would conclude that your real preferences should instead be so-and-so” (in a manner similar to what you wrote in your earlier comment). Then suppose I say “nope; I am unconvinced; these are definitely my real preferences and I refuse to budge on this—my preferences are not up for grabs, no matter what reasoning you adduce”. Then what? Would you, in such a case, accept my example as an existence proof of my claim? Or would you continue to defy the data?

Well I don't know how I would react without seeing it, which is why I'm asking. But yes my better-odds expectation is that it will only be apparently inconsistent and we'd either be able to unravel the real underlying terminal values or convincingly show that the ramifications of the resulting inconsistency are not compatible with your preferences. If you think that'd be a waste of your time you're free not to continue with this, with no assumed fault of course.

Well, let’s say this: I will take some time (when I can, sometime within the next few days) to find some of the comments in question, but if it turns out that you do think that none of the claimed examples are sufficient, then I make no promises about engaging with the proposed “unraveling of real underlying terminal values” or what have you—*that* part I do think is unlikely to be productive (simply because there is usually not much to say in response to “no, these really are my preferences, despite any of these so-called ‘contradictions’, ‘incompatibilities’, ‘inconsistencies’, etc.”—in other words, preferences are, generally, prior to everything else^{[1]}).

In the meantime, however, you might consider (for your own interest, if nothing else) looking into the existing (and quite considerable) literature on VNM axiom violations in the actual preferences of real-world humans. (The Wikipedia page on the VNM theorem should be a good place to start chasing links and citations for this.)

This, of course, avoids the issue of higher-order preferences, which I acknowledge is an important complicating factor, but which I think ought to be dealt with as a special case, and with full awareness of what exactly is being dealt with. (Robin Hanson’s curve-fitting approach is the best framework I’ve seen for thinking about this sort of thing.) ↩︎

You are proving that *if* preferences are well-defined , they also need to be consistent.

What does it feel like from the inside to have badly defined preferences? Presumably it feels like sometimes being unable to make decisions, which you report is the case.

You can't prove that preferences are consistent without first proving they are well defined.

The search term you want is “reflective equilibrium,” which reduces our beliefs about our preferences to a fixed point under the transform of incrementally resolving inconsistencies. The output is necessarily a single consistent set of preferences in practice. (In theory there could be a cyclic group representing an unresolvable inconsistency, but there are reasonable priors that can be chosen to avoid.)

Yes, I’m familiar with the term. But what of it? Just because I can run some procedure (whether once or iteratively) on my preferences and get some output, which may be some altered set of preferences… doesn’t mean… well, anything. Yes, I can do this, if I were so inclined. (Or, alternatively, I could also… *not* do this.) What of it? How does that mean that I currently don’t have the preferences that I do, in fact, have? How does it prevent my current, actual preferences from being inconsistent?

Seeking reflective equilibrium isn’t meant to change your preferences. It is meant to refine or alter cached thoughts, which play a role in the production of stated preferences.

E.g. if in ask ‘is it okay to kill someone?’ And you say “no, never” with conviction. Then I follow with ‘what about self-defense?’ And you reply “ok, in self defense or the defense of others, but only if there is no other option.” Did your preferences change?

What I’m arguing is that you didn’t change your preferences, but rather updated your stated preferences based on a cache flush I initiated with my line of questioning.

I see. But then, whence this claim:

The output is necessarily a single consistent set of preferences in practice.

This doesn’t seem necessary at all, to me. Why do you say it is?

In fact, *in practice*—to take an immediate example—the output of reflection in *my* case has been to demonstrate that my preferences do not conform to the VNM axioms (and therefore cannot be represented with a utility function). Indeed this reflection process did not change my preferences, as you say. And yet the output was not ‘consistent’ in the way we’re discussing!

Would you say that I just haven’t reflected *enough*, and that *further* reflection would reveal that *actually*, my *real* preferences are, and have always been, ‘consistent’ in this way? But how would we verify this claim? (How much more reflection is ‘enough’?) Or how else would you resolve this apparent falsification of your claim?

I would like to give you a longer response, but I’m on the go and about to enter a long week of work meetings. Remind me if you don’t get a longer reply (and you still care).

I think it would help though to clarify: what do you mean by: “feels inconsistent?“ I hope it is okay to ask you a short question about the meaning of a common word :) It would help to have an example.

Er, sorry, but I didn’t use the phrase “feels inconsistent” (nor any other construction involving “feel”)… what are you referring to?

Sorry that was sloppy of me:

And yet the output was not ‘consistent’ in the way we’re discussing!

Oh—I wasn’t saying anything new there; I was just referring back to the first sentence of that paragraph:

In fact,

in practice—to take an immediate example—the output of reflection inmycase has been to demonstrate that my preferences do not conform to the VNM axioms (and therefore cannot be represented with a utility function).

Devil’s advocacy:

One answer to the above might be “we have a meta-preference to have a consistent morality”.

Well, fair enough, if so. However, if that is the *only* answer—if this this our *only* reason for preferring the consistent-but-inaccurate moral system to the accurate-but-inconsistent one—then we ought to get clear on this fact, *first*. Having our choice in such a dilemma be driven *only* by a meta-preference, and not by any other considerations, is a special case, and must be unambiguously identified before we attempt to resolve the issue.

We have to make choices, and it is not possible to make choices to maximize outcomes if we don’t have a consistent utility function. Since having a consistent utility function is a hard requirement of simply being an agent and having any effect on the world, I think it’s a reasonable requirement to have.

People say they have inconsistent utility functions. But then they go and make real life decisions anyway, so their actions imply a consistent utility function. Actions speak louder than words...

having a consistent utility function is a hard requirement of simply being an agent and having any effect on the world

I don’t know what you mean by this; it seems plainly false. I have effects on the world all the time (as do most people), and I don’t, as far as I can tell, have a consistent utility function (nor do most people).

People say they have inconsistent utility functions. But then they go and make real life decisions anyway, so their actions imply a consistent utility function.

But just the fact of making decisions doesn’t imply a utility function. What can you mean by this…?

But, to put it simply,if your ethical assumptions contradict the mathematics, it's not the mathematics that's wrong.

The mathematics includes axioms, and axioms certainly can be wrong. That is, they can be false of the things in the real world that they were invented in order to describe. As Einstein said, "As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality."

I haven't studied the proof of boundedness in detail, but it seems to be that unbounded utilities allow St. Petersburg-type combinations of them with infinite utilities, but since each thing is supposed to have finite utility, that is a contradiction. Or if infinite utilities are not immediately a problem, then by a more complicated argument, involving constructing multiple St. Petersburg-type combinations and demonstrating that the axioms imply that there both should and should not be a preference between them.

I believe that the first of those arguments is what Fishburn is alluding to in his paper "Bounded Expected Utility" (paywalled, also sci-hubbed) when he says that it is "easily seen to be bounded" (1st paragraph of section 4, p.1055). (Fishburn's book is rather too dense to speed-read all the way to his boundedness theorems.) He does not give details, but the argument that I conjecture from his text is that if there are unbounded utilities then one can construct a convex combination of infinitely many of them that has infinite utility (and indeed one can), contradicting the proof from his axioms that the utility function is a total function to the real numbers.

But by a similar argument, one might establish that the real numbers must be bounded, when instead one actually concludes that not all series converge and that one cannot meaningfully compare the magnitudes of divergent infinite series. Inf–Inf = NaN, as IEEE 754 puts it. All it takes is sufficient art in constructing the axioms to make them seem individually plausible while concealing the contradiction that will be sprung.

Individually plausible axioms do not necessarily have a plausible union.

I note that in order to construct convex combinations of infinitely many states, Fishburn extends his axiom 0 to allow this. He does not label this extension separately as e.g. "Axiom 0*". So if you were to ask which of his axioms to reject in order to retain unbounded utility, it could be none of those labelled as such, but the one that he does not name, at the end of the first paragraph on p.1055. Notice that the real numbers satisfy Axiom 0 but not Axiom 0*. It is that requirement that all infinite convex combinations exist that surfaces later as the boundedness of the range of the utility function.

While searching out the original sources, I found a paper indicating that at least in 1993, bounded utility theorems were seen as indicating a problem with Savage's axioms: "Unbounded utility for Savage's "Foundations of Statistics" and Other Models", by Peter Wakker. There is another such paper from 2014. I haven't read them, but they indicate that proofs of boundedness of utility are seen as problems for the axioms, not discoveries that utility must be bounded.

I think you've misunderstood a fair bit. I hope you don't mind if I address this slightly out of order.

Or if infinite utilities are not immediately a problem, then by a more complicated argument, involving constructing multiple St. Petersburg-type combinations and demonstrating that the axioms imply that there both should and should not be a preference between them.

This is exactly what Fishburn does, as I mentioned above. (Well, OK, I didn't attribute it to Fishburn, I kind of implicitly misattributed it to Savage, but it was actually Fishburn; I didn't think that was worth going into.)

I haven't studied the proof of boundedness in detail, but it seems to be that unbounded utilities allow St. Petersburg-type combinations of them with infinite utilities, but since each thing is supposed to have finite utility, that is a contradiction.

He does not give details, but the argument that I conjecture from his text is that if there are unbounded utilities then one can construct a convex combination of infinitely many of them that has infinite utility (and indeed one can), contradicting the proof from his axioms that the utility function is a total function to the real numbers.

What you describe in these two parts I'm quoting is, well, not how decision-theoretic utility functions work. A decision-theoretic utility function is a function on outcomes, not on gambles over outcomes. You take expected utility of a gamble; you don't take utility of a gamble.

So, yes, if you have an unbounded decision-theoretic utility function, you can set up a St. Petersburg-style situation that will have infinite expected utility. But that is not by itself a problem! The gamble has infinite *expected* utility; no individual outcome has infinite utility. There's no contradiction yet.

Of course, you then do get a contradiction when you attempt to compare two of these that have been appropriately set up, but...

But by a similar argument, one might establish that the real numbers must be bounded, when instead one actually concludes that not all series converge

What? I don't know what one might plausibly assume that might imply the boundedness of the real numbers.

...oh, I think I see the analogy you're going for here. But, it seems to rest on the misunderstanding of utility functions discussed above.

and that one cannot meaningfully compare the magnitudes of divergent infinite series.

Well, so, one must remember the goal here. So, let's start with divergent series, per your analogy. (I'm assuming you're discussing series of nonnegative numbers here, that diverge to infinity.)

So, well, there's any number of ways we could compare divergent series. We could just say that they sum to infinity, and so are equal in magnitude. Or we could try to do a more detailed comparison of their growth rates. That might not always yield a well-defined result though. So yeah. There's not any one universal way to compare magnitudes of divergent series, as you say; if someone asks, which of these two series is bigger, you might just have to say, that's a meaningless question. All this is as you say.

But that's not at all the situation we find ourselves in choosing between two gambles! If you reason backward, from the idea of utility functions, it might seem reasonable to say, oh, these two gambles are both divergent, so comparison is meaningless. But if you reason forward, from the idea of preferences... well, you have to pick one (or be indifferent). You can't just leave it undefined. Or if you have some formalism where preferences can be undefined (in a way that is distinct from indifference), by all means explain it... (but what happens when you program these preferences into an FAI and it encounters this situation? It has to pick. Does it pick arbitrarily? How is that distinct from indifference?)

That we have preferences between gambles is the whole thing we're starting from.

I note that in order to construct convex combinations of infinitely many states, Fishburn extends his axiom 0 to allow this. He does not label this extension separately as e.g. "Axiom 0*". So if you were to ask which of his axioms to reject in order to retain unbounded utility, it could be none of those labelled as such, but the one that he does not name, at the end of the first paragraph on p.1055. Notice that the real numbers satisfy Axiom 0 but not Axiom 0*. It is that requirement that all infinite convex combinations exist that surfaces later as the boundedness of the range of the utility function.

Sorry, but looking through Fishburn's paper I can't see anything like this. The only place where any sort of infinite combination seems to be mentioned is section 9, which is not relevant. Axiom 0 means one thing throughout and allows only finite convex combinations. I simply don't see where you're getting this at all.

(Would you mind sticking to Savage's formalism for simplicity? I can take the time to properly read Fishburn if for some reason you *insist* things have to be done this way, but otherwise for now I'm just going to put things in Savage's terms.)

In any case, in Savage's formalism there's no trouble in proving that the necessary actions exist -- you don't have to go taking convex combinations of anything, you simply directly construct the functions. You just need an appropriate partition of the set of world-states (provided by the Archimedean axiom he assumes, P6) and an appropriate set of outcomes (which comes from the assumption of unbounded utility). You don't have to go constructing other things and then doing some fancy infinite convex combination of them.

If you don't mind, I'd like to ask: could just tell me *what in particular* in Savage's setup or axioms you find to be the probable weak point? If it's P7 you object to, well, I already discussed that in the post; if you get rid of that, the utility function may be unbounded but it's no longer guaranteed to give correct results when comparing infinite gambles.

While searching out the original sources, I found a paper indicating that at least in 1993, bounded utility theorems were seen as indicating a problem with Savage's axioms: "Unbounded utility for Savage's "Foundations of Statistics" and Other Models", by Peter Wakker. There is another such paper from 2014. I haven't read them, but they indicate that proofs of boundedness of utility are seen as problems for the axioms, not discoveries that utility must be bounded.

I realize a number of people see this as a problem. Evidently they have some intuition or argument that disagrees with the boundedness of utility. Whatever this intuition or argument is, I would be very surprised if it were as strong as the argument that utility must be bounded. There's no question that assumptions *can* be bad. I just think the reasons to think these *are* bad that have been offered, are seriously flimsy compared to the reasons to think that they're good. So I see this as basically a sort of refusal to take the math seriously. (Again: Which axiom should we throw out, or what part of the setup should we rework?)

Or if you have some formalism where preferences can be undefined (in a way that is distinct from indifference), by all means explain it... (but what happens when you program these preferences into an FAI and it encounters this situation? It has to pick. Does it pick arbitrarily? How is that distinct from indifference?)

A short answer to this (something longer later) is that an agent need not have preferences between things that it is impossible to encounter. The standard dissolution of the St. Petersberg paradox is that nobody can offer that gamble. Even though each possible outcome is finite, the offerer must be able to cover every possible outcome, requiring that they have infinite resources.

Since the gamble cannot be offered, no preferences between that gamble and any other need exist. If your axioms require both that preference must be total and that St. Petersburg gambles exist, I would say that that is a flaw in the axioms. Fishburn (*op. cit.*, following Blackwell and Girschick, an inaccessible source) requires that the set of gambles be closed under infinitary convex combinations. I shall take a look at Savage's axioms and see what in them is responsible for the same thing.

Looking at the argument from the other end, at what point in valuing numbers of intelligent lives does one approach an asymptote, bearing in mind the possibility of expansion to the accessible universe? What if we discover that the habitable universe is vastly larger than we currently believe? How would one discover the limits, if there are any, to one's valuing?

Fishburn (op. cit., following Blackwell and Girschick, an inaccessible source) requires that the set of gambles be closed under infinitary convex combinations.

Again, I'm simply not seeing this in the paper you linked? As I said above, I simply do not see anything like that outside of section 9, which is irrelevant. Can you point to where you're seeing this condition?

I shall take a look at Savage's axioms and see what in them is responsible for the same thing.

In the case of Savage, it's not any particular axiom, but rather the setup. An action is a function from world-states to outcomes. If you can construct the function, the action (gamble) exists. That's all there is to it. And the relevant functions are easy enough to construct, as I described above; you use P6 (the Archimedean condition, which also allows flipping coins, basically) to construct the events, and we have the outcomes by assumption. You assign the one to the other and there you go.

(If you don't want to go getting the book out, you may want to read the summary of Savage I wrote earlier!)

A short answer to this (something longer later) is that an agent need not have preferences between things that it is impossible to encounter. The standard dissolution of the St. Petersberg paradox is that nobody can offer that gamble. Even though each possible outcome is finite, the offerer must be able to cover every possible outcome, requiring that they have infinite resources. Since the gamble cannot be offered, no preferences between that gamble and any other need exist.

So, would it be fair to sum this up as "it is not necessary to have preferences between two gambles if one of them takes on unbounded utility values"? Interesting. That doesn't strike me as wholly unworkable, but I'm skeptical. In particular:

- Can we phrase this without reference to utility functions? It would say a lot more for the possibility if we can.
- What if you're playing against Nature? A gamble can be any action; and in a world of unbounded utility functions, why should one believe that any action must have some bound on how much utility it can get you? Sure, sure, second law of thermodynamics and all that, but that's just a feature of the paticular universe we happen to live in, not something that reshapes your preferences. (And if we were taking account of that sort of thing, we'd probably just say, oh, utility is bounded after all, in a kind of stupid way.) Notionally, it could be discovered to be wrong! It won't happen, but it's not probability literally 0.

Or are you trying to cut out a more limited class of gambles as impossible? I'm not clear on this, although I'm not certain it affects the results.

Anyway, yeah, as I said, my main objection is that I see no reason to believe that, if you have an unbounded utility function, Nature cannot offer you a St. Petersburg game. Or I mean, to the extent I do see reasons to believe that, they're facts about the particular universe we happen to live in, that notionally could be discovered to be wrong.

Looking at the argument from the other end, at what point in valuing numbers of intelligent lives does one approach an asymptote, bearing in mind the possibility of expansion to the accessible universe? What if we discover that the habitable universe is vastly larger than we currently believe? How would one discover the limits, if there are any, to one's valuing?

This is exactly the sort of argument that I called "flimsy" above. My answer to these questions is that none of this is relevant.

Both of us are trying to extend our ideas about preferences from ordinary situations to extraordinary ones. (Like, I agree that some sort of total utilitarianism is a good heuristic for value under the conditions we're familiar with.) This sort of extrapolation, to an unfamiliar realm, is always potentially dangerous. The question then becomes, what sort of tools can we expect to continue to work, without needing any sort of adjustment to the new conditions?

I do not expect speculation about the particular form preferences our would take under these unusual conditions to be trustworthy. Whereas basic coherence conditions had damn well better continue to hold, or else we're barely even talking about sensible preferences anymore.

Or, to put it differently, my answer is, *I don't know, but the answer must satisfy basic coherence conditions*. There's simply no way that the idea that decision-theoretic utility has to increase linearly with number intelligent lives, is on anywhere near as solid ground as that! The mere fact that it's stated in terms of a utility function in the first place, rather than in terms of something more basic, is something of a smell. Complicated statements we're not even entirely sure how to formulate can easily break in a new context. Short simple statements that *have* to be true for reasons of simple coherence don't break.

(Also, some of your questions don't seem to actually appreciating what a bounded utility function would actually mean. It wouldn't mean taking an unbounded utility function and then applying a cap to it. It would just mean something that naturally approaches 1 as things get better and 0 as things get worse. There is no *point* at which it approaches an asymptote; that's not how asymptotes work. There *is* no limit to one's valuing; presumably utility 1 does not actually occur. Or, at least, that's how I infer it would have to work.)

Again, I'm simply not seeing this in the paper you linked? As I said above, I simply do not see anything like that outside of section 9, which is irrelevant. Can you point to where you're seeing this condition?

In Fishburn's "Bounded Expected Utility", page 1055, end of first paragraph (as cited previously):

However, we shall for the present take (for any -algebra that contains each ) since this is the Blackwell-Girshick setting. Not only is an abstract convex set, but also if and for and , then .

That depends on some earlier definitions, e.g. is a certain set of probability distributions (the “d” stands for “discrete”) defined with reference to some particular -algebra, but the important part is that last infinite sum: this is where all infinitary convex combinations are asserted to exist. Whether that is assigned to "background setup" or "axioms" does not matter. It has to be present, to allow the construction of St. Petersburg gambles.

Will address the rest of your comments later.

(This is more properly a followup to my sibling comment, but posting it here so you'll see it.)

I already said that I think that thinking in terms of infinitary convex combinations, as you're doing, is the wrong way to go about it; but it took me a bit to put together why that's definitely the wrong way.

Specifically, it assumes probability! Fishburn, in the paper you link, assumes probability, which is why he's able to talk about why infinitary convex combinations are or are not allowed (I mean, that and the fact that he's not necessarily arbitrary actions).

Savage doesn't assume probability! So if you want to disallow certain actions... how do you specify them? Or if you want to talk about convex combinations of actions -- not just infinitary ones, *any* ones -- how do you even define these?

In Savage's framework, you have to *prove* that if two actions can be described by the same probabilities and outcomes, then they're equivalent. E.g., suppose action A results in outcome X with probability 1/2 and outcome Y with probability 1/2, and suppose action B meets that same description. Are A and B equivalent? Well, yes, but that requires proof, because maybe A and B take outcome X on *different* sets of probability 1/2. (OK, in the two-outcome case it doesn't really require "proof", rather it's basically just his definition of probability; but the more general case requires proof.)

So, until you've established that theorem, that it's meaningful to combine gambles like that, and that the particular events yielding the probabilities aren't relevant, one can't really meaningfully define convex combinations at all. This makes it pretty hard to incorporate them into the setup or axioms!

More generally this should apply not only to Savage's particular formalism, but any formalism that attempts to ground probability as well as utility.

Anyway yeah. As I think I already said, I think we should think of this in terms not of, what combinations of actions yield permitted actions, but rather whether there should be forbidden actions at all. (Note btw in the usual VNM setup there aren't any forbidden actions either! Although there infinite gambles are, while not forbidden, just kind of ignored.) But this is in particular why trying to put it it in terms of convex combinations as you've done doesn't really work from a fundamentals point of view, where there is no probability yet, only preferences.

I already said that I think that thinking in terms of infinitary convex combinations, as you're doing, is the wrong way to go about it; but it took me a bit to put together why that's definitely the wrong way.

Specifically, it assumes probability! Fishburn, in the paper you link, assumes probability, which is why he's able to talk about why infinitary convex combinations are or are not allowed (I mean, that and the fact that he's not necessarily arbitrary actions).

Savage doesn't assume probability!

Savage doesn't assume probability or utility, but their construction is a mathematical consequence of the axioms. So although they come later in the exposition, they mathematically exist as soon as the axioms have been stated.

So if you want to disallow certain actions... how do you specify them?

I am still thinking about that, and may be some time.

As a general outline of the situation, you read P1-7 => bounded utility as modus ponens: you accept the axioms and therefore accept the conclusion. I read it as modus tollens: the conclusion seems wrong, so I believe there is a flaw in the axioms. In the same way, the axioms of Euclidean geometry seemed very plausible as a description of the physical space we find ourselves in, but conflicts emerged with phenomena of electromagnetism and gravity, and eventually they were superseded as descriptions of physical space by the geometry of differential manifolds.

It isn't possible to answer the question "which of P1-7 would I reject?" What is needed to block the proof of bounded utility is a new set of axioms, which will no doubt imply large parts of P1-7, but might not imply the whole of any one of them. If and when such a set of axioms can be found, P1-7 can be re-examined in their light.

Oh, so that's what you're referring to. Well, if you look at the theorem statements, you'll see that P=P_d is an axiom that is explicitly called out in the theorems where it's assumed; it's *not* implictly part of Axiom 0 like you asserted, nor is it more generally left implicit at all.

but the important part is that last infinite sum: this is where all infinitary convex combinations are asserted to exist. Whether that is assigned to "background setup" or "axioms" does not matter. It has to be present, to allow the construction of St. Petersburg gambles.

I really think that thinking in terms of infinitary convex combinations is the wrong way to go about this here. As I said above: You don't get a St. Petersburg gamble by taking some fancy convex combination, you do it by just constructing the function. (Or, in Fishburn's framework, you do it by just constructing the distribution; same effect.) I guess without P=P_d you do end up relying on closure properties in Fishburn's framework, but Savage's framework just doesn't work that way at all; and Fishburn with P=P_d, well, that's not a closure property. Rather what Savage's setup, and P=P_d have in common, is that they're, like, arbitrary-construction properties: If you can make a thing, you can compare it.

A further short answer. In Savage's formulation, from P1-P6 he derives Theorem 4 of section 2 of chapter 5 of his book, which is linear interpolation in any interval. Clearly, linear interpolation does not work on an interval such as [17,Inf], therefore there cannot be any infinitely valuable gambles. St. Petersburg-type gambles are therefore excluded from his formulation.

Savage does not actually prove bounded utility. Fishburn did this later, as Savage footnotes in the edition I'm looking at, so Fishburn must be tackled. Theorem 14.5 of Fishburn's book derives bounded utility from Savage's P1-P7. His proof seems to construct a St. Petersburg gamble from the supposition of unbounded utility, deriving a contradiction. I shall have to examine further how his construction works, to discern what in Savage's axioms allows the construction, when P1-P6 have already excluded infinitely valuable gambles.

Savage does not actually prove bounded utility. Fishburn did this later, as Savage footnotes in the edition I'm looking at, so Fishburn must be tackled.

Yes, it was actually Fishburn that did that. Apologies if I carelessly implied it was Savage.

IIRC, Fishburn's proof, formulated in Savage's terms, is in Savage's book, at least if you have the second edition. Which I think you must, because otherwise that footnote wouldn't be there at all. But maybe I'm misremembering? I think it has to be though...

In Savage's formulation, from P1-P6 he derives Theorem 4 of section 2 of chapter 5 of his book, which is linear interpolation in any interval.

I don't have the book in front of me, but I don't recall any discussion of anything that could be called linear interpolation, other than the conclusion that expected utility works for finite gambles. Could you explain what you mean? I also don't see the relevance of intervals here? Having read (and written a summary of) that part of the book I simply don't know what you're talking about.

Clearly, linear interpolation does not work on an interval such as [17,Inf], therefore there cannot be any infinitely valuable gambles. St. Petersburg-type gambles are therefore excluded from his formulation.

I still don't know what you're talking about here, but I'm familiar enough with Savage's formalism to say that you seem to have gotten quite lost somewhere, because this all sounds like nonsense.

From what you're saying, the impression that I'm getting is that you're treating Savage's formalism like Fishburn's, where there's some a-prior set of actions under consideration, and so we need to know closure properties about that set. But, that's not how Savage's formalism works. Rather the way it works is that actions are just functions (possibly with a measurability condition -- he doesn't discuss this but you probably want it) from world-states to outcomes. If you can construct the action as a function, there's no way to exclude it.

I shall have to examine further how his construction works, to discern what in Savage's axioms allows the construction, when P1-P6 have already excluded infinitely valuable gambles.

Well, I've already described the construction above, but I'll describe it again. Once again though, you're simply wrong about that last part; that last statement is not only incorrect, but fundamentally incompatible with Savage's whole approach.

Anyway. To restate the construction of how to make a St. Petersburg gamble. (This time with a little more detail.) An action is simply a function from world-states to outcomes.

By assumption, we have a sequence of outcomes a_i such that U(a_i) >= 2^i and such that U(a_i) is strictly increasing.

We can use P6 (which allows us to "flip coins", so to speak) to construct events E_i (sets of world-states) with probability 1/2^i.

Then, the action G that takes on the value a_i on the set E_i is a St. Petersburg gamble.

For the particular construction, you take G as above, and also G', which is the same except that G' takes the value a_1 on E_0, instead of the value a_0.

Savage proves in the book (although I think the proof is due to Fishburn? I'm going by memory) that given two gambles, both of which are preferred to any essentially bounded gamble, the agent must be indifferent between them. (The proof uses P7, obviously -- the same thing that proves that expected utility works for infinite gambles at all. I don't recall the actual proof offhand and don't feel like trying to reconstruct it right now, but anyway I think you have it in front of you from the sounds of it.) And we can show both these gambles are preferred to any essentially bounded gamble by comparing to truncated versions of themselves (using sure-thing principle) and using the fact that expected utility works for essentially bounded gambles. Thus the agent must be indifferent between G and G'. But also, by the sure-thing principle (P2 and P3), the agent must prefer G' to G. That's the contradiction.

Edit: Earlier version of this comment misstated how the proof goes

Oops, turns out I *did* misremember -- Savage does not in fact put the proof in his book. You have to go to Fishburn's book.

I've been reviewing all this recently and yeah -- for anyone else who wants to get into this, I'd reccommend getting Fishburn's book ("Utility Theory for Decision Making") in addition to Savage's "Foundations of Statistics". Because in addition to the above, what I'd also forgotten is that *Savage leaves out a bunch of the proofs*. It's really annoying. Thankfully in Fishburn's treatment he went and actually elaborated all the proofs that Savage thought it OK to skip over...

(Also, stating the obvious, but get the second edition of "Foundations of Statistics", as it fixes some mistakes. You probably don't want *just* Fishburn's book, it's fairly hard to read by itself.)

Just to sketch out the contradiction between unbounded utilities and gambles involving infinitely many outcomes a bit more explicitly.

If your probability function is unbounded, we can consider the following wager: You win 2 utils with probability 1/2, 4 utils with probability 1/4, and so on. The expected utility of this wager is infinite. (If there are no outcomes with utility exactly 2, 4, etc, we can award more - this is possible because utility is unbounded).

Now consider these wagers on a (fair) coinflip:

- A: Play the above game if heads, pay out 0 utils if tails
- B: Play the above game if heads, pay out 100000 utils if tails

(0 and 100000 can be any two non-equal numbers).

Both of these wagers have infinite expected utility, so we must be indifferent between them. But since they agree on heads, and B is strictly preferred to A on tails, we must prefer B (since tails occurs with positive probability)

Yes, thanks, I didn't bother including it in the body of the post but that's basically how it goes. Worth noting that this:

Both of these wagers have infinite expected utility, so we must be indifferent between them.

...is kind of shortcutting a bit (at least as Savage/Fishburn[0] does it; he proves indifference between things of infinite expected utility separately after proving that expected utility works when it's finite), but that is the essence of it, yes.

(As for the actual argument... eh, I don't have it in front of me and don't feel like rederiving it...)

[0]I initially wrote Savage here, but I think this part is actually due to Fishburn. Don't have the book in front of me right now though.

Is there a reason we can't just solve this by proposing arbitrarily large bounds on utility instead of *infinite *bounds? For instance, if we posit that utility is bounded by some arbitrarily high value * X*, then the wager can only payout values

sum(from *i*=1 to *i*=log2(* X*)) (1/2^

The above, for any arbitarily large * X*, is clearly finite (the former term is a bounded summation and the latter term is a convergent geometric series). So, we can believe that wager B is better for any arbitrarily large bound on our utility function.

This might seem unsatisfactory but for problems like

Eliezer Yudkowsky has argued against this (I can't find the particular comment at the moment, sorry) basically on the idea of that total utilitarianism in a universe that can contain arbitrarily many people requires unbounded utility functions.

it seems easier to just reject the claim that our universe can contain *infinite *people and instead just go with the assumption that it can contain * X* people, where

Is there a reason we can't just solve this by proposing arbitrarily large bounds on utility instead of infinite bounds? For instance, if we posit that utility is bounded by some arbitrarily high value X, then the wager can only payout values X for probabilities below 1/X.

I'm not sure what you're asking here. An individual decision-theoretic utility function can be bounded or it can be unbounded. Since decision-theoretic utility functions can be rescaled arbitrarily, naming a precise value for the bounds is meaningless; so like we could just assume the bounds are 0 below and 1 above.

So, I mean, yeah, you can make the problem go away by assuming bounded utility, but if you were trying to say something more than that, a bounded utility that is somehow "closer" to unbounded utility, then no such notion is meaningful.

Apologies if I've misunderstood what you're trying to do.

So, I mean, yeah, you can make the problem go away by assuming bounded utility, but if you were trying to say something more than that, a bounded utility that is somehow "closer" to unbounded utility, then no such notion is meaningful.

Say our utility function assigns an actual thing in the universe with value **V1** and the utility function is bounded by value **X**. What I'm saying is that we can make the problem go away by assuming bounded utility *but without* actually having to define the ratio between **V1** and **X **as a specific finite number (this would not change upon scaling).

This means that, if your utility function is something like "number of happy human beings", you don't have to worry about your utility function breaking if the maximum number of happy human beings is larger than you expected since you never have to define such an expectation. See my sub-sub-reply to Eigil Rischel's sub-reply for elaboration.

OK, so going by that you're suggesting, like, introducing varying caps and then taking limits as the cap goes to infinity? It's an interesting idea, but I don't see why one would expect it to have anything to do with preferences.

Yes, I think that's a good description.

I don't see why one would expect it to have anything to do with preferences.

In my case, it's a useful distinction because I'm the kind of person who thinks that showing that a *real thing* is infinite requires an infinite amount of information. This means I can say things like "my utility function scales upward linearly with the number of happy people" without things breaking because it is essentially impossible to convince me that any set of finite action could legitimately cause a* literally infinite* number of happy people to exist.

For people who believe they could achieve *actually *infinitely high values in their utility functions, the issues you point out still hold. But I think my utility function is bounded by something eventually even if I can't tell you what that boundary actually is.

Apologies, but it sounds like you've gotten some things mixed up here? The issue is boundedness of utility functions, not whether they can take on infinity as a value. I don't think anyone here is arguing that utility functions don't need to be finite-valued. All the things you're saying seem to be related to the latter question rather than the former, or you seem to be possibly conflating them?

In the second paragraph perhaps this is just an issue of language -- when you say "infinitely high", do you actually mean "aribtrarily high"? -- but in the first paragraph this does not seem to be the case.

I'm also not sure you understood the point of my question, so let me make it more explicit. Taking the idea of a utility function and modifying it as you describe is what I called "backwards reasoning" above -- starting from the idea of a utility function, rather than starting from preferences. Why should one believe that modifying the idea of a utility function would result in something that is meaningful about preferences, without any sort of theorem to say that one's preferences must be of this form?

Ahh, thanks for clarifying. I think what happened was that your modus ponens was my modus tollens -- so when I think about my preferences, I ask "what conditions do my preferences need to satisfy for me to avoid being exploited or undoing my own work?" whereas you ask something like "if my preferences need to correspond to a bounded utility function, what should they be?" [1]. As a result, I went on a tangent about infinity to begin exploring whether my modified notion of a utility function would break in ways that regular ones wouldn't.

Why should one believe that modifying the idea of a utility function would result in something that is meaningful about preferences, without any sort of theorem to say that one's preferences must be of this form?

I agree, one shouldn't conclude anything without a theorem. Personally, I would approach the problem by looking at the infinite wager comparisons discussed earlier and trying to formalize them into additional rationality condition. We'd need

- an axiom describing what it means for one infinite wager to be "strictly better" than another.
- an axiom describing what kinds of infinite wagers it is rational to be indifferent towards

Then, I would try to find a decisioning-system that satisfies these new conditions *as well* as the VNM-rationality axioms (where VNM-rationality applies). If such a system exists, these axioms would probably bar it from being represented fully as a utility function. If it didn't, that'd be interesting. In any case, whatever happens will tell us more about either the structure our preferences should follow or the structure that our rationality-axioms should follow (if we cannot find a system).

Of course, maybe my modification of the idea of a utility function turns out to show such a decisioning-system exists by construction. In this case, modifying the idea of a utility function would help tell me that my preferences should follow the structure of that modification as well.

Does that address the question?

[1] From your post:

We should say instead,preferencesare not up for grabs -- utility functions merely encode these, remember. But if we're stating idealized preferences (including a moral theory), then these idealized preferences had better be consistent -- and not literally just consistent, but obeying rationality axioms to avoid stupid stuff. Which, as already discussed above, means they'll correspond to a bounded utility function.

Ahh, thanks for clarifying. I think what happened was that your modus ponens was my modus tollens -- so when I think about my preferences, I ask "what conditions do my preferences need to satisfy for me to avoid being exploited or undoing my own work?" whereas you ask something like "if my preferences need to correspond to a bounded utility function, what should they be?" [1]

That doesn't seem right. The whole point of what I've been saying is that we can write down some simple conditions that ought to be true in order to avoid being exploitable or otherwise incoherent, and then it follows as a conclusion that they have to correspond to a [bounded] utility function. I'm confused by your claim that you're asking about conditions, when you haven't been talking about conditions, but rather ways of modifying the idea of decision-theoretic utility.

Something seems to be backwards here.

I agree, one shouldn't conclude anything without a theorem. Personally, I would approach the problem by looking at the infinite wager comparisons discussed earlier and trying to formalize them into additional rationality condition. We'd need

- an axiom describing what it means for one infinite wager to be "strictly better" than another.
- an axiom describing what kinds of infinite wagers it is rational to be indifferent towards

I'm confused here; it sounds like you're just describing, in the VNM framework, the strong continuity requirement, or in Savage's framework, P7? Of course Savage's P7 doesn't directly talk about these things, it just implies them as a consequence. I believe the VNM case is similar although I'm less familiar with that.

Then, I would try to find a decisioning-system that satisfies these new conditions as well as the VNM-rationality axioms (where VNM-rationality applies). If such a system exists, these axioms would probably bar it from being represented fully as a utility function.

That doesn't make sense. If you *add* axioms, you'll only be able to conclude *more* things, not fewer. Such a thing will necessarily be representable by a utility function (that is valid for finite gambles), since we have the VNM theorem; and then additional axioms will just add restrictions. Which is what P7 or strong continuity do!

Thanks for the reply. I re-read your post and your post on Savage's proof and you're right on all counts. For some reason, it didn't actually click for me that P7 was introduced to address unbounded utility functions and boundedness was a consequence of taking the axioms to their logical conclusion.

Well, it's worth noting that P7 is introduced to address gambles with infinitely many possible outcomes, regardless of whether those outcomes are bounded or not (which is the reason I argue above you can't just get rid of it). But yeah. Glad that's cleared up now! :)

This is a reasonable way to resolve the paradox, but note that you're required to fix the max number of people ahead of time - and it can't change as you receive evidence (it must be a maximum across all possible worlds, and evidence just restricts the set of possible worlds). This essentially resolves Pascal's mugging by fixing some large number X and assigning probability 0 to claims about more than X people.

This essentially resolves Pascal's mugging by fixing some large number X and assigning probability 0 to claims about more than X people.

I understand why this is from a theoretical perspective: if you define * X* as a finite number, then an "infinite" gamble with low probability can have lower expected value than a finite gamble. It also seems pretty clear that increasing

But from a practical perspective, why do we have to define * X* in greater detail than just "it's a very large finite number but I don't know what it is" and then compare analytically? That is to say

- comparing infinite-gambles to finite gambles by analytically showing that, for large enough X, one of them is higher value than the other
- comparing infinite-gambles to finite gambles by analytically showing that, for large enough
, the infinite-gamble is higher value than the finite gamble**X** - compare finite gambles to finite gambles as normal

Another way to think about this is that, when we decide to take an action, we shouldn't use the function

because we know * X* is a finite number and taking the limit washes out the important of any terms that don't scale with

If we analogize Action A and Action B to wager A and wager B, we see that the ">" evaluator returns FALSE for all * X* larger than some value of

However, if we analogize Action A to "take Pacal's Mugging" and Action B to "Don't do that", we see that at some finite * X*, the "EV(Pascal's Mugging) > EV(No Pascal's Mugging)" function will return TRUE and always return TRUE for larger values of

And obviously, for all finite gambles, the Expected Values of the finite gambles become independent of * X* for large enough

Pardon the dust on this post; the LaTex display is acting up.

But, because E-utility functions are so ill-defined, there is, as best I can tell, not really any meaningful distinction between the two. For example, consider a utilitarian theory that assigns to each agent p a real-valued E-utility function U_p, and aggregates them by summing.

If you're not making a prioritarian aggregate utility function by *summing *functions of individual utility functions, the mapping of a prioritarian function to a utility function doesn't always work. Prioritarian utility functions, for instance, can do things like rank-order everyone's utility functions and then sum each individual utility raised to the negative-power of the rank-order ... or something*. They allow interactions *between *individual utility functions in the aggregate function that are not facilitated by the direct summing permitted in utilitarianism.

But then the utilitarian theory described by the U'_p, describes exactly the same theory as the prioritarian theory described by the U_p! The theory could equally well be described as "utilitarian" or "prioritarian"; for this reason, unless one puts further restrictions on E-utility functions, I do not consider there to be any meaningful difference between the two.

So from a mathematical perspective, it is possible to represent many prioritarian utility function as a conventional utilitarian utility function. However, from an intuitive perspective, they mean different things:

- If you take a bunch of people's individual utilities and aggregate them by summing the square roots, you're implying: "we care about improving the welfare of worse-off people
*more*than we care about improving the welfare of better-off people" - If you put the square-root
*into*the utility functions, you're implying "we believe that whatever-metric-is-going-in-the-square-root provides diminishing returns on inidividual welfare as it increases."

This doesn't practically affect decision-making of a moral agents but it does reflect different underlying philosophies -- which affects the kinds of utility functions people might propose.

*[EDIT: what I was thinking of was something like \sum (a)^(-s_i) U_i where s_i is the rank-order of U_i in the sequence of all experiences individual utility functions. If a is below 1, this ensures that the welfare improvement of a finite number of low-welfare beings will be weighted more highly than the welfare improvement of any amount of higher welfare beings (for specific values of "low welfare" and "high welfare"). There's a paper on this that I can't find right now.

If you're not making a prioritarian aggregate utility function by summing functions of individual utility functions, the mapping of a prioritarian function to a utility function doesn't always work. Prioritarian utility functions, for instance, can do things like rank-order everyone's utility functions and then sum each individual utility raised to the negative-power of the rank-order ... or something*. They allow interactions between individual utility functions in the aggregate function that are not facilitated by the direct summing permitted in utilitarianism.

This is a good point. I might want to go back and edit the original post to account for this.

So from a mathematical perspective, it is possible to represent many prioritarian utility function as a conventional utilitarian utility function. However, from an intuitive perspective, they mean different things:

This doesn't practically affect decision-making of a moral agents but it does reflect different underlying philosophies -- which affects the kinds of utility functions people might propose.

Sure, I'll agree that they're different in terms of ways of thinking about things, but I thought it was worth pointing out that in terms of what they actually propose they are largely indistinguishable without further constraints.

I don't believe that infinite gambles are a thing. In fact they feel almost self evidently at best an approximation.

you can't have both unbounded utility functions, and meaningful expected utility comparisons for infinite gambles

Are you sure about this? My intuition is that there should be a formalism where the utility functions can be unbounded so long as their expectations remain well defined and bounded. The price to pay is obviously that your agent won't have well defined preferences *for all gambles*, but often we're talking about an agents preferences on a *specific* gamble or subset of gambles, in those cases, certain unbounded utilities should be fine.

By "a specific gamble" do you mean "a specific pair of gambles"? Remember, preferences are between two things! And you hardly need a utility function to express a preference between a single pair of gambles.

I don't understand how to make sense of what you're saying. Agent's preferences are the starting point -- preferences as in, given a choice between the two, which do you pick? It's not clear to me how you have a notion of preference that allows for this to be undefined (the agent can be *indifferent*, but that's distinct).

I mean, you could try to come up with such a thing, but I'd be pretty skeptical of its meaningfulness. (What happens if you program these preferences into an FAI and then it hits a choice for which its preference is undefined? Does it act arbitrarily? How does this differ from indifference, then? By lack of transitivity, maybe? But then that's effectively just nontransitive indifference, which seems like it would be a problem...)

I think your comment is the sort of thing that sounds reasonable if you reason backward, starting from the idea of expected utility, but will fall apart if you reason forward, starting from the idea of preferences. But if you have some way of making it work, I'd be interested to hear...

By "a specific gamble" do you mean "a specific pair of gambles"? Remember, preferences are between two things! And you hardly need a utility function to express a preference between a single pair of gambles.

This is true, then it would only be between a specific subset of gambles.

I don't understand how to make sense of what you're saying. Agent's preferences are the starting point -- preferences as in, given a choice between the two, which do you pick? It's not clear to me how you have a notion of preference that allows for this to be undefined (the agent can be indifferent, but that's distinct).

I mean, you could try to come up with such a thing, but I'd be pretty skeptical of its meaningfulness. (What happens if you program these preferences into an FAI and then it hits a choice for which its preference is undefined? Does it act arbitrarily? How does this differ from indifference, then? By lack of transitivity, maybe? But then that's effectively just nontransitive indifference, which seems like it would be a problem...)

I think you should be able to set things up so that you never encounter a pair of gambles where this is undefined. I'll illustrate with an example. Suppose you start with a prior over the integers, such that:

p(n) = (C/F(n)) where F(n) is a function that grows really fast and C is a normalization constant. Then the set of gambles that we're considering would be posteriors on the integers given that they obey certain properties. For instance, we could ask the agent to choose between the posterior over integers given that n is odd vs the posterior given that n is even.

I'm pretty sure that you can construct an agent that behaves as if it had an unbounded utility function in this case. So long as the utility associated with an integer n grows sufficiently slower than F(N), all expectations over posteriors on the integers should be well defined.

If you were to build an FAI this way, it would never end up in a belief state where the expected utility diverges between two outcomes. The expected utility would be well defined over any posterior on it's prior, so it's choice given a pair of gambles would also be well defined for any belief state it could find itself in.

Huh. This would need some elaboration, but this is definitely the most plausible way around the problem I've seen.

Now (in Savage's formalism) actions are just functions from world-states to outcomes (maybe with a measurability condition), so regardless of your prior it's easy to construct the relevant St. Petersburg gambles if the utility function is unbounded. But seems like what you're saying is, if we *don't* allow arbitrary actions, then the prior could be such that, not only are none of the permitted actions St. Petersburg gambles, but also this remains the case even after future updates. Interesting! Yeah, that just might be workable...

(

EditMay 9th: I've gone and added a quick addendum to the end.)In this post I'd basically like to collect some underappreciated points about utility functions that I've made in the comments of various places but which I thought were collecting into a proper, easily-referenceable post. The first part will review the different things referred to by the term "utility function", review how they work, and explain the difference between them. The second part will explain why -- contrary to widespread opinion on this website -- decision-theoretic utility functions really do need to be bounded.

(It's also worth noting that as a consequence, a number of the decision-theoretic "paradoxes" discussed on this site simply

are not problemssince they rely on unbounded decision-theoretic utility. An example is the original Pascal's Mugging (yes, I realize that term has since been applied to a bunch of things that have nothing to do with unbounded utility, but I mean the original problem).)Anyway. Let's get on with it.

## Part 1: "Utility function" refers to two different things that are often conflated and you should be sure you know which one you're talking about

The term "utility function" refers to two significantly different, but somewhat related, things, which are, due to the terminological and conceptual overlap, often conflated. This results in a lot of confusion. So, I want to cover the distinction here.

The two things called utility functions are:

decision-theoretic utility function.E-utilityfunction.(There's actually a third thing sometimes called a "utility function", which also gets confused with these other two, but this is a rarer and IMO less important usage; I'll get back to this in a bit.)

It's important to note that much discussion online conflates all of these and yields nonsense as a result. If you see someone talking nonsense about utility functions, before replying, it's worth asking -- are they mixing together different definitions of "utility function"?

So. Let's examine these in a bit more detail.

## Decision-theoretic utility functions and their assumptions

Decision-theoretic utility functions describe the preferences of any consequentialist agent satisfying certain rationality conditions; by "it describes the agent's preferences", I mean that, given a choice between two options, the one yielding the higher expected utility is the one the agent chooses.

It's not obvious in advance that a rational agent's preferences need to be described by a utility function, but there are theorems guaranteeing this; Savage's theorem probably provides the best foundation for this, although the VNM theorem may be a little more familiar. (We'll discuss the difference between these two in more detail in the quick note below and discuss it further in the second part of this post.) Note that these functions are not

entirelyunique -- see below. Also note that these are conditions of rationality underuncertainty.Again, a decision-theoretic utility function simply describes an agent's preferences. It has nothing to do with any particular idea of morality, such as utilitarianism. Although you could say -- as I've said above -- that it assumes a consequentialist agent, who cares only about consequences. So, any rational consequentialist agent has a decision-theoretic utility function; but only a utilitarian would admit the existence of E-utility functions. (While this doesn't

exactlybear on the point here, it is worth noting that utilitarianism is a specific type of consequentialism and not identical with it!)Note that real people will not actually obey the required rationality assumptions and thus will not actually have decision-theoretic utility functions; nonetheless, idealized rational agents, and therefore decision-theoretic utility functions are a useful abstraction for a number of purposes.

Decision-theoretic utility functions are usually stated as taking values in the real numbers, but they're only defined up to positive affine transformations (scaling by a positive constant, and translation); applying such a transformation to a utility function for an agent will yield another, equally-valid utility function. As such they may be better thought of not as taking values in

R, exactly, but rather a sort of ordered 1-dimensional affine space overR. Outputs of a decision-theoretic utility function are not individually meaningful; in order to get meaningful numbers, with concrete meaning about the agent's preferences, one must take ratios of utilities, (a-b)/|c-d|. (Note the absolute value in the denominator but not the numerator, due to the importance of order.)Decision-theoretic utility functions really need to be bounded -- a point seriously underappreciated on this website -- but I'll save discussion of that for the second part of this post.

## A quick tangential note on probability and additivity

This is pretty tangential to the point of this post, but it's probably worth taking the time here to explain what the difference is between the Savage and VNM formalisms. (Well, one of two differences; the other will be discussed in the second part of this post, but as we'll see it's actually

notsuch a difference.) The main difference is that the VNM theorem assumes that we already believe in the idea of probability -- it justifies decision-theoretic utility, but it does nothing to justify probability, it just assumes it. Savage's theorem, by contrast, provides a foundation for both probability and decision-theoretic utility simultaneously, based on just on rationality axioms about preferences, which is why I think it's the better foundation.However, the probability measure it constructs need not actually be a probability measure as such, as it need only be finitely additive rather than countably additive. It's not clear what to make of this. Maybe countable additivity of probability just isn't necessary for a rational agent? It's hard to say. (If I'm not mistaken, the limiting probabilities of MIRI's logical inductor are merely (the analogue of) finitely additive, not countably additive, but I could be wrong about that...) But this is really off the point, so I'm just going to raise the question and then move on; I just wanted to mention it to ward off nitpicks on this point. As we'll see below, the choice of formalism doesn't actually matter much to my point here.

## E-utility functions and their assumptions

This is the older meaning of the term if I'm not mistaken, but there is

mostlynot a lot to say about these because they're fairly ill-defined. They are, as mentioned above, specifically a utilitarian notion (nota general consequentialist notion). How to define these, as well as how to aggregate them, remain disputed.Utilitarians say that one should try to maximize the expected value of the aggregated utility function, which means that the aggregated function is actually a weird sort of decision-theoretic utility function (corresponding to an ideal moral agent rather than any particular agent), not an E-utility function. One does not attempt to maximize expected value of E-utility functions.

One thing we

cansay about E-utility functions is that while only idealized rational agents have decision-theoretic utility functions and not real people, real peoplearesupposed to have E-utility functions. Or at least so I gather, or else I don't see how utilitarianism makes sense?Actually, one could say that it is not only utilitarians who rely on these -- there is also the notion of prioritarianism; one sometimes sees the term "aggregative consequentialism" to cover both of these (as well as other potential variants). But, because E-utility functions are so ill-defined, there is, as best I can tell, not really any meaningful distinction between the two. For example, consider a utilitarian theory that assigns to each agent p a real-valued E-utility function U_p, and aggregates them by summing. Let's suppose further that each U_p takes values in the nonnegative reals; then if we change the aggregation rule to summing the square roots of the U_p, we have changed our utilitarian theory into a prioritarian one. Except, instead of doing that, we could define U'_p = sqrt(U_p), and call U'_p the E-utilities; because there's no precise definition of E-utilities, there's nothing stopping us from doing this. But then the utilitarian theory described by the U'_p, describes exactly the same theory as the prioritarian theory described by the U_p! The theory could equally well be described as "utilitarian" or "prioritarian"; for this reason, unless one puts further restrictions on E-utility functions, I do not consider there to be any meaningful difference between the two.

As such, throughout this post I simply say "utilitarianism" rather than "aggregative consequentialism"; but if I'm wrong in identifying the two, well, whenever I say "utilitarianism" I really kind of mean "aggregative consequentialism". Hope that's OK.

## Preference utilitarianism and Harsanyi's theorem (using decision-theoretic utility functions as E-utility functions)

Above I've made a point of emphasizing that decision-theoretic utility and E-utility functions are different things. But could there be cases where it makes sense to use one as the other? Specifically, to use decision-theoretic utility functions as E-utility functions? (The reverse clearly doesn't make much sense.)

Well, yes, that's basically what preference utilitarianism is! OK, precise formulations of preference utilitarianism may vary, but the idea is to use people's preferences as E-utility functions; and how are you going to encode people's preferences if not with decision-theoretic utility functions? (OK, this may only really work for a population of idealized agents, but it's still worth thinking about.)

Indeed, we can go further and formalize this with Harsanyi's theorem, which gives a series of moral assumptions (note: among them is that the agents in the population do indeed have decision-theoretic utility functions!) under which morality does indeed come down to maximizing a sort of aggregate of the population's decision-theoretic utility functions.

(Note that it also assumes that the population is fixed, which arguably assumes away a lot of the hard parts of utilitarianism, but it's still a useful starting point.)

But, what is this aggregation? If we think of the agent's utility functions as taking values in

R, as they're usually thought of, then the aggregation consists of summingautility function for each agent. But which one? As mentioned above, utility functions are only unique up to positive affine transformations. Harsanyi's theorem provides no guidance onwhichutility function to use for each agent -- how could it? They're all equally valid. And yet using different ones can yield very different (andmeaningfullydifferent) aggregated results, essentially letting you adjust weightings between agents! Except there's no meaningful notion of "equal weighting" to use as baseline. It's something of a problem.(This is often discussed in terms of "weights", coefficients put in front of the utility functions; but I think this obscures the fundamental issue, in making it sound like there's a meaningful notion of "equal weights" when there really isn't.)

Still, despite these holes, preference utilitarianism and Harsanyi's theorem are definitely worth thinking about.

## Brief note on that third sort of utility function

Finally, before we get to the second part of this post, I wanted to mention that third thing sometimes called a "utility function".

The term "utility function" is sometimes used for a real-valued function that describe's an agents

deterministicpreferences; i.e., if A and B are two options, and U the utility function, then the agent prefers A to B if and only if U(A) > U(B). Note the lack of any requirement here about expected value! This is a weaker sense than a decision-theoretic utility function as I described it above; any decision-theoretic utility function is one of these, but not vice-versa.While you're occasionally encounter this, it's frankly a useless and even counterproductive notion. Why? Because fundamentally, it's

the wrong abstraction for the situation. If uncertainty isn't coming into play, and you're only applying deterministic rationality constraints, then the right structure for describing an agent's preferences is a total preorder. Why would you introduce real numbers? That just restricts what you can express! Not every total preorder will embed in the real numbers. So, there isn't any sensible set of rationality conditions that will lead to this notion of utility function; they'll lead you instead to the idea of a total preorder, and then oops maybe that total preorder will fail to embed inRand the agent won't have a "utility function" in this sense.Such a function is of course only unique up to order-preserving functions on

R, meaning it's not very unique at all (one more sign of it being the wrong abstraction).Why were such functions ever even used, when they're clearly the wrong abstraction? I think basically it's because a lot of people lack familiarity with mathematical structures, or how to build an abstraction to suit a set of requirements, and instead tend to just immediately reach for the real numbers as a familiar setting to put things in. (Honestly, that's probably why decision-theoretic utility functions were initially defined as

R-valued as well; fortunately, in that case, it turns out to be the correct choice! The real numbers can indeed be quite useful...)Of course, as discussed above, if your agent not only obeys requirements of deterministic rationality, but also requirements of rationality under uncertainty, then in fact they'll have a decision-theoretic utility function, taking values in

R, and so will have one of these. So in that sense the assumption of taking these values inRis harmless. But still...## Part 2: Yes, decision-theoretic utility functions really do need to be bounded

OK. Now for the main point: Contrary to widespread opinion on this site, decision-theoretic utility functions really do need to be bounded.

First, I'm going to discuss this in terms of Savage's theorem. I realize this is the less familiar formalism for many here, but I think it's the better one; if you're not familiar with it I recommend reading my post on it. I'll discuss the point in terms of the more familiar VNM formalism shortly.

OK. So under Savage's formalism, well, Savage's theorem tells us that (under Savage's rationality constraints) decision-theoretic utility functions must be bounded. Um, OK, hm, that's not a very helpful way of putting it, is it? Let's break this down some more.

There's one specific axiom that guarantees the boundedness of utility functions: Savage's axiom P7. Maybe we don't need axiom P7? Is P7 really an important rationality constraint? It seems intuitive enough, like a constraint any rational agent should obey -- what rational agent could possibly violate it? -- but maybe we can do without it?

Let's hold that thought and switch tracks to the VNM formalism instead. I mean -- why all this discussion of Savage at all? Maybe we prefer the VNM formalism.

Thatdoesn't guarantee that utility functions are bounded, right?Indeed, as usually expressed, the VNM formalism doesn't guarantee that utility functions are bounded... except the usual VNM formalism doesn't actually prove that utility functions do everything we want!

The point of a decision-theoretic utility function is that it describes the agent's preferences under uncertainty; given two gambles A and B, the one with the higher expected utility (according to the function) is the one the agent prefers.

Except, the VNM theorem doesn't actually prove this for arbitrary gambles! It only proves it for gambles with finitely many possible outcomes. What if we're comparing two gambles and one of them has infinitely many possible outcomes? This is something utility functions are often used for on this site, and a case I think we really do need to handle -- I mean,

anythingcould potentially have infinitely many possible outcomes, couldn't it?Well, in this case, the VNM theorem by itself provides

absolutely no guaranteethat higher expected utility actually describes the agent's preference! Our utility function might simplynot work-- might simply fail to correctly describe the agent's preference -- once gambles with infintely many outcomes are involved!Hm. How troublesome. OK, let's take another look at Savage and his axiom P7. What happens if we toss that out? There's no longer anything guaranteeing that utility functions are bounded. But also, there's no longer anything guaranteeing that the utility function works when comparing gambles with infinitely many outcomes!

Sounds familiar, doesn't it? Just like with VNM. If you don't mind a utility function that might fail to correctly describe your agent's preferences once infinite gambles get involved, then sure, utility functions can be unbounded. But, well, that's really not something we can accept -- we

doneed to be able to handle such cases; or at least, such cases are often discussed on this site. Which means bounded utility functions. There's not really any way around it.And if you're still skeptical of Savage, well, this all has an analogue in the VNM formalism too -- you can add additional conditions to guarantee that the utility function continues to work even when dealing with infinite gambles, but you end up proving in addition that the utility function is bounded. I'm not so familiar with this, so I'll just point to this old comment by AlexMennen for that...

Anyway, point is, it doesn't really matter which formalism you use -- either you accept that utility functions are bounded, or you give up on the idea that utility functions produce meaningful results in the face of infinite gambles, and, as I've already said, the second of these is not acceptable.

Really, the basic reason should go through regardless of the particular formalism; you can't have both unbounded utility functions, and meaningful expected utility comparisons for infinite gambles, because, while the details will depend on the particular formalism, you can get contradictions by considering St. Petersburg-like scenarios. For instance, in Savage's formalism, you could set up two St. Petersburg-like gambles A and B such that the agent necessarily prefers A to B but also necessarily is indifferent between them; forcing the conclusion that in fact the agent's utility function must have been bounded, preventing this setup.

I'd like to note here a consequence of this I already noted in the intro -- a number of the decision-theoretic "paradoxes" discussed on this site simply

are not problemssince they rely on unbounded decision-theoretic utility. An example is the original Pascal's Mugging; yes, I realize that term has since been applied to a bunch of things that have nothing to do with unbounded utility, but the original problem, the one Yudkowsky was actually concerned with, crucially does.And I mean, it's often been noted before that these paradoxes go away if bounded utility is assumed, but the point I want to make is stronger -- that the only reason these "paradoxes" seem to come up at all is because contradictory assumptions are being made! That utility functions can be unbounded,

andthat utility functions work for infinite gambles. One could say "utility functions have to be bounded", but from a different point of view, one could say "expected utility is meaningless for infinite gambles";eitherof these would dissolve the problem, it's only insisting thatneitherof these are acceptable that causes the conflict. (Of course, the second really is unacceptable, but that's another matter.)## Does normalization solve the weighting problem? (I wouldn't bet on it)

One interesting note about bounded utility functions is that it suggests a solution to the weighting problem discussed above with Harsanyi's theorem; notionally, one could use boundedness to pick a canonical normalization -- e.g., choosing everyone's utility function to have infimum 0 and supremum 1. I say it

suggestsa solution rather than that itprovidesa solution, however, in that I've seen nothing to suggest that there's any reason one should actually do that other than it just seeming nice, which, well, is not really a very strong reason for this sort of thing. While I haven't thought too much about it, I'd bet someone can come up with an argument as to why this is actually a really bad idea.(And, again, this would still leave the problem of population ethics, as well as many others, but still, in this idealized setting...)

## Some (bad) arguments against boundedness

Finally, I want to take a moment here to discuss some arguments against boundedness that have come up here.

Eliezer Yudkowsky has argued against this (I can't find the particular comment at the moment, sorry) basically on the idea of that total utilitarianism in a universe that can contain arbitrarily many people requires unbounded utility functions. Which I suppose it does. But, to put it simply,

if your ethical assumptions contradict the mathematics, it's not the mathematics that's wrong.That's being a bit flip, though, so let's examine this in more detail to see just where the problem is.

Eliezer would point out that the utility function is not up for grabs. To which I can only say, yes, exactly -- except that this way of formulating it is slightly less than ideal. We should say instead,

preferencesare not up for grabs -- utility functions merely encode these, remember. But if we're stating idealized preferences (including a moral theory), then these idealized preferences had better be consistent -- and not literally just consistent, but obeying rationality axioms to avoid stupid stuff. Which, as already discussed above, means they'll correspond to a bounded utility function. So if your moral theory is given by anunboundedutility function, then it isnot, in fact, a correct description of anyone's idealized preferences, no matter how much you insist it is, because you're saying that people's idealized (not real!) preferences are, essentially, inconsistent. (I mean, unless you claim that it's not supposed to be valid for infinite gambles, in which case it can I suppose be correct within its domain of applicability, but it won't be a complete description of your theory, which will need some other mechanism to cover those cases; in particular this means your theory will no longer be utilitarian, if that was a goal of yours, and so in particular will not be total-utilitarian.)One could question whether the rationality constraints of Savage (or VNM, or whatever) really apply to an aggregated utility function -- above I claimed this should be treated as a decision-theoretic utility function, but is this claim correct? -- but I think we have to conclude that they do for the same reason that they apply to preferences of ideal agents, i.e., they're supposed to be a consistent set of preferences; an incosistent (or technically consistent but having obvious perversities) moral system is no good. (And one could imagine, in some idealized world, one's ethical theory being programmed as the preferences of an FAI, so...)

Basically the insistence on unbounded utility functions strikes me as, really, backwards reasoning -- the sort of thing that only makes sense if one

startswith the idea of maximizing expected utility (and maybe not distinguishing too strongly between the two different things called "utility functions"), rather than if one starts from agents' actualpreferencesand the rationality constraints these must obey. If one remembers that utility functions are merely meant to describe preferences that obey rationality constraints, there's no reason you'd ever want them to be unbounded; the math rules this out. If one reasons backwards, however, andstartswith the idea of utility functions, it seems like a harmless little variant (it isn't). So, I'd like to encourage everyone reading this to beware of this sort of backwards thinking, and to remember that the primary thing is agents' preferences, and that good rationality constraints are directly interpretable in terms of these. Whereas "the agent has a decision-theoretic utility function"... what does that mean, concretely? Why are there real numbers involved, where did those come from? These are a lot of very strong assumptions to be making with little reason! Of course, there are good reason to believe these strong-sounding claims, such as the use of real numbers specifically; but they make sense asconclusions, not assumptions.## Tangential note about other formalisms (or: I have an axe to grind, sorry)

One final tangential note: Eliezer Yudkowsky has occasionally claimed here that probability and decision-theoretic utility should be grounded not in Savage's theorem but rather in the complete class theorem (thus perhaps allowing unbounded utilities, despite the reasons above the particular formalism shouldn't matter?), but the arguments he has presented for this do not make any sense to me and as best I can tell contain a number of claims that are simply incorrect. Like, obviously, the complete class theorem cannot provide a foundation for probability, when it already assumes a notion of probability; I may be mistaken but it looks to me like it assumes a notion of decision-theoretic utility as well; and his claims about it requiring weaker assumptions than Savage's theorem are not only wrong but likely exactly backwards. Apologies for grinding this axe here, but given how this has come up here before I thought it was necessary. Anyway, see previous discussion on this point, not going to discuss it more here. (Again, sorry for that.)

## Conclusion?

Anyway I hope this has clarified the different things meant by the term "utility function", so you can avoid getting these mixed up in the future, and if you see confused discussion of them you can come in and de-confuse the issue.

...and yes, decision-theoretic utility functions really do need to be bounded.

AddendumMay 9th: I should note, if for some reason you really want to bite the bullet and instead say, OK, utility functions don't apply to choices with infinitely many possible outcomes, then, well, I think that's silly, but it is consistent. But what I want to make clear here is that doing both thatandinsisting utility functions should be unbounded -- which is what you need for "paradoxes" as discussed above to come up -- is just inconsistent; you are at the least going to need to pickoneof the horns of the dilemma