Against the Linear Utility Hypothesis and the Leverage Penalty

14th Dec 2017

18Sniffnoy

4AlexMennen

2Sniffnoy

3ESRogs

3AlexMennen

1Richard_Ngo

4AlexMennen

2Sniffnoy

13SilentCal

9AlexMennen

11Stuart_Armstrong

9SilentCal

6AlexMennen

5entirelyuseless2

3AlexMennen

3DragonGod

3itaibn0

4AlexMennen

3daozaich

2Chris_Leong

2Ben Pace

1daozaich

2shminux

2AlexMennen

2Ben Pace

2TheMajor

3AlexMennen

2TheMajor

1Douglas_Reay

1Nighzmarquls

5AlexMennen

1DragonGod

5AlexMennen

2DragonGod

3AlexMennen

2DragonGod

4AlexMennen

1Dacyn

4AlexMennen

1Dacyn

2AlexMennen

3Dacyn

2AlexMennen

1Dacyn

0DragonGod

4AlexMennen

-1DragonGod

New Comment

47 comments, sorted by Click to highlight new comments since: Today at 5:27 AM

Absolutely it is the case that utility should be bounded. However as best I can tell you've left out the most fundamental reason why, so I think I should explain that here. (Perhaps I should make this a separate post?)

The basic question is: Where do utility functions come from? Like, why should one model a rational agent as having a utility function at all? The answer of course is either the VNM theorem or Savage's theorem, depending on whether or not you're pre-assuming probability (you shouldn't, really, but that's another matter). Right, both these theorems take the form of, here's a bunch of conditions any rational agent should obey, let's show that such an agent must in fact be acting according to a utility function (i.e. trying to maximize its expected value).

Now here's the thing: The utility functions output by Savage's theorem are always bounded. Why is that? Well, essentially, because otherwise you could set up a St. Petersburg paradox that would contradict the assumed rationality conditions (in short, you can set up two gambles, both of "infinite expected utility", but where one dominates the other, and show that both A. the agent must prefer the first to the second, but also B. the agent must be indifferent between them, contradiction). Thus we conclude that the utility function must be bounded.

OK, but what if we base things around the VNM theorem, then? It requires pre-assuming the notion of probability, but the utility functions output by the VNM theorem aren't guaranteed to be bounded.

Here's the thing: The VNM theorem *only guarantees that the utility function it outputs works for finite gambles*. Seriously. The VNM theorem gives *no* guarantee that the agent is acting according to the specified utility function when presented with a gamble with infinitely many possible outcomes, only when presented with a gamble with finitely many outcomes.

Similarly, with Savage's theorem, the assumption that forces utility functions to be bounded -- P7 -- is the same one that guarantees that the utility function works for infinite gambles. You can get rid of P7, and you'll no longer be guaranteed to get a bounded utility function, but neither will you be guaranteed that the utility function will work for gambles with infinitely many possible outcomes.

This means that, fundamentally, if you want to work with infinite gambles, you need to only be talking about bounded utility functions. If you talk about infinite gambles in the context of unbounded utility functions, well, you're basically talking nonsense, because there's just absolutely no guarantee that the utility function you're using applies in such a situation. The problems of unbounded utility that Eliezer keeps pointing out, that he insists we need to solve, really are just straight contradictions arising from him making bad assumptions that need to be thrown out. Like, they all stem from him assuming that unbounded utility functions work in the case of infinite gambles, and there simply *is no such guarantee*; not in the VNM theorem, not in Savage's theorem.

If you're assuming infinite gambles, you need to assume bounded utility functions, or else you need to accept that in cases of infinite gambles the utility function doesn't actually apply -- making the utility function basically useless, because, well, everything has infinitely many possible outcomes. Between a utility function that remains valid in the face of infinite gambles, and unbounded utility, it's pretty clear you should choose the former.

And between Savage's axiom P7 and unbounded utility, it's pretty clear you should choose the former. Because P7 is an assumption that directly describes a rationality condition on the agent's preferences, a form of the sure-thing principle, one we can clearly see had better be true of any rational agent; while unbounded utility... means what, exactly, in terms of the agent's preferences? Something, certainly, but not something we obviously need. And in fact we don't need it.

As best I can tell, Eliezer keeps insisting we need unbounded utility functions out of some sort of commitment to total utilitarianism or something along the lines of such (that's my summary of his position, anyway). I would consider that to be on **much** shakier ground (there are *so* many nonobvious large assumptions for something like that to even make sense, seriously I'm not even going into it) than obvious things like the sure-thing principle, or that a utility function is nearly useless if it's not valid for infinite gambles. And like I said, as best I can tell, Eliezer keeps assuming that the utility function is valid in such situations even though there's nothing guaranteeing this; and this assumption is just in contradiction with his assumption of an unbounded utility function. He should keep the validity assumption (which we need) and throw out the unboundedness one (which we don't).

That, to my mind, is the most fundamental reason we should only be considering bounded utility functions!

I'm not familiar with Savage's theorem, but I was aware of the things you said about the VNM theorem, and in fact, I often bring up the same arguments you've been making. The standard response that I hear is that some probability distributions cannot be updated to without an infinite amount of information (e.g. if a priori the probability of the nth outcome is proportional to 1/3^n, then there can't be any evidence that could occur with nonzero probability that would convince you that the probability of the nth outcome is 1/2^n for each n), and there's no need for a utility function to converge on gambles that it is impossible even in theory for you to be convinced are available options.

When I ask why they assume that their utility function should be valid on those infinite gambles that are possible for them to consider, if they aren't assuming that their preference relation is closed in the strong topology (which implies that the utility function is bounded), they'll say something like that their utility function not being valid where their preference relation is defined seems weirdly discontinuous (in some sense that they can't quite formalize and definitely isn't the preference relation being closed in the strong topology), or that the examples I gave them of VNM-rational preference relations for which the utility function isn't valid for infinite gambles all have some other pathology, like that there's an infinite gamble which is considered either better than all of or worse than all of the constituent outcomes, and there might be a representation theorem saying something like that has to happen, even though they can't point me to one.

Anyway, I agree that that's a more fundamental reason to only consider bounded utility functions, but I decided I could probably be more convincing by abandoning that line of argument, and showing that if you sweep convergence issues under the rug, unbounded utility functions still suggest insane behavior in concrete situations.

Huh, I only just saw this for some reason. Anyway, if you're not familiar with Savage's theorem, that's why I wrote the linked article here about it! :)

The problems of unbounded utility that Eliezer keeps pointing out, that he insists we need to solve, really are just straight contradictions arising from him making bad assumptions that need to be thrown out. Like, they all stem from him assuming that unbounded utility functions work in the case of infinite gambles

Just to be clear, you're not thinking of 3↑↑↑3 when you talk about infinite gambles, right?

I'm not sure I know what argument of Eliezer's you're talking about when you reference infinite gambles. Is there an example you can link to?

He means gambles that can have infinitely many different outcomes. This causes problems for unbounded utility functions because of the Saint Petersburg paradox.

But the way you solve the St Petersburg paradox in real life is to note that nobody has infinite money, nor infinite time, and therefore it doesn't matter if your utility function spits out a weird outcome for it because you can have a prior of 0 that it will actually happen. Am I missing something?

Huh, I only just saw this for some reason.

Anyway yes AlexMennen has the right of it.

I don't have an example to hand of Eliezer's remarks. By which, I remember seeing on old LW, but I can't find it at the moment. (Note that I'm interpreting what he said... very charitably. What he actually said made considerably less sense, but we can perhaps steelman it as a strong commitment to total utilitarianism.)

I'm not sure your refutation of the leverage penalty works. If there really are 3 ↑↑↑ 3 copies of you, your decision conditioned on that may still not be to pay. You have to compare

P(A real mugging will happen) x U(all your copies die)

against

P(fake muggings happen) x U(lose five dollars) x (expected number of copies getting fake-mugged)

where that last term will in fact be proportional to 3 ↑↑↑ 3. Even if there is an incomprehensibly vast matrix, its Dark Lords are pretty unlikely to mug you for petty cash. And this plausibly does make you pay in the Muggle case, since P(fake muggings happen) is way down if 'mugging' involves tearing a hole in the sky.

Yes, it looks like you're right. I'll think about this and probably write a follow-up later. Edit: I have finally written that follow-up.

Good points (so I upvoted), but the post could be half as long and make the same points better.

I think I disagree with your approach here.

I, and I think most people in practice, use reflective equilibrium to decide what our ethics are. This means that we can notice that our ethical intuitions are insensitive to scope, but also that upon reflection it seems like this is wrong, and thus adopt an ethics different from that given by our naive intuition.

When we're trying to use logic to decide whether to accept an ethical conclusion counter to our intuition, it's no good to document what our intuition currently says as if that settles the matter.

A priori, 1,000 lives at risk may seem just as urgent as 10,000. But we think about it, and we do our best to override it.

And in fact, I fail pretty hard at it. I'm pretty sure the amount I give to charity wouldn't be different in a world where the effectiveness of the best causes were an order of magnitude different. I suspect this is true of many; certainly anyone following the Giving What We Can pledge is using an ancient Schelling Point rather than any kind of calculation. But that doesn't mean you can convince me that my "real" ethics doesn't care how many lives are saved.

When we talk about weird hypotheticals like Pascallian deals, we aren't trying to figure out what our intuition says; we're trying to figure out whether we should overrule it.

When you use philosophical reflection to override naive intuition, you should have explicit reasons for doing so. A reason for valuing 10,000 lives 10 times as much as 1,000 lives is that both of these are tiny compared to the total number of lives, so if you valued them at a different ratio, this would imply an oddly sharp bend in utility as a function of lives, and we can tell that there is no such bend because if we imagine that there were a few thousand more or fewer people on the planet, our intuitions about that that particular tradeoff would not change. This reasoning does not apply to decisions affecting astronomically large numbers of lives, and I have not seen any reasoning that does which I find compelling.

It is also not true that people are trying to figure out whether to overrule their intuition when they talk about Pascal's mugging; typically, they are trying to figure out how to justify not overruling their intuition. How else can you explain the preponderence of shaky "resolutions" to Pascal's mugging that accept the Linear Utility Hypothesis and nonetheless conclude that you should not pay Pascal's mugger, when "I tried to estimate the relevent probabilities fairly conservatively, multiplied probabilities times utilities, and paying Pascal's mugger came out far ahead" is usually not considered a resolution?

This is all basically right.

However, as I said in a recent comment, people do not actually have utility functions. So in that sense, they have neither a bounded nor an unbounded utility function. They can only try to make their preferences less inconsistent. And you have two options: you can pick some crazy consistency very different from normal, or you can try to increase normality at the same time as increasing consistency. The second choice is better. And in this case, the second choice means picking a bounded utility function, and the first choice means choosing an unbounded one, and going insane (because agreeing to be mugged is insane.)

Yes, that's true. The fact that humans are not actually rational agents is an important point that I was ignoring.

I've changed my mind. I agree with this.

The probability of you getting struck by lightning and dying while making your decision is .

The probability of you dying by a meteor strike, by an earthquake, by .... is .

The probability that you don't get to complete your decision for one reason or the other is .

It doesn't make sense then to entertain probabilities vastly lower than , but not entertain probabilities much higher than .

What happens is that our utility is bounded at or below .

This is because we ignore probabilities at that order of magnitude via revealed preference theory.

If you value at times more utility than , then you are indifferent between exchanging for a times chance of .

I'm not indifferent between exchanging for a chance of anything, so my utility is bounded at . It is possible that others deny this is true for them, but they are being inconsistent. They ignore other events with higher probability which may prevent them from deciding, but consider events they assign vastly lower probability.

Yes, this is the refutation for Pascal's mugger that I believe in, although I never got around to writing it up like you did. However, I disagree with you that it implies that our utilities must be bounded. All the argument shows is that ordinary people never assign to events enourmous utility values with also assigning them commensuably low probabilities. That is, normative claims (i.e., claims that certain events have certain utility assigned to them) are judged fundamentally differently from factual claims, and require more evidence than merely the complexity prior. In a moral intuitionist framework this is the fact that anyone can say that 3^^^3 lives are suffering, but it would take living 3^^^3 years and getting to know 3^^^3 people personally to feel the 3^^^3 times utility associated with this events.

I don't know how to distinguish the scenarios where our utilities are bounded and where our utilities are unbounded but regularized (or whether our utilities are suffiently well-defined to distinguish the two). Still, I want to emphasize that the latter situation is possible.

All the argument shows is that ordinary people never assign to events enourmous utility values with also assigning them commensuably low probabilities.

Under certain assumptions (which perhaps could be quibbled with), demanding that utility values can't grow too fast for expected utility to be defined actually does imply that the utility function must be bounded.

unbounded but regularized

I'm not sure what you mean by that.

The assumption I'm talking about is that the state of the rest of the universe (or multiverse) does not affect the marginal utility of there also being someone having certain experiences at some location in the uni-/multi-verse.

Now, I am not a friend of probabilities / utilities separately; instead, consider your decision function.

Linearity means that your decisions are independent of observations of far parts of the universe. In other words, you have one system over which your agent optimizes expected utility; and now compare it to the situation where you have two systems. Your utility function is linear iff you can make decisions locally, that is, without considering the state of the other system.

Clearly, almost nothing has a linear decision / utility function.

I think people mistake the following (amazing) heuristic for linear utility: If there are very many local systems, and you have a sufficiently smooth utility and probability distribution for all of them, then you can do mean-field: You don't need to look, the law of large numbers guarantees strong bounds. In this sense, you don't need to couple all the systems, they just all couple to the mean-field.

To be more practical: Someone might claim to have almost linear (altruistic) utility for QALYs over the 5 years (so time-discounting is irrelevant). Equivalently, whether some war in the middle east is terrible or not does not influence his/her malaria-focused charity work (say, he/she only decides on this specific topic).

Awesome, he/she does not need to read the news! And this is true to some extent, but becomes bullshit at the tails of the distribution. (the news become relevant if e.g. the nukes fly, because they bring you into a more nonlinear regime for utility; on the other hand, given an almost fixed background population, log-utility and linear-utility are indistinguishable by Taylor's rule)

Re pascal's muggle: Obviously your chance of getting a stroke and hallucinating weird stuff outweighs your chance of witnessing magic. I think that it is quite clear that you can forget about the marginal cost of giving 5 bucks to an imaginary mugger before the ambulance arrives to maybe save you; decision-theoretically, you win by precommitting to pay the mugger and call an ambulance once you observe something sufficiently weird.

There is a number of good reasons why one would refuse to pay the Pascal's Muggle, and one is based on computational complexity: **Small probabilities are costly. One needs more computing resources to evaluate tiny probablitites accurately.** Once they are tiny enough, your calculations will take too long to be of interest. And the numbers under consideration exceed 10^122, the largest number of quantum states in the observable universe, by so much, you have no chance of making an accurate evaluation before the heat death of the universe.

Incidentally, this is one of the arguments in physics against the AMPS black hole firewall paradox: Collecting the outgoing Hawking radiation takes way longer than the black hole evaporation time.

This also matches the intuitive approach. If you are confronted by something as unbelievable as "In the sky above, a gap edged by blue fire opens with a horrendous tearing sound", as Eliezer colorfully puts it, then the first thing to realize is that you have no way to evaluate the probability of this event being "real" without much more work than you can possibly do in your lifetime. Or at least in the time the Mugger wants you to make a decision. Even to choose whether to precommit to something like that requires more resources that you are likely to have. So the Eliezer's statement " *If you assign superexponentially infinitesimal probability to claims of large impacts, then apparently you should ignore the possibility of a large impact even after seeing huge amounts of evidence.* " has two points of failure based on computational complexity:

- Assigning an accurate superexponentially infinitesimal probability is computationally very expensive.
- Figuring out the right amount of updating after "seeing huge amounts of [extremely surprising] evidence is also computationally very expensive.

So the naive approach, "this is too unbelievable to take seriously, no way I can figure out what is true, what is real and what is fake with any degree of confidence, might as well not bother" is actually the reasonable one.

If you have an unbounded utility function, then putting a lot of resources into accurately estimating arbitrarily tiny probabilities can be worth the effort, and if you can't estimate them very accurately, then you just have to make do with as accurate an estimate as you can make.

I promoted this to featured, in small part because I'm happy about additions to important conversations (pascal's mugging), in medium part because of the good conversation it inspired (both in the comments and in this post by Zvi), and in large part because I want people to know that these types of technical posts are really awesome and totally a good fit for the LW frontpage, and if they contain good ideas they'll totally be promoted.

I agree with your conclusions and really like the post. Nevertheless I would like to offer a defense of rejecting Pascal's Mugging even with an unbounded utility function, although I am not all that confident that my defense is actually correct.

Warning: *slight rambling and half-formed thoughts below. Continue at own risk.*

If I wish to consider the probability of the event [a stranger blackmails me with the threat to harm 3↑↑↑3 people] we should have some grasp of the likeliness of the claim [there exists a person/being that can willfully harm 3↑↑↑3 people]. There are two reasons I can think of why this claim is so problematic that perhaps we should assign it on the order of 1/3↑↑↑3 probability.

Firstly while the Knuth up-arrow notation is cutely compact, I think it is helpful to consider the set of claims [there exists a being that can willfully harm *n* people] for each value of *n. *Each claim implies all those before it, so the probabilities of this sequence should be decreasing. From this point of view I find it not at all strange that the probability of this claim can make explicit reference to the number *n *and behave as something like *1/n. *The point I'm trying to make is that while 3↑↑↑3 is only 5 symbols, the claim to be able to harm that many people is **so much more extraordinary **than a a lot of similar claims that we might be justified in assigning it really low probability. Compare it to an artifical lottery where we force our computer to draw a number between 0 and 10^10^27 (deliberately chosen larger than the 'practical limit' referred to in the post). I think we can justly assign the claim [The number 3 will be the winning lottery ticket] a probability of 1/10^10^27. Something similar is going on here: there are so many hypothesis about being able to harm *n* people that in order to sum the probabilities to 1 we are forced to assign on the order of 1/n probability to each of them.

Secondly (and I hope this reinforces the slight rambling above) consider how you might be convinced that this stranger can harm 3↑↑↑3 people, as opposed to only has the ability to harm 3↑↑↑3 - 1 people. I think the 'tearing open the sky' magic trick wouldn't do it - this will increase our confidence in the stranger being very powerful by extreme amounts (or, more realistically, convince us that we've gone completely insane), but I see no reason why we would be forced to assign significant probability to this stranger being able to harm 3↑↑↑3 people, instead of 'just' 10^100 or 10^10^26 people or something. Or in more Bayesian terms - which evidence *E* is more likely if this stranger can harm 3↑↑↑3 people than if it can harm, say, only 10^100 people? Which *E *satisfies P(E|stranger can harm 3↑↑↑3 people) > P(E|stranger can harm 10^100 people but not 3↑↑↑3 people)? Any suggestions along the lines of 'present 3↑↑↑3 people and punch them all in the nose' justifies having a prior of 1/3↑↑↑3 for this event, since showing that many people really is evidence with that likelihood ratio. But, hypothetically, if all such evidence *E *are of this form, are our actions then not consistent with the infinitesimal prior, since we require this particular likelihood ratio before we consider the hypothesis likely?

I hope I haven't rambled too much, but I think that the Knuth up-arrow notation is hiding the complexity of the claim in Pascal's Mugging, and that the evidence required to convince me that a being really has the power to do as is claimed has a likelyhood ratio close to 3↑↑↑3:1.

It is not true that if someone has the ability to harm n people, then they also necessarily have the ability to harm exactly m of those people for any m<n, so it isn't clear that P(there is someone who has the ability to harm n people) monotonically decreases as n increases. Unless you meant *at least* n people, in which case that's true, but still irrelevent since it doesn't even establish that this probability approaches a limit of 0, much less that it does so at any particular rate.

Compare it to an artifical lottery where we force our computer to draw a number between 0 and 10^10^27 (deliberately chosen larger than the 'practical limit' referred to in the post). I think we can justly assign the claim [The number 3 will be the winning lottery ticket] a probability of 1/10^10^27.

The probability that a number chosen randomly from the uniform distribution on integers from 0 to 10^10^27 is 3 is indeed 1/10^10^27, but I wouldn't count that as an empirical hypothesis. Given any particular mechanism for producing integers that you are very confident implements a uniform distribution on the integers from 0 to 10^10^27, the probability you should assign to it producing 3 is still much higher than 1/10^10^27.

Or in more Bayesian terms - which evidence E is more likely if this stranger can harm 3↑↑↑3 people than if it can harm, say, only 10^100 people?

The stranger says so, and has established some credibility by providing very strong evidence for other a priori very implausible claims that they have made.

Any suggestions along the lines of 'present 3↑↑↑3 people and punch them all in the nose' justifies having a prior of 1/3↑↑↑3 for this event, since showing that many people really is evidence with that likelihood ratio.

That's not evidence that is physically possible to present to a human, and I don't see why you say its likelihood ratio is around 1:3↑↑↑3.

I think I will try writing my reply as a full post, this discussion is getting longer than is easy to fit as a set of replies. You are right that my above reply has some serious flaws.

Another approach might be to go meta. Assume that there are many dire threats theoretically possible which, if true, would justify a person in the sole position stop them, doing so at near any cost (from paying a penny or five pounds, all the way up to the person cutting their own throat, or pressing a nuke launching button that would wipe out the human species). Indeed, once the size of action requested in response to the threat is maxed out (it is the biggest response the individual is capable of making), all such claims are functionally identical - the magnitiude of the threat beyond that needed to max out the response, is irrelevant. In this context, there is no difference between **3↑↑↑3** and **3↑↑↑↑3** .

But, what policy upon responding to claims of such threats, should a species have, in order to maximise expected utility?

The moral hazard from encouraging such claims to be made falsely needs to be taken into account.

It is that moral hazard which has to be balanced against a pool of money that, species wide, should be risked on covering such bets. Think of it this way: suppose I, Pascal's Policeman, were to make the claim "On behalf of the time police, in order to deter confidence tricksters, I hereby guarantee that an additional utility will be added to the multiverse equal in magnitude to the sum of all offers made by Pascal Muggers that happen to be telling the truth (if any), in exchange for your not responding positively to their threats or offers."

It then becomes a matter of weighing the evidence presented by different muggers and policemen.

While reading your article I had trouble continuing to take your perspective in good faith after I got to this point:

**"For instance, suppose you're given a choice between the following two options: 1: Humanity grows into a vast civilization of 10^100 people living long and happy lives, or 2: a 10% chance that humanity grows into a vast civilization of 10^102 people living long and happy lives, and a 90% chance of going extinct right now. I think almost everyone would pick option 1, and would think it crazy to take a reckless gamble like option 2. But the Linear Utility Hypothesis says that option 2 is much better.** "

It seems like selectively choosing a utility function that does not weight the negative utility of the state of 'anti-lives' that having NO civilization of people living long and happy lives at all in the entire universe would represent.

I think you could tune a linear relationship of these negative values and accurately get the behavior that people have regarding these options.

This seems a lot like picking the weakest and least credible possible argument to use as way to refute the entire idea. Which made it much more difficult for me to read the rest of your article with the benefit of the doubt I would have prefered to have held through out.

The Linear Utility Hypothesis does imply that there is no extra penalty (on top of the usual linear relationship between population and utility) for the population being zero, and it seems to me that it is common for people to assume the Linear Utility Hypothesis unmodified by such a zero-population penalty. Furthermore, a zero-population penalty seems poorly motivated to me, and still does not change the answer that Linear Utility Hypothesis + zero-population penalty would suggest in the thought experiment that you quoted, since you can just talk about populations large enough to dwarf the zero-population penalty.

Refuting a weak argument for a hypothesis is not a good way to refute the hypothesis, but that's not what I'm doing; I'm refuting weak *consequences* of the Linear Utility Hypothesis, and "X implies Y, but not Y" is a perfectly legitimate form of argument for "not X".

: 1: Humanity grows into a vast civilization of 10^100 people living long and happy lives, or 2: a 10% chance that humanity grows into a vast civilization of 10^102 people living long and happy lives, and a 90% chance of going extinct right now. I think almost everyone would pick option 1, and would think it crazy to take a reckless gamble like option 2. But the Linear Utility Hypothesis says that option 2 is much better. Most of the ways people respond to Pascal's mugger don't apply to this situation, since the probabilities and ratios of utilities involved here are not at all extreme.

This is not an airtight argument.

Extinction of humanity is not 0 utils, it's negative utils. Let the utility of human extinction be -X.

If X > 10^101, then a linear utility function would pick option 1.

Linear Expected Utility (LEU) of option 1:

1.0(10^100) = 10^100.

LEU of option 2:

0.9(-X) + 0.1(10^102) = 10^101 + 0.9(-X)

10^101 - 0.9X < 10^100

-0.9X < 10^100 - 10^101

-0.9X < -9(10^100)

X > 10^101.

I place the extinction of humanity pretty highly, as ot curtails any possible future. So X is always at least as high as the utopia. I would not accept any utopia where the P of human extinction was > 0.51, because the negutility of human extinction outweighs utility of utopia for any possible utopia.

Extinction of humanity just means humanity not existing in the future, so the Linear Utility Hypothesis does imply its value is 0. If you make an exception and add a penalty for extinction that is larger than the Linear Utility Hypothesis would dictate, then the Linear Utility Hypothesis applied to other outcomes would still imply that when considering sufficiently large potential future populations, this extinction penalty becomes negligible in comparison.

See this thread. **there is no finite number of lives that reach utopia, for which I would accept Omega's bet at a 90% chance of extinction**.

Human extinction now for me is worse than losing 10 trillion people, if the global population was 100 trillion.

My disutility of extinction isn't just the number of lives lost. It involves the termination of all future potential of humanity, and I'm not sure how to value that, but see the bolded.

I don't assign a disutility to extinction, and my preference with regards to extinction is probably lexicographic with respect to some other things (see above).

That's a reasonable value judgement, but it's not what the Linear Utility Hypothesis would predict.

My point is that for me, extinction is not equivalent to losing current amount of lives now.

Human extinction now for me is worse than losing 10 trillion people, if the global population was 100 trillion.

This is because extinction destroys all potential future utility. It destroys thw potential of humanity.

I'm saying that extinction can't be evaluated normally, so you need a better example to state your argument against LUH.

Extinction now is worse than losing X people, if the global human population is 10 X, irregardless of how large X is.

That position above is **independent** of the linear utility hypothesis.

It was specified that the total future population in each scenario was 10^100 and 10^102. These numbers are the future people that couldn't exist if humanity goes extinct.

I think the most natural argument for the Linear Utility Hypothesis is implicit in your third sentence: the negation of the LUH is that the state of the rest of the universe *does* affect the marginal utility of someone having certain experiences. This seems to imply that you value the experiences not for their own sake, but only for their relation to the rest of humanity. Now your "small-scale counterexample" seems to me pointing out that people value experiences not *only* for their own sake but also for their relations with other experiences. But this does not contradict that people also do value the experiences for their own sake.

So maybe we need LUH+: the utility has a linear term plus some term that depends on relations.

I don't know what to make of Pascal's Mugger. I don't think we have very good intuitions about it. Maybe we should pay the mugger (not that this is an offer).

Here's a weakened version of LUH+ that might be more plausible: maybe you value experiences for their own sake, but once an experience is instantiated somewhere in the universe you don't care about instantiating it again. There are enough possible experiences that this variation doesn't do much to change "astronomical waste" type arguments, but it does at least bring Pascal's mugger down to a somewhat reasonable level (nowhere near 3^^^3).

I don't see the statement of the negation of LUH as a compelling argument for LUH.

LUH+ as you described it is somewhat underspecified, but it's an idea that might be worth looking into. That said, I think it's still not viable in its current form because of the other counterexamples.

maybe you value experiences for their own sake, but once an experience is instantiated somewhere in the universe you don't care about instantiating it again.

I don't think the marginal value of an experience occuring should be 0 if it also occurs elsewhere. That assumption (together with the many-worlds interpretation) would suggest that quantum suicide should be perfectly fine, but even people who are extremely confident in the many-worlds interpretation typically typically feel that quantum suicide is about as bad an idea as regular suicide.

Furthermore, I don't think it makes sense to classify pairs of experiences as "the same" or "not the same". There may be two conscious experiences that are so close to identical that they may as well be identical as far as your preferences over how much each conscious experience occurs are concerned. But there may be a long chain of conscious experiences, each of which differing imperceptibly from the previous, such that the experiences on each end of the chain are radically different. I don't think it makes sense to have a discontinuous jump in the marginal utility of some conscious experience occuring as it crosses some artificial boundary between those that are "the same" and those that are "different" from some other conscious experience. I do think it makes sense to discount the marginal value of conscious experiences occuring based on similarity to other conscious experiences that are instantiated elsewhere.

I think usually when the question is whether one thing affects another, the burden of proof is on the person who says it does affect the other. Anyway, the real point is that people do claim to value experiences for their own sake as well as for their relations to other experiences, and translated into utility terms this seems to imply LUH+.

I'm not sure what you mean by "the other counterexamples": I think your first example is just demonstrating that people's System 1s mostly replace the numbers 10^100 and 10^102 by 100 and 102 respectively, not indicating any deep fact about morality. (To be clear, I mean that people's sense of how "different" the 10^100 and 10^102 outcomes are seems to be based on the closeness of 100 and 102, not that people don't realize that one is 100 times bigger.)

LUH+ is certainly underspecified as a utility function, but as a hypothesis about a utility function it seems to be reasonably well specified?

Introducing MWI seems to make the whole discussion a lot more complicated: to get a bounded utility function you need the marginal utility of an experience to decrease with respect to how many similar experiences there are, but MWI says there are always a lot of similar experiences, so marginal utility is always small? And if you are dealing with quantum randomness then LUH just says you want to maximize the total quantum measure of experiences, which is a monotonicity hypothesis rather than a linearity hypothesis, so it's harder to see how you can deny it. Personally I don't think that MWI is relevant to decision theory: I think I only care about what happens in this quantum branch.

I agree with you about the problems that arise if you want a utility function to depend on whether experiences are "the same".

I think the counterexamples I gave are clear proof that LUH is false, but as long as we're playing burden of proof tennis, I disagree that the burden of proof should lie on those who think LUH is false. Human value is complicated, and shouldn't be assumed by default to satisfy whatever nice properties you think up. A random function from states of the universe to real numbers will not satisfy LUH, and while human preferences are far from random, if you claim that they satisfy any particular strong structural constraint, it's on you to explain why. I also disagree that "valuing experiences for their own sake" implies LUH; that somewhat vague expression still sounds compatible with the marginal value of experiences decreasing to zero as the number of them that have already occured increases.

Preferences and bias are deeply intertwined in humans, and there's no objective way to determine whether an expressed preference is due primarily to one or the other. That said, at some point, if an expression of preference is sufficiently strongly held, and the arguments that it is irrational are sufficiently weak, it gets hard to deny that preference has anything to do with it, even if similar thought patterns can be shown to be bias. This is where I'm at with 10^100 lives versus a gamble on 10^102. I'm aware scope insensitivity is a bias that can be shown to be irrational in most contexts, and in those contexts, I tend to be receptive to scope sensitivity arguments, but I just cannot accept that I "really should" take a gamble that has a 90% chance of destroying everything, just to try to increase population 100-fold after already winning pretty decisively. Do you see this differently? In any case, that example was intended as an analog of Pascal's mugging with more normal probabilities and ratios between the stakes, and virtually no one actually thinks it's rational to pay Pascal's mugger.

And if you are dealing with quantum randomness then LUH just says you want to maximize the total quantum measure of experiences, which is a monotonicity hypothesis rather than a linearity hypothesis, so it's harder to see how you can deny it.

What are you trying to say here? If you only pay attention to ordinal preferences among sure outcomes, then the restriction of LUH to this context is a monotonicity hypothesis, which is a lot more plausible. But you shouldn't restrict your attention only to sure outcomes like that, since pretty much every choice you make involves uncertainty. Perhaps you are under the misconception that epistemic uncertainty and the Born rule are the same thing?

this quantum branch

That doesn't actually mean anything. You are instantiated in many Everette branches, and those branches will each in turn split into many further branches.

Okay. To be fair I have two conflicting intuitions about this. One is that if you upload someone and manage to give him a perfect experience, then filling the universe with computronium in order to run that program again and again with the same input isn't particularly valuable; in fact I want to say it's not any more valuable than just having the experience occur once.

The other intuition is that in "normal" situations, people are just different from each other. And if I want to evaluate how good someone's experience is, it seems condescending to say that it's less important because someone else already had a similar experience. How similar can such an experience be, in the real world? I mean they are different people.

There's also the fact that the person having the experience likely doesn't care about whether it's happened before. A pig on a factory farm doesn't care that its experience is basically the same as the experience of any other pig on the farm, it just wants to stop suffering. On the other hand it seems like this argument could apply to the upload as well, and I'm not sure how to resolve that.

Regarding 10^100 vs 10^102, I recognize that there are a lot of ways in which having such an advanced civilization could be counted as "winning". For example there's a good chance we've solved Hilbert's sixth problem by then which in my book is a pretty nice acheivement. And of course you can only do it once. But does it, or any similar metric that depends on the boolean existence of civilization, really compare to the 10^100 lives that are at stake here? It seems like the answer is no, so tentatively I would take the gamble, though I could imagine being convinced out of it. Of course, I'm aware I'm probably in a minority here.

It seems like people's response to Pascal's Mugging is partially dependent on framing; e.g. the "astronomical waste" argument seems to get taken at least slightly more seriously. I think there is also a non-utilitarian flinching at the idea of a decision process that could be taken advantage of so easily -- I think I agree with the flinching, but recognize that it's not a utilitarian instinct.

I do realize that epistemic uncertainty isn't the same as the Born rule; that's why I wrote "if you are dealing with quantum randomness" -- i.e. if you are dealing with a situation where all of your epistemic uncertainty is caused by quantum randomness (or in MWI language, where all of your epistemic uncertainty is indexical uncertainty). Anyway, it sounds like you agree with me that under this hypothesis MWI seems to imply LUH, but you think that the hypothesis isn't satisfied very often. Nevertheless, it's interesting that whether randomness is quantum or not seems to be having consequences for our decision theory. Does it mean that we want more of our uncertainty to be quantum, or less? Anyway, it's hard for me to take these questions too seriously since I don't think MWI has decision-theoretic consequences, but I thought I would at least raise them.

Almost every ordinary use of language presumes that it makes sense to talk about the Everett branch that one is currently in, and about the branch that one will be in in the future. Of course these are not perfectly well-defined concepts, but since when have we restricted our language to well-defined concepts?

Okay. To be fair I have two conflicting intuitions about this. One is that if you upload someone and manage to give him a perfect experience, then filling the universe with computronium in order to run that program again and again with the same input isn't particularly valuable; in fact I want to say it's not any more valuable than just having the experience occur once.

The other intuition is that in "normal" situations, people are just different from each other. And if I want to evaluate how good someone's experience is, it seems condescending to say that it's less important because someone else already had a similar experience. How similar can such an experience be, in the real world? I mean they are different people.

My intuition here is that the more similar the experiences are to each other, the faster their marginal utility diminishes.

There's also the fact that the person having the experience likely doesn't care about whether it's happened before.

That's not clear. As I was saying, an agent having an experience has no way of refering to one instantiation of itself separately from other identical instantiations of the agent having the same experience, so presumably the agent cares about all instantiations of itself in the same way, and I don't see why that way must be linear.

Regarding 10^100 vs 10^102, I recognize that there are a lot of ways in which having such an advanced civilization could be counted as "winning". For example there's a good chance we've solved Hilbert's sixth problem by then which in my book is a pretty nice acheivement. And of course you can only do it once. But does it, or any similar metric that depends on the boolean existence of civilization, really compare to the 10^100 lives that are at stake here? It seems like the answer is no, so tentatively I would take the gamble, though I could imagine being convinced out of it.

To be clear, by "winning", I was refering to the 10^100 flourishing humans being brought into existence, not glorious intellectual achievements that would be made by this civilization. Those are also nice, but I agree that they are insignificant in comparison.

Anyway, it sounds like you agree with me that under this hypothesis MWI seems to imply LUH, but you think that the hypothesis isn't satisfied very often.

I didn't totally agree; I said it was more plausible. Someone could care how their copies are distributed among Everette branches.

Nevertheless, it's interesting that whether randomness is quantum or not seems to be having consequences for our decision theory. Does it mean that we want more of our uncertainty to be quantum, or less?

Yes, I am interested in this. I think the answer to your latter question probably depends on what the uncertainty is about, but I'll have to think about how it depends on that.

Hmm. I'm not sure that reference works the way you say it does. If an upload points at itself and the experience of pointing is copied, it seems fair to say that you have a bunch of individuals pointing at themselves, not all pointing at each other. Not sure why other forms of reference should be any different. Though if it does work the way you say, maybe it would explain why uploads seem to be different from pigs... unless you think that the pigs can't refer to themselves except as a group either.

I disagree with your conclusion (as I explained here, but upvoting anyway for the content of the post).

Also, it is possible to have a linear utility function and still reject Pascal's mugger if:

- Your linear utility function is bounded.
- You apply my rule.

Thus, LUH is independent of rejecting Pascal's muggle?

This result has even stronger consequences than that the Linear Utility Hypothesis is false, namely that utility is bounded.

Not necessary. You don't need to suppose bounded utility to explain rejecting Pascal's muggle.

My reason for rejecting pascal muggle is this.

If there exists a set of states E_j such that:

- P(E_j) < epsilon.
- There does not exist E_k (E_k is not a subset of E_j and P(E_k) < epsilon).

Then I ignore E_j in decision making in singleton decision problems. In iterated decision problems, the value for epsilon depends on the number of iterations.

I don't have a name for this principle (and it is an ad-hoc patch I added to my decision theory to prevent EU from being dominated by tiny probabilities of vast utilities).

This patch is different from bounded utility, because you might ignore a set of atates in a singleton problem, but consider same set in an iterated problem.

[Roughly the second half of this is a reply to: Pascal's Muggle]

There's an assumption that people often make when thinking about decision theory, which is that utility should be linear with respect to amount of stuff going on. To be clear, I don't mean linear with respect to amount of money/cookies/etc that you own; most people know better than that. The assumption I'm talking about is that the state of the rest of the universe (or multiverse) does not affect the marginal utility of there also being someone having certain experiences at some location in the uni-/multi-verse. For instance, if 1 util is the difference in utility between nothing existing, and there being a planet that has some humans and other animals living on it for a while before going extinct, then the difference in utility between nothing existing and there being n copies of that planet should be n utils. I'll call this the Linear Utility Hypothesis. It seems to me that, despite its popularity, the Linear Utility Hypothesis is poorly motivated, and a very poor fit to actual human preferences.

The Linear Utility Hypothesis gets implicitly assumed a lot in discussions of Pascal's mugging. For instance, in Pascal's Muggle, Eliezer Yudkowsky says he “[doesn't] see any way around” the conclusion that he must be assigning a probably at most on the order of 1/3↑↑↑3 to the proposition that Pascal's mugger is telling the truth, given that the mugger claims to be influencing 3↑↑↑3 lives and that he would refuse the mugger's demands. This implies that he doesn't see any way that influencing 3↑↑↑3 lives could not have on the order of 3↑↑↑3 times as much utility as influencing one life, which sounds like an invocation of the Linear Utility Hypothesis.

One argument for something kind of like the Linear Utility Hypothesis is that there may be a vast multiverse that you can influence only a small part of, and unless your utility function is weirdly nondifferentiable and you have very precise information about the state of the rest of the multiverse (or if your utility function depends primarily on things you personally control), then your utility function should be locally very close to linear. That is, if your utility function is a smooth function of how many people are experiencing what conditions, then the utility from influencing 1 life should be 1/n times the utility of having the same influence on n lives, because n is inevitably going to be small enough that a linear approximation to your utility function will be reasonably accurate, and even if your utility function isn't smooth, you don't know what the rest of the universe looks like, so you can't predict how the small changes you can make will interact with discontinuities in your utility function. This is a scaled-up version of a common argument that you should be willing to pay 10 times as much to save 20,000 birds as you would be willing to pay to save 2,000 birds. I am sympathetic to this argument, though not convinced of the premise that you can only influence a tiny portion of what is actually valuable to you. More importantly, this argument does not even attempt to establish that utility is globally linear, and counterintuitive consequences of the Linear Utility Hypothesis, such as Pascal's mugging, often involve situations that seem especially likely to violate the assumption that all choices you make have tiny consequences.

I have never seen anyone provide a defense of the Linear Utility Hypothesis itself (actually, I think I've been pointed to the VNM theorem for this, but I don't count that because it's a non-sequitor; the VNM theorem is just a reason to use a utility function in the first place, and does not place any constraints on what that utility function might look like), so I don't know of any arguments for it available for me to refute, and I'll just go ahead and argue that it can't be right because actual human preferences violate it too dramatically. For instance, suppose you're given a choice between the following two options: 1: Humanity grows into a vast civilization of 10^100 people living long and happy lives, or 2: a 10% chance that humanity grows into a vast civilization of 10^102 people living long and happy lives, and a 90% chance of going extinct right now. I think almost everyone would pick option 1, and would think it crazy to take a reckless gamble like option 2. But the Linear Utility Hypothesis says that option 2 is much better. Most of the ways people respond to Pascal's mugger don't apply to this situation, since the probabilities and ratios of utilities involved here are not at all extreme.

There are smaller-scale counterexamples to the Linear Utility Hypothesis as well. Suppose you're offered the choice between: 1: continue to live a normal life, which lasts for n more years, or 2: live the next year of a normal life, but then instead of living a normal life after that, have all your memories from the past year removed, and experience that year again n more times (your memories getting reset each time). I expect pretty much everyone to take option 1, even if they expect the next year of their life to be better than the average of all future years of their life. If utility is just a naive sum of local utility, then there must be some year in which has at least as much utility in it as the average year, and just repeating that year every year would thus increase total utility. But humans care about the relationship that their experiences have with each other at different times, as well as what those experiences are.

Here's another thought experiment that seems like a reasonable empirical test of the Linear Utility Hypothesis: take some event that is familiar enough that we understand its expected utility reasonably well (for instance, the amount of money in your pocket changing by $5), and some ludicrously unlikely event (for instance, the event in which some random person is actually telling the truth when they claim, without evidence, to have magic powers allowing them to control the fates of arbitrarily large universes, and saying, without giving a reason, that the way they use this power is dependent on some seemingly unrelated action you can take), and see if you become willing to sacrifice the well-understood amount of utility in exchange for the tiny chance of a large impact when the large impact becomes big enough that the tiny chance of it would be more important if the Linear Utility Hypothesis were true. This thought experiment should sound very familiar. The result of this experiment is that basically everyone agrees that they shouldn't pay the mugger, not only at much higher stakes than the Linear Utility Hypothesis predicts should be sufficient, but even at arbitrarily large stakes. This result has even stronger consequences than that the Linear Utility Hypothesis is false, namely that utility is bounded. People have come up with all sorts of absurd explanations for why they wouldn't pay Pascal's mugger even though the Linear Utility Hypothesis is true about their preferences (I will address the least absurd of these explanations in a bit), but there is no better test for whether an agent's utility function is bounded than how it responds to Pascal's mugger. If you take the claim “My utility function is unbounded”, and taboo “utility function” and "unbounded", it becomes “Given outcomes A and B such that I prefer A over B, for any probability p>0, there is an outcome C such that I would take B rather than A if it lets me control whether C happens instead with probability p.” If you claim that one of these claims is true and the other is false, then you're just contradicting yourself, because that's what “utility function” means. That can be roughly translated into English as “I would do the equivalent of paying the mugger in Pascal's mugging-like situations”. So in Pascal's mugging-like situations, agents with unbounded utility functions don't look for clever reasons not to do the equivalent of paying the mugger; they just pay up. The fact that this behavior is so counterintuitive is an indication that agents with unbounded utility functions are so alien that you have no idea how to empathize with them.

The “least absurd explanation” I referred to for why an agent satisfying the Linear Utility Hypothesis would reject Pascal's mugger, is, of course, the leverage penalty that Eliezer discusses in Pascal's Muggle. The argument is that any hypothesis in which there are n people, one of whom has a unique opportunity to affect all the others, must imply that a randomly selected one of those n people has only a 1/n chance of being the one who has influence. So if a hypothesis implies that you have a unique opportunity to affect n people's lives, then this fact is evidence against this hypothesis by a factor of 1:n. In particular, if Pascal's mugger tells you that you are in a unique position to affect 3↑↑↑3 lives, the fact that you are the one in this position is 1 : 3↑↑↑3 evidence against the hypothesis that Pascal's mugger is telling the truth. I have two criticisms of the leverage penalty: first, that it is not the actual reason that people reject Pascal's mugger, and second, that it is not a correct reason for an ideal rational agent to reject Pascal's mugger.

The leverage penalty can't be the actual reason people reject Pascal's mugger because people don't actually assign probability as low as 1/3↑↑↑3 to the proposition that Pascal's mugger is telling the truth. This can be demonstrated with thought experiments. Consider what happens when someone encounters overwhelming evidence that Pascal's mugger actually is telling the truth. The probability of the evidence being faked can't possibly be less than 1 in 10^10^26 or so (this upper bound was suggested by Eliezer in Pascal's Muggle), so an agent with a leverage prior will still be absolutely convinced that Pascal's mugger is lying. Eliezer suggests two reasons that an agent might pay Pascal's mugger anyway, given a sufficient amount of evidence: first, that once you update to a probability of something like 10^100 / 3↑↑↑3, and multiply by the stakes of 3↑↑↑3 lives, you get an expected utility of something like 10^100 lives, which is worth a lot more than $5, and second, that the agent might just give up on the idea of a leverage penalty and admit that there is a non-infinitesimal chance that Pascal's mugger may actually be telling the truth. Eliezer concludes, and I agree, that the first of these explanations is not a good one. I can actually demonstrate this with a thought experiment. Suppose that after showing you overwhelming evidence that they're telling the truth, Pascal's mugger says “Oh, and by the way, if I was telling the truth about the 3↑↑↑3 lives in your hands, then X is also true,” where X is some (a priori fairly unlikely) proposition that you later have the opportunity to bet on with a third party. Now, I'm sure you'd be appropriately cautious in light of the fact that you would be very confused about what's going on, so you wouldn't bet recklessly, but you probably would consider yourself to have some special information about X, and if offered good enough odds, you might see a good opportunity for profit with an acceptable risk, which would not have looked appealing before being told X by Pascal's mugger. If you were really as confident that Pascal's mugger was lying as the leverage prior would imply, then you wouldn't assume X was any more likely than you thought before for any purposes not involving astronomical stakes, since your reason for believing X is predicated on you having control over astronomical stakes, which is astronomically unlikely.

So after seeing the overwhelming evidence, you shouldn't have a leverage prior. And despite Eliezer's protests to the contrary, this does straightforwardly imply that you never had a leverage prior in the first place. Eliezer's excuse for using a leverage prior before but not after seeing observations that a leverage prior predicts are extremely unlikely is computational limitations. He compares this to the situation in which there is a theorem X that you aren't yet aware you can prove, and a lemma Y that you can see is true and you can see implies X. If you're asked how likely X is to be true, you might say something like 50%, since you haven't thought of Y, and then when asked how likely X&Y is to be true, you see why X is probably true, and say something like 90%. This is not at all analogous to a “superupdate” in which you change priors because of unlikely observations, because in the case of assigning probabilities to mathematical claims, you only need to think about Y, whereas Eliezer is trying to claim that a superupdate can only happen when you actually observe that evidence, and just thinking hypothetically about such evidence isn't enough. A better analogy to the situation with the theorem and lemma would be when you initially say that there's a 1 in 3↑↑↑3 chance that Pascal's mugger was telling the truth, and then someone asks what you would think if Pascal's mugger tore a hole in the sky, showing another copy of the mugger next to a button, and repeating the claim that pushing the button would influence 3↑↑↑3 lives, and then you think “oh in that case I'd think it's possible the mugger's telling the truth; I'd still be pretty skeptical, so maybe I'd think there was about a 1 in 1000 chance that the mugger is telling the truth, and come to think of it, I guess the chance of me observing that evidence is around 10^-12, so I'm updating right now to a 10^-15 chance that the mugger is telling the truth.” Incidentally, if that did happen, then this agent would be very poorly calibrated, since if you assign a probability of 1 in 3↑↑↑3 to a proposition, you should assign a probability of at most 10^15 / 3↑↑↑3 to ever justifiably updating that probability to 10^-15. If you want a well-calibrated probability for an absurdly unlikely event, you should already be thinking about less unlikely ways that your model of the world could be wrong, instead of waiting for strong evidence that your model of the world actually is wrong, and plugging your ears and shouting “LA LA LA I CAN'T HEAR YOU!!!” when someone describes a thought experiment that suggests that the overwhelmingly most likely way the event could occur is for your model to be incorrect. But Eliezer perplexingly suggests ignoring the results of these thought experiments unless they actually occur in real life, and doesn't give a reason for this other than “computational limitations”, but, uh, if you've thought of a thought experiment and reasoned though its implications, then your computational limitations apparently aren't strict enough to prevent you from doing that. Eliezer suggests that the fact that probabilities must sum to 1 might force you to assign near-infinitesimal probabilities to certain easy-to-state propositions, but this is clearly false. Complexity priors sum to 1. Those aren't computable, but as long as we're talking about computational limitations, by Eliezer's own estimate, there are far less than 10^10^26 mutually disjoint hypotheses a human is physically capable of even considering, so the fact that probabilities sum to 1 cannot force you to assign a probability less than 1 in 10^10^26 to any of them (and you probably shouldn't; I suggest a “strong Cromwell's rule” that empirical hypotheses shouldn't be given probabilities less than 10^-10^26 or so). And for the sorts of hypotheses that are easy enough to describe that we actually do so in thought experiments, we're not going to get upper bounds anywhere near that tiny.

And if you do assign a probability of 1/3↑↑↑3 to some proposition, what is the empirical content of this claim? One possible answer is that this means that the odds at which you would be indifferent to betting on the proposition are 1 : 3↑↑↑3, if the bet is settled with some currency that your utility function is close to linear with respect to across such scales. But the existence of such a currency is under dispute, and the empirical content to the claim that such a currency exists is that you would make certain bets with it involving arbitrarily extreme odds, so this is a very circular way to empirically ground the claim that you assign a probability of 1/3↑↑↑3 to some proposition. So a good empirical grounding for this claim is going to have to be in terms of preferences between more familiar outcomes. And in terms of payoffs at familiar scales, I don't see anything else that the claim that you assign a probability of 1/3↑↑↑3 to a proposition could mean other than that you expect to continue to act as if the probability of the proposition is 0, even conditional on any observations that don't give you a likelihood ratio on the order of 1/3↑↑↑3. If you claim that you would superupdate long before then, it's not clear to me what you could mean when you say that your current probability for the proposition is 1/3↑↑↑3.

There's another way to see that bounded utility functions, not leverage priors, are Eliezer's (and also pretty much everyone's) true rejection to paying Pascal's mugger, and that is the following quote from Pascal's Muggle: “I still feel a bit nervous about the idea that Pascal's Muggee, after the sky splits open, is handing over five dollars while claiming to assign probability on the order of 10^9/3↑↑↑3 that it's doing any good.” This is an admission that Eliezer's utility function is bounded (even though Eliezer does not admit that he is admitting this) because the rational agents whose utility functions are bounded are exactly (and tautologically) characterized by those for which there exists a probability p>0 such that the agent would not spend [fixed amount of utility] for probability p of doing any good, no matter what the good is. An agent satisfying the Linear Utility Hypothesis would spend $5 for a 10^9/3↑↑↑3 chance of saving 3↑↑↑3 lives. Admitting that it would do the wrong thing if it was in that situation, but claiming that that's okay because you have an elaborate argument that the agent can't be in that situation even though it can be in situations in which the probability is lower and can also be in situations in which the probability is higher, strikes me as an exceptionally flimsy argument that the Linear Utility Hypothesis is compatible with human values.

I also promised a reason that the leverage penalty argument is not a correct reason for rational agents (regardless of computational constraints) satisfying the Linear Utility Hypothesis to not pay Pascal's mugger. This is that in weird situations like this, you should be using updateless decision theory, and figure out which policy has the best a priori expected utility and implementing that policy, instead of trying to make sense of weird anthropic arguments before updatefully coming up with a strategy. Now consider the following hypothesis: “There are 3↑↑↑3 copies of you, and a Matrix Lord will approach one of them while disguised as an ordinary human, inform that copy about his powers and intentions without offering any solid evidence to support his claims, and then kill the rest of the copies iff this copy declines to pay him $5. None of the other copies will experience or hallucinate anything like this.” Of course, this hypothesis is extremely unlikely, but there is no assumption that some randomly selected copy coincidentally happens to be the one that the Matrix Lord approaches, and thus no way for a leverage penalty to force the probability of the hypothesis below 1/3↑↑↑3. This hypothesis and the Linear Utility Hypothesis suggest that having a policy of paying Pascal's mugger would have consequences 3↑↑↑3 times as important as not dying, which is worth well over $5 in expectation, since the probability of the hypothesis couldn't be as low as 1/3↑↑↑3. The fact that actually being approached by Pascal's mugger can be seen as overwhelming evidence against this hypothesis does nothing to change that.

Edit: I have written a follow-up to this.