# 18

This is Pascal's Mugging: Someone comes to you and says, "Give me five dollars, and I'll use my powers from outside the matrix to grant you 4^^^^4 years of fun." And they're lying, of course, but under a Solomonoff prior, the probability that they're not, though surely very small, isn't going to be less than one in 3^^^3; and so if you shut up and multiply, it's clear that the expected utility of paying up outweighs the expected utility of anything sensible you might be doing with those five dollars, and therefore—

Well, fortunately, if you're afraid that your utility-maximizing AI will end up paying all its money to the first clever mugger to come along and ask: never to worry! It will do so only if it can't think of anything better to do with five dollars, after all. So to avoid being mugged, all it has to do is to think of a harebrained scheme for spending \$5 that has more than a one-in-4^^^4 chance of providing 5^^^^5 years of fun. Problem solved.

If, however, you would like to be there be a chance greater than one-in-hell that your AI ends up doing something actually useful, you'll need to do something else. And the simplest answer is to adopt a bounded utility function: any positive singularity gives at least 50 utils, a billion years gives 80 utils, a googol years gives 99 utils, a googolplex years gives 99.9 utils, and 4^^^^4 years of fun give 100 utils (minus epsilon).

This will, indeed, solve the problem. Probability of getting mugged: used to be one (minus epsilon, of course); has now been brought down to zero. That's right: zero.

(Plus epsilon.)

But let's suppose that the impossible happens, and the universe turns out to be able to support TREE(100) years of fun, and we've already lived out 4^^^^4 of them, and the AI has long since folded up operations and faded out of existence because humanity has become sufficiently sane that we no longer need it—

And lo, someone comes to you and says, "Alas, you're not really experiencing 4^^^^4 years of fun here; you're really a mere billion-year-old living in a very convincing simulation. Give me five dollars, and I'll use my powers from outside the matrix to extend your lifespan to a googol years."

And they're lying, of course — but it has been a long time indeed since you last faced a choice that could make a difference of nineteen whole utils...

*

If you truly have a bounded utility function, you must agree that in this situation, paying up is exactly what you'd want to do. Even though it means that you will not experience 4^^^^4 years of fun, even conditional on the universe being capable of supporting TREE(100) of them.

[ETA: To clarify, by "4^^^^4", I really mean any number so large that your utility function assigns (100 - epsilon) utils to it. It's possible to have a utility function where this is only true for infinite numbers which are so incredibly infinite that, given a particular formal language, their definition is so long and complicated that no mere human-sized mind could comprehend it. See this comment thread for discussion of bounded utility functions that assign significant weight to very large lifetimes.]

# 18

Pingbacks
New Comment

under a Solomonoff prior, the probability that they're not, though surely very small, isn't going to be less than one in 3^^^3

Can you prove that? If not, is there some fairly compelling reason to believe that it is true anyway?

but it has been a long time indeed since you last faced a choice that could make a difference of nineteen whole utils...

You could have your utility function only count future events, so that you always have 100 utils at stake no matter how long you have lived, instead of having 100 utils at stake if you were just born but only 1 if you have lived for a googol years. I guess this does seem like an ugly hack, though.

If not, is there some fairly compelling reason to believe that it is true anyway?

The information required to describe your body is about an exabyte. Once you have a simulated body, getting answers out is trivial, so we'll call an exabyte an upper limit on what information you could tell someone. 10^18 ish. This means that if you have a utility function, you aren't able to imagine situations complicated enough to have a simplicity prior below 10^-10^18. That is, one part in 1 followed by 10^18 zeroes.

So, in Knuth up arrow notation, how big a reward do I need to promise someone to totally overwhelm the admittedly large resolution offered by our prior probabilities? Let's do an example with tens. 10^10^18 looks a lot like 10^10^10, which is 10^^3. What happens if we go up to 10^^4? Then it's 10^10^10^10, or 10^(10^(10^10)), or 10^(10^^3). That is, it's ten to the number that we were just considering. So just by incrementing this second-exponent, we've raised our answer to an exponent. So offering rewards big enough to totally wash out our prior resolution turns out to be easy.

You could have your utility function only count future events

That runs into problems - like you'd dump toxic waste in your house as long as you only got sick far in the future.

The information required to describe your body is about an exabyte. Once you have a simulated body, getting answers out is trivial, so we'll call an exabyte an upper limit on what information you could tell someone. 10^18 ish. This means that if you have a utility function, you aren't able to imagine situations complicated enough to have a simplicity prior below 10^-10^18. That is, one part in 1 followed by 10^18 zeroes.

Hm, that's an interesting point. On the other hand, "Robin Hanson has suggested penalizing the prior probability of hypotheses which argue that we are in a surprisingly unique position to affect large numbers of other people who cannot symmetrically affect us. Since only one in 3^^^^3 people can be in a unique position to ordain the existence of at least 3^^^^3 other people who can't have a symmetrical effect on this one person, the prior probability would be penalized by a factor on the same order as the utility." (source: LW wiki; I couldn't find where Robin actually said that)

In other words, you can represent the hypothesis with so little information because you can cheat by referring to yourself with a small amount of information, no matter how much information it would take to specify you objectively.

That runs into problems - like you'd dump toxic waste in your house as long as you only got sick far in the future.

Why?

Robin's argument relies on infinite certainty in a particular view of anthropic questions. It penalizes the probability significantly, but doesn't on its own defeat infinity concerns.

If you use EDT, then Robin's argument cashes out as: "if there are 3^^^^3 people, then the effects of my decisions via the typical copies of me are multiplied up by O(3^^^^3), while the effects of my decisions via the lottery winner aren't." So then the effects balance out, and you are down to the same reasoning as if you accepted the anthropic argument. But now you get a similar conclusion even if you assign 1% probability to "I have no idea what's going on re: anthropic reasoning."

Do you think that works?

(Infinity still gets you into trouble with divergent sums, but this seems to work fine if you have a finite but large cap on the value of the universe.)

Coincidentally I just posted on this without having seen the OP.

Yes, but then you're acting on probabilities of ludicrous utilities again, an empirical "stabilizing assumption" in Bostrom's language.

That runs into problems - like you'd dump toxic waste in your house as long as you only got sick far in the future.

Why?

Say that living 50 more years without getting sick was 90 utilons, and the maximum score was 100. This means that there are only 10 utilons with which to describe the quality of your life between 50 years from now and the far future - being healthy 51 years from now is worth only 1/10 as being healthy now. So for each day you can use as you wish this year, you'd be willing to spend 10 days bedridden, or doing boring work, or in jail 50 years from now.

So in a word, procrastination. And because the utility function is actually shifting over time so that it stays 100-points-max, each point in time looks the same - there's no point where they'd stop procrastinating, once they started, unless the rate of work piling up changed.

That's a problem with any sort of discounting, but only counting future events in your utility function does not change that. It doesn't matter whether the next 50 can get you 90 out of 100 available future utils or .09 out of .1 available future utils (where the other 99.9 were determined in the past); your behavior will be the same.

I agree for the typical implementation of discounting - though if someone just had a utility function that got non-exponentially smaller as the numbers on the calendar got bigger, you could see some different behavior.

Hm, you're right. For nonexponential discounting, future!you discounts differently than you want it to if it resets its utility, but not if it doesn't.

An exabyte? Really? 8e18 bits?

(Most values to one sig fig)

Estimate 4e28 total nucleons in the human body (60Kg of nucleons) it takes about 100 bits to describe the number of nucleons. each nucleon takes about two bits of information to specify type (proton, neutron, antiproton, antineutron) Figure that a human is about 6'x2'x1'; that's 1e35x4e34x2e34 plank units. With 8e108 unique locations within that rectangle, each nucleon needs 362 bits to describe location.

Without discussing momentum, it already takes 10^41 bits to describe the location of every nucleon. Any benefit you gain from the distribution of matter not being uniform you lose when describing what the distribution of matter is in a human.

With a discussion of momentum, there is no way to describe an arbitrary nucleon in fixed space, since there is no upper limit to the amount of momentum of a nucleon.

How did you estimate an exabyte for the information content of a human?

How did you estimate an exabyte for the information content of a human?

Googled it. If I had to guess why the number I found was so much smaller, it's probably because they had a scheme more like "for each molecule, describe it and place it with precision much better than its thermal vibration," and maybe a few bits to describe temperature.

But yes, even 10^your number of bits will be smaller than 10^^5. Even if we tracked each possible quantum state that's localized in my body, which would be exponentially more intensive than tracking individual particles, that just means we might have to (but probably not) bump up to a reward of 10^^6 in order to swamp my prior probabilities.

Whatever the information content is, unless you have an argument that it's actually infinite, it's going to be smaller than 3^^^^3.

Describe momentum in fixed space. It is acceptable to have an upper limit on momentum, provided that that limit can be arbitrarily high. It is not acceptable for two different vectors to be described differently.

I'm not sure what the smallest unit of angular measure is, or if it is continuous. If it is continuous, there's no way to describe momentum in finite bits.

You could have your utility function only count future events, so that you always have 100 utils at stake no matter how long you have lived...

If you're capable of self-modification, this wouldn't last long. Your current utility function would be better maximized if future!you would uphold it, instead of his own twisted version, so you modify yourself so that future!you would keep the same utility function.

Not necessarily. Future!you doesn't have much control over events that happen in its past, so you don't have much incentive to ensure that it cares about them. Also, you know that yourself 4^^^^4 years from now will not actually be only a billion years old.

[This comment is no longer endorsed by its author]Reply

Perhaps the mugger claims that there were only a billion fun years.

Good point. My reasoning only applies to situations in which you have information that future!you doesn't, and come to think of it, messing with future!you's utility function seems like an odd way to handle that problem even then.

[-][anonymous]9y 1

I can just write "3^^^3". The interpreter prefix to that to turn it into a concrete universe-possibility is finite and rather small (1Mb to be conservative)

3^^^3*2^(-2^20) is still totally huge.

Edit: but then we're actually interested in relative utility. (which just makes things worse, but for the sake of completeness...) Do the same thing with a couple billion utils and maybe 750kb for a program that simulates a more plausible universe.

(I'm not sure how to actually prove anything in SI, where "prove" would satisfy a mathematician. I'm not even sure it's possible to prove anything interesting)

"I'll take you up on your offer and I would also like to borrow \$5 at 20% APR amortizing over (4^^^^4 - 100) years, if you defer the first 1200 monthly payments." Surely the mugger couldn't refuse such a lucrative offer, and the 4^^^^4 years of fun will clearly be worth the total interest (a dollar a year to live 4^^^^4 years? Sign me up).

In general a reverse-mugging that is conditional on the truth of the mugger's claim seems like the best response.

"Uh... the matrix lords that I am representing don't want us to make that deal. They'll only give you 4^^^^4 years of fun if you give me \$5 right now and I keep it."

As I noted at the beginning of the post, the real problem isn't an actual intelligent mugger, the real problem is that you're able to come up with a harebrained scheme all by yourself, if you consistently maximize expected utility.

I realized that after thinking a little longer. There's nothing from preventing me from believing "There's an Omega who will give me 4^^^^4 years of fun if I decrease my utility by X right now."

One way to approach it is by the likelihood that such a hypothesis could pay rent, in which case the possibility of trade with the Omega is a way to test the truth of the statement. If there's an entity that can dramatically improve my life I would be willing to pay for it with reasonable economic means. If that entity refuses all sane economic means it dramatically lowers the probability that it is a true offer.

Another approach is to consider the opportunity cost of paying this particular entity some of my negative utility when some other entity may offer a better deal. I assume there are an uncountable amount of Omegas who might offer me deals for every possible future utility if I give up some utility now. I only have so much utility that I can reasonably give up right now. Which Omega(s) do I pick to give up my utility to? I certainly shouldn't choose the most expensive Omega that yields the lowest expected return; I should pick the most efficient Omega for my cost in negative utility. What does that Omega look like? Certainly not one that offers 4^^^^4 years of fun for \$5, but one that offers 4^^^^4 years of fun for less than \$5 (or even pays me to experience it). Perhaps I only have \$5, and spending it now would prevent me from ever accepting another deal from a different Omega. Ultimately what this hypothetical choice reduces to is choosing the earliest likely payoff so that I have more money in the future to consider more deals, e.g. skip out on paying Omega \$5 today when I could invest it in the local economy and earn \$6 tomorrow and have even more utility to pay potential Omegas with. I have to assume that any particular instance of Omega is roughly as likely as any other; a mugger appearing and offering me a purported deal from a particular Omega is no proof that the particular Omega exists unless there is additional proof such as being able to take out a loan repayable in the distant future or a verifiable insurance policy against the failure of Omega to deliver on the promise.

EDIT: I should probably try to quantify this. U(deal_N)*P(deal_N_is_true) - U(cost_N) is the expected utility of giving up cost_N utility in exchange for deal_N utility from a potential entity Omega_N. Unless I have some evidence that P(deal_X_is_true) is greater than P(deal_N_is_true) for some X and every other N, I should simply pick the most efficient deal. If P(deal_X_is_true | evidence_X_is_true)*U(deal_X) - U(cost_X) > U(deal_N)*P(deal_N_is_true) - U(cost_N) for some X and all other N, then I should pick X. Is a mugger appearing and offering me a particular deal, X, evidence that X is true? No, unless the mugger (or another entity I can trade with, including myself if I find actual evidence that X is true) also believes the deal sufficiently strongly to sell me insurance. In other words the mugger has to believe the deal will pay rent in order for me to believe it will pay rent.

If your utility function depends on the size of the universe---for example, if you are an "average utilitarian" who averages across all of the possible holes in reality where there might be an observer moment---then you may not run into these problems. When you learn that the universe supports TREE(100) computational steps, you downgrade the value of the life you lived so far to something negligible. But the value of the universe that lies ahead of you is still 1, just the same as ever.

(The instantiation I have in mind is UDASSA; I was prompted to post this by the recent discussion of Pascal's mugging.)

I could totally imagine this having some other wacky behavior though...

True. Of course, average utilitarianism seems whacky to me as such, in particular if physics allows you to influence the size of the universe; it might lead you to choose a small universe if you have a better chance of filling it with people having fun.

Just to confirm... I take it from this that you consider a large universe intrinsically more valuable than a small one, and therefore any value system that leads one to choose a small universe doesn't capture your values. Have I understood you?

Not sure, so let me phrase differently: I consider a universe with a larger number of people all leading interesting, fulfilling, diverse lives as intrinsically more valuable than a universe with a smaller number of such people; e.g., I consider a universe of size googolplex containing 10^15 such people to be more valuable than a universe of size googol containing 10^12 such people. I don't at all feel like an FAI should choose the second option because it has a much larger fraction of (fulfilling lives per slots in the universe where a fulfilling life could be).

OK, thanks.

"If you truly have a bounded utility function"

A bounded utility function that doesn't assign any significant marginal weight, e.g. 0.00000000001, to TREE(100), "largest number I can formulate in a year," or infinite years of fun.

It can only apply significant marginal weight for so long. Just weight until it gets to 100-epsilon utils, and then mug.

Marginal weight at infinite years is interesting. That would likely mean that, after a certain amount of fun, you just put all your resources to trying to get infinite fun.

It can only apply significant marginal weight for so long. Just weight until it gets to 100-epsilon utils, and then mug.

Yes, that was my point (maybe should have been more explicit about it).

Though it's also worth pointing out that with a utility function like Carl is alluding to (where utilities are significantly different if the lifespans are noticeably different to humans), if you've lived for 3^^^3 years and the universe looks capable of supporting 2^(3^^^3) years of fun but not more than that, you will worry about the tiny probability of TREE(100) more than about realizing the 2^(3^^^3) that are actually possible.

Though it's also worth pointing out that with a utility function like Carl is alluding to (where utilities are significantly different if the lifespans are noticeably different to humans),

To be more explicit: my take on this sort of thing is to smear out marginal utility across our conceptual space of such measures:

For years of life I would assign weight to at least (and more than) these regions:

• Human happy lifespan
• Thousands of years of moderately transhuman happiness
• Billions of years of posthuman existence with limited energy budget
• Jupiter-brains
• Atoms-in-the-universe scale
• Exponential growth (to various finite scales)
• Families of 'gigantic finite number' functions found thus far
• Giant finite number functions that could be represented with ludicrous brain-size/resources
• Infinite life-years
• Increase the measure of an infinite lifespan in various ways in an infinite world
• Various complications of infinity (uncountably many life-years, etc :)

I am also tempted to throw in some relative measures:

• Achieving at least fraction X of the life-years I could have obtained (with various numbers and procedures for generating numbers for this) given the laws of physics that in fact occur
• Achieving as many life-years as proportion Y of my generational cohort/Earthlings/whatever
• Achieving life-years in some relationship to the scale of the accessible universe

Simple conceptual space (that can be represented in a terrestrial brain) is limited, and if one cannot 'cover all the bases' one can spread oneself widely enough not to miss opportunities for easy wins when the gains "are...noticeably different to humans." And that seems pretty OK with me.

"Marginal weight at infinite years is interesting. That would likely mean that, after a certain amount of fun, you just put all your resources to trying to get infinite fun."

With these large finite numbers you exhaust all the possible brain states of humans or Jupiter-brains almost at the beginning. Then you have to cycle or scale your sensations and cognition up (which no one has suggested above), and I am not so drastically motivated to be galaxy-sized and blissfully cycling than planet-sized and blissfully cycling. Infinite life-years could be qualitatively different from the ludicrous finite lifespans in not having an end, which is a feature that I can care about.

Carl, thanks for writing this up! I may as well unpack and say that this is pretty much how I have been thinking about the problem, too (though I hadn't considered the idea of relative measures), and I still think I prefer biting the attendant bullets that I can see to the alternatives. But I do at least find it -- well -- worth pointing out that if we in fact achieve one of the higher strata, and we want to be time-consistent, it looks like we're going to stop living our lives on the mainline probability; i.e., if the universe is of size 3^^^3, it seems like we'll spend almost all of the available resources on trying to crack the matrix (even if there is no indication that we live in a matrix) and only an infinitesimal -- combinatorially small -- fraction on actually having fun.

Yes, I do think that this is probably what I will on reflection find to be the right thing, because the combinatorially small fraction pretty much looks like 3^^^3 from my current vantage point and even my middle-distance extrapolations, and as we self-modify to grow larger, since we want to be time-consistent and not regret being time-consistent, we'll design our future selves such that we'll keep feeling that this is the right tradeoff (i.e., this is much better than starting out with a near-certainty of not having fun at all, because our FAI puts all resources into trying to find infinite laws of physics). So perhaps it is simply appropriate (to humanity's utility function) that immense brains spend most of their resources guarding against events of infinitesimal probabilities. But it's sufficiently non-obvious that it at least seems worth keeping in mind.

(Also, amended the post with a note that by "4^^^^4", I really mean "whatever is so large that it is only epsilon away from the upper bound".)

But I do at least find it -- well -- worth pointing out that if we in fact achieve one of the higher strata, and we want to be time-consistent, it looks like we're going to stop living our lives on the mainline probability;

Indeed.

• Families of 'gigantic finite number' functions found thus far

• Giant finite number functions that could be represented with ludicrous brain-size/resources

These strike me as basically the same thing relative to my imagination. The biggest numbers mathematicians can describe using the fast-growing hierarchy for the largest computable ordinals are already too gigantic to... well... they're already too gigantic. Taking the Ackermann function as primitive, I still can't visualize the Goodstein sequence of 16, never mind 17, and I think that's somewhere around w^(w^2) in the fast-growing hierarchy.

The jump to uncomputable numbers / numbers that are unique models of second-order axioms would still be a large further jump, though.

After having read this over twice and read all the comments, I still find myself confused. Isn't the bounded mugging simply predicated on a really strange definition of a utility function? The "simplest" bounded utility function I can think of is time-invariant, in the sense that it should output the same utilities, or the same preference ordering over your choices, whether you're 4^^^^4 years old or merely a billion.

The idea that as you live longer you "accumulate" utility seems to run counter to the basic definition of a utility function (e.g. nyan sandwich's recent post). This idea can be salvaged by saying that your utility function has a term in it for how long you've been alive, and having the bound decrease in proportion to that, but I'm just really not sure why you would want to do that. Without this time term, it seems that you'd simply consider whether gaining ~19 utils * miniscule probability is worth it, and reject the offer in the usual way, since you still have plenty of options for getting ~100.

The solution is that a loss of five dollars only incurs a loss of Epsilon utils in this situation, so you will indeed pay up if you do the expected utility calculation (although I'm not sure I agree that you would want to).

[This comment is no longer endorsed by its author]Reply

under a Solomonoff prior, the probability that they're not, though surely very small, isn't going to be less than one in 3^^^3;

Prove it.

Presumably the idea is that 3^^^3 is computable by a fairly short program (for an abstract Turing machine -- the universe does not contain the resources to actually run it), and so the Solomonoff prior assigns hypotheses involving the number a probability vastly larger than 1/3^^^3. Hence Solomonoff prior + large numbers computable by small programs --> Pascal's mugging.

This could be taken as an argument against using the Solomonoff prior for anything (besides the obvious one, that it's uncomputable). It raises all short hypotheses containing gigantic numbers to prominence, compared with computationally random numbers of the same magnitude, merely because there are short programs that compute them.

Presumably the idea is that 3^^^3 is computable by a fairly short program

Irrelevant.

Solomonoff induction evaluates programs that generate strings of possible future observations conditioned on all your past observations (or equivalently, programs that generate strings of past and future observations, and counts only those that generate a string whose prefix matches your previous observations).
These programs correspond to models of the universe (observed from your particular point of view).

Models of the universe where life goes on as we know it until at a certain point and then suddenly you get 4^^^^4 years of fun are presumbly much more complex than models of the universe where life keeps going on as we know it.

In particular, if Solomonoff induction assigns to models where starting from now you get X years of fun a total probabilty p(X) that asymptotically decreases fast enough with X so that U(X) * p(X) also decreases, it will not be subject to Pascal's mugging.
All reasonable light-tailed probability distribution would have that property w.r.t. any reasonable utility function.
I see no reason to believe that the Solomonoff distribution would behave otherwise.

All reasonable light-tailed probability distribution would have that property w.r.t. any reasonable utility function.

Ahem.

Consider a probabilistic model over the world which is a light-tailed probability distribution w.r.t. any physical quantity. Light-tailed means that p(X) decreases at least exponentially with X.

If your utility function U grows sub-exponentially w.r.t. any physical quantity X (typical utility functions grow sub-linearly), then p(X) * U(X) decreases at least exponentially with X, therfore the mugging fails.

Consider a probabilistic model over the world which is a light-tailed probability distribution w.r.t. any physical quantity. Light-tailed means that p(X) decreases at least exponentially with X.

We do not know what physical quantities exist, and Solomonoff induction requires us to consider all computable possibilities compatible with observations so far. Any distribution p so light-tailed as to go to zero faster than every computable function must itself be uncomputable, and therefore inaccessible to Solomonoff induction. Likewise universally sub-exponentially growing utility functions.

We do not know what physical quantities exist, and Solomonoff induction requires us to consider all computable possibilities compatible with observations so far.

Yes.

Any distribution p so light-tailed as to go to zero faster than every computable function must itself be uncomputable, and therefore inaccessible to Solomonoff induction.

It doesn't have to decrease faster than every computable function, only to decrease at least as fast as an exponential function with negative exponent.

Likewise universally sub-exponentially growing utility functions.

Solomonoff induction doesn't try to learn your utility function.
Clearly, if your utility function is super-exponential, then p(X) * U(X) may diverge even if p(X) is light-tailed.

I can't imagine how you could possibly think that for Solomonoff induction, p(X) would decrease exponentially for any reasonable physical quantity X. Do you really think that the number of bits required to specify a universe with X >= X_0 grows linearly with X_0?

Any natural number X_0 can be generated by a program of a length KolgomorovComplexity(X_0), which is at most log_2(X_0) + C bits, where C is a constant dependent on the UTM and not dependent on X_0.
Numbers with a special form such as 4^^^^4, obviously have a Kolmogorov complexity much smaller than log_2(X_0) + C.

But that's besides the point.

In the scenario we are considering, the agent makes the prediction after it has been offered the deal. Solomonoff induction considers only programs which are consistent with the observation history, which includes the observation of having been offered the deal. Therefore, all these programs already include the specification of the amount of valuables being offered (X = 4^^^^4 years of fun, in this example).
This means that the 2^-(KolgomorovComplexity(X)) factor in the probability estimate is normalized out.

Yes, the relevant probability is conditional on having been offered the deal. I still have no idea how this could possibly help.

Let's say that your utility is logarithmic in X, the years of fun. What you need to avoid the mugging is that P(more than X years of fun iff I pay up | deal for X years offered) goes to zero faster than log(X) grows. But the number of bits in the program "offer n^^^n years of fun, and grant this iff subject pays up" grows only linearly with n, while log(n^^^n) grows, well, superexponentially. Normalizing by the total weight of all programs that offer the deal makes the probability go further up, not down.

What you need to avoid the mugging is that P(more than X years of fun iff I pay up | deal for X years offered) goes to zero faster than log(X) grows

No.

In order to avoid the mugging you need just that
P(more than X years of fun iff I pay up | deal for X years offered) (log(X) - MarginalU(\$5)) < P(Y years of fun iff I pay up | deal for X years offered) (log(Y) - MarginalU(\$5))
where Y is the number of years of fun you will get anyway if the mugger gives you nothing (assuming, for simplicity, that the mugger either gives you X or more years of fun or gives you nothing, but the argument can be generalized to scenarios where the mugger may give you Z < X years of fun)

the number of bits in the program "offer n^^^n years of fun, and grant this iff subject pays up" grows only linearly with n

That seems incorrect.

I think that the source of your confusion stems from the fact that the program length of the (shortest) program "write n^^^n on the tape" is K(n^^^n) <= log_2(n) .
But the length of programs consistent with "offer n^^^n years of fun, and grant this iff subject pays up" must grow faster than K(n^^^n).

Suppose that length of the shortest program A(n^^^n) = "offer n^^^n years of fun, and grant this iff subject pays up" grows with K(n^^^n).
Then, up to some additive constant, Len(A(n^^^n)) is the length of the concatenation of two shortest programs:
W(n^^^n) = "write n^^^n on the tape" and
B = "read X from the tape, offer X years of fun, and grant this iff subject pays up"
where Len(W(n^^^n)) = K(n^^^n) and Len(B) is a constant independent of n^^^n.

Consider the posterior:
post = p(grant this iff subject pays up | offer n^^^n years of fun) = p("offer n^^^n years of fun, and grant this iff subject pays up") / p("offer n^^^n years of fun")

Define the shortest program O(n^^^n) = "offer n^^^n years of fun".
Again, if Len(O(n^^^n)) grows with K(n^^^n), then up to some additive constant Len(O(n^^^n)) is the length of the concatenation of W(n^^^n) and program C = "read X from the tape, offer X years of fun" which doesn't depend on n^^^n.

That is,
Len(A(n^^^n)) = K(n^^^n) + Len(B)
Len(O(n^^^n)) = K(n^^^n) + Len(C)

Therefore
post =~= 2^-(Len(A(n^^^n)) - Len(O(n^^^n))) =
= 2^-((K(n^^^n) + Len(B)) - (K(n^^^n) + Len(C))) =
= 2^-(Len(B) - Len(C))

As you can see this posterior doesn't depend on n^^^n or even on n, which is clearly inconsistent with the notion (formalized in a theorem by Solomonoff) that Solomonoff induction learns an accurate model of the environment.
Therefore, the assumption that there is a shortest program consistent with "offer n^^^n years of fun, and grant this iff subject pays up" whose length grows with K(n^^^n) must be incorrect.

What you need to avoid the mugging is that P(more than X years of fun iff I pay up | deal for X years offered) goes to zero faster than log(X) grows

No.

Okay, you're saying that as X goes up, the probability of getting X years of fun even if you don't pay up also goes up, because any program that offers a deal of X years has to include a specification of the number X? So the expected utility of not paying up doesn't stay constant as we vary X, but increases with X (at least asymptotically, for very large X)?

Well, you're right on that, and that's in fact a point I hadn't considered, thanks. But I was replying to this:

In particular, if Solomonoff induction assigns to models where starting from now you get X years of fun a total probabilty p(X) that asymptotically decreases fast enough with X so that U(X) * p(X) also decreases, it will not be subject to Pascal's mugging.

If by this you mean something else than that P(more than X years of fun iff I pay up | deal for X years offered) log(X) -> 0, then I don't understand what you mean by p(X). If you mean e.g. p(X) = P(deal for X years offered), then why would p(X) U(X) -> 0 avoid Pascal's mugging? (Not that it does go to zero.)

As you can see this posterior doesn't depend on n^^^n or even on n, which is clearly inconsistent with the notion (formalized in a theorem by Solomonoff) that Solomonoff induction learns an accurate model of the environment.

That theorem says, roughly (actually I'm just giving a particular consequence), that given a particular world program, after seeing a certain finite number of bits produced by this program, Solomonoff induction will predict all future bits correctly. The particular number of bits needed initially depends, of course, on the program. (More generally: Given a computable probability distribution over world programs, with probability one there is some number of bits after which Solomonoff induction's conditional distribution over subsequent bits will equal the "true" conditional distribution.) Of course, Solomonoff's theorem only allows the agent to observe the environment, not interact with it, but that doesn't seem to be the issue here (we can consider Hutter's variants instead).

You are not keeping the world program fixed, and for each world program considered, you only talk about what happens after the agent has a certain number of bits (which is fixed given the world program, i.e. you don't let it tend to infinity), so the theorem does not apply.

What antecedent do you want to deny in your argument, anyway? If your argument worked, it would still work if we replaced "grant X years of fun" by "write the symbol FUN on the tape X times". But there is certainly a program B that reads X from the internal tape, writes X to the output tape, reads a symbol from the input tape, and writes FUN X times iff the input symbol is ACCEPT, and similarly for C and W(n^^^n).