Torture vs. Dust vs. the Presumptuous Philosopher: Anthropic Reasoning in UDT

19Eliezer Yudkowsky

8Tyrrell_McAllister

18[anonymous]

3Eliezer Yudkowsky

16Jordan

2Eliezer Yudkowsky

0D_Malik

0Technologos

1Eliezer Yudkowsky

0Technologos

11D_Malik

8mendel

5Chris_Leong

3Vladimir_Nesov

4Wei Dai

-2SforSingularity

2Paul Crowley

2timtyler

2SforSingularity

5Wei Dai

3AnnaSalamon

5Vladimir_Nesov

1Wei Dai

3Vladimir_Nesov

2Wei Dai

2Vladimir_Nesov

3SforSingularity

1Jonathan_Graehl

1Wei Dai

New Comment

An important point about the Presumptuous Philosopher is that so far the apparent universe *has* kept on getting bigger and bigger, and someone back at the dawn of time who said, "I don't know *how* exactly, but I predict that as we learn more about reality our model of the world will just keep getting bigger and bigger and bigger" would have been exactly and shockingly *right*.

At this point I apply the meta-principle of rationality, "Don't criticize people when they are right , even if you disagree with how they got there - wait to criticize their methods for an occasion when they are actually wrong." I.e., the Presumptuous Philosopher is a poor battleground on which to attack SIA, because if used as a heuristic in the past - correctly, not to stake everything on particular theories, but to say that whatever winning theory would make the apparent universe larger - it would have racked up an unbroken string of victories.

I.e., the Presumptuous Philosopher is a poor battleground on which to attack SIA, because if used as a heuristic in the past - correctly, not to stake everything on particular theories, but to say that whatever winning theory would make the apparent universe larger - it would have racked up an unbroken string of victories.

I don't think that that's right. Back when it was Big Bang vs Steady State, Steady State lost, even though it implied a larger Universe.

Maybe the *ultimate* winning theory will hold the Universe to be larger still. But, insofar as there have yet been victories, that one did not go to the one advocating the larger universe.

Hello from five years in the future. Multiverse theories of what caused the Big Bang are now taken very seriously. Creation just got larger *again*.

The Presumptuous Philosopher doesn't think that in every particular clash the bigger hypothesis wins, because there are many possible ways for the universe to be big. The Presumptuous Philosopher just says the universe eventually turns out to be big. Big Bang won over Steady State, but in an open universe (which I believe this is) the Big Bang universe is "infinite" anyway. Now, infinity is a special case, and I suspect that the configuration space may be finite; but that's a separate story - the point is, the universe still turned out to be very big, just not in the particular Steady State way.

Suppose we modify the situation so that there is an infinite list of possible Theories of Everything, T1, T2, T3, ... With each theory predicting a trillion observers, a trillion trillion observers, a trillion trillion trillion observers, and so on. Additionally, from some other prior info gleaned from scientific insight, the probability of Tn being true is 2^(-n).

SIA here would suggest we live in the universe with the most observers... which in this case doesn't exist.

Huh! This provides a surprisingly non-zero amount of justification for an intuition I thought was probably just me being stupid - namely my attempt to rescale my utilities as fractions of the total number of observers or integral of observer-moments inside a universe, for purposes of cross-universe comparison.

I don't think this is a good argument against SIA. The most natural interpretation of SIA gives a posterior of zero for each Ti, so it doesn't actually say we "live in the universe with the most people".

"T{i+1} is half a trillion times as likely as Ti" doesn't actually imply "T{i+1} is *more* likely than Ti", because those probabilities are both zero.

Even if we somehow constrained the number of observers to be bounded above by (e.g.) the number of cubed Planck lengths in the universe, our hypothesis under SIA would be... that all such cubes did in fact contain an observer?

Reasons I hate the Presumptuous Philosopher thought experiment:

- It depends partly on question-begging. "We all know SIA is false, therefore the philosopher's claims are ridiculous, therefore SIA is false."
- The philosopher's confidence is only really warranted if he's 100% sure of SIA, which he shouldn't be, and this makes his posterior much more ridiculous than it would be if he were e.g. only 95% sure of SIA. The thought experiment sneaks in our philosophical uncertainty as an anti-SIA intuition.
- The Nobel committee won't award him the prize because he's just assumed SIA, not justified it. Suppose some other dude assumed inverse-SIA and came to the opposite conclusion - is the Nobel committee obligated to award the prize to one of these prestige-grubbing jackasses just because one of them inevitably gets the right answer? If he instead wrote an airtight thousand-page treatise proving SIA completely, and then applied that to the T1-vs-T2 question, I think the Nobel committee would be more likely to give him the prize.

It is bad to apply statistics when you don't in fact have large numbers - we have just one universe (at least until the many-world theory is better established - and anyway, the exposition didn't mention it).

I think the following problem is equivalent to the one posed: It is late at night, you're tired, and it's dark and you're driving down an unfamiliar road. Then you see two motels, one to the right of the street, one to the left, both advertising vacant rooms. You know from a visit years ago that one has 10 rooms, the other has 100, but you can't tell which is which (though you do remember that the larger one is cheaper). Anyway, you're tired, so you just choose the one on the right at random, check in, and go to sleep. As you wake up in the morning, what are your chances that you find yourself in the larger motel? Does the number of rooms come into it? (Assume both motels are 90% full.)

The paradox is that while the other hotel is not contrafactual, it might as well be - the problem will play out the same. Same with the universe - there aren't actually two universes with probabilities on which one you'll end up in.

For a version where the Bayesian update works, you'd not go to the motel directly, but go to a tourist information stall that directs vistors to either the smaller or the larger motel until both are full - in that case, expect to wake up in the larger one. In this case, we have not one world, but two, and then the reasoning holds.

But if there's only one motel, because the other burnt down (and we don't know which), we're back to 50/50.

I know that "fuzzy logic" tries to mix statistics and logic, and many AIs use it to deal with uncertain assertions, but statistics can be misapplied so easily that you seem to have a problem here.

I think attempting to link epistemology to morality is going down the wrong path. Instead, it seems that morality should be built on epistemology. Instead of comparing dollars to the disaster, we should be put everything in the same currency. So for example, each individual having the option to choose to either:

a) Get one dust speck in their eye if they are in the large world

b) Get a hundred dust specks in their eye if they are in the small world

Converting the hundred dust specks into torture just unnecessarily complicates the problem.

Ultimately, this path leads to such unenlightening answers as there being a preference among programs that the agent should execute (with no uncertainty anywhere), and so choice of action is just choice among these programs, which is correct if it's according to that preference (order) -- a "moral" decision. UDT surfaces some mistaken assumptions in the usual decision theories, mainly in situations of unusual levels of craziness, but it doesn't actually answer any of the questions. It just heals the mistakes and pushes the buck to unspecified "preference".

I don't quite understand your comment. When you say "this path leads to such unenlightening answers" what path are you referring to? If you mean the path of considering anthropic reasoning problems in the UDT framework, I don't see why that must be unenlightening. It seems to me that we can learn something about the nature of both anthropic reasoning and preferences in UDT through such considerations.

For example, if someone has strong intuitions or arguments for or against SIA, that would seem to have certain implications about his preferences in UDT, right?

I think that Wei has a point: it is in principle possible to hold preferences and an epistemology such that, via his link, you are contradicting yourself.

For example, if you believe SIA and think that you should be a utility-maximizer, then you are committed to risking a 50% probability of killing someone to save $1, which many people may find highly counter-intuitive.

The "Presumptuous Philosopher" thought experiment does not seem like a convincing argument against the Self-Indication Assumption to me. Rather, the Self-Indication Assumption seems pretty self-evident.

The philosopher doesn't get a Nobel prize - but it's not because she is wrong - it is because she is stating the obvious.

In UDT, no Bayesian updating occurs, and in particular, you don't update on the fact that you exist

To be honest, I am suspicious of both UDT and Modal Realism. I find that I simply do not care what happens in mathematically possible structures other than the world I am actually in; in a sense this validates Wei's claim that

Updateless Decision Theory converts anthropic reasoning problems into ethical problems

since I do not care what happens in mathematically possible structures that are incompatible with what I have already observed about the real world around me, I may as well have updated on my own existence in this world anyway.

In the case where either theory T1 or T2 is true, I care about whichever world is actually real, so my intuition is that we should pay the $1, which causes me to believe that I implicitly reject SIA.

I find that I simply do not care what happens in mathematically possible structures other than the world I am actually in

I think this is where Counterfactual Mugging comes in. If you expect to encounter CM-like situations in the future, then you'd want your future self to care what happens in mathematically possible structures other than the world it is "actually in", since that makes the current you better off.

UDT might be too alien, so that you can't make yourself use it even if you want to (so your future self won't give $100 to Omega no matter what the current you wants), but AI seems to be a good application for it.

in mathematically possible structures other than the world it is "actually in"

The Counterfactual Mugging previously discussed involved probabilities over mathematical facts, e.g. the value, in binary, of the nth digit of pi. If that digit does turn out to be a 0, the counterfactual mugging pays off only in a mathematically *im*possible structures.

The point of the comment you replied to was that "simply do not care what happens in mathematically possible structures other than the world I am actually in" may be true of SforSingularity, but consideration of Counterfactual Mugging shows that it shouldn't be elevated to a general moral principle, and in fact he would prefer his own future self to not follow that. To make that point, I only need a version of CM with a physical coin.

The version of CM with a mathematical coin is trickier. But I think under UDT, since you don't update on what Omega tells you about the coin result, you continue to think that both outcomes are possible. You only think something is mathematically impossible, if you come to that conclusion through your internal computations.

The version of CM with a mathematical coin is trickier. But I think under UDT, since you don't update on what Omega tells you about the coin result, you continue to think that both outcomes are possible. You only think something is mathematically impossible, if you come to that conclusion through your internal computations.

You don't "update" on your own mathematical computations either.

The data you construct or collect is about what *you* are, and by extension what your actions are and thus what is their effect, not about what is possible in the abstract (more precisely: what you could think possible in other situations). That's the trick with mathematical uncertainty: since you can plan for situations that turn out to be impossible, you need to take that planning into account in other situations. This is what you do by factoring the impossible situations in the decision-making: accounting for your own planning for those situations, in situations where you don't know them to be impossible.

I don't get this either, sorry. Can you give an example where "You don't "update" on your own mathematical computations either" makes sense?

Here's how I see CM-with-math-coin goes in more detail. I think we should ask the question, suppose you think that Omega may come in a moment to CM you using the n-th bit of pi, what would you prefer your future self to do, assuming that you can compute n-th bit of pi, either now or later? If you can compute it now, clearly you'd prefer your future self to not give $100 to Omega if the bit is 0.

What if you can't compute it now, but you can compute it later? In that case, you'd prefer your future self to not give $100 to Omega if *it* computes that the bit is 0. Because suppose the bit is 1, then Omega will simulate/predict your future self, and the simulated self will compute that the bit is 1 and give $100 to Omega, so Omega will reward you. And if the bit is 0, then Omega will not get $100 from you.

Since by "updating" on your own computation, you win both ways, I don't see why you shouldn't do it.

(I converted this comment to a top-level post. See Counterfactual Mugging and Logical Uncertainty. A little bit is left here in the original notation, as a reply.)

I assume you recast the problem this way: if the n-th bit of pi is 1, then Omega maybe gives you $10000, and if the bit is 0, then Omega asks for the $100.

If the bit is 1, Omega's simulation of you *can't* conclude that the bit is 0, because the bit is 1. Omega doesn't compute what you'll predict in reality, it computes what you *would* do *if the bit was 0* (which it isn't, in reality, in this case where it isn't). And as you suggested, you decline to give away the $100 if the bit is 0, thus Omega's simulation *of counterfactual* will say that you wouldn't oblige, and you won't get the $10000.

Of course from a physical point of view, (e.g. from the point of view of Many Worlds QM or the lower Tegmark levels) there are lots of human instances around in the multiverse, all thinking that their particular bit of the multiverse is "real". Clearly, they cannot all be right. This is somewhat worrying; naive ideas about our little part of the universe being real, and the rest imaginary, are probably a "confusion", so we end up (as Wei D says) having to turn our old-fashioned epistemological intuitions into ethical principles; principles such as "I only care about the world that I am actually in", or we have to leave ourselves open to turning into madmen who do bizarre things to themselves for expected reward in other possible universes.

And formalizing "the universe I am actually in" may not be easy; unless we are omniscient, we cannot have enough data to pin down where exactly in the multiverse we are.

Re: 1 and 2, if you've got subjectively strong (but actually harmful) intuitions in one area, then the link allows contagion. Intuitions should be used to generate experiments, not structures.

I'm not sure how intuitions about morality and epistemology can be used to generate experiments. Do you have any suggestions or examples?

But anyway, perhaps I should have used the word "belief" rather than "intuition".

In this post, I'd like to examine whether Updateless Decision Theory can provide any insights into anthropic reasoning. Puzzles/paradoxes in anthropic reasoning is what prompted me to consider UDT originally and this post may be of interest to those who do not consider Counterfactual Mugging to provide sufficient motivation for UDT.

The Presumptuous Philosopher is a thought experiment that Nick Bostrom used to argue against the Self-Indication Assumption. (SIA: Given the fact that you exist, you should (other things equal) favor hypotheses according to which many observers exist over hypotheses on which few observers exist.)

To make this example clearer as a decision problem, let's say that the consequences of carrying out the "simple experiment" is a very small cost (one dollar). And the consequences of just assuming T2 is a disaster down the line if T1 turns out to be true (we create a power plant based on T2, and it blows up and kills someone).

In UDT, no Bayesian updating occurs, and in particular, you don't update on the fact that you exist. Suppose in CDT you have a prior P(T1) = P(T2) = .5 before taking into account that you exist, then translated into UDT you have Σ P(V

_{i}) = Σ P(W_{i}) = .5, where V_{i}and W_{i}are world programs where T1 and T2 respectively hold. Anthropic reasoning occurs as a result of considering theconsequencesof your decisions, which are a trillion times greater in T2 worlds than in T1 worlds, since your decision algorithm S is called about a trillion times more often in W_{i}programs than in V_{i}programs.Perhaps by now you've notice the parallel between this decision problem and Eliezer's Torture vs. Dust Specks. The very small cost of the simple physics experiment is akin to getting a dust speck in the eye, and the disaster of wrongly assuming T2 is akin to being tortured. By not doing the experiment, we can save one dollar for a trillion individuals in exchange for every individual we kill.

In general, Updateless Decision Theory converts anthropic reasoning problems into ethical problems. I can see three approaches to taking advantage of this:

Personally, I have vacillated between 1 and 2. I've argued, based on 1, that we should discount the values of individuals by using a complexity-based measure. And I've also argued, based on 2, that perhaps the choice of an epistemic prior is more or less arbitrary (since objective morality seems unlikely to me). So I'm not sure what the right answer is, but this seems to be the right track to me.