Self-indication assumption is wrong for interesting reasons

by neq1 3 min read16th Apr 201024 comments


The self-indication assumption (SIA) states that

Given the fact that you exist, you should (other things equal) favor hypotheses according to which many observers exist over hypotheses on which few observers exist.

The reason this is a bad assumption might not be obvious at first.  In fact, I think it's very easy to miss.

Argument for SIA posted on Less Wrong

First, let's take a look at a argument for SIA that appeared at Less Wrong (link).  Two situations are considered.

1.  we imagine that there are 99 people in rooms that have a blue door on the outside (1 person per room).  One person is in a room with a red door on the outside.  It was argued that you are in a blue door room with probability 0.99.

2.  Same situation as above, but first a coin is flipped.  If heads, the red door person is never created.  If tails, the blue door people are never created.  You wake up in a room and know these facts.  It was argued that you are in a blue door room with probability 0.99.

So why is 1. correct and 2. incorrect?  The first thing we have to be careful about is not treating yourself as special.  The fact that you woke up just tells you that at least one conscious observer exists. 

In scenario 1 we basically just need to know what proportion of conscious observers are in a blue door room.  The answer is 0.99.

In scenario 2 you never would have woken up in a room if you hadn't been created.  Thus, the fact that you exist is something we have to take into account.  We don't want to estimate P(randomly selected person, regardless of if they exist or not, is in a blue door room).  That would be ignoring the fact that you exist.  Instead, the fact that you exist tells us that at least one conscious observer exists.  Again, we want to know what proportion of conscious observers are in blue door rooms.  Well, there is a 50% chance (if heads landed) that all conscious observers are in blue door rooms, and a 50% chance that all conscious observers are in red door rooms.  Thus, the marginal probability of a conscious observer being in a blue door room is 0.5.

The flaw in the more detailed Less Wrong proof (see the post) is when they go from step C to step D.  The *you* being referred to in step A might not exist to be asked the question in step D.  You have to take that into account.

General argument for SIA and why it's wrong

Let's consider the assumption more formally.

Assume that the number of people to be created, N, is a random draw from a discrete uniform distribution1 on {1,2,...,Nmax}.  Thus, P(N=k)=1/Nmax, for k=1,...,Nmax.  Assume Nmax is large enough so that we can effectively ignore finite sample issues (this is just for simplicity).

Assume M= Nmax*(Nmax+1)/2 possible people exist, and we arbitrarily label them 1,...,M.  After the size of the world, say N=n, is determined, then we randomly draw n people from the M possible people.

After the data are collected we find out that person x exists.

We can apply Bayes' theorem to get the posterior probability:

P(N=k|x exists)=k/M, for k=1,...,Nmax.

The prior probability was uniform, but the posterior favors larger worlds.  QED.

Well, not really.

The flaw here is that we conditioned on person x existing, but person x only became of interest after we saw that they existed (peeked at the data).

What we really know is that at least one conscious observer exists -- there is nothing special about person x.

So, the correct conditional probability is:

P(N=k|someone exists)=1/Nmax, for k=1,...,Nmax.

Thus, prior=posterior and SIA is wrong.


The flaw with SIA that I highlighted here is it treats you as special, as if you were labeled ahead of time.  But the reality is, no matter who was selected, they would think they are the special person.  "But I exist, I'm not just some arbitrary person.  That couldn't happen in small world.  It's too unlikely."  In reality, that fact that I exist just means someone exists. I only became special after I already existed (peeked at the data and used it to construct the conditional probability).

Here's another way to look at it.  Imagine that a random number between 1 and 1 trillion was drawn.  Suppose 34,441 was selected.  If someone then asked what the probability of selecting that number was, the correct answer is 1 in 1 trillion.  They could then argue, "that's too unlikely of an event.  It couldn't have happened by chance."  However, because they didn't identify the number(s) of interest ahead of time, all we really can conclude is that a number was drawn, and drawing a number was a probability 1 event.

I give more examples of this here.

I think Nick Bostrom is getting at the same thing in his book (page 125):

..your own existence is not in general a ground for thinking that hypotheses are more likely to be true just by virtue of implying that there is a greater total number of observers. The datum of your existence tends to disconfirm hypotheses on which it would be unlikely that any observers (in your reference class) should exist; but that’s as far as it goes. The reason for this is that the sample at hand—you—should not be thought of as randomly selected from the class of all possible observers but only from a class of observers who will actually have existed. It is, so to speak, not a coincidence that the sample you are considering is one that actually exists. Rather, that’s a logical consequence of the fact that only actual observers actually view themselves as samples from anything at all

Related arguments are made in this LessWrong post.  

1 for simplicity I'm assuming a uniform prior... the prior isn't the issue here