Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

I've written before that different theories of anthropic probability are really answers to different questions. In this post I'll try to be as clear as possible on what that means, and explore the implications.

Introduction

One of Nick Bostrom's early anthropic examples involved different numbers of cars in different lanes. Here is a modification of that example:

You're driving along, when you turn into a dark tunnel and are automatically shunted into the left or the right lane. You can't see whether there are any other cars in your dark lane, but the car radio announces "there are cars in the right lane and in the left lane".

Given that, what is your probability of being in the left lane?

That probability is obviously . More interesting than that answer, is that there are multiple ways of reaching it. And each of these ways corresponds to answering a slightly different question. And this leads to my ultimate answer about anthropic probability:

  • Each theory of anthropic probability corresponds to answering a specific, different question about proportions. These questions are equivalent in non-anthropic setting, so each of them feels potentially like a "true" extension of probability to anthropics. Paradoxes and confusion in anthropics results from confusing one question with another.

So if I'm asked "what's the 'real' anthropic probability of ?", my answer is: tell me what you mean by probability, and I'll tell you what the answer is.

0. The questions

If is a feature that you might or might not have (like being in a left lane), here are several questions that might encode the probability of :

  1. What proportion of potential observers have ?
  2. What proportion of potential observers exactly like you have ?
  3. What is the average proportion of potential observers with ?
  4. What is the average proportion of potential observers exactly like you with ?

We'll look at each of these questions in turn[1], and see what they say imply in anthropic and non-anthropic situations.

1. Proportion of potential observers: SIA

We're trying to answer "Given that, what is your probability of being in the left lane?" The "that" is means being in the tunnel in the above situations, so we're actually looking for a conditional probability, best expressed as:

  1. What proportion of the potential observers, who are in the tunnel in the situation above, are also in the left lane?

The answer for that is an immediate "one in a hundred", or , since we know there are drivers in the tunnel, and of them is in the left lane. There may be millions of different tunnels, in trillions of different potential universes; but, assuming we don't need to worry about infinity[2], we can count observers in the tunnel in that situation for each observer in the left lane.

1.1 Anthropic variant

Let's now see how this approach generalises to anthropic problems. Here is an anthropic version of the tunnel problem, based on the incubator version of the Sleeping Beauty problem:

A godly AI creates a tunnel, then flips a fair coin. If the coin comes out heads, it will create one person in the tunnel; if it was tails, it creates people.

You've just woken up in this tunnel; what is the probability that the coin was heads?

So, we want to answer:

  1. What proportion of the potential observers, who are in the tunnel, are also in a world where the coin was heads?

We can't just count off observers within the same universe here, since the and the observers don't exist in the same universe. But we can pair up universes here: for each universe where the coin flip goes heads ( observer), there is another universe of equal probability where the coin flip goes tails ( observers).

So the answer to the proportion of potential observers question remains , just as in the non-anthropic situation.

This is exactly the "self-indication assumption" (SIA) version of probability, which counts observers in other potential universes as if they existed in a larger multiverse of potential universes[3].

2. Proportion of potential observers exactly like you: SIA again

Let's now look at the second question:

  1. What proportion of the potential observers exactly like you, who are in the tunnel in the situation above, are also in the left lane?

The phrase "exactly like you" is underdefined - do you require that the other yous be made of exactly the same material, in the same location, etc... I'll cash out the phrase as meaning "has had the same subjective experience as you". So we can cash out the left-lane probability as:

  1. What proportion of the potential observers, with the same subjective experiences as you, who are in the tunnel in the situation above, are also in the left lane?

We can't count off observers within the same universe for this, as the chance of having multiple observers with the same subjective experience in the same universe is very low, unless there are huge numbers of observers.

Instead, assume that one in observers in the tunnel have the same subjective experiences as you. This proportion[4] must be equal for an observer in the left and right lanes. If it weren't, you could deduce information about which lane you were in just from your experiences - so the proportion being equal is the same thing as the lane and your subjective experiences being independent. For any given little , this gives the following proportions (where "Right 1 not you" is short for "the same world as 'Right 1 you,' apart from the first person on the right, who is replaced with a non-you observer"):

So the proportion of observers in the right/left lane with your subjective experience is the proportion of observers in the right/left lane. When comparing those two proportions, the two cancel out, and we get , as before.

2.1 Anthropic variant

Ask the anthropic version of the question:

  1. What proportion of the potential observers who are in the tunnel, with the same subjective experiences as you, are also in a world where the coin was heads?

Then same argument as above shows this is also (where "Tails 1 not you" is short for "the same world as 'Tails 1 you,' apart from the first tails person, who is replaced with a non-you observer"):

This is still SIA, and reflects the fact that, for SIA, the reference class doesn't matter - as long as it include the observers subjectively indistinguishable from you. So questions about you are the same whether we talk about "observers" or "observers with the same subjective experiences as you".

3. Average proportions of observers: SSA

We now turn to the next question:

  1. What is the average proportion of potential observers in the left lane, relative to the average proportion of potential observers in the tunnel?

Within a given world, say there are observers not in the tunnel and tunnels, so observers in total.

The proportion of observers in the left lane is while the proportion of observers in the tunnel is . The ratios of the these proportions in .

Then notice that if and are in a proportion in every possible world, the averages of and are in a proportion as well[5], giving the standard probability of .

3.1 Anthropic variant

The anthropic variant of the question is then:

  1. What is the average proportion of potential observers in a world where the coin was heads, relative to the average proportion of potential observers in the tunnel?

Within a given world, ignoring the coin, say there are observers not in the tunnel, and tunnels. Let's focus on the case with one tunnel, . Then the coin toss splits this world into two equally probable worlds, the heads world, , with observers, and the tails world, with observers:

The proportion of observers in tunnels in is . The proportion of observers in tunnels in is . Hence, across these two worlds, the average proportion of observers in tunnels is the average of these two, specifically

If is zero, this is ; this is intuitive, since means that all observers are in tunnels, so the average proportion of observers in tunnels is .

What about the proportion of observers in the tunnels in the heads worlds? Well, this is is the heads world, and is the tails world, so the average proportion is:

If is zero, this is -- the average between , the heads world proportion for in (all observers are heads world observers in tunnels) and , the proportion of heads world observers in the tails world .

Taking the ratio , the answer to that question is . This is the answer given by the "self-sampling assumption" (SSA), with gives the response in the sleeping beauty problem (of which this is a variant).

In general, the ratio would be:

If is very large, this is approximately , i.e. the same answer as SIA would give. This shows the fact that, for SSA, the reference class of observers is important. The , the number of observers that are not in tunnel, define the probability estimate. So how we define observers will determine our probability[6].

So, for a given pair of worlds equally likely worlds, and , the ratio of question 3. varies between and . This holds true for multiple tunnels as well. And it's not hard to see that this implies that, averaging across all worlds, we also get a ratio between (all observers in the reference class are in tunnels) and (almost no observers in the reference class are in tunnels).

4. Average proportions of observers exactly like you: FNC

Almost there! We have a last question to ask:

  1. What is the average proportion of potential observers in the left lane, with the same subjective experiences as you, relative to the average proportion of potential observers in the tunnel, with the same subjective experiences as you?

I'll spare you the proof that this gives again, and turn directly to the anthropic variant:

  1. What is the average proportion of potential observers in a world where the coin was heads, with the same subjective experiences as you, relative to the average proportion of potential observers in the tunnel, with the same subjective experiences as you?

By the previous section, this is the SSA probability with the reference class of "observers with the same subjective experiences as you". This turns out to be FNC, full non-indexical conditioning (FNC), which involves conditioning on any possible observation you've made, no matter how irrelevant. It's known that if all the observers have made the same observations, this reproduces SSA, but that as the number of unique observations increases, this tends to SIA.

That's because FNC is inconsistent - the odds of heads to tails change based on irrelevant observations which change your subjective experience. Here we can see what's going on: FNC is SSA with the reference class of observers with the same subjective experiences as you. But this reference class is variable: as you observe more, the size of the reference class changes, decreasing[7] because others in the reference class will observe something different to what you do.

But SSA is not consistent across reference class changes! So FNC is not stable across new observations, even if those observations are irrelevant to the probability being estimated.

For example, imagine that we started, in the tails world, with all copies exactly identical to you, and then you make a complex observation. Then that world will split in many worlds where there are no exact copies of you (since none of them made exactly the same observation as you), a few worlds where there is one copy of you (that made the same observation as you), and many fewer worlds where there are more than one copy of you:

In the heads world, we only have no exact copies and one exact copy. We can ignore the worlds without observers exactly like us, and concentrate one the worlds with a single observer like us (this represents the vast majority of the probability mass). Then, since there are possible locations in the tails world and in the heads world, we get a ratio of roughly for tails over heads:

This give a ratio of roughly for "any coin result" over heads, and shows why FNC converges to SIA.

5. What decision to make: ADT

There's a fifth question you could ask:

  1. What is the best action I can take, given what I know about the observers, our decision algorithms, and my utility function?

This transforms transforms the probability question into a decision-theoretic question. I've posted at length on Anthropic Decision Theory, which is the answer to that question. Since I've done a lot of work on that already, I won't be repeating that work here. I'll just point out that "what's the best decision" is something that can be computed independently of the various versions of "what's the probability".

5.1 How right do you want to be?

An alternate characterisation of the SIA and SSA questions could be to ask, "If I said 'I have ', would I want most of my copies to be correct (SIA) or my copies to be correct in most universes (SSA)?"

These can be seen as having two different utility functions (one linear in copies that are correct, one that gives rewards in universes where my copies are correct), and acting to maximise them. See the post here for more details.

6. Some "paradoxes" of anthropic reasoning

Given the above, let's look again at some of the paradoxes of anthropic reasoning. I'll choose three: the Doomsday argument, the presumptuous philosopher, and Robin Hanson's take on grabby aliens.

6.1 Doomsday argument

The Doomsday argument claims that the end of humanity is likely to be at hand - or at least more likely than we might think.

To see how the argument goes, we could ask "what proportion of humans will be in the last of all humans who have ever lived in their universe?" The answer to that is, tautologically[8], .

The simplest Doomsday argument would then reason from that, saying "with probability, we are in the last of humans in our universe, so, with probability, humanity will end in this universe before it reaches times the human population to date."

What went wrong there? The use of the term "probability", without qualifiers. The sentence slipped from using probability in terms of ratios within universes (the SSA version) to ratios of which universes we find ourselves in (the SIA version).

As an illustration, imagine that the godly AI creates either world (with humans), (with humans), (with humans), or (with humans). Each option is with probability . These human are created in numbered room, in order, starting at room .

Then we might ask:

  • A. What proportion of humans are in the last of all humans created in their universe?

That proportion is undefined for . But for the other worlds, the proportion is (e.g. humans through for , humans through for etc...). Ignoring the undefined world, the average proportion is also .

Now suppose we are created in one of those rooms, and we notice that it is room number . This rules out worlds and ; but the average proportion remains .

But we might ask instead:

  • B. What proportion of humans in room are in the last of all humans created in their universe?

As before, humans being in room eliminates worlds and . The worlds and are equally likely, and each have one human in room . In , we are in the last of humans; in , we are not. So the answer to question B is .

Thus the answer to A is , the answer to B is , and there is no contradiction between these.

Another way of thinking of this: suppose you play a game where you invest a certain amount of coins. With probability , your money is multiplied by ten; with probability , you lost everything. You continue re-investing the money until you lose. This is illustrated by the following diagram, (with the initial investment indicated by green coins):

Then it is simultaneously true that:

  1. of all the coins you earnt were lost the very first time you invested them, and
  2. You have only chance of losing any given investment.

So being more precise about what is meant by "probability" dissolves the Doomsday argument.

6.2 Presumptuous philosopher

Nick Bostrom introduced the presumptuous philosopher thought experiment to illustrate a paradox of SIA:

It is the year 2100 and physicists have narrowed down the search for a theory of everything to only two remaining plausible candidate theories: T1 and T2 (using considerations from super-duper symmetry). According to T1 the world is very, very big but finite and there are a total of a trillion trillion observers in the cosmos. According to T2, the world is very, very, very big but finite and there are a trillion trillion trillion observers. The super-duper symmetry considerations are indifferent between these two theories. Physicists are preparing a simple experiment that will falsify one of the theories. Enter the presumptuous philosopher: “Hey guys, it is completely unnecessary for you to do the experiment, because I can already show you that T2 is about a trillion times more likely to be true than T1!”

The first thing to note is that the presumptuous philosopher (PP) may not even be right under SIA. We could ask:

  • A. What proportion of the observers exactly like the PP are in the universes relative to the universes?

Recall that SIA is independent of reference class, so adding "exactly like the PP" doesn't change this. So, what is the answer to A.?

Now, universes have a trillion times more observers than the universes, but that doesn't necessarily mean that the PP are more likely in them. Suppose that everyone in these universes knows their rank of birth; for the PP it's the number 24601:

Then since all universes have more that 24601 inhabitants, the PP exists equally likely in universes as universes; the proportion is therefore (interpreting "the super-duper symmetry considerations are indifferent between these two theories" as meaning "the two theories are equally likely").

Suppose however, the the PP does not know their rank, and the universes are akin to a trillion independent copies of the universes, each of which has an independent chance of generating an exact copy of PP:

Then SIA would indeed shift the odds by a factor of a trillion, giving a proportion of . But this is not so much a paradox, as the PP is correctly thinking "if all the exact copies of me in the multiverse of possibilities were to guess we were in universes, only one in a trillion of them would be wrong".

But if instead we were to ask:

    1. What is the average proportion of PPs among other observers, in versus universes?

Then we would get the SSA answer. If the PPs know their birth rank, this is a proportion of in favour of universes. That's because there is just one PP in each universe, and a trillion times more people in the universes, which dilutes the proportion.

If the PP doesn't know their birth rank, then this proportion is the same[9] in the and universes. In probability terms, this would mean a "probability" of for and .

6.3 Anthropics and grabby aliens

The other paradoxes of anthropic reasoning can be treated similarly to the above. Now let's look at a more recent use of anthropics, due to Robin Hanson, Daniel Martin, Calvin McCarter, and Jonathan Paulson.

The basic scenario is one in which a certain number of alien species are "grabby": they will expand across the universe, at almost the speed of light, and prevent any other species of intelligent life from evolving independently within their expanding zone of influence[10].

Humanity has not noticed any grabby aliens in the cosmos; so we are not within their zone of influence. If they had started nearby and some time ago - say within the Milky Way and half a million years ago - then they would be here by now.

What if grabby aliens recently evolved a few billion light years away? Well, we wouldn't see them until a few billion years have passed. So we're fine. But if humans had instead evolved several billion years in the future, then we wouldn't be fine: the grabby aliens would have reached this location before then, and prevented us evolving, or at least would have affected us.

Robin Hanson sees this as an anthropic solution to a puzzle: why did humanity evolve early, i.e. only 13.8 billion years after the Big Bang? We didn't evolve as early as we possibly could - the Earth is a latecomer among Earth-like planets. But the smaller stars will last for trillions of years. Most habitable epochs in the history of the galaxy will be on planets around these small stars, way into the future.

One possible solution to this puzzle is grabby aliens. If grabby aliens are likely (but not too likely), then we could only have evolved in this brief window before they reached us. I mentioned that SIA doesn't work for this (for the same reason that it doesn't care about the Doomsday argument). Robin Hanson then responded:

If your theory of the universe says that what actually happened is way out in the tails of the distribution of what could happen, you should be especially eager to find alternate theories in which what happened is not so far into the tails. And more willing to believe those alternate theories because of that fact.

That is essentially Bayesian reasoning. If you have two theories, and , and your observations are very unlikely given but more likely given , then this gives extra weight to .

Here we could have three theories:

  1. : "There are grabby aliens nearby"
  2. : "There are grabby aliens a moderate distance away"
  3. : "Any grabby aliens are very far away"

The can be ruled out by the fact that we exist. Theory posits that humans could not have evolved much later than we did (or else the grabby aliens would have stopped us). Theory allows for the possibility that humans evolved much later than we did. So, from 's perspective, it is "surprising" that we evolved so early; from 's perspective, it isn't, as this is the only possible window.

But by "theory of the universe", Robin Hanson meant not only the theory of how the physical universe was, but the anthropic probability theory. The main candidates are SIA and SSA. SIA is indifferent between and . But SSA prefers (after updating on the time of our evolution). So we are more surprised under SIA than under SSA, which, in Bayesian/Robin reasoning, means that SSA is more likely to be correct.

But let's not talk about anthropic probability theories; let's instead see what questions are being answered. SIA is equivalent with asking the question:

  1. What proportions of universes with human exactly like us, have moderately close grabby aliens () versus very distant grabby aliens ()?

Or, perhaps more relevant to our future:

  1. In what proportions of universes with human exactly like us, would those humans, upon expanding in the universe, encounter grabby aliens () or not encounter them ()?

In contrast, the question SSA is asking is:

  1. What is the average proportion of humans among all observers, in universes where there are nearby grabby aliens () versus very distant grabby aliens ()?

If we were launching an interstellar exploration mission, and were asking ourselves what "the probability" of encountering grabby alien life was, then question 1. seems a closer phrasing of that than question 2. is.

And question 2. has the usual reference class problems. I said "observers", but I could have defined this narrowly as "human observers"; in which case it would have given a more SIA-like answer. Or I could have defined it expansively as "all observers, including those that might have been created by grabby aliens"; in that case SSA ceases to prioritise theories and may prioritise ones instead. In that case, humans are indeed "way out in the tails", given : we are the very rare observers that have not seen or been created by grabby aliens.

In fact, the same reasoning that prefers SSA in the first place would have preferences over the reference class. The narrowest reference classes are the least surprising - given that we are humans in the 21st century with this history, how surprising is it that we are humans in the 21st century with this history? - so they would be "preferred" by this argument.

But the real response is that Robin is making a category error. If we substitute "question" for "theory", we can transform his point into:

If your question about the universe gets a very surprising answer, you should be especially eager to ask alternate questions with less surprising answers. And more willing to believe those alternate questions.


  1. We could ask some variants of questions 3. and 4., by maybe counting causally disconnected segments of universes as different universes (this doesn't change questions 1. and 2.). We'll ignore this possibility in this post. ↩︎

  2. And also assuming that the radio's description of the situation is correct! ↩︎

  3. Notice here that I've counted off observers with other observers that have exactly the same probability of existing. To be technical, the question which gives SIA probabilities should be "what proportion of potential observers, weighted by their probability of existing, have ?" ↩︎

  4. More accurately: probability-weighted proportion. ↩︎

  5. Let be a set of worlds, a probability distribution over . Then the expectation of is , which is times the expectation of . ↩︎

  6. If we replace "observers" with "observer moments", then this question is equivalent with the probability generated by the Strong Self-Sampling Assumption (SSSA). ↩︎

  7. If you forget some observations, your reference class can increase, as previously different copies become indistinguishable. ↩︎

  8. Assuming the population is divisible by . ↩︎

  9. As usual with SSA and this kind of question, this depends on how you define the reference class of "other observers", and who counts as a PP. ↩︎

  10. This doesn't mean they will sterilise planets or kill other species; just that any being evolving within their control will be affected by them and know that they're around. Hence grabby aliens are, by definition, not hidden from view. ↩︎

New to LessWrong?

New Comment
10 comments, sorted by Click to highlight new comments since: Today at 7:00 AM

Didn't check lesswrong for a month, almost missed this post. I have followed your work on anthropic for quite sometime. I want to ask a quick question:

An alternate characterisation of the SIA and SSA questions could be to ask, "If I said 'I have X', would I want most of my copies to be correct (SIA) or my copies to be correct in most universes (SSA)?"

What if I just want to be correct. Not giving any thought to other copies at all? Do you consider that response invalid?

For some context consider this example: Tonight during my sleep an alien Omega is going to toss a coin. If it land Tails it splits me in halves, and make two copies of me by molecular cloning the other half respectively. The process is accurate enought the resulting copies won't be able to tell. If Heads Omega won't do anything. After waking up Omega would ask one/both copies about the coin toss (guess the result or give a probabililty).

Now I wake up from the experiement, how should I guess the result? I think it can be answered without even care about what the other copy of me (if it exist) thinks. I can participate this experiment repeatedly and will experience about equal numbers of Heads vs Tails.

In the classical sleeping beauty problem, if I guess the coin was tails, I will be correct in 50% of the experiments, and in 67% of my guesses. Whether you score by "experiments" or by "guesses" gives a different optimal performance.

In the classical sleeping beauty problem, if I guess the coin was tails, I will be correct in 50% of the experiments, and in 67% of my guesses.

In this case, how do you define "my guesses"? Does that mean guesses made by the same physical person? That would lead to different answers in the sleeping beauty problem vs the above cloning/spliting problem.

To translate my position from the previous comment would be "my guesses" are primitively clear to me because I have the subjective experience of them. So when faced with the question "would I want most of my copies to be correct (SIA) or my copies to be correct in most universes (SSA)?", I would simply say "I just want myself to be correct." Do you think that is an invalid position?

Put it in the classical sleeping beauty problem, the exact duration of the experiment is inconsequential. The two awakenings can be one day apart (as the usual formulation), or a week apart, or an hour apart. It is still the same experiment/problem. I think we can all agree on that.

So imagine I wake up in the experiment. I can enter another iteration of the experiment right away. As long as the second experiment finishes before the potentially incoming memory wipe of the first experiment. For example, let the first experiment have awakenings 1 day apart, the second experiment with awakenings 1/2 day apart, the third experiment with awakenings 1/4 day apart, etc. Theoretically, the experiment can be repeated infinite times. (This is of course assuming the actual awakenings take insignificant time and the memory wipes happen just before the second awakening). When repeating the sleeping beauty problem this way, I have a clear track of how many iterations have I entered, what was my guesses in those iterations, and how many of them are right. And that number would approach 50% if I guess tails every time.

In this process, I never have to consider "this awakening" as a member of any reference class. Do you think "keeping the score" this way invalid?

In this process, I never have to consider "this awakening" as a member of any reference class. Do you think "keeping the score" this way invalid?

Different ways of keeping the score give different answers. So, no, I don't think that's invalid.

But by "theory of the universe", Robin Hanson meant not only the theory of how the physical universe was, but the anthropic probability theory. The main candidates are SIA and SSA. SIA is indifferent between T1 and T2. But SSA prefers T1 (after updating on the time of our evolution).

SIA is not indifferent between T1 and T2. There are way more humans in world T1 than in world T2 (since T2 requires life to be very uncommon, which would imply that humans are even more uncommon), so SIA thinks world T1 is much more likely. After all, the difference between SIA and SSA is that SIA thinks that universes with more observers are proportionally more likely; so SIA will always think aliens are more likely than SSA does.

Previously, I thought this was in conflict with the fact that humans didn't seem to be particularly early (ie., if life is common, it's surprising that there aren't any aliens around 13.8 billion years into the universe's life span). I ran the numbers, and concluded that SIA still thought that we'd be very likely to encounter aliens (though most of the linked post instead focuses on answering the decision-relevant question "how much of potentially-colonisable space would be colonised without us?", evaluated ADT-style).

After having read Robin's work, I now think humans probably are quite early, which would imply that (given SIA/ADT) it is highly overdetermined that aliens are common. As you say, Robin's work also implies that SSA agrees that aliens are common. So that's nice: no matter which of these questions we ask, we get a similar answer.

I didn't fully define those theories, and, indeed, if they depended on commonness of life, then SAI would prefer .

But if I posited instead that and differ only in the propensity for aliens to become grabby or not, then SIA would indeed be indifferent between them.

Good point, I didn't think about that. That's the old SIA argument for there being a late filter.

The reason I didn't think about it is because I use SIA-like reasoning in the first place because it pays attention to the stakes in the right way: I think I care about acting correctly in universes with more copies of me almost-proportionally more. But I also care more about universes where civilisations-like-Earth are more likely to colonise space (ie become grabby), because that means that each copy of me can have more impact. That kind-of cancels out the SIA argument for a late filter, mostly leaving me with my priors, which points toward a decent probability that any given civilisation colonises space in a grabby manner.

Also: if Earth-originiating intelligence ever becomes grabby, that's a huge bayesian update in favor of other civilisations becoming grabby, too. So regardless of how we describe the difference between T1 and T2, SIA will definitely think that T1 is a lot more likely once we start colonising space, if we ever do that.

So regardless of how we describe the difference between T1 and T2, SIA will definitely think that T1 is a lot more likely once we start colonising space, if we ever do that.

SIA isn't needed for that; standard probability theory will be enough (as our becoming grabby is evidence that grabbiness is easier than expected, and vice-versa).

I think there's a confusion with SIA and reference classes and so on. If there are no other exact copies of me, then SIA is just standard Bayesian update on the fact that I exist. If theory T_i has prior probability p_i and gives a probability q_i of me existing, then SIA changes its probability to q_i*p_i (and renormalises).

Effects that increase the expected number of other humans, other observers, etc... are indirect consequences of this update. So a theory that says life in general is easy also says that me existing is easy, so gets boosted. But "Earth is special" theories also get boosted: if a theory claims life is very easy but only on Earth-like planets, then those also get boosted.

SIA isn't needed for that; standard probability theory will be enough (as our becoming grabby is evidence that grabbiness is easier than expected, and vice-versa).
I think there's a confusion with SIA and reference classes and so on. If there are no other exact copies of me, then SIA is just standard Bayesian update on the fact that I exist. If theory T_i has prior probability p_i and gives a probability q_i of me existing, then SIA changes its probability to q_i*p_i (and renormalises).

Yeah, I agree with all of that. In particular, SIA updating on us being alive on Earth is exactly as if we sampled a random planet from space, discovered it was Earth, and discovered it had life on it. Of course, there are also tons of planets that we've seen that doesn't look like they have life on them.

But "Earth is special" theories also get boosted: if a theory claims life is very easy but only on Earth-like planets, then those also get boosted.

I sort-of agree with this, but I don't think it matters in practice, because we update down on "Earth is unlikely" when we first observe that the planet we sampled was Earth-like.


Here's a model: Assume that there's a conception of "Earth-like planet" such that life-on-Earth is exactly equal evidence for life emerging on any Earth-like planet, and 0 evidence for life emerging on other planets. This is clearly a simplification, but I think it generalises. "Earth-like planet" could be any rocky planet, any rocky planet with water, any rocky planet with water that was hit by an asteroid X years into its lifespan, etc.

Now, if we sample a planet (Earth) and notice that it's Earth-like and has life on it, we do two updates:

  • Noticing that Earth is an Earth-like planet should update us towards thinking that Earth-like planets are common in the universe.
  • Noticing that life emerged on Earth should update us towards thinking that life has a high probability of emerging on Earth-like planets.

If we don't know anything else about the universe yet, these two updates should collectively imply an update towards life-is-common that is just as big as if we hadn't done this decomposition, and just updated on the hypothesis "how common is life?" in the first place.

Now, lets say we start observing the rest of the universe. Lets assume this happens via sampling random planets and observing (a) whether they are/aren't Earth-like (b) whether they do/don't have life on them.

  • If we sample a non-Earth-like planet, we update towards thinking that Earth-like planets aren't common.
  • If we sample an Earth-like planet without life, we update towards thinking that Earth-like planets has a lower probability of supporting life.

I haven't done the math, but I'm pretty sure that it doesn't matter which of these we observe. The update on "How common is life?" will be the same regardless. So the existence of "Earth is special"-hypotheses doesn't matter for our best-guess of "How common is life?", if we only conside the impact of observing planets with/without Earth-like features and life.


Of course, observing planets isn't the only way we can learn about the universe. We can also do science, and reason about the likely reasons that life emerged, and how common those things ought to be.

That means that if you can come up with a strong theoretical argument (that isn't just based on observing how many planets are Earth-like and/or had life on them, including Earth) that some feature of Earth significantly boosts the probability of life and that that feature is extremely rare in the universe at-large, then that would be a solid argument for why to expect life to be rare in the universe. However, note that you'd have to argue that it was extremely rare. If we're assuming that grabby aliens could travel over many galaxies, then we've already observed evidence that grabby life is sufficiently rare to not yet have appeared in any of a very large number of planets in any of a very large number of galaxies. Your theoretical reasons to expect life to be rare would have to assert that it's even rarer than that to impact the results.

Someday your review of this will mention subjective probability questions in addition to the frequency questions :P The "I have a probabilistic model of the world and I want to compute anthropic probabilities using this model" kind of thing.

A related interesting approach to anthropics is Solomonoff induction. You treat the entire process generating your subjective experience as a Turing machine and ask about what it's likely to do next. In some sense this drops the whole notion of "world" out entirely, and just generally breaks the mold