Updateless anthropics

by Stuart_Armstrong7 min read20th Feb 201113 comments

13

Anthropics
Personal Blog

Three weeks ago, I set out to find a new theory of anthropics, to try and set decision theory on a firm footing with respect to copying, deleting copies, merging them, correlated decisions, and the presence or absence of extra observers. I've since come full circle, and realised that UDT already has a built-in anthropic theory, that resolves a lot of the problems that had been confusing me.

The theory is simple, and is essentially a rephrasing of UDT: if you are facing a decision X, and trying to figure out the utility of X=a for some action a, then calculate the full expected utility of X being a, given the objective probabilities of each world (including those in which you don't exist).

As usual, you have to consider the consequences of X=a for all agents who will make the same decision as you, whether they be exact copies, enemies, simulations or similar-minded people. However, your utility will have to do more work that is usually realised: notions such as selfishness or altruism with respect to your copies have to be encoded in the utility function, and will result in substantially different behaviour.

The rest of the post is a series of cases-studies illustrating this theory. Utility is assumed to be linear in cash for convenience.

Sleeping with the Presumptuous Philosopher 

The first test case is the Sleeping Beauty problem.


 In its simplest form, this involves a coin toss; if it comes out heads, one copy of Sleeping Beauty is created. If it comes out tails, two copies are created. Then the copies are asked at what odds they would be prepared to bet that the coin came out tails. You can assume either that the different copies care for each other in the manner I detailed here, or more simply that all winnings will be kept by a future merged copy (or an approved charity). Then the algorithm is simple: the two worlds have equal probability. Let X be the decision where sleeping beauty decides between a contract that pays out $1 if the coin is heads, versus one that pays out $1 if the coin is tails. If X="heads" (to use an obvious shorthand), then Sleeping Beauty will expect to make $1*0.5, as she is offered the contract once. If X="tails", then the total return of that decision is $1*2*0.5, as copies of her will be offered the contract twice, and they will all make the same decision. So Sleeping Beauty will follow the SIA 2:1 betting odds of tails over heads.

Variants such as "extreme Sleeping Beauty" (where thousands of copies are created on tails) will behave in the same way; if it feels counter-intuitive to bet at thousands-to-one odds that a fair coin landed tails, it's the fault of expected utility itself, as the rewards of being right dwarf the costs of being wrong.

But now let's turn to the Presumptuous Philosopher, a thought experiment that is often confused with Sleeping Beauty. Here we have exactly the same setup as "extreme Sleeping Beauty", but the agents (the Presumptuous philosophers) are mutually selfish. Here the return to X="heads" remains $1*0.5. However the return to X="tails" is also $1*0.5, since even if all the Presumptuous Philosophers in the "tails" universe bet on "tails", each one will still only get $1 in utility. So the Presumptuous Philosopher should only take even SSA betting 1:1 odds on the result of the coin flip.

So SB is acts like she follows the self-indication assumption, (SIA), and while the PP is following the self-sampling assumption (SSA). This remains true if we change the setup so that one agent is given a betting opportunity in the tails universe. Then the objective probability of any one agent being asked is low, so both SB and PP model the "objective probability" of the tails world, given that they have been asked to bet, as being low. However, SB gains utility if any of her copies is asked to bet and receives a profit, so the strategy "if I'm offered $1 if I guess correctly whether the coin is heads or tails, I will say tails" gets her $1*0.5 utility whether or not she is the specific one who is asked. Betting heads nets her the same result, so SB will give SIA 1:1 odds in this case.

On the other hand, the PP will only gain utility in the very specific world where he himself is asked to bet. So his gain from the updateless "if I'm offered $1 if I guess correctly whether the coin is heads or tails, I will say tails" is tiny, as he's unlikely to be asked to bet. Hence he will offer the SSA odds that make heads a much more "likely" proposition.

The Doomsday argument

Now, using SSA odds brings us back into the realm of the classical Doomsday argument. How is it that Sleeping Beauty is immune to the Doomsday argument while the Presumptuous Philosopher is not? Which one is right; is the world really about to end?

Asking about probabilities independently of decisions is meaningless here; instead, we can ask what would agents decide in particular cases. It's not surprising that agents will reach different decisions on such questions as, for instance, existential risk mitigation, if they have different preferences.

Let's do a very simplified model, where there are two agents in the world, and that one of them is approached at random to see if they would pay $Y to add a third agent. Each agent derives a (non-indexical) utility of $1 for the presence of this third agent, and nothing else happens in the world to increase or decrease anyone's utility.

First, let's assume that each agent is selfish about their indexical utility (their cash in the hand). If the decision is to not add a third agent, all will get $0 utility. If the decision is to add a third agent, then there are three agents in the world, and one them will be approached to lose $Y. Hence the expected utility is $(1-Y/3).

Now let us assume the agents are altruistic towards each other's indexical utilities. Then the expected utility of not adding a third agent is still $0. If the decision is to add a third agent, then there are three agents in the world, and one of them will be approached to lose $Y - but all will value that lose at the same amount. Hence the expected utility is $(1-Y).

So if $Y=$2, for instance, the "selfish" agents will add the third agent, and the "altruistic" ones will not. So generalising this to more complicated models describing existential risk mitigations schemes, we would expect SB-type agents to behave differently to PP-types in most models. There is no sense in asking which one is "right" and which one gives the more accurate "probability of doom"; instead ask yourself which better corresponds to your own utility model, hence what your decision will be.

Psy-Kosh's non-anthropic problem

Cousin_it has a rephrasing of Psy-Kosh's non-anthropic problem to which updateless anthropics can be illustratively applied:

You are one of a group of 10 people who care about saving African kids. You will all be put in separate rooms, then I will flip a coin. If the coin comes up heads, a random one of you will be designated as the "decider". If it comes up tails, nine of you will be designated as "deciders". Next, I will tell everyone their status, without telling the status of others. Each decider will be asked to say "yea" or "nay". If the coin came up tails and all nine deciders say "yea", I donate $1000 to VillageReach. If the coin came up heads and the sole decider says "yea", I donate only $100. If all deciders say "nay", I donate $700 regardless of the result of the coin toss. If the deciders disagree, I don't donate anything.

We'll set aside the "deciders disagree" and assume that you will all reach the same decision. The point of the problem was to illustrate a supposed preference inversion: if you coordinate ahead of time, you should all agree to say "nay", but after you have been told you're a decider, you should update in the direction of the coin coming up tails, and say "yea".

From the updateless perspective, however, there is no mystery here: the strategy "if I were a decider, I would say nay" maximises utility both for the deciders and the non-deciders.

But what if the problem were rephrased in a more selfish way, with the non-deciders not getting any utility from the setup (maybe they don't get to see the photos of the grateful saved African kids), while the deciders got the same utility as before? Then the strategy "if I were a decider, I would say yea" maximises your expect utility, because non-deciders get nothing, thus reducing the expected utility gains and losses in the world where the coin came out tails. This is similar to SIA odds, again.

That second model is similar to the way I argued for SIA with agents getting created and destroyed. That post has been superseded by this one, which pointed out the flaw in the argument which was (roughly speaking) not considering setups like Psy-Kosh's original model. So once again, whether utility is broadly shared or not affects the outcome of the decision.

The Anthropic Trilemma

Eliezer's anthropic trilemma was an interesting puzzle involving probabilities, copying, and subjective anticipation. It inspired me to come up with a way of spreading utility across multiple copies which was essentially a Sleeping Beauty copy-altruistic model. The decision process going with it is then the same as the updateless decision process outlined here. Though initially it was phrased in terms of SIA probabilities and individual impact, the isomorphism between the two can be seen here.

13

13 comments, sorted by Highlighting new comments since Today at 5:25 AM
New Comment

I remember you linked me to Radford Neal's paper (pdf) on Full Non-indexical Conditioning. I think FNC is a much nicer way to think about problems like these than SSA and SIA, but I guess you disagree?

To save others from having to wade through the paper, which is rather long, I'll try to explain what FNC means:

First, let's consider a much simpler instance of the Doomsday Argument: At the beginning of time, God tosses a coin. If heads then there will only ever be one person (call them "M"), who is created, matures and dies on Monday, and then the world ends. If tails then there will be two people, one ("M") who lives and dies on Monday and another ("T") on Tuesday. As this is a Doomsday Argument, we don't require that T is a copy of M.

M learns that it's Monday but is given no (other) empirical clues about the coin. M says to herself "Well, if the coin is heads then I was certain to find myself here on Monday, but if it's tails then there was a 1/2 chance that I'd find myself experiencing a Tuesday. Applying Bayes' theorem, I deduce that there's a 2/3 chance that the coin is heads, and that the world is going to end before tomorrow."

Now FNC makes two observations:

  1. The event "it is Monday today" is indexical. However, an "indexical event" isn't strictly speaking an event. (Because an event picks out a set of possible worlds, whereas an indexical event picks out a set of possible "centered worlds".) Since it isn't an event, it makes no sense to treat it as 'data' in a Bayesian calculation.
  2. (But apart from that) the best way to do an update is to update on everything we know.

M takes these points to heart. Rather than updating on "it is Monday" she instead updates on "there once was a person who experienced [complete catalogue of M's mental state] and that person lived on Monday."

If we ignore the (at best) remote possibility that T has exactly the same experiences as M (prior to learning which day it is) then the event above is independent of the coin toss. Therefore M should calculate a posterior probability of 1/2 that the coin is heads.

On discovering that it's Monday, M gains no evidence that the end of the world is nigh. Notice that we've reached this conclusion independently of decision theory.

If M is 'altruistic' towards T, valuing him as much as she values herself, then she should be prepared to part with one cube of chocolate in exchange for a guarantee that he'll get two if he exists. If M is 'selfish' then the exchange rate changes from 1:2 to 1:infinity. These exchange rates are not probabilities. It would be very wrong to say something like "the probability that M gives to T's existence only makes sense when we specify M's utility function, and it in particular it changes from 1/2 to 0 if M switches from 'altruistic' to 'selfish'".

I used to be a great believer in FNC, but I've found it's flawed. The main problem is that it's not time-consistent.

For instance, if you start with some identical copies, and they are each going to flip a coin twenty times. Now FNC says that before they flip a coin, they should not believe that they are in a large universe, because they are identical.

However, after they have flipped, they will be nearly certainly very different, and so will believe that they are in a large universe.

So they know that after they flip the coin, their probability of being in a large universe will have increased, no matter what they see.

The problem isn't just restricted to when you start with identical copies; whenever you increase your memory size by one bit, say, then FNC will be slightly inconsistent (because (1+e)^-n is approximately 1-ne for small e, but not exactly).

Yes, that is definitely a problem! The variation of FNC which I described in the final section of my UDT post has each person being allowed to help themselves to uniform random number in [0,1] - i.e. infinitely many random "coin flips", as long as they don't try to actually use the outcomes.

This solves the problem you mention, but others arise:

  1. It's hard to see how to give an independent justification of this trick.
  2. More importantly, Eliezer's tale of the Ebborians demonstrates that we can go continuously from one copy to two copies.

Actually, using (2), and variations alpha to gamma, I think I can construct a continuum of variations on Sleeping Beauty which stretch from one where the answer is unambiguously 1/3 (or 1/11 as in the link) to one where it's unambiguously 1/2.

OK, I recant and denounce myself - the idea that any sensible variation of the Sleeping Beauty problem must have a 'canonical' answer is wrong, and FNC is broken.

OK, I recant and denounce myself - the idea that any sensible variation of the Sleeping Beauty problem must have a 'canonical' answer is wrong, and FNC is broken.

Very admirable stance to take :-) I wish I could claim I found the problem and immediately renounced SIA and FNC, but it was a long process :-)

Btw, a variant similar to your alpha to gamma was presented in my post http://lesswrong.com/lw/18r/avoiding_doomsday_a_proof_of_the_selfindication ; I found the problem with that in http://lesswrong.com/lw/4fl/dead_men_tell_tales_falling_out_of_love_with_sia/

Also, I don't think probabilities are useful objects on their own; only the decisions that result from them. Different theories can get different probabilities but always the same decision...

I suppose I'm being obtuse about this, but please help me find my way through this argument.

  1. The event "it is Monday today" is indexical. However, an "indexical event" isn't strictly speaking an event. (Because an event picks out a set of possible worlds, whereas an indexical event picks out a set of possible "centered worlds".) Since it isn't an event, it makes no sense to treat it as 'data' in a Bayesian calculation.

Isn't this argument confounded by the observation that an indexical event "It is Tuesday today", in the process of ruling out several centered possible worlds--the ones occurring on Monday--also happens to rule out an entire uncentered world? If it's not an event, how does it makes sense to treat it as data in a Bayesian calculation that rules out Heads? If that wasn't the event that entered into the Bayesian calculation, what was?

If that wasn't the event that entered into the Bayesian calculation, what was?

The Bayesian calculation only needs to use the event "Tuesday exists" which is non-indexical (though you're right - it is entailed by "today is Tuesday").

The problem with indexical events is that our prior is a distribution over possible worlds, and there doesn't seem to be any non-arbitrary way of deriving a distribution over centered worlds from a distribution over uncentered ones. (E.g. Are all people equally likely regardless of lifespan, brain power, state of wakefulness etc.? What if people are copied and the copies diverge from one another? Where does the first 'observer' appear in the tree of life? etc.)

The Bayesian calculation only needs to use the event "Tuesday exists"

I can't follow this. If "Tuesday exists" isn't indexical, then it's exactly as true on Monday as it is on Tuesday, and furthermore as true everywhere and for everyone as it is for anyone.

there doesn't seem to be any non-arbitrary way of deriving a distribution over centered worlds from a distribution over uncentered ones.

Indeed, unless you work within the confines of a finite toy model. But why go in that direction? What non-arbitrary reason is there not to start with centered worlds and try to derive a distribution over uncentered ones? In fact, isn't that the direction scientific method works in?

I can't follow this. If "Tuesday exists" isn't indexical, then it's exactly as true on Monday as it is on Tuesday, and furthermore as true everywhere and for everyone as it is for anyone.

Well, in my toy model of the Doomsday Argument, there's only a 1/2 chance that Tuesday exists, and the only way that a person can know that Tuesday exists is to be alive on Tuesday. Do you still think there's a problem?

Indeed, unless you work within the confines of a finite toy model.

Even in toy models like Sleeping Beauty we have to somehow choose between SSA and SIA (which are precisely two rival methods for deriving centered from uncentered distributions.)

What non-arbitrary reason is there not to start with centered worlds and try to derive a distribution over uncentered ones? In fact, isn't that the direction scientific method works in?

That's a very good, philosophically deep question! Like many lesswrongers, I'm what David Chalmers would call a "Type-A materialist" which means that I deny the existence of "subjective facts" which aren't in some way reducible to objective facts.

Therefore, I think that centered worlds can be regarded one of two ways: (i) as nonsense or (ii) as just a peculiar kind of uncentered world: A "centered world" really just means an "uncentered world that happens to contain an ontologically basic, causally inert 'pointer' towards some being and an ontologically basic, causally inert catalogue of its "mental facts". However, because a "center" is causally inert, we can never acquire any evidence that the world has a "center".

(I'd like to say more but really this needs a lot more thought and I can see I'm already starting to ramble...)

I'm what David Chalmers would call a "Type-A materialist" which means that I deny the existence of "subjective facts" which aren't in some way reducible to objective facts.

The concerns Chalmers wrote about focused on the nature of phenomenal experience, and the traditional dichotomy between subjective and objective in human experience. That distinction draws a dividing line way off to the side of what I'm interested in. My main concern isn't with ineffable consciousness, it's with cognitive processing of information, information defined as that which distinguishes possibilities, reduces uncertainty and can have behavioral consequences. Consequences for what/whom? Situated epistemic agents, which I take as ubiquituous constituents of the world around us, and not just sentient life-forms like ourselves. Situated agents that process information don't need to be very high on the computational hierarchy in order to be able to interact with the world as it is, use representations of the world as they take it to be, and entertain possibilities about how well their representations conform to what they are intended to represent. The old 128MB 286 I had in the corner that was too underpowered to run even a current version of linux, was powerful enough to implement an instantiation of a situated Bayesian agent. I'm completely fine with stipulating that it had about as much phenomenal or subjective experience as a chunk of pavement. But I think there are useful distinctions totally missed by Chalmers' division (which I'm sure he's aware of, but not concerned with in the paper you cite), between what you might call objective facts and what you might call "subjective facts", if by the latter you include essentially indexical and contextual information, such as de se and de dicto information, as well as de re propositions.

Therefore, I think that centered worlds can be regarded one of two ways: (i) as nonsense or (ii) as just a peculiar kind of uncentered world: A "centered world" really just means an "uncentered world that happens to contain an ontologically basic, causally inert 'pointer' towards some being and an ontologically basic, causally inert catalogue of its "mental facts". However, because a "center" is causally inert, we can never acquire any evidence that the world has a "center".

(On Lewis's account, centered worlds are generalizations of uncentered ones, which are contained in them as special cases.) From the point of view of a situated agent, centered worlds are epistemologically prior, about as patently obvious as the existence of "True", "False" and "Don't Know", and the uncentered worlds are secondary, synthesized, hypothesized and inferred. The process of converting limited indexical information into objective, universally valid knowledge is where all the interesting stuff happens. That's what the very idea of "calibration" is about. To know whether they (centered worlds or the other kind) are ontologically prior it's just too soon for me to tell, but I feel uncomfortable prejudging the issue on such strict criteria without a more detailed exploration of the territory on the outside of the walled garden of God's Own Library of Eternal Verity. In other words, with respect to that wall, I don't see warrant flowing from inside out, I see it flowing from outside in. I suppose that's in danger of making me an idealist, but I'm trying to be a good empiricist.

A few thoughts on the cousin_its problem:

  1. When you calculate the expected outcome for the "deciders say nay" strategy and the "deciders say yea" strategy, you already know that the deciders will be deciders. So "you are a decider" is not new information (relative to that strategy), don't change your answer. (It may be new information relative to other strategies, where the one making the decision is an individual that wasn't necessarily going to be told "you are the decider" for the original problem. If you're told, "you are the decider", you should still conclude with 90% probability that the coin is tails.)

  2. (Possibly a rephrasing of 1.) If the deciders in the tails universe come to the same conclusion as the deciders in the heads universe about the probability of which universe they're in, one might conclude that they didn't actually get useful information about which universe they're in.

  3. (Also a rephrasing of 1.) The deciders do a pretty good job of predicting what universe they're in individually, but the situation is contrived to give the one wrong decider nine times the decision-making power. (Edit: And since you know about that trap in advance, you shouldn't fall into it.)

  4. (Isomorphic?) Perhaps "there's a 90% probability that I'm in the 'tails' universe" is the wrong probability to look at. The relevant probability is, "if nine hypothetical individuals are told 'you're a decider', there's only a 10% probability that they're all in the tails universe".

[-][anonymous]11y 0
Cousin_it has a rephrasing of Psy-Kosh's non-anthropic problem

One attack on that:

The relevant probability isn't "given that I'm a decider, it's 90% likely that the coin is tails" it's "given 9 hypothetical deciders, it's 10% likely that they're all in the 'tails' universe." In fact, when deciding on the

[-][anonymous]11y 0

I remember you linked me to Radford Neal's paper (pdf) on Full Non-indexical Conditioning. I think FNC is a much nicer way to think about problems like these than SSA and SIA, but I guess you disagree?

To save others from having to wade through the paper, I'll try to explain relatively briefly what FNC means:

First, let's consider a much simpler instance of the Doomsday Argument: At the beginning of time, God tosses a coin. If heads then there will only ever be one person (call them "M"), who is created, matures and dies on Monday, and then the world ends. If tails then there will be two people, one ("M") who lives and dies on Monday and another ("T") on Tuesday. As this is a Doomsday Argument, we don't require that T is a copy of M.

M learns that it's Monday but is given no (other) empirical clues about the coin. M says to herself "Well, if the coin is heads then I was certain to find myself here on Monday, but if it's tails then there was a 1/2 chance that I'd find myself experiencing a Tuesday. Applying Bayes' theorem, I deduce that there's a 2/3 chance that the coin is heads, and and that the world is going to end before tomorrow."

Now FNC makes two observations:

  1. The event on which we were updating, "it is Monday today", is indexical. However, an "indexical event" isn't strictly speaking an event. (Because an event picks out a set of possible worlds, whereas an indexical event picks out a set of possible "centered worlds".) Since it isn't an event, we can't update on it.
  2. (But apart from that) the best way to do an update is to update on everything we know.

M takes these points to heart. Rather than updating on "it is Monday" she instead updates on "there once was a person who experienced this [complete catalogue of M's mental state] and that person lived on Monday."

If we ignore the (at best) remote possibility that T has exactly the same experiences as M (prior to learning which day it is) then the event above is independent of the coin toss. Therefore M should calculate a posterior probability of 1/2 that the coin is heads.

On discovering that it's Monday, M gains no evidence that the end of the world is nigh. Notice that we've reached this conclusion independently of decision theory.

If M is 'altruistic' towards T, valuing him as much as she values herself, then she should be prepared to part with one cube of chocolate in exchange for a guarantee that he'll get two if he exists. If M is 'selfish' then the exchange rate changes from 1:2 to 1:infinity. These exchange rates are not probabilities. It would be very wrong to say something like "the probability that M gives to T's existence only makes sense when we specify M's utility function, and it in particular it changes from 1/2 to 0 if M switches from 'altruistic' to 'selfish'".