EDT with updating double counts

Interesting! Here's one way to look at this:

EDT+SSA-with-a-minimal-reference-class behaves like UDT in anthropic dilemmas where updatelessness doesn't matter.
I think SSA with a minimal reference class is roughly equivalent to "notice that you exist; exclude all possible worlds where you don't exist; renormalize"
In large worlds where your observations have sufficient randomness that observers of all kinds exists in all worlds, the SSA update step cannot exclude any world. You're updateless by default. (This is the case in the 99% example above.)
In small or sufficiently deterministic worlds, the SSA update step can exclude some possible worlds.

In "normal" situations, the fact that it excludes worlds where you don't exist doesn't have any implications for your decisions — because your actions will normally not have any effects in worlds where you don't exist.
But in situations like transparent newcombs, this means that you will now not care about non-existent copies of yourself.

Basically, EDT behaves fine without updating. Excluding worlds where you don't exist is one kind of updating that you can do that doesn't change your behavior in normal situations. Whether you do this or not will determine whether you act updateless in situations like transparent newcomb that happen in small or sufficiently deterministic worlds. (In large and sufficiently random worlds, you'll act updateless regardless.)

Viewed like this, the SSA part of EDT+SSA looks unnecessary and strange. Especially since I think you do want to act updateless in situations like transparent newscomb.

[-]paulfchristiano4y50

I feel like the part where you "exclude worlds where 'you don't exist' " should probably amount to "exclude worlds where your current decision doesn't have any effects"---it's not clear in what sense you "don't exist" if you are perfectly correlated with something in the world. And of course renormalizing makes no difference, it's just expressing the fact that both sides of the bet get scaled down. So if that's your operationalization, then it's also just a description of something that automatically happens inside of the utility calculation.

(I do think it's unclear whether selfish agents "should" be updateless in transparent newcomb.)

[-]Lukas Finnveden4y60

Yes, with that operationalisation, the update has no impact on actions. (Which makes it even more clear that the parsimonious choice is to skip it.)

(I do think it's unclear whether selfish agents "should" be updateless in transparent newcomb.)

Yeah. It might be clearer to think about this as a 2-by-2 grid, with "Would you help a recent copy of yourself that has had one divergent experience from you?" on one axis and "Would you help a version of yourself that would naively be seen as non-existant?" (e.g. in transparent newcombs) on another.

It seems fairly clear that it's reasonable to answer "yes" to both of these.
It's possible that a selfish agent could sensibly answer "no" to both of them.

But perhaps we can exclude the other options.

Answering "yes" to the former and "no" to the latter would correspond to only caring about copies of yourself that 'exist' in the naive sense. (This is what the version of EDT+SSA that I wrote about it in my top-level comment would do.) Perhaps this could be excluded as relying on philosophical confusion about 'existence'.
Answer "no" to the former and "yes" to the latter might correspond to something like... only caring about versions of yourself that you have some particular kind of (counterfactual) continuity or connection with. (I'm making stuff up here.) Anyway, maybe this could be excluded as necessarily having to rely on some confusions about personal identity.

[-]JBlack4y30

Doesn't "sufficient randomness in observations" just mean that you split the possible worlds further by conditional probability of observations given actual world-state? You can still eliminate the ones where the observers don't observe what you observed.

For example "I observe that the calculator says NO" doesn't let you eliminate worlds where the correct answer is YES, but it does let you eliminate all worlds where you observe that the calculator says YES. So "notice that you (an observer who sees NO) exist; exclude all possible worlds where you don't exist (because observers in that world see YES); renormalize" still does some work.

[-]Lukas Finnveden22d20

There's an even stronger argument against EDT+SSA: That it can be diachronically dutch-booked. See Conitzer (2017). (H/t Anthony DiGiovanni for the link.)

I find this satisfying, since it more cleanly justifies that EDT shouldn't be combined with any empirical updating whatsoever. (Not sure what's the situation with logical updates.)

(The update that Paul suggests in a parallel comment, "exclude worlds where your current decision doesn't have any effects", would of course still work — but it transparently doesn't serve any decision-relevant purpose and doesn't seem philosophically appealing either, to me.)

[-]JBlack4y7-9

I'm not sure what this is, but it's not EDT.

The correct decision under EDT is the action A that maximizes Sum P(O_i | A) U(O_i, A), where the O_i are the possible outcomes and U is the utility function given outcome O_i for action A. For this agent, the utility is not a function of O_i and A, so EDT cannot be applied.

[-]paulfchristiano4y60

I'm using EDT to mean the agent that calculates expected utility conditioned on each statement of the form "I take action A" and then chooses the action for which the expected utility is highest. I'm not sure what you mean by saying the utility is not a function of O_i, isn't "how much money me and my copies earn" a function of the outcome?

(In your formulation I don't know what P(|A) means, given that A is an action and not an event, but if I interpret it as "Probability given that I take action A" then it looks like it's basically what I'm doing?)

[-]JBlack4y1-2

The "me and my copies" that this agent bases its utility on are split across possible worlds with different outcomes. EDT requires a function that maps an action and an outcome to a utility value, and no such function exists for this agent.

Edit: as an example, what is the utility of this agent winning $1000 in a game where they don't know the chance of winning? They don't even know themselves what their own utility is, because their utility doesn't just depend upon the outcome. If you credibly tell them afterward that they were nearly certain to win, they value the same $1000 very much greater than if you tell them that there was a 1 in a million chance that they would win.

For this sort of agent that values nonexistent and causally-disconnected people, we need some different class of decision theory altogether, and I'm not sure it can even be made rationally consistent.

[-]abramdemski5mo20

From the OP:

We live in a very big universe where many copies of me all face the exact same decision. This seems plausible for a variety of reasons; the best one is accepting an interpretation of quantum mechanics without collapse (a popular view).

The copies in almost all of the decision problems mentioned are spread out across a big world, not across "possible worlds". EG:

If both agents exist and they are just in separate worlds, then there is no conflict between their values at all, and they always push the button.

"Worlds" here means literal planets, rather than the "possible worlds" of philosophy. Hence, it can all be accommodated in one big outcome.

The one exception to this I'm noticing is the final case mentioned:

Suppose that only one agent exists. Then it feels weird, seeing button “B,” to press the button knowing that it causes you to lose $1 in the real, actually-existing world. But in this case I think the problem comes from the sketchy way we’re using the word “exist”—if copy B gets money based on copy A’s decision, then in what sense exactly does copy A “not exist”? What are we to make of the version of copy A who is doing the same reasoning, and is apparently wrong about whether or not they exist? I think these cases are confusing from a misuse of “existence” as a concept rather than updatelessness per se.

However, the text is obviously noting that there is something off about this case.

I admit that it is common, in discussion of UDT, to let the outcome be a function of the full policy, including actions taken in alternate possible worlds (even sometimes including impossible possible worlds, IE, contradictory worlds). However, it can always be interpreted as some kind of simulation taking place within the actual world (usually, in Omega's imagination).

[-]Ben1234y*60

I think the agent should take the bet, and the double counting is actually justified. Epistemic status: Sleep deprived.

The number of clones that end up betting along with the agent is an additional effect of its decision that EDT-with update is correctly accounting for. Since "calculator says X" is evidence that "X = true", selecting only clones that saw "calc says X" gives you better odds. What seems like a superfluous second update is really an essential step -- computing the number of clones in each branch.

Consider this modification: All N clones bet iff you do, using their own calculator to decide whether to bet on X or ¬X.

This reformulation is just the basic 0-clones problem repeated, and it recommends no bet.

if X, EVT = ¯100 =
0.99 × N winners × $10
- 0.01 × N losers × $1000
if ¬X, EVT = ¯100 =
0.99 × N winners × $10
- 0.01 × N losers × $1000

Now recall the "double count" calculation for the original problem.

if X, EVT = 9900 = 0.99 × N winners × $10
if ¬X, EVT = ¯10 = -0.01 × N losers × $1000

Notice what's missing: The winners when ¬X and, crucially, the losers when X. This is a real improvement in value -- if you're one of the clones when X is true, there's no longer any risk of losing money.

[-]Nora Belrose4y40

Yeah, I think this is right. It seems like the whole problem arises from ignoring the copies of you who see "X is false." If your prior on X is 0.5, then really the behavior of the clones that see "X is false" should be exactly analogous to yours, and if you're going to be a clone-altruist you should care about all the clones of you whose behavior and outcomes you can easily predict.

I should also point out that this whole setup assumes that there are 0.99N clones who see one calculator output and 0.01N clones who see the opposite, but that's really going to depend on what exact type of multiverse you're considering (quantum vs. inflationary vs. something else) and what type of randomness is injected into the calculator (classical or quantum). But if you include both the "X is true" and "X is false" copies then I think it ends up not mattering.

[-]Lukas Finnveden2y*20

Maybe interesting: I think a similar double-counting problem would appear naturally if you tried to train an RL agent in a setting where:

"Reward" is proportional to an estimate of some impartial measure of goodness.
There are multiple identical copies of your RL algorithm (including: they all use the same random seed for exploration).

In a repeated version of the calculator example (importantly: where in each iteration, you randomly decide whether the people who saw "true" get offered a bet or the people who saw "false" get offered a bet — never both), the RL algorithms would learn that, indeed:

99% of the time, they're in the group where the calculator doesn't make an error
and on average, when they get offered a bet, they will get more reward afterwards if they take it than if they don't.

The reason that this happens is because, when the RL agents lose money, there's fewer agents that associate negative reinforcement with having taken a bet just-before. Whereas whenever they gain money, there's more agents that associate positive reinforcement with having taken a bet just-before. So the total amount of reinforcement is greater in the latter case, so the RL agents learn to bet. (Despite how this loses them money on average.)

[-]Daniel Kokotajlo2y20

Suppose I conclude from this that I shouldn't update.

How does this interact with logical updates / learning more math / conceiving of additional hypotheses / etc.? A priori stuff. Can I still learn/evolve my credences over time in those ways, or do I have to freeze them also?

[-]ESRogs4y20

Then it feels weird, seeing button “B,” to press the button knowing that it causes you to lose $1 in the real, actually-existing world.

Was that supposed to be "seeing button 'A'"? (since A was the one who stands to lose a dollar, and B the one who stands to gain a dollar)

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

56

EDT with updating double counts

56

56

Why EDT bets at 99.99% odds (under some conditions)

Failure diagnosis

“Updatelessness” as a feature of preferences