Abstract: A boring, long-winded account of some extremely basic ideas is given.

### The "Coin Universe"

Imagine a universe split into subsystems A and B, where causal influences go from A to B but not vice versa. A is extremely simple - each day either a "heads event" or "tails event" takes place, which is visible to observers in B as a red or green patch at a certain place in the sky. In fact, coin events are the only 'communication' between A and B.

Naturally, the observers in B would like to understand the pattern of coin events so they formulate some theories.

### Two Rival Theories

**Theory 1:** Every day, the universe 'splits' into two. The entire contents of B are 'copied' somehow. One copy sees a heads event, the other copy sees a tails event.

**Theory 2:** There is no splitting. Instead, some kind of (deterministic or stochastic) process in A is producing the sequence of events. As a special case (**Theory 2a**) it could be that each coin event is independent and random, and has a probability 1/2 of being heads. (Let's also write down another special case (**Theory 2b**) where each coin event has probability 9/10 of being heads.)

### The Position of Jupiter

Imagine that we're primitive astronomers trying to understand the trajectory of Jupiter through the sky. We take for granted that our observations are indexed by a variable called "time" and that a complete theory of Jupiter's trajectory would have the form: "Position of Jupiter at time t = F(t)" for some function F that we can calculate. Suppose such a theory is formulated.

However, if we believe in a point P such that P is "the position of Jupiter" then the theory does not resolve all of our uncertainty about P, for it merely tells us: "If you ask the question at time t then the answer is F(t)." If the question is asked "timelessly" then there is no unique answer. There isn't even a probability distribution over the set of possible answers, because there is no 'probability distribution over time'.^{1}

### Theories 1 and 2 are Incommensurable

Can the scientists in the Coin Universe decide empirically between Theories 1 and 2a? Imagine that our scientists already have a 'theory of everything' for B's own physics, and that coin events are the only phenomena about which there remains controversy.

In fact, the same problem arises for *every* variation of Theory 2 (except those which sometimes predict a probability or 1 or 0). Variations of Theory 2 can be tested against each other, but not against Theory 1.

### Why Should 'Splitting' Entail That Probabilities Aren't Well Defined?

If the universe is splitting then a 'history of the universe' looks like a branching tree rather than a line. Now, I'm taking for granted an 'objective' concept of probability in which the set Ω of possible histories of the universe has the structure of a 'probability space', so that the only things which can be assigned ('objective') probabilities are subsets of Ω. So for any event E with a well-defined probability, and any possible history H, E must either contain all of H or else none of it. Hence, it makes no sense to look at some branch B within H and ask about "the probability that B is true". (Any more than we can look at a particular person in the world and ask "what is the probability of being this person?")

A natural response might be as follows:

Surely if time is a branching tree then all we have to do to define the probabilities of branches is say that the probability of a 'child node' is the probability of its parent divided by the number of 'children'. So we could simulate Theory 2a within a purely deterministic universe by having one 'heads branch' and one 'tails branch' each day of the simulation, or we could simulate Theory 2b instead by having nine 'heads branches' and only one 'tails'.

Observers in the first simulation would experience 1/2 probabilities of heads, while observers in the second would experience 9/10 probabilities.

(Note: We can make this idea of "experiencing 1/2" or "experiencing 9/10" more vivid by supposing that 'days' happen extremely quickly, so that experiencing 1/2 would mean that the relevant patch of sky is flickering between red and green so fast that it looks yellow, whereas 9/10 would equate to an orangey-red.)

### The Lesson of the Ebborians

Consider the Ebborian universe. It has five dimensions: three 'ordinary' spatial dimensions, one of time and an 'extra' dimension reserved for 'splitting'. If you were to draw a cross section of the Ebborian universe along the temporal and the 'extra' dimension, you would see a branching tree. Suppose for simplicity that all such cross sections look alike, so that each time the universe 'splits' it happens everywhere simultaneously. Now, a critical point in Eliezer's fable is that the branches have thickness, and the subjective probability of finding yourself in a child branch is supposed to be proportional to the *square* of that branch's thickness. For my purposes I want branches to have widths, such that the width of a parent branch equal the sum of widths of its children, but I want to discard the idea of squaring. Imagine that the only times the universe 'splits' are once a day when a red or green light appears somewhere in the sky, which the Ebborians call a "heads event" or "tails event" respectively. Hmm, this sounds familiar...

Prima facie it seems that we've reconstructed a version of the Coin Universe which (a) contains "splitting" and (b) contains "objective probabilities" (which "clearly" ought to be proportional to the widths of branches).

What I want to ask is: why exactly should 'width along the extra dimension' be proportional to 'probability'? One possible answer would be "that's just what the extra dimension *is*. It's *intrinsically* a 'dimension of probability'." That's fine, I guess, but then I want to say that the difference between this Coin Universe and one described by Theory 2 is purely verbal. But now suppose the extra dimension is just an ordinary spatial dimension (whatever that means). Then where does the rule 'probability = thickness' come from, when there are so many other possibilities? E.g. "At any branch point, the ratio of the probability of the left branch to the probability of the right branch is 9 * (width of left branch) to 1 * (width of right branch)." If this was the rule then even though uniform branch widths may suggest that Theory 2a is correct, the 'experienced probabilities' would be those of Theory 2b. (If days were extremely rapid, the sky would look orange-red rather than yellow.)

If the extra dimension is not explicitly a 'dimension of probability' then the 'experienced probabilities' will be indeterminate without a 'bridge law' connecting width and probability. But the difference between ("extra dimension is spatial" + bridge law connecting branch widths and probability) and ("extra dimension is probability" + bridge law connecting probabilities with epiphenomenal 'branch widths') is purely verbal.

So ultimately the only two possibilities are (i) the extra dimension is a 'dimension of probability', and there is at best 'epiphenomenal splitting'; or else (ii) the probabilities are indeterminate.

Of various possible conclusions, one in particular seems worth noting down: If we are attempting to simulate a Coin Universe by computing all of its branches at once, then regardless of how we 'tag' or 'weight' the branches to indicate their supposed probabilities, we should not think that we are thereby affecting the experiences of the simulated beings. (So ignoring 'externalities' there's no moral imperative that we should prefer two copies of a happy simulation and one of a sad simulation over two 'sad' simulations and one 'happy', any more than that we should stick pieces of paper to the computer cases saying "probability 9/10" and "probability 1/10".)

### Implications For MWI?

MWI emphatically does *not *assert that time 'splits into branches', so it's not immediately clear that there are any implications for MWI or what they would be if there were. For what it's worth, my current way of thinking is that a quantum theory is neither "deterministic" nor "probabilistic" but just "quantum". I'm beginning to suspect that MWI is what you get when you mistakenly try to conceive of a quantum theory as deterministic. Two things in particular have led me to this view: (i) Scott Aaronson's lecture and (ii) this paper which goes a considerable way towards demolishing what I had previously taken to be one of the strongest reasons for 'believing in' many worlds. However, my thoughts on this are extremely half-baked and subject to revision.

### Is Probability Reducible?

It's conspicuous that the discussion above presupposes that probabilities - "real probabilities" - are or might be 'built in' at the 'ground floor' of reality. However, others have made ingenious attempts to show how (our concepts and perceptions of) probability can arise perfectly well even if the universe doesn't presuppose it. I'm not averse to this project - in fact it parallels Dennett's strategy in the philosophy of mind, namely to show how it can 'seem like' we have 'qualia' even in a world where no such things exist.

Anyway, I seem to be converging onto user cousin_it's statement: "Perhaps counterintuitively, the easiest way for probabilities to arise is *not *by postulating 'different worlds' that you could 'end up' in starting from now."

^{1} Perhaps Julian Barbour would disagree. However, for the purposes of my discussion, I'm presupposing the naive 'common sense' view of time where 'the facts' about a (classical) universe are exhausted precisely when we've specified "the state of affairs at every moment of time". Another possible objection is that because Jupiter's position in the sky repeats cyclically, we can define 'time averages' after all. Well, let's just suppose that these astronomers are able to detect the slight deviations due to e.g. the solar wind pushing Jupiter away and lengthening its orbit. (If you're still bothered, imagine replacing 'position of Jupiter' with 'brightness of the Sun', which is gradually increasing on a geological timescale.)

The many worlds interpretation is not just splitting so that it can be deterministic.

Imagine in your example that someone wanted to know if it was random or splitting. They point a telescope through the fifth dimension, and see an identical universe, except with the opposite event. Someone suggests that maybe it's mostly probabilistic, but there's a split at the end. The person then manages to look closely enough to distinguish the next twenty or so universes. They're told that it just branches further than they thought before.

This is what it is with quantum physics. We know there's multiple universes because we can test for them. Any quantum theory requires a configuration space for all the relevant particles. The only question is if it has one giant configuration space for the whole universe, or has little configuration spaces, which otherwise already follow all the laws of the giant one, that randomly split and combine.

My motivation for suggesting that "MWI is what you get when you mistakenly try to conceive of a quantum theory as deterministic" is the following:

First, imagine we have a simple deterministic universe like the Life universe. Forget about quantum theory for now, and suppose that we're going to build a simulation of a "Coin universe" within the Life universe. Suppose we decide to do it by simulating all branches at once, and 'tagging them' somehow to indicate their relative probabilities. Then the "tags" will be epiphenomenal, and 'from the inside' the beings will experience a universe where "Theory 1" is true. In other words, the probabilities we assign won't affect the experiences of the simulated beings.

Now, I want to say that this branching simulation is "what you get when you mistakenly try to model the coin universe as a deterministic universe".

OK, now let's replace the coin universe with a universe where quantum mechanics is the 'theory of everything'. Now we

couldsimulate it within the Life universe by deterministically modelling the entire wavefunction, and that might even be the only way of doing so, but it isn't clear to me that this wouldn't cause some or all of the information about probabilities to become "epiphenomenal" in the same way as before. As Steane says:Hanson's ingenious concept of "Mangled Worlds" might be exactly what I need to reassure myself that a deterministic simulation of the entire wavefunction would 'feel the same from the inside' as a genuine quantum universe. Armok_Gob was right to mention it. But then I'm just an "interested layperson", and no physicists or philosophers of physics besides Hanson himself ever seem to mention Mangled Worlds, so I'm not quite sure what to make of it.

That's not really analogous. What makes distinct MWI branches distinct is that they have 'decohered' and can no longer interact in any detectable way. [Disclaimer: I know this isn't an 'absolute' notion - that the 'off-diagonal elements' don't vanish entirely.]

Now, quantum interference can be illustrated with an experimental apparatus like figure one where to show that photons always exit the same way you need to take into account all of the possible paths a photon could take. However, since the experiment can only end one way, it all takes place within one "branch". (There is no decoherence.) The fact that the different photon paths interfere with each other means that they're not in "different worlds".

(Many worlds doesn't "explain" quantum interference, in spite of what Deutsch might have you believe. I don't think that was ever its "purpose", to be fair.)

Now, in principle you could have macroscopic superpositions - e.g. you could do a two-slit experiment with people rather than electrons. But it's better to say that the concept of "other worlds" breaks down if you push it too far, than to say we can thereby detect "other worlds".

Anyway, this is all rather confusing and confused - I haven't really worked out in my own mind if/how the "branching vs probabilities" discussion relates to MWI.

If the MWI branches are "close" and haven't completely decohered, it's possible to detect them. If they're far away, it's not. Similarly, if the universes are close by in the fifth dimension, you might be able to make them out. If they're far away, and you have to look through a hundred universes to make it out, it's essentially impossible. The method of detecting them is different, the the principle is the same.

You can detect it, but it doesn't happen? Isn't that like saying that the universe doesn't exist, but we experience things as if it did?

You need to know the potential energy of every point in configuration space in order to find out the probability of a given event. How can it matter if it isn't involved?

I don't understand. It explains it. So does any interpretation beyond pure particle. Its purpose is to explain away waveform collapse and the process of particles getting entangled. The laws regarding those in the Copenhagen interpretation are bizarre, and the results are indistinguishable from just assuming that everything is always entangled, and waveforms never collapse, which is the MWI.

We can detect other worlds of they're close enough, but not if they're too far away. This isn't just limited to the MWI. The Copenhagen interpretation follows the same laws with entangled particles. We've never been able to detect waveform collapse, so decoherence getting to the point where we can't detect the interference must happen first.

This is no different than saying that the next twenty branches exist, and maybe a few hundred more, but after a billion, the concept of other branches breaks down.

It's also like saying that Earth is made of atoms, and the rest of our solar system is made of atoms, but we aren't remotely capable of discerning atoms in other solar systems, so the concept of "being made of atoms" broke down.

This has largely turned into a semantic dispute about the "correct" meaning of the term "world" in the context of MWI.

You're using it to mean "summand with respect to the position basis" whereas I'm using it to mean "summand with respect to a decomposition of the Hilbert space into subspaces large enough that elements of two distinct subspaces represent 'distinct macrostates'". (Where "macroscopic distinctness" is not and does not pretend to be precisely definable.)

Right after the photon in the Mach-Zehnder apparatus splits, you see two worlds corresponding to the two different positions of the photon, whereas I see only a single world because all macroscopic variables still have determinate values. (Or rather, their values are still as close to being determinate as they ever are.)

In my use of the term "worlds" it is correct to say that the notion of "other worlds" breaks down if you push it too far (ultimately this is because the boundary between the "micro" and "macro" domains cannot be rigorously defined.) In your use of the term "worlds" it is trivially true that, at any given time, the state vector is uniquely expressible as a superposition of "worlds".

I don't want to say too much in defense of my usage, except that I

thinkmine is the standard one. You might like to read this by the way. (Not to resolve our dispute, but because it's awesome.)Sorry, I can't see how your questions relate to my statement.

The reason I say it doesn't explain it is that the notion of "constructive and destructive interference" between different possibilities is deeply bizarre. Simply declaring that all possibilities exist doesn't explain why two possibilities can

cancel each other out. But again, I suspect this is partly just a dispute over the semantics of "explain".ETA: I have to acknowledge a bait-and-switch on my part. Whereas in my previous comment I was seeking to characterise worlds directly in terms of decoherence, now I'm characterizing them by way of a third concept, namely "macroscopic distinctness", which "under normal circumstances (i.e. not doing a two-slit experiment with people)" guarantees decoherence.

It was a misunderstanding you cleared up by specifying what you meant by "world".

The interference isn't between probabilities. They don't contain sufficient information. It's between the amplitudes. Going from amplitudes to probabilities is the weird part. It's not explained by any interpretation.

Good thing I didn't say that, then!

Above, I said of MWI "I don't think that was ever its "purpose"."

My motivation for suggesting that "MWI is what you get when you mistakenly try to conceive of a quantum theory as deterministic" is the following:

First, imagine we have a simple deterministic universe like the Life universe. Forget about quantum theory for now, and suppose that we're going to build a simulation of a "Coin universe" within the Life universe. Suppose we decide to do it by simulating all branches at once, and 'tagging them' somehow to indicate their relative probabilities. Then the "tags" will be epiphenomenal, and 'from the inside' the beings will experience a universe where "Theory 1" is true. In other words, the probabilities we assign won't affect the experiences of the simulated beings.

Now, I want to say that this branching simulation is "what you get when you mistakenly try to model the coin universe as a deterministic universe".

OK, now let's replace the coin universe with a universe where quantum mechanics is the 'theory of everything'. Now we

couldsimulate it within the Life universe by deterministically modelling the entire wavefunction, and that might even be the only way of doing so, but it isn't clear to me that this wouldn't cause some or all of the information about probabilities to become "epiphenomenal" in the same way as before. As Steane says:Hanson's ingenious concept of "Mangled Worlds" might be exactly what I need to reassure myself that a deterministic simulation of the entire wavefunction would 'feel the same from the inside' as a genuine quantum universe. Armok_Gob was right to mention it. But then I'm just an "interested layperson", and no physicists or philosophers of physics besides Hanson himself ever seem to mention Mangled Worlds, so I'm not quite sure what to make of it.

How do you tell the difference between those from inside a branch?

The difference between what?

If you want to know how to tell the difference between the MWI of quantum mechanics and a single branch theory, there's no experiment I can give, because there's no such thing as a single branch theory.

The Schroedinger equation gives the behavior of a single particle in a potential field. If you want to model two particles, you have to use the configuration space.

Random and Splitting.

Though, I just realized that I should just google around and look for papers on the subject.

Collapse interpretation, IIRC Bohmian interpretation, unreal MWI, etc.

They still involve the whole MWI, just on a smaller scale.

The difference between randomly choosing when scale gets to a certain point and splitting is that splitting isn't total. There's still some interference between any two universes no matter how far apart they are. It's just that when there's macroscopic differences, the interference is astronomically small.

Interesting post. I have a few questions...

Do you think its fair to say that the question could be rephrased as "After a fundamentally probabilistic event, how can you tell if every outcome happened (in different universes, weighted by probability) versus only one outcome happened, with a particular probability?"

What does A do in that universe? Or does it not really exist in that model?

Not quite. I don't think there is a meaningful (as opposed to 'verbal') difference between the two options you've described. (Eliezer might say that the difference doesn't "pay any rent" in terms of subjective anticipations.)

The section about incommensurability is trying to argue that there is no "Bayesian" way for a believer in probabilities to dissuade a believer in (probabilityless) branches, or vice versa. This is disturbing because there might actually be a 'fact of the matter' about which of Theory 1 and Theory 2 is correct. (After all, we could simulate a coin universe either by simulating all branches

orby pruning with the help of a random number generator.)A is copied as well. Sorry, I didn't make that very clear.

Agreed. Yeah, I can't really figure out what difference that causes right now...

What different anticipations would Theory 1 vs 2 cause?

Ah. Thanks.

In that case, what influence does A have on B?

Well, a believer in Theory 2b anticipates 'heads event tomorrow' with 90% probability. A believer in Theory 1 doesn't think the concept of 'tomorrow' makes sense unless you've also specified tomorrow's coin event. Hence, the most they're prepared to say is that they anticipate heads with 100% probability at time (tomorrow, heads) and heads with 0% probability at time (tomorrow, tails).

OK, let's forget about "A and B". The "coin universe" is just a universe where there is a "coin event" each day, which can be observed but not influenced.

Could a Theory 1 believer have the same expectation based on anthropic reasoning?

Like, lets say that Theory 1b is that the universe splits into 10, with 9 universes having heads and 1 having tails. 9 out of 10 observers would see heads, so they could say that they expect heads event tomorrow with 90% probability.

I don't think so. Let me show you the mental image which motivated this entire post:

Imagine that you're about to simulate both branches of the second day of a 'coin universe', with the following equipment:

(i) a single 'heads' computer weighing seven tons, with a sticker on the side saying "probability 4/5", and (ii) six 'tails' computers each weighing 500 grams, running identical programs, and having stickers on the side saying "probability 1/30".

Now I can't see any reason why the

number of computersas opposed to the weights or the stickers should be the relevant input into the simulated beings' anthropic reasoning (assuming they knew on day 1 how they were going to be simulated on day 2).In order to do any anthropic reasoning, they need some kind of "bridge law" which takes all of these (and possibly other) factors and outputs probability weights for heads and tails. But it seems that any choice of bridge law is going to be hopelessly arbitrary.

(If you equalised all but one of the factors, and were left with just the numbers of computers, the weights or the stickers, then you would have a canonical way of assigning probabilities, but just because it "leaps out at you" doesn't mean it's "correct".)

Going back to your Theory 1b, I want to ask why

counting up numbers of universesshould be the correct bridge law (as opposed to, say, counting the number of distinct universes, or one of infinitely many other weird and wonderful bridge laws).The dilemma for you is that you can either (i) stipulate that all you

meanby "splitting into 10 universes" is that each of the 10 has probability 1/10 (in which case we're back to Theory 2b) or else (ii) you need to somehow justify an arbitrary choice of 'bridge law' (which I don't think is possible).I'm gonna go ahead and notice that I'm confused.

That being said, (and simplifying to the computers simulating one person rather than a universe)

I think that the number of computers is more relevant than weight and stickers in that it determines how many people are running. If you simulate someone on a really big computer, no matter how big the computer is, you're only simulating them once. Similarly, you can slap a different sticker on, and nothing would really change observer-wise.

If you simulate someone 5 times, then there are five simulations running, and there's 5 conscious experiences going on.

I easily imagine that counting the number of distinct experiences is the proper procedure. Even if the person is on 5 computers, they're still experiencing the exact same things.

However, you can still intervene in any of those 5 simulations without affecting the other 4 simulations. That's probably part of the reason that I think that the fact that the person's on 5 different computers matters.

But that wouldn't contradict the idea that, until an intervention makes things different, only distinct universes should counted.

I'm trying to figure out if there are any betting rules which would make you want to choose different ways of assigning probability, kind of like how this was approached.

All the ones that I've come up with so far involve scoring across universes, which seems like a cheap way out, and still doesn't boil down to predictions that the universe's inhabitants can test.

I like this term.

What if the computers in question are two-dimensional like Ebborians. Then splitting a computer down the middle has the effect of going from one computer weighing x to two computers each weighing x/2. Why should this splitting operation mean that you're simulating 'twice as many people'? And how many people are you simulating if the computer is only 'partially split'?

The LW consensus on Sleeping Beauty is that there is no such thing as "SB's correct subjective probability that the coin is tails" unless one specifies 'betting rules' and even then the only meaning that "subjective probability" has is "the probability assignments that prevent you from having negative expected winnings". (Where the word 'expected' in the previous sentence relates only to the coin, not the subjective probabilities.)

So in terms of the "fact vs value" or "is vs ought" distinction, there is no purely "factual" answer to Sleeping Beauty, just some strategies that maximize value.

It comes from the philosophy of mind.

Is it possible to predict the final state given only the initial state? If so, it's deterministic. If not, it's probabilistic.

I would think "probabliistic" should be reserved for things that are actually governed by probabilities. As you know, amplitudes don't really work like probabilities; if they did, the MWI hypothesis would be unneeded.

You would expect so, but amplitudes just don't work like probabilities. (Of course, if you are considering the whole wavefunction as the state, then it is deterministic.)

By the way, you might like to read the Scott Aaronson lecture I linked to in my post. Here's a quote:

Yeah but you can make a "probabilistic" system look "deterministic" as long as you define the "state" in such a way as it includes the entire distribution.

Of course, a person could never

observethat 'final state', but neither can a person observe the entire wavefunction.For instance, you're only allowed to extract one bit of information about the spin of a given electron, even though the wavefunction (of the spin of a single electron) looks like a point on the surface of a sphere. This is analogous to how, given a {0,1}-valued random quantity, when you observe it you only extract one bit of information about it, even though its expectation value could have been anywhere in the interval [0,1].

My motto here is that if a theory is

assigning weights to possible worldsthen it's as far away from being deterministic as it's possible to be.So, it's probabilistic?

Read and then get back to me if you still don't understand where I'm coming from.

I'm not sure how much of a parallel can be drawn between probability and their extension of it.

Probability is a state of your knowledge. Quantum superposition has nothing to do with how much you know.

One last thing - make of it what you will.

Suppose you have access to a 'true random number generator', and you read off a string of N random bits. You also take N electrons, and you prepare the spins of the electrons such that the i-th electron has spin "up" if the i-th bit is 1, or else "down" if the i-th bit is 0.

Now here's an interesting fact: There is no experiment anyone could do to determine whether you had chosen "up"/"down" or "left"/"right" as spin directions for your electrons.

In other words, quantum uncertainty and probabilistic uncertainty can combine 'seamlessly' in such a way that it's impossible to say where one ends and the other begins.

Two things to say:

determinedby the values of analgorithmically randomsequence. The algorithmic randomness would be a property of the sequence itself, not anyone's knowledge of it.Quantum superposition has "quite a lot" to do with the Born probabilities, and (according to you) the Born probabilities, being mere probabilities, have everything to do with how much you know.

I'm not saying a quantum universe

isa probabilistic one. But that's really the whole point - it's neither probabilistic nor deterministic (except in the same vacuous sense that you can make itlookdeterministicifyou carry the entire distribution around with you).How do you get your hands on an algorithmically random sequence? If our physics isn't objectively probabilistic, then we can't even simulate Theory 2.

This is relevant: http://hanson.gmu.edu/mangledworlds.html

Yeah, I've heard of Mangled Worlds. It sounds ingenious, but ideally I'd like to see some evidence of physicists or philosophers of physics besides Hanson himself taking it seriously (before I invest the time necessary to understand it properly).