Previously in seriesDecoherence is Pointless
Followup toWhere Experience Confuses Physicists

One serious mystery of decoherence is where the Born probabilities come from, or even what they are probabilities of.  What does the integral over the squared modulus of the amplitude density have to do with anything?

This was discussed by analogy in "Where Experience Confuses Physicists", and I won't repeat arguments already covered there.  I will, however, try to convey exactly what the puzzle is, in the real framework of quantum mechanics.

A professor teaching undergraduates might say:  "The probability of finding a particle in a particular position is given by the squared modulus of the amplitude at that position."

This is oversimplified in several ways.

First, for continuous variables like position, amplitude is a density, not a point mass.  You integrate over it.  The integral over a single point is zero.

(Historical note:  If "observing a particle's position" invoked a mysterious event that squeezed the amplitude distribution down to a delta point, or flattened it in one subspace, this would give us a different future amplitude distribution from what decoherence would predict.  All interpretations of QM that involve quantum systems jumping into a point/flat state, which are both testable and have been tested, have been falsified.  The universe does not have a "classical mode" to jump into; it's all amplitudes, all the time.)

Second, a single observed particle doesn't have an amplitude distribution.  Rather the system containing yourself, plus the particle, plus the rest of the universe, may approximately factor into the multiplicative product of (1) a sub-distribution over the particle position and (2) a sub-distribution over the rest of the universe.  Or rather, the particular blob of amplitude that you happen to be in, can factor that way.

So what could it mean, to associate a "subjective probability" with a component of one factor of a combined amplitude distribution that happens to factorize?

Recall the physics for:

(Human-BLANK * Sensor-BLANK) * (Atom-LEFT + Atom-RIGHT)
(Human-LEFT * Sensor-LEFT * Atom-LEFT) + (Human-RIGHT * Sensor-RIGHT * Atom-RIGHT)

Think of the whole process as reflecting the good-old-fashioned distributive rule of algebra.  The initial state can be decomposed—note that this is an identity, not an evolution—into:

(Human-BLANK * Sensor-BLANK) * (Atom-LEFT + Atom-RIGHT)
(Human-BLANK * Sensor-BLANK * Atom-LEFT) + (Human-BLANK * Sensor-BLANK * Atom-RIGHT)

We assume that the distribution factorizes.  It follows that the term on the left, and the term on the right, initially differ only by a multiplicative factor of Atom-LEFT vs. Atom-RIGHT.

If you were to immediately take the multi-dimensional integral over the squared modulus of the amplitude density of that whole system,

Then the ratio of the all-dimensional integral of the squared modulus over the left-side term, to the all-dimensional integral over the squared modulus of the right-side term,

Would equal the ratio of the lower-dimensional integral over the squared modulus of the Atom-LEFT, to the lower-dimensional integral over the squared modulus of Atom-RIGHT,

For essentially the same reason that if you've got (2 * 3) * (5 + 7), the ratio of (2 * 3 * 5) to (2 * 3 * 7) is the same as the ratio of 5 to 7.

Doing an integral over the squared modulus of a complex amplitude distribution in N dimensions doesn't change that.

There's also a rule called "unitary evolution" in quantum mechanics, which says that quantum evolution never changes the total integral over the squared modulus of the amplitude density.

So if you assume that the initial left term and the initial right term evolve, without overlapping each other, into the final LEFT term and the final RIGHT term, they'll have the same ratio of integrals over etcetera as before.

What all this says is that,

If some roughly independent Atom has got a blob of amplitude on the left of its factor, and a blob of amplitude on the right,

Then, after the Sensor senses the atom, and you look at the Sensor,

The integrated squared modulus of the whole LEFT blob, and the integrated squared modulus of the whole RIGHT blob,

Will have the same ratio,

As the ratio of the squared moduli of the original Atom-LEFT and Atom-RIGHT components.

This is why it's important to remember that apparently individual particles have amplitude distributions that are multiplicative factors within the total joint distribution over all the particles.

If a whole gigantic human experimenter made up of quintillions of particles,

Interacts with one teensy little atom whose amplitude factor has a big bulge on the left and a small bulge on the right,

Then the resulting amplitude distribution, in the joint configuration space,

Has a big amplitude blob for "human sees atom on the left", and a small amplitude blob of "human sees atom on the right".

And what that means, is that the Born probabilities seem to be about finding yourself in a particular blob, not the particle being in a particular place.

But what does the integral over squared moduli have to do with anything?  On a straight reading of the data, you would always find yourself in both blobs, every time.  How can you find yourself in one blob with greater probability?  What are the Born probabilities, probabilities of?  Here's the map—where's the territory?

I don't know.  It's an open problem.  Try not to go funny in the head about it.

This problem is even worse than it looks, because the squared-modulus business is the only non-linear rule in all of quantum mechanics.  Everything else—everything else—obeys the linear rule that the evolution of amplitude distribution A, plus the evolution of the amplitude distribution B, equals the evolution of the amplitude distribution A + B.

When you think about the weather in terms of clouds and flapping butterflies, it may not look linear on that higher level.  But the amplitude distribution for weather (plus the rest of the universe) is linear on the only level that's fundamentally real.

Does this mean that the squared-modulus business must require additional physics beyond the linear laws we know—that it's necessarily futile to try to derive it on any higher level of organization?

But even this doesn't follow.

Let's say I have a computer program which computes a sequence of positive integers that encode the successive states of a sentient being.  For example, the positive integers might describe a Conway's-Game-of-Life universe containing sentient beings (Life is Turing-complete) or some other cellular automaton.

Regardless, this sequence of positive integers represents the time series of a discrete universe containing conscious entities.  Call this sequence Sentient(n).

Now consider another computer program, which computes the negative of the first sequence:  -Sentient(n).  If the computer running Sentient(n) instantiates conscious entities, then so too should a program that computes Sentient(n) and then negates the output.

Now I write a computer program that computes the sequence {0, 0, 0...} in the obvious fashion.

This sequence happens to be equal to the sequence Sentient(n) + -Sentient(n).

So does a program that computes {0, 0, 0...} necessarily instantiate as many conscious beings as both Sentient programs put together?

Admittedly, this isn't an exact analogy for "two universes add linearly and cancel out".  For that, you would have to talk about a universe with linear physics, which excludes Conway's Life.  And then in this linear universe, two states of the world both containing conscious observers—world-states equal but for their opposite sign—would have to cancel out.

It doesn't work in Conway's Life, but it works in our own universe!  Two quantum amplitude distributions can contain components that cancel each other out, and this demonstrates that the number of conscious observers in the sum of two distributions, need not equal the sum of conscious observers in each distribution separately.

So it actually is possible that we could pawn off the only non-linear phenomenon in all of quantum physics onto a better understanding of consciousness.  The question "How many conscious observers are contained in an evolving amplitude distribution?" has obvious reasons to be non-linear.


Robin Hanson has made a suggestion along these lines.


Decoherence is a physically continuous process, and the interaction between LEFT and RIGHT blobs may never actually become zero.

So, Robin suggests, any blob of amplitude which gets small enough, becomes dominated by stray flows of amplitude from many larger worlds.

A blob which gets too small, cannot sustain coherent inner interactions—an internally driven chain of cause and effect—because the amplitude flows are dominated from outside.  Too-small worlds fail to support computation and consciousness, or are ground up into chaos, or merge into larger worlds.

Hence Robin's cheery phrase, "mangled worlds".

The cutoff point will be a function of the squared modulus, because unitary physics preserves the squared modulus under evolution; if a blob has a certain total squared modulus, future evolution will preserve that integrated squared modulus so long as the blob doesn't split further.  You can think of the squared modulus as the amount of amplitude available to internal flows of causality, as opposed to outside impositions.

The seductive aspect of Robin's theory is that quantum physics wouldn't need interpreting.  You wouldn't have to stand off beside the mathematical structure of the universe, and say, "Okay, now that you're finished computing all the mere numbers, I'm furthermore telling you that the squared modulus is the 'degree of existence'."  Instead, when you run any program that computes the mere numbers, the program automatically contains people who experience the same physics we do, with the same probabilities.

A major problem with Robin's theory is that it seems to predict things like, "We should find ourselves in a universe in which lots of very few decoherence events have already taken place," which tendency does not seem especially apparent.

The main thing that would support Robin's theory would be if you could show from first principles that mangling does happen; and that the cutoff point is somewhere around the median amplitude density (the point where half the total amplitude density is in worlds above the point, and half beneath it), which is apparently what it takes to reproduce the Born probabilities in any particular experiment.

What's the probability that Hanson's suggestion is right?  I'd put it under fifty percent, which I don't think Hanson would disagree with.  It would be much lower if I knew of a single alternative that seemed equally... reductionist.

But even if Hanson is wrong about what causes the Born probabilities, I would guess that the final answer still comes out equally non-mysterious.  Which would make me feel very silly, if I'd embraced a more mysterious-seeming "answer" up until then.  As a general rule, it is questions that are mysterious, not answers.

When I began reading Hanson's paper, my initial thought was:  The math isn't beautiful enough to be true.

By the time I finished processing the paper, I was thinking:  I don't know if this is the real answer, but the real answer has got to be at least this normal.

This is still my position today.


Part of The Quantum Physics Sequence

Next post: "Decoherence as Projection"

Previous post: "Decoherent Essences"

New Comment
82 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

I guess I was too quick to assume that mangled worlds involved some additional process. Oops.

Unless there is a surprising amount of coherence between worlds with different lottery outcomes, this mangled worlds model should still be vulnerable to my lottery winning technique (split the world a bunch of times if you win).


I haven't commented on a while. I'm just curious, are there any non-physicists who are able to follow this whole quantum-series? I've given up some posts ago.


You wouldn't have to stand off beside the mathematical structure of the universe, and say, "Okay, now that you're finished computing all the mere numbers, I'm furthermore telling you that the squared modulus is the 'degree of existence'."

Instead, you'd have to stand off beside the mathematical structure of the universe, and say, "Okay, now that you're finished computing all the mere numbers, I'm furthermore telling you that the world count is the 'degree of existence'."

Roland: yes, at least one. Where did you give up and why?

A major problem with Robin's theory is that it seems to predict things like, "We should find ourselves in a universe in which lots of decoherence events have already taken place," which tendency does not seem especially apparent.

Actually the theory suggests we should find ourselves in a state with near the least feasible number of past decoherence events. Yes, it is not clear if this in fact holds, and yes I'd put the chance of something like mangled worlds being right as more like 1/4 or 1/3.

Thanks to Eliezer's QM series, I'm starting to have enough background to understand Robin's paper (kind of, maybe). And now that I do (kind of, maybe), it seems to me that Robin's point is completely demolished by Wallace's points about decoherence being continuous rather than discrete and therefore there being no such thing as a number of discrete worlds to count.

There seems to be nothing to resolve between the probabilities given by measure and the probabilities implied by world count if you simply say that measure is probability.

Eliezer objects. We're... (read more)

Reading through this, and Hanson's quick overview page of mangled worlds, I was wondering the same thing myself. For some reason though, seeing you ask the question I hadn't quite verbalized put the answer right on the tip of my tongue: for the same reason Einstein was so sure of General Relativity. The modulus squared law conflicts with a regularity in the form that the fundamental laws seem to take, specifically their linear evolution, and Eliezer puts stock in that regularity. In fact, he does so sufficiently to let him elevate any theory which accounts for the data while holding the regularity far above those that don't, similar to how Einstein picked GR out of hypothesis space. The benefit of the mangled worlds interpretation is that while the universe-amplitude-blobs do have measure (a non-linear element), it is irrelevant to what actually happens. It really only comes into play when trying to understand the interaction between the universe-amplitude-blobs, but it doesn't play a part in actually describing that interaction. For example, the possible mangling of a world of small measure would be described by normal linear quantum evolution, but since the calculations are not very nice, we can determine whether it would be mangled using that measure. Thus, we are using the measure as a mathematical shortcut to determine generalized behavior, but all evolution is linear, and observations can be explained without the extra hypothesis that "measure is probability".


My understanding of Eli's beef with the Born rule is this (he can correct me if I'm wrong): the Born rule appears to be a bridging rule in fundamental physics that directly tells us something about how qualia bind to the universe. This seems odd. Furthermore, if the binding of qualia to the universe is given by a separate fundamental bridging rule independent of the other laws of physics, then the zombie world really is logically possible, or in other words epiphenomenalism is true. (Just postulate a universe with all the laws of physics except Born... (read more)

"Eli argues against epiphenomenalism on the grounds that if epiphenomenalism is true, then the correlation between beliefs (which are qualia) with our statements and actions (which are physical processes) is just a miraculous coincidence." Supposing he does, I must point out that it is false to say that beliefs are qualia. In fact, beliefs are part of the intentional stance. That is well worked out in Dennett's book by the same name. The intentional level can be accounted for in physical terms (See for instance "Kinds of Minds" by Dennett to see how intentionality unfolds from genes to amoebas to Karl Popper. One could insist on being a phenomenal realist, and say that beliefs are both an intentional interpretation of a physical system that can be accounted for without the aid of qualia, and furthermore that there was another aspect of beliefs that is the experiential aspect, the qualia-ness of them. Even holding such a position, one needs only to explain our beliefs as long as they are physically causally effective upon the world (for instance causing us to talk about qualia, beliefs, etc..). So if there are beliefs as intentional descriptions of organisms, AND in addition beliefs as qualia, the second kind is UTTERLY unexplainable by its very nature. There is no need to account for them, because we have no reason to believe they exist, since if they did, they would not figure in our theories, being causally inneficient.

None of the confusion over duplication and quantum measures seems unique to beings with qualia; any Bayesian system capable of anthropic reasoning, it would seem, should be surprised the universe is orderly. So maybe either the confusion is separate from and deeper than experience, or AIXItl has qualia.


As I understand it (someone correct me if I'm wrong), there are two problems with the Born rule: 1) It is non-linear, which suggests that it's not fundamental, since other fundamental laws seem to be linear

2) From my reading of Robin's article, I gather that the problem with the many-worlds interpretation is: let's say a world is created for each possible outcome (countable or uncountable). In that case, the vast majority of worlds should end up away from the peaks of the distribution, just because the peaks only occupy a small part of any distribution.

Rob... (read more)

The observer's consciousness is still involved. Imagine that the Born rule isn't a law of the universe itself, but of consciousness. The universe evaluates all branches. Consciousness follows the branches in weights following the Born rule. The conscious observer always finds themselves down a series of branches that were selected by the Born rule, and it's easy for them to take measurements to confirm this. The Mathematica 5000 machine that's come down this series of branches has made measurements from experiments and has found that the Born rule has held. It only comes up with this result because this is the version of the machine that has followed the observer's consciousness through the branches. In the raw universe, most worlds have the Mathematica 5000 machine finding that Born's rule does not hold; these aren't the worlds that conscious observers usually find themselves in though.

Nick: I don't understand the connection to quantum mechanics.

The argument that I commonly see relating quantum mechanics to anthropic reasoning is deeply flawed. Some people seem to think that many worlds means there are many "branches" of the wavefunction and we find ourselves in them with equal probability. In this case, they argue, we should expect to find ourselves in a disorderly universe. However, this is exactly what the Born rule (and experiment!) does not say. Rather, the Born rule says that we are only likely to find ourselves in states... (read more)

In this case epiphenomenalism would be true (since qualia have no effect on the physical world), but the correlation would not be a coincidence (since the physical world directly causes qualia).

But the nature of the experiences we claimed to have would not depend in any way on the properties of these hypothetical 'qualia'. There would be no event in the physical world that would be affected by them - they would not, in fact, exist.

Epiphenomenalism is never true, because it contains a contradiction in terms.

Here's a different question which may be relevant: why unitary transforms?

That is, if you didn't in the first place know about the Born rule, what would be a (even semi) intuitive justification for the restriction that all "reasonable" transforms/time evolution operators have to conserve the squared magnitude?

Given the Born rule, it seems rather obvious, but the Born rule itself is what is currently appears to be suspiciously out of place. So, if that arises out of something more basic, then why the unitary rule in the first place?

Stephen, thanks for your thoughts on Eli's thoughts. I'm going to have to think on them further - after all these helpful posts I can pretend I understand quantum mechanics, but pretending to understand how conscious minds perceive a single point in configuration space instead of blobs of amplitude is going to take more work.

I will point out, though, that the question of how consciousness is bound to a particular branch (and thus why the Born rule works like it does) doesn't seem that much different from how consciousness is tied to a particular point in ... (read more)

"Given the Born rule, it seems rather obvious, but the Born rule itself is what is currently appears to be suspiciously out of place. So, if that arises out of something more basic, then why the unitary rule in the first place?"

While not an answer, I know of a relevant comment. Suppose you assume that a theory is linear and preserves some norm. What norm might it be? Before addressing this, let's say what a norm is. In mathematics a norm is defined to be some function on vectors that is only zero for the all zeros vector, and obeys the triangle i... (read more)

This seems to be true, but with the small note that you should add multipication of the coordinates by -1 [by any number from unit circle if the space is taken over complex numbers] and their compositions with permutations to the allowed isomorphisms. Never heard about this though, interesting. However this does not generalize to all the norms. As Douglas noted below one can imagine norm simply as a central-symmetric convex body. And there are plenty of those. Now if we can fix a finite subgroup of space rotations and symmetries that strictly contains all the coordinate permutations and central-symmetry then we are done, since one can simply take convex hull of the orbit of some point as your desired norm. Symmetries and rotations of regular 100-gon on the plane would work for example. Hmm, something fishy is going with signs in the whole argument and here I am completely lost. What if I take 2x2 matrix with all entries equal to 1/2 and a vector (1/2, -1/2)? Probably the full formulation by Scott would help. Does anybody have a link?
Thank you. Nice paper. Signs are treated accurately there of course. However call to "formal functions" in the end of the proof seems wacky at best. Formalizing it looks harder to me than the initial statement. At this point it should be easier to just look at the smoothness degrees of the norm on x_i = 0 hyperplanes. If anybody knows what was meant, however, please clarify.

"I will point out, though, that the question of how consciousness is bound to a particular branch (and thus why the Born rule works like it does) doesn't seem that much different from how consciousness is tied to a particular point in time or to a particular brain when the Spaghetti Monster can see all brains in all times and would have to be given extra information to know that my consciousness seems to be living in this particular brain at this particular time."


More generally, it seems to me that many objections people raise about the fo... (read more)

Psy-Kosh, the amplitudes of everything everywhere could be changing by a constant modulus and phase, without it being noticed. But if it were possible for you to carry out some physical process that changed the squared modulus of the LEFT blob as a whole, without splitting it and without changing the squared modulus of the RIGHT blob, then you would be able to use this physical process to change the ratio of the squared moduli of LEFT and RIGHT, hence control the outcome of arbitrary quantum experiments by invoking it selectively.

It would be an Outcome Pu... (read more)

Stephen: Thanks. First, not everything corresponding to a length or such obeys that particular rule... consider the Lorenz metric... any "lightlike" vector has a norm of zero, for instance, and yet that particular matric is rather useful physically. :) (admittedly, you get that via the minus sign, and if your norm is such that it treats all the components in some sense equivalently, you don't get that... well, what about norms involving cross terms?)

More to the subject... why is any norm preserved? That is, why only allow norm preserving transfor... (read more)


Good example with the Lorentz metric.

Invariance of norm under permutations seems a reasonable assumption for state spaces. On the other hand, I now realize the answer to my question about whether permutation invariance narrows things down to p-norms is no. A simple counterexample is a linear combination of two different p-norms.

I think there might be a good reason to think in terms of norm-preserving maps. Namely, suppose the norms can be anything but the individual amplitudes don't matter, only their ratios do. That is, states are identified not ... (read more)

I'm struck by guilt for having spoken of "ratios of amplitudes". It makes the proposal sound more specific and fully worked-out than it is. Let me just replace that phrase in my previous post with the vaguer notion of "relative amplitudes".

Stephen: Is the point you're making basically along the lines of "vector as geometric object rather than list of numbers"?

Sure, I buy that. Heck, I'm naturally inclined toward that perspective at this time. (In part because have been studying GR lately)

Aaanyways, so I guess basically what you're saying is that all operators corresponding to time evolution or whatever are just rotations or such in the space? And why the 2-norm instead of, say, the 1-norm? why would the universe "prefer" to preserve the sum of the squared magnitudes rathe... (read more)

@Roland: My physics and maths is patchy but I'm still just about following (the posts - some comments are way too advanced) though it is hard work for some bits. Lots of slow re-reading, looking things up and revising old posts, but it's worth it.

If you're determined enough, try reading the posts a few at a time (instead of one a day) starting a few posts before where you got stuck, and make sure you "get" each one before you move on, even if it means an hour on another web source studying the thing you don't understand in Eliezer's explanation.


"Or did I completely and utterly misunderstand what you were trying to say?"

No, you are correctly interpreting me and noticing a gap in the reasoning of my preceeding post. Sorry about that. I re-looked-up Scott's paper to see what he actually said. If, as you propose, you allow invertible but non-norm-preserving time evolutions and just re-adjust the norm afterwards then you get FTL signalling, as well as obscene computational power. The paper is here.

A major problem with Robin's theory is that it seems to predict things like, We should find ourselves in a universe in which lots of decoherence events have already taken place," which tendency does not seem especially apparent.

Actually the theory suggests we should find ourselves in a state with near the least feasible number of past decoherence events

I don't understand this - doesn't decoherence occur all the time, in every quantum interaction between all amplitudes all the time? So, like for every amptlitude separate enough to be a "particle&q... (read more)

1Ramana Kumar
I'd also love to know the answer to Peter's question... A similar question is whether we should expect all worlds to eventually become mangled (assuming the "mangled worlds" model). I understand "world" to mean "somewhat isolated blob of amplitude in an amplitude distribution" - is that right?
The answer to Peter's question is: no, decoherence doesn't happen with a constant rate and it certainly doesn't happen on the Planck time scale. The answer to your question is that "managled worlds" is a collapse theory: some worlds get managled and go away, leaving other worlds.
0Ramana Kumar
Then I'm still unclear about what a world is. Care to explain?
0Ramana Kumar
Eliezer gave a simpler answer to my question: "yes". (I'm still not sure what yours means.) Back to Peter's question. What makes you say decoherence doesn't happen on the Planck time scale? Can you explain that further?
Any given instance of decoherence is an interaction between two or more particles. And all known interactions take rather longer than Planck time. There probably are enough decoherence events in the universe that at least one occurs somewhere in each Plank timeunit. But that doesn't instantly decohere everything. Other objects remain coherent until they interact with the decohered system, which is limited by the rate at which information propagates (both latency and bandwidth) (unless of course they decohere on their own). i.e. after a blob of amplitude has split, the sub-blobs are only separated along some dimensions of configuration space, and retain the same cross-section along the rest of the dimensions (hence "factors").
Okay, given one sub-decoherence event per planck time, somewhere in the universe, propagating throughout it at some rate less than or equal to the speed of light...we either have constant (one per planck time or less) full decoherence events after some fixed time as each finishes propagating sufficiently, or we have no full decoherence events at all as the sub-decoherences fail to decohere the whole sufficiently. The latter seems more realistic, especially given the light speed limit, as the expansion of space can completely causally isolate two parts of the universe preventing the propagation of the decoherence. So, with this understood, we're left to determine how large a portion of the universe has to be decohered to qualify as a "decoherence event" in terms of the many worlds theories which rely on the term. I honestly doubt that, once a suitable determination has been made, the events will be infrequent in almost any sense of the word. It really does seem, given the massive quantities of interactions in our universe(even just the causally linked subspace of it we inhabit), that the frequency of decoherence events should be ridiculously high. And given some basic uniformity assumptions, the rate should be quite regular too.

Stephen: I don't have a postscript viewer.

Wait, I thought the superpower stuff only happens if you allow nonlinear transforms, not just nonunitary. Let's add an additional restriction: let's actually throw in some notion of locality, but even with the locality, abandon unitaryness. So our rules are "linear, local, invertable" (no rescaling aftarwards... not defining a norm to preserve in the first place)... or does locality necessitate unitarity? (is unitarity a word? Well, you know what I mean. Maybe I should say orthognality instead?)

Well, actu... (read more)

"If you didn't know squared amplitudes corresponded to probability of experiencing a state, would you still be able to derive "nonunitary operator -> superpowers?""

Scott looks at a specific class of models where you assume that your state is a vector of amplitudes, and then you use a p-norm to get the corresponding probabilities. If you demand that the time evolutions be norm-preserving then you're stuck with permutations. If you allow non-norm-preserving time evolution, then you have to readjust the normalization before calculating ... (read more)

Stephen: Aaah, okay. And yeah, that's why I said no rescaling.

I mean, if one didn't already have the "probability of experiencing something is linear in p-norm..." thing, would one still be able to argue superpowers?

From your description, it looks like he still has to use the princple of "probability of experiencing something proportional to p-norm" to justify the superpowers thing.

Browsed through the paper, and, if I interpreted it right, that is kinda what it was doing... Assume there's some p-norm corresponding to probability. But ma... (read more)

are all the norms invariant under permutation of the indices p-norms?

Well, you answered that exact question, but here's a description of all norms (on a finite dimensional real vector space): a norm determines the set of all vectors of norm less than or equal to 1. This is convex and symmetric under inverting sign (if you wanted complex, you'd have to allow multiplication by complex units). It determines the norm: the norm of a vector is the amount you have to scale the set to envelope the vector. Any set satisfying those conditions determines a norm.

So th... (read more)

Weren't the Born probabilities successfully derived from decision theory for the MWI in 2007 by Deutsch: "Probabilities used to be regarded as the biggest problem for Everett, but ironically, they are now its most powerful success" -

4Wei Dai
There are a couple of recent papers on this topic: * A formal proof of the Born rule from decision-theoretic assumptions by David Wallace * Has the Born rule been proven? by J. Finkelstein I personally find Finkelstein's response/counterargument convincing.
Hm, Wei_Dai(2009) seems to have a notion of rationality that is quite permissive if he's convinced by Finkelstein. If rationality isn't in fact permissive and instead stringently requires diachronic consistency (exceptionlessness, updatelessness, pre-rational priors) then I don't think Finkelstein's arguments are convincing. And there are positive arguments, e.g. by Derek Parfit, that rationality is normatively "thick".

If anyone can produce a cellular automata model that can create circles like those which relate to the inverse square of distance or the stuff of early wave mechanics, I think I can bridge the MWI view and the one universe of many fidgetings view that I cling to. I know of one other person who has a similar idea, unfortunately his idea has a bizarre quantity which is the square root of a meter.

Consider for example what "scattering experiments" show, in a context of imagining that the universe is made of fields and that only "observation" makes a manifestation in a small region of space? I mean, suppose we think of the "observations" as being our detecting the impacts of the "scattered" electrons rather than the scatterings themselves. (IOW, we don't consider "mere" interactions to be observations - whatever that means.) But then why and how did the waves representing the electrons scatter as if o... (read more)

Re: "If anyone can produce a cellular automata model that can create circles like those which relate to the inverse square of distance" Producing such a cellular automaton model is trivial. See my: Gallery: Java CA program that made the images:

My guess is that the Born's Rule is related to the Solomonoff Prior. Consider a program P that takes 4 inputs:

  • boundary conditions for a wavefunction
  • a time coordinate T
  • a spatial region R
  • a random string

What P does is take the boundary conditions, use Schrödinger's equation to compute the wavefunction at time T, then sample the wavefunction using the Born probabilities and the random input string, and finally output the particles in the region R and their relative positions.

Suppose this program, along with the inputs that cause it to output the descrip... (read more)

The Solomonoff prior depends on the encoding of algorithms, the Born rule doesn't. Or am I missing anything?
0Wei Dai
That seems like a general argument against the whole Solomonoff Induction approach. I'd be happy to see the dependence on an encoding of algorithms removed, but until someone finds a way to do so, it doesn't seem to be a deal-breaker. I think my claim should apply to any encoding of algorithms one might use that isn't contrived specifically to make it false.
Is it possible (I'm not sure it makes sense to ask about easy) under our physics to build an intelligence that optimizes (or at least a structure that propagates itself) according to some metric other than the Born Rule? If not, then it should be anthropically unsurprising that we perceive probability as squared amplitude, even if there is no law of physics to that effect. Otoh if it is possible, then you could have a TOE from which you can't derive how to compute probability, and there's nothing wrong with that, because then there really is another way to interpret probability that other people in the universe (though of course not in our Everett branch) may be using. Fair rephrasing?
Hello Wei Dai. Your paradigm is a bit opaque to me. There's a cosmology here which involves programs, program outputs, and probability distributions over each, but I can't tell what's supposed to exist. Just the program outputs? The program outputs and the programs? Does the program correspond to "basic physical law", and program output to "the physical world"? If I try to abstract away from the metaphysical idiosyncrasies, the idea seems to be that Born's rule is true because the worlds which function according to Born's rule are the majority of the worlds in which sentient beings show up. Well, it could be true. But here's an interesting Bohmian fact: if you start out with an ensemble of Bohmian worlds deviating from the Born distribution, they will actually converge on it, solely due to Bohmian dynamics. (See quant-ph/0403034.) So something like the Bohmian equation of motion may actually be the more fundamental fact.
0Wei Dai
In general, I think what exists are mathematical structures, which include computations as a subclass. Thanks for the link. That looks interesting, and I have a couple of questions that maybe you help me with. 1. Why do they converge to the Born distribution? The authors make an analogy with thermal relaxation, but there is a standard explanation of the second law of thermodynamics in terms of sizes of macrostates in configuration space, and I don't see what the equivalent explanation is for Bohmian relaxation. 2. What about decoherence? Suppose you have a wavefunction that has decohered into two approximately non-interacting branches occupying different parts of configuration space. If you start with a Bohmian world that belongs to one branch, then in all likelihood its future evolution will stay within that branch, right? Now if you take an ensemble of Bohmian worlds that all belong to that branch, how will it converge to the Born distribution, which occupies both branches? 3. This is more of an objection to the Bohmian ontology than a question. If you look at Bohmian Mechanics as a computation, it consists of two parts: (1) evolution of the wavefunction, and (2) evolution of a point in configuration space, guided by the wavefunction. But it seems like all of the real work is being done in part 1. If you wanted to simulate a quantum system, for example, it seems sufficient to just do part 1, and then sample the resulting wavefunction according to Born's rule, and part 2 adds more complexity and computational burden without any apparent benefit.
"Why do they converge to the Born distribution?" Let's distinguish two versions of this question. First version: why does a generic non-Born ensemble of Bohmian worlds tend to become Born-like? I think the technical answer is to be found in footnote 9 and the discussion around equation 20. But ultimately I think it will come back to a Liouville theorem in the space of distributions. There is some natural metric under which the Born-like distributions are the majority. (Or perhaps it is that non-Born regions are traversed relatively quickly.) Second version: why does an individual Bohmian world contain a Born distribution of outcomes? This follows from the first part. An individual Bohmian world consists of a universal wavefunction and a quasiclassical trajectory. If you pick just a few of the classical variables, you can construct a corresponding reduced density matrix in the usual fashion, and a reduced Bohmian equation of motion in which the evolution of those variables depends on that density matrix and on influences coming from all the degrees of freedom that were traced over. So when you look at all the instances, within a single Bohmian history, of a particular physical process, you are looking at an ensemble of noisy Bohmian microhistories. The argument above suggests that even if this starts as a non-Born ensemble, it will evolve into a Born-like ensemble. The only complication is the noise factor. But it is at least plausible that in the majority of Bohmian worlds, this nonlocal noise is just noise and does not introduce an anti-Born tendency. From an all-worlds-exist perspective, which we both favor, I would summarize as follows: (1) the Born distribution is the natural measure on the subset of worlds consisting of the Bohmian worlds (2) most Bohmian worlds will exhibit an internal Born distribution of physical outcomes. At present these are conjectures rather than theorems, but I would consider them plausible conjectures in the light of Valentini's wor
Epistemic hygiene alert!
More specifically, to replace my previous summary comment: the above statement sounds kind-a redeemable, but it's so vague and common-sensually absurd that I think it makes a negative contribution. Things like this need to be said clearly, or not at all. It invites all sorts of kookery, not just with the format of presentation, but in own mind as well.
0Wei Dai
Huh, that's a surprising response. I thought that at least the intended meaning would be obvious for someone familiar with the Solomonoff Prior. I guess "vague" I can address by making my claim mathematically precise, but why "common-sensually absurd"?
Re absurd: It's not clear why you would say something like the quote.
2Wei Dai
I was hoping that it would trigger an insight in someone who might solve this mystery for me. As I said, I'm not sure how to develop it into a full answer myself (but it might be related to this other vague/possibly-absurd idea). Perhaps I'm abusing this community by presenting ideas that are half-formed and "epistemically unhygienic", but I expect that's not a serious danger. It seems like a promising direction to explore, that I don't see anyone else exploring (kind of like UDT until recently). I have too many questions I'd like to see answered, and not enough time and ability to answer them all myself.
7Wei Dai
I just read in Scott Aaronson's Quantum Computing, Postselection, and Probabilistic Polynomial-Time that if the exponent in the probability rule was anything other than 2, then we'd be able to do postselection without quantum suicide and solve problems in PP. (See Page 6, Theorem 6.) The same is true if quantum mechanics was non-linear. Given that, my conjecture is implied by one that says "sentience is unlikely to evolve in a world where problems in PP (which is probably strictly harder than PH, which is probably strictly harder than NP) can be easily solved" (presumably because intelligence wouldn't be useful in such a world).
Interesting. What would such a world look like? I imagine instead of a selection pressure for intelligence there would be a selection pressure for raw memory, so that you could perfectly model any creature with less memory than yourself. It seems that this would be a very intense pressure, since the upper hand is essentially guaranteed superiority, and you would ultimately wind up with galaxy sized computers running through all possible simulations of other galaxy sized computers. I never put much stock in the simulation hypothesis, because I couldn't see why an entity capable of simulating our universe would derive any value from doing it. This scenario makes me rethink that a little. In any case, while this is another potential reason why the rule must be 2 in our universe, it still doesn't shed any light on the mechanism by which our subjective experience follows this rule.
0Wei Dai
I don't know. I don't have a very good understanding of regular quantum computing, much less the non-Born "fantasy" quantum computers that Aaronson used in his paper. But I'm going to guess that your speculation is probably wrong, unless you happen to be an expert in this area. These things tend not to be very intuitive at all.
I honestly can't imagine my evolution story is right. It just seemed like an immensely fun opportunity for speculation.

The Transactional Interpretation of QM resolves the mystery of where this nonlinear squared modulus comes from quite neatly. On that basis alone, I'm surprised that Eliezer doesn't even mention it as a serious rival to MWI.


Don't the transactional interpretation's followers claim that standard QM gives the wrong result on the Afshar experiment? Or is that not all of them?
Cramer argues that both Copenhagen and MWI are inconsistent with the results of the Afshar experiment.
Yeah, but he's wrong. Almost no physicists accept his argument as mathematically valid. If the transactional interpretation does give different results, then it is incompatible with experiment.
If you're talking about the Afshar experiment, Unruh demolished that convincingly. We don't need to take it on trust that Afshar is wrong. However, Afshar and Cramer were only ever arguing about the interpretation of the results of Afshar's experiment, not what those results would be. It would be most unwise to rule out the transactional interpretation just because its inventor subsequently said something foolish.
See the grandparent; Cramer justified the transactional interpretation by saying that it was the only interpretation able to give the correct result for the Afshar experiment. This being wrong removes much of the claimed evidence.
Sure. I think the bottom line is that the Afshar experiment doesn't give empirical support, or even 'philosophical support', to any interpretation. It's a wild goose chase.

First of all - great sequence! I had a lot of 'I see!'-moments reading it. I study physics, but often the clear picture gets lost in the standard approach and one is left with a lot of calculating techniques without any intuitive grasp of the subject. After reading this I became very fond of tutoring the course on quantum mechanics and always tried to give some deeper insight (many of which was taken from here) in addition to just explaining the exercises. If I am correct, the world mangling theory just tries to explain some anomalies, but the rule of squa... (read more)

First of all - great sequence! I had a lot of 'I see!'-moments reading it. I study physics, but often the clear picture gets lost in the standard approach and one is left with a lot of calculating techniques without any intuitive grasp of the subject. After reading this I became very fond of tutoring the course on quantum mechanics and always tried to give some deeper insight (many of which was taken from here) in addition to just explaining the exercises. If I am correct, the world mangling theory just tries to explain some anomalies, but the rule of squa... (read more)

Hm, just read the article again and saw that many of this was already explained there. But the essential point is that although the full information of a system is given by the amplitude distribution over all possible configurations, this information is not accessible to another system. When we try to couple the system to another (for example, by copying the state), this only respects the pure 'classical' states as described above. Thus it is possible to ask the question 'how much have these two states in common', where one classical state compared with itself gives one and with another one 0. If we want to also be able to compare mixed states, the notion of a scalar product comes in. The squared modulus is just the comparison of a state with itself, which is constantly 1 - obviously, the state has a hell lot in common with itself.

Suppose that the probability of an observer-moment is determined by its complexity, instead of the probability of a universe being determined by its complexity and the probability of an observation within that universe being described by some different anthropic selection.

You can specify a particular human's brain by describing the universal wave function and then pointing to a brain within that wave function. Now the mere "physical existence" of the brain is not relevant to experience; it is necessary to describe precisely how to extract a descr... (read more)

Does your argument work as a post hoc explanation of any regular system of physics and sampling laws, provided you're an observer that finds itself within it?

could the flow of amplitude between blobs we normally think of as separated following a measurement possibly explain the quantum field theory prediction/phenomenon of vacuum fluctuations?

Nope. Vacuum fluctuations happen because the field that tells you whether there's a particle there or not behaves like a quantum thing and not a classical thing, and you end up with a non-boring vacuum state for the same reason atoms have non-boring ground states rather than collapsing in on themselves. Weird as all get out, but not quantum-mechanics-breaking, and measured reasonably well by the Casimir effect (though also horribly wrong because of the cosmological constant problem, but that's a problem for quantum gravity to sort out, not one that can be solved by big changes to already-tested parts of quantum mechanics).

I'm a bit puzzled by the problem here. What's wrong with the interpretation that the Born probabilities just are the limiting frequencies in infinite independent repetitions of the same experiment? Further, that these limiting frequencies really are defined because the universe really is spatially infinite, with infinitely many causally isolated regions. There is nothing hypothetical at all about the infinite repetition - it actually happens.

My understanding is that in such a universe model, the Everett-Wheeler version of quantum theory makes a precise pre... (read more)

That gave me, if I am not mistaken, the last piece of the puzzle. Let's just take the naive definition of probability - the relative frequency of outcomes as N goes to infinity. Now prepare N systems independently in the state a|0>+b|1>. Now measure one after another - couple the measurement device to the system. At first we have (a|0>+b|1>)^N |0>. Now the first one is measured: (a|0>+b|1>)^(N-1) (a|0,0>+b|1,1>) where the number after the comma denotes the state of the measuring device, which just counts the number of measured ones. After the second measurement we have (a|0>+b|1>)^(N-2) (a²|00,0>+ab|01,1>+ab|10,1>+b²|11,2>) Since the two states ab|01,1> and ab|10,1> are not distinguished by the measurement, the basis should be changed - and this is the crucial point: |01>+|10> has a length of sqrt(2), so if we change the basis to |+>=(|01>+|10>)/sqrt(2), we have (a|0>+b|1>)^(N-2) (a²|00,0>+absqrt(2)|+,1>+b²|11,2>). The coefficiants are like in the binomial theorem, but note the sqare root! Continuing, we will get something similar to a binomial distribution: sum(k=0..N: sqrt(N!/(k!(N-k)!))a^k b^(N-k) |...,k>). Now it remains to prove that for j/N not equal to a² the amplitudes go to zero as N goes to infinity. This is equivalent to the square of the amplitude going to zero (this is just to make the calculation easier, it does not have anything to do with the Born rule). It is, for |...,k>, ck² = N!/(k!(N-k)!) a²^k b²^(N-k) which becomes a Gaussian distribution for large N, with mean at k=Na² and width Na²b². So at k/N=a²+d it has a value proportional to exp(-(Nd)²/(2Na²b²))=exp(-Nd²/(2a²b²)) --> 0 as N --> inf. So a time capsule where the records indicate that some quantum experiment has been performed a great number of times and the Born rule is broken will have an amplitude that goes to zero (yeah, I just read Barbour's book).
Yes, this is called the Finkelstein-Hartle theorem (D. Finkelstein, Transactions of the New York Academy of Sciences 25, 621 (1963); J. B. Hartle, Am. J. Phys. 36, 704 (1968)). This theorem is the basis for constructing a limit operator for the relative frequency when there are infinitely many independent repetitions of a measurement, and showing that the product wave-function is an exact eigenstate of the relative frequency operator. Unfortunately, it seems that Hartle's construction of the frequency operator wasn't quite right, and needed to be generalized. (E. Farhi, J. Goldstone, and S. Gutmann, Ann. Phys. 192, 368 (1989)). Even so, the critics are still picky about the construction. There is a line of criticism that infinite frequency operators can be constructed arbitrarily as functions over Hilbert space, and unless you already know the Born rule, you won't know how to construct one sensibly (so that the Hartle derivation is circular). However this seems unfair, because if you want the relative frequency operator to obey the Kolmogorov axioms of probability then it has to coincide with the Born rule, something which is another long-standing result called Gleason's theorem. (The squared modulus of the amplitude is the only function of the measure which follows the axioms of probability.) Hence the full derivation is: 1) (Postulate) If the wavefunction is in an eigenstate of a measurement operator, then the measurement will with certainty have the corresponding eigenvalue. 2) (Postulate) Probability is relative frequency over infinitely many independent repetitions. 3) (Postulate) Relative frequency follows the Kolmogorov axioms of probability. 4) (Gleason's theorem) Relative frequency must converge to the Born rule (squared modulus of amplitude) over infinitely many repetitions, or it won't be able to follow the Kolmogorov axioms. 5) (Hartle's theorem, as strengthened by Farhi et al) There is a unique definition of the relative frequency operator over i

Perhaps I'm being too simplistic, but I see a decent explanation that doesn't get as far into the weeds as some of the others. It's proportional to the square because both the event being observed and the observer need to be in the same universe. If the particle can be in A or B, the odds are:

P(A)&O(A) = A^2

P(B)&O(B) = B^2

P(A)&O(B) = Would be AB, but this is physically impossible.

P(B)&O(A) = Would be AB, but this is physically impossible.

Squares fall out naturally.

There are a number of reasons this solution does not work. Here is one problem with the solution that does not require any discussion of the formalism or interpretation of quantum theory: According to you, the location of the particle and the location of the observer are correlated (this follows from the fact that some combinations are physically impossible). If that's the case, you can't calculate the probability of the conjunction by multiplying the probabilities of the conjuncts. That only works if the conjuncts are uncorrelated. More broadly, based on what you propose here I don't think you have sufficient understanding of quantum mechanics to fully appreciate the nature of the problem or the kind of solution that would be required. Your comment suggests several fairly fundamental misunderstandings about the theory. I hope this doesn't come off as impolite or condescending. It's the kind of thing I'd want someone to say to me if they genuinely believed it (although that in itself doesn't entail that it isn't impolite or condescending).
I didn't expect something that simple had escaped everyone's notice(though I suppose I should have said that more explicitly in my post) - I threw it out there because it made sense at first glance and had no immediately obvious problems, not because I figured I had definitely cracked the problem. Easier to see if there's a known response than to try to figure it out myself. So no, I'm not annoyed by your response. And I do think I see what you're getting at. Oh well, it was worth a shot.