LESSWRONG
LW

Rationality: A-Z — LessWrong

77 Decoherence is Simple

6th May 2008

13 min read

77

An epistle to the physicists:

When I was but a little lad, my father, a PhD physicist, warned me sternly against meddling in the affairs of physicists; he said that it was hopeless to try to comprehend physics without the formal math. Period. No escape clauses. But I had read in Feynman’s popular books that if you really understood physics, you ought to be able to explain it to a nonphysicist. I believed Feynman instead of my father, because Feynman had won the Nobel Prize and my father had not.

It was not until later—when I was reading the Feynman Lectures, in fact— that I realized that my father had given me the simple and honest truth. No math = no physics.

By vocation I am a Bayesian, not a physicist. Yet although I was raised not to meddle in the affairs of physicists, my hand has been forced by the occasional gross misuse of three terms: simple, falsifiable, and testable.

The foregoing introduction is so that you don’t laugh, and say, “Of course I know what those words mean!” There is math here. What follows will be a restatement of the points in Belief in the Implied Invisible, as they apply to quantum physics.

Let’s begin with the remark that started me down this whole avenue, of which I have seen several versions; paraphrased, it runs:

The many-worlds interpretation of quantum mechanics postulates that there are vast numbers of other worlds, existing alongside our own. Occam’s Razor says we should not multiply entities unnecessarily.

Now it must be said, in all fairness, that those who say this will usually also confess:

But this is not a universally accepted application of Occam’s Razor; some say that Occam’s Razor should apply to the laws governing the model, not the number of objects inside the model.

So it is good that we are all acknowledging the contrary arguments, and telling both sides of the story—

But suppose you had to calculate the simplicity of a theory.

The original formulation of William of Ockham stated:

Lex parsimoniae: Entia non sunt multiplicanda praeter necessitatem.

“The law of parsimony: Entities should not be multiplied beyond necessity.”

But this is qualitative advice. It is not enough to say whether one theory seems more simple, or seems more complex, than another—you have to assign a number; and the number has to be meaningful, you can’t just make it up. Crossing this gap is like the difference between being able to eyeball which things are moving “fast” or “slow,” and starting to measure and calculate velocities.

Suppose you tried saying: “Count the words—that’s how complicated a theory is.”

Robert Heinlein once claimed (tongue-in-cheek, I hope) that the “simplest explanation” is always: “The woman down the street is a witch; she did it.” Eleven words—not many physics papers can beat that.

Faced with this challenge, there are two different roads you can take.

First, you can ask: “The woman down the street is a what?” Just because English has one word to indicate a concept doesn’t mean that the concept itself is simple. Suppose you were talking to aliens who didn’t know about witches, women, or streets—how long would it take you to explain your theory to them? Better yet, suppose you had to write a computer program that embodied your hypothesis, and output what you say are your hypothesis’s predictions—how big would that computer program have to be? Let’s say that your task is to predict a time series of measured positions for a rock rolling down a hill. If you write a subroutine that simulates witches, this doesn’t seem to help narrow down where the rock rolls—the extra subroutine just inflates your code. You might find, however, that your code necessarily includes a subroutine that squares numbers.

Second, you can ask: “The woman down the street is a witch; she did what?” Suppose you want to describe some event, as precisely as you possibly can given the evidence available to you—again, say, the distance/time series of a rock rolling down a hill. You can preface your explanation by saying, “The woman down the street is a witch,” but your friend then says, “What did she do?,” and you reply, “She made the rock roll one meter after the first second, nine meters after the third second…” Prefacing your message with “The woman down the street is a witch,” doesn’t help to compress the rest of your description. On the whole, you just end up sending a longer message than necessary—it makes more sense to just leave off the “witch” prefix. On the other hand, if you take a moment to talk about Galileo, you may be able to greatly compress the next five thousand detailed time series for rocks rolling down hills.

If you follow the first road, you end up with what’s known as Kolmogorov complexity and Solomonoff induction. If you follow the second road, you end up with what’s known as Minimum Message Length.

Ah, so I can pick and choose among definitions of simplicity?

No, actually the two formalisms in their most highly developed forms were proven equivalent.

And I suppose now you’re going to tell me that both formalisms come down on the side of “Occam means counting laws, not counting objects.”

More or less. In Minimum Message Length, so long as you can tell your friend an exact recipe they can mentally follow to get the rolling rock’s time series, we don’t care how much mental work it takes to follow the recipe. In Solomonoff induction, we count bits in the program code, not bits of RAM used by the program as it runs. “Entities” are lines of code, not simulated objects. And as said, these two formalisms are ultimately equivalent.

Now before I go into any further detail on formal simplicity, let me digress to consider the objection:

So what? Why can’t I just invent my own formalism that does things differently? Why should I pay any attention to the way you happened to decide to do things, over in your field? Got any experimental evidence that shows I should do things this way?

Yes, actually, believe it or not. But let me start at the beginning.

The conjunction rule of probability theory states:

P (X, Y) \leq P (X)

For any propositions X and Y, the probability that “X is true, and Y is true,” is less than or equal to the probability that “X is true (whether or not Y is true).” (If this statement sounds not terribly profound, then let me assure you that it is easy to find cases where human probability assessors violate this rule.)

You usually can’t apply the conjunction rule $P (X, Y) \leq P (X)$ directly to a conflict between mutually exclusive hypotheses. The conjunction rule only applies directly to cases where the left-hand-side strictly implies the right-hand-side. Furthermore, the conjunction is just an inequality; it doesn’t give us the kind of quantitative calculation we want.

But the conjunction rule does give us a rule of monotonic decrease in probability: as you tack more details onto a story, and each additional detail can potentially be true or false, the story’s probability goes down monotonically. Think of probability as a conserved quantity: there’s only so much to go around. As the number of details in a story goes up, the number of possible stories increases exponentially, but the sum over their probabilities can never be greater than 1. For every story “X and Y,” there is a story “X and ¬Y.” When you just tell the story “X,” you get to sum over the possibilities Y and ¬Y.

If you add ten details to X, each of which could potentially be true or false, then that story must compete with $2^{10} - 1$ other equally detailed stories for precious probability. If on the other hand it suffices to just say X, you can sum your probability over $2^{10}$ stories

((X and Y and Z and ...) or (X and ¬Y and Z and ...) or ...) .

The “entities” counted by Occam’s Razor should be individually costly in probability; this is why we prefer theories with fewer of them.

Imagine a lottery which sells up to a million tickets, where each possible ticket is sold only once, and the lottery has sold every ticket at the time of the drawing. A friend of yours has bought one ticket for $1—which seems to you like a poor investment, because the payoff is only $500,000. Yet your friend says, “Ah, but consider the alternative hypotheses, ‘Tomorrow, someone will win the lottery’ and ‘Tomorrow, I will win the lottery.’ Clearly, the latter hypothesis is simpler by Occam’s Razor; it only makes mention of one person and one ticket, while the former hypothesis is more complicated: it mentions a million people and a million tickets!”

To say that Occam’s Razor only counts laws, and not objects, is not quite correct: what counts against a theory are the entities it must mention explicitly, because these are the entities that cannot be summed over. Suppose that you and a friend are puzzling over an amazing billiards shot, in which you are told the starting state of a billiards table, and which balls were sunk, but not how the shot was made. You propose a theory which involves ten specific collisions between ten specific balls; your friend counters with a theory that involves five specific collisions between five specific balls. What counts against your theories is not just the laws that you claim to govern billiard balls, but any specific billiard balls that had to be in some particular state for your model’s prediction to be successful.

If you measure the temperature of your living room as 22 degrees Celsius, it does not make sense to say: “Your thermometer is probably in error; the room is much more likely to be 20 °C. Because, when you consider all the particles in the room, there are exponentially vastly more states they can occupy if the temperature is really 22 °C—which makes any particular state all the more improbable.” But no matter which exact 22 °C state your room occupies, you can make the same prediction (for the supervast majority of these states) that your thermometer will end up showing 22 °C, and so you are not sensitive to the exact initial conditions. You do not need to specify an exact position of all the air molecules in the room, so that is not counted against the probability of your explanation.

On the other hand—returning to the case of the lottery—suppose your friend won ten lotteries in a row. At this point you should suspect the fix is in. The hypothesis “My friend wins the lottery every time” is more complicated than the hypothesis “Someone wins the lottery every time.” But the former hypothesis is predicting the data much more precisely.

In the Minimum Message Length formalism, saying “There is a single person who wins the lottery every time” at the beginning of your message compresses your description of who won the next ten lotteries; you can just say “And that person is Fred Smith” to finish your message. Compare to, “The first lottery was won by Fred Smith, the second lottery was won by Fred Smith, the third lottery was…”

In the Solomonoff induction formalism, the prior probability of “My friend wins the lottery every time” is low, because the program that describes the lottery now needs explicit code that singles out your friend; but because that program can produce a tighter probability distribution over potential lottery winners than “Someone wins the lottery every time,” it can, by Bayes’s Rule, overcome its prior improbability and win out as a hypothesis.

Any formal theory of Occam’s Razor should quantitatively define, not only “entities” and “simplicity,” but also the “necessity” part.

Minimum Message Length defines necessity as “that which compresses the message.”

Solomonoff induction assigns a prior probability to each possible computer program, with the entire distribution, over every possible computer program, summing to no more than 1. This can be accomplished using a binary code where no valid computer program is a prefix of any other valid computer program (“prefix-free code”), e.g. because it contains a stop code. Then the prior probability of any program P is simply $2^{- L (P)}$ where $L (P)$ is the length of P in bits.

The program P itself can be a program that takes in a (possibly zero-length) string of bits and outputs the conditional probability that the next bit will be 1; this makes P a probability distribution over all binary sequences. This version of Solomonoff induction, for any string, gives us a mixture of posterior probabilities dominated by the shortest programs that most precisely predict the string. Summing over this mixture gives us a prediction for the next bit.

The upshot is that it takes more Bayesian evidence—more successful predictions, or more precise predictions—to justify more complex hypotheses. But it can be done; the burden of prior improbability is not infinite. If you flip a coin four times, and it comes up heads every time, you don’t conclude right away that the coin produces only heads; but if the coin comes up heads twenty times in a row, you should be considering it very seriously. What about the hypothesis that a coin is fixed to produce HTTHTT… in a repeating cycle? That’s more bizarre—but after a hundred coinflips you’d be a fool to deny it.

Standard chemistry says that in a gram of hydrogen gas there are six hundred billion trillion hydrogen atoms. This is a startling statement, but there was some amount of evidence that sufficed to convince physicists in general, and you particularly, that this statement was true.

Now ask yourself how much evidence it would take to convince you of a theory with six hundred billion trillion separately specified physical laws.

Why doesn’t the prior probability of a program, in the Solomonoff formalism, include a measure of how much RAM the program uses, or the total running time?

The simple answer is, “Because space and time resources used by a program aren’t mutually exclusive possibilities.” It’s not like the program specification, that can only have a 1 or a 0 in any particular place.

But the even simpler answer is, “Because, historically speaking, that heuristic doesn’t work.”

Occam’s Razor was raised as an objection to the suggestion that nebulae were actually distant galaxies—it seemed to vastly multiply the number of entities in the universe. All those stars!

Over and over, in human history, the universe has gotten bigger. A variant of Occam’s Razor which, on each such occasion, would label the vaster universe as more unlikely, would fare less well under humanity’s historical experience.

This is part of the “experimental evidence” I was alluding to earlier. While you can justify theories of simplicity on mathy sorts of grounds, it is also desirable that they actually work in practice. (The other part of the “experimental evidence” comes from statisticians / computer scientists / Artificial Intelligence researchers, testing which definitions of “simplicity” let them construct computer programs that do empirically well at predicting future data from past data. Probably the Minimum Message Length paradigm has proven most productive here, because it is a very adaptable way to think about real-world problems.)

Imagine a spaceship whose launch you witness with great fanfare; it accelerates away from you, and is soon traveling at $0.9 c$ . If the expansion of the universe continues, as current cosmology holds it should, there will come some future point where—according to your model of reality—you don’t expect to be able to interact with the spaceship even in principle; it has gone over the cosmological horizon relative to you, and photons leaving it will not be able to outrace the expansion of the universe.

Should you believe that the spaceship literally, physically disappears from the universe at the point where it goes over the cosmological horizon relative to you?

If you believe that Occam’s Razor counts the objects in a model, then yes, you should. Once the spaceship goes over your cosmological horizon, the model in which the spaceship instantly disappears, and the model in which the spaceship continues onward, give indistinguishable predictions; they have no Bayesian evidential advantage over one another. But one model contains many fewer “entities”; it need not speak of all the quarks and electrons and fields composing the spaceship. So it is simpler to suppose that the spaceship vanishes.

Alternatively, you could say: “Over numerous experiments, I have generalized certain laws that govern observed particles. The spaceship is made up of such particles. Applying these laws, I deduce that the spaceship should continue on after it crosses the cosmological horizon, with the same momentum and the same energy as before, on pain of violating the conservation laws that I have seen holding in every examinable instance. To suppose that the spaceship vanishes, I would have to add a new law, ‘Things vanish as soon as they cross my cosmological horizon.’ ”

The decoherence (a.k.a. many-worlds) version of quantum mechanics states that measurements obey the same quantum-mechanical rules as all other physical processes. Applying these rules to macroscopic objects in exactly the same way as microscopic ones, we end up with observers in states of superposition. Now there are many questions that can be asked here, such as

“But then why don’t all binary quantum measurements appear to have 50/50 probability, since different versions of us see both outcomes?”

However, the objection that decoherence violates Occam’s Razor on account of multiplying objects in the model is simply wrong.

Decoherence does not require the wavefunction to take on some complicated exact initial state. Many-worlds is not specifying all its worlds by hand, but generating them via the compact laws of quantum mechanics. A computer program that directly simulates quantum mechanics to make experimental predictions, would require a great deal of RAM to run—but simulating the wavefunction is exponentially expensive in any flavor of quantum mechanics! Decoherence is simply more so. Many physical discoveries in human history, from stars to galaxies, from atoms to quantum mechanics, have vastly increased the apparent CPU load of what we believe to be the universe.

Many-worlds is not a zillion worlds worth of complicated, any more than the atomic hypothesis is a zillion atoms worth of complicated. For anyone with a quantitative grasp of Occam’s Razor that is simply not what the term “complicated” means.

As with the historical case of galaxies, it may be that people have mistaken their shock at the notion of a universe that large, for a probability penalty, and invoked Occam’s Razor in justification. But if there are probability penalties for decoherence, the largeness of the implied universe, per se, is definitely not their source!

The notion that decoherent worlds are additional entities penalized by Occam’s Razor is just plain mistaken. It is not sort-of-right. It is not an argument that is weak but still valid. It is not a defensible position that could be shored up with further arguments. It is entirely defective as probability theory. It is not fixable. It is bad math. $2 + 2 = 3$ $.$

Quantum MechanicsOccam's RazorPhysicsSolomonoff inductionWorld Modeling

Personal Blog

77

Collapse Postulates

66 comments59 karma

Decoherence is Falsifiable and Testable

43 comments49 karma

New Comment

63 comments, sorted by

oldest

Click to highlight new comments since: Today at 3:59 PM

[-]Paul_Crowley18y10

Tomorrow I will address myself to accusations I have encountered that decoherence is "unfalsifiable" or "untestable", as the words "falsifiable" and "testable" have (even simpler) probability-theoretic meanings which would seem to be violated by this usage.

Doesn't this follow trivially from the above? No experiment can determine whether or not we have souls, but that counts against the idea of souls, not against the idea of their absence. If decoherence is the simpler theory, then lack of falsifiability counts against the other guys, not against it.

[-]steven18y10

If the expansion of the universe continues, as current cosmology holds it should, there will come some future point where - according to your model of reality - you don't expect to be able to interact with the spaceship even in principle; it has gone over the cosmological horizon relative to you, and photons leaving it will not be able to outrace the expansion of the universe.

IIRC for this to be true the universe's expansion has to accelerate, and the acceleration has to stay bounded above zero forever. (IIRC this is still considered the most probable case.)

[-]Nick_Tarleton18y90

No experiment can determine whether or not we have souls

Really? Not attempted uploading? Microphysical examination of a living brain? Tests for reliable memories of past lives, or reliable mediums?

[+]Caledonian218y-90

[-]Recovering_irrationalist18y80

If you're covering this later I'll wait, but I ask now in case my confusion means I'm misunderstanding something.

Why isn't nearly everything entanged with nearly everything else around it by now? Why is there a significant amount of much quantum independance still around? Or does it just look that way because entanged subconfigurations tend to get split off by decorehence so branches retain a reasonable amount of non-entangledness within their branch? Sorry if this is a daft or daftly phrased question.

[-]Luke_A_Somers14y120

It IS. That was something Eliezer said waay back when, and he was right. Entanglement is a very ordinary state of affairs.

Your educated guess is correct.

[-]Shmi14y70

Indeed, a thoroughly entangled world looks classical, however paradoxical it might sound. Regardless of the adopted interpretation.

[-]RobinHanson18y40

To be fair, one could shy away from saying all those branches are real due to the difficulty of squaring the Born rule with the equal probability calculations that seem to follow from that view. Without something like mangled worlds, one can be tempted by an objective collapse view, as that at least gives a coherent account of the Born rule.

[-]Peter_Turney18y50

(The other part of the "experimental evidence" comes from statisticians / computer scientists / Artificial Intelligence researchers, testing which definitions of "simplicity" let them construct computer programs that do empirically well at predicting future data from past data. Probably the Minimum Message Length paradigm has proven most productive here, because it is a very adaptable way to think about real-world problems.)

I once believed that simplicity is the key to induction (it was the topic of my PhD thesis), but I no longer believe this. I think most researchers in machine learning have come to the same conclusion. Here are some problems with the idea that simplicity is a guide to truth:

(1) Solomonoff/Gold/Chaitin complexity is not computable in any reasonable amount of time.

(2) The Minimum Message Length depends entirely on how a situation is represented. Different representations lead to radically different MML complexity measures. This is a general problem with any attempt to measure simplicity. How do you justify your choice of representation? For any two hypotheses, A and B, it is possible to find a representation X such that complexity(A) < complexity(B) and another representation Y such that complexity(A) > complexity(B).

(3) Simplicity is merely one type of bias. The No Free Lunch theorems show that there is no a prior reason to prefer one type of bias over another. Therefore there is nothing special about a bias towards simplicity. A bias towards complexity is equally valid a priori.

http://www.jair.org/papers/paper228.html http://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization http://en.wikipedia.org/wiki/Inductive_bias

[-]Constant218y10

Without something like mangled worlds, one can be tempted by an objective collapse view, as that at least gives a coherent account of the Born rule.

Does it really account for it in the sense of explain it? I don't think so. I think it merely says that the collapsing occurs in accordance with the Born rule. But we can also simply say that many-worlds is true and the history of our fragment of the multiverse is consistent with the Born rule. Admittedly, this doesn't explain why we happen to live in such a fragment but merely asserts that we do, but similarly, the collapse view does not (as far as I know) explain why the collapse occurs in the frequencies it does but merely asserts that it does.

[-]Enginerd18y00

"No math = no physics"

I would say that as a practical matter, this is true, because often, many theories make the same qualitative prediction, but different quantitative ones. The effect of gravity on light for instance. In Newtonian gravity, light affected the same as matter, but in General Relativity, the effect is larger. Another example would be flat-Earth theory gravity versus Newtonian. Flat-Earthers would say that the Earth is constantly accelerating upwards at 9.8 m/s^2. To a high level of precision, this matches the idea that objects are attracted by G M/ R^2. Difference becomes large at high altitudes (large R), where it is quantitatively different, but qualitatively the same.

One could probably get by setting up experiments where the only possible results are (same, different), but that's really the same as defining numbers of terms of what they lie between; i.e., calculating sqrt(2) by calculating the largest number < sqrt(2) and the smallest number > sqrt(2).

The rate of scientific progress jumped enormously after Newton, as people began thinking more and more quantitatively, and developed tools accordingly. This is not an accident.

[-]Kevin_Dick18y10

I just had a thought, probably not a good one, about Many Worlds. It seems like there's a parallel here to the discovery of Natural Selection and understanding of Evolution.

Darwin had the key insight about how selection pressure could lead to changes in organisms over time. But it's taken us over 100 years to get a good handle on speciation and figure out the detailed mechanisms of selecting for genetic fitness. One could argue that we still have a long way to go.

Similarly, it seems like we've had this insight that QM leads to Many Worlds due to decoherence. But it could take quite a while for us to get a good handle on what happens to worlds and figure that detailed mechanisms of how they progress.

But it was pretty clear that Darwin was right long before we had worked the details. So I guess it doesn't bother me that we haven't worked out the details of what happens to the Many Worlds.

[-]Psy-Kosh18y40

Eliezer: Could you maybe clarify a bit the difference between Kolmogrov Complexity and MML? The first is "shortest computer program that outputs the result" and the second one is.... "shortest info that can be used to figure out the result"? ie, I'm not quite sure I understand the difference here.

However, as to the two being equivalent, I thought I'd seen something about the second one being used because the first was sometimes uncomputable, in the "to solve it in the general case, you'd have to have a halting oracle" sense.

Caledonian: Math isn't so much a language as various notations for math are a language. Basically, if you use "non math" to express exactly the same thing as math, you basically have to turn up the precision and "legaleese" to the point that you practially are describing math, just using a language not meant for it, right?

[-]Caledonian218y10

I think you have it backwards. When we use language in a very precise and specific way, stripping out ambiguity and the potential for alternate meanings, we call the result 'mathematics'.

[-]Psy-Kosh18y10

Caledonian: Er... isn't that what I was saying? (That is, that's basically what I meant. What did you think I meant?)

[-][anonymous]18y00

If you head down this reductionism, occams razor route, doesn't the concpet of a human become explanatorily redundant? It will be simpler to precisely predict what a human will do without invoking the intentional stance and just modelling the underlying physics.

[-]Paul_Crowley18y00

Nick Tarleton: sadly, it's my experience that it's futile to try and throw flour over the dragon.

[-]Lake18y10

Hang on. @ Caledonian and Psy-Kosh: Surely mathematical language is just language that refers to mathematical objects - numbers and suchlike. Precise, unambiguous language doesn't count as mathematics unless it meets this condition.

[-]Caledonian218y-10

Surely mathematical language is just language that refers to mathematical objects - numbers and suchlike. Precise, unambiguous language doesn't count as mathematics unless it meets this condition.

Is logic mathematics? I assert that precise, umambiguous language necessarily refers to mathematical objects, because 'mathematics' is precise, umambiguous language. All math is language. Not all language is math.

Of course, I also assert that mathematics is a subset of science, so consider that our basic worldviews might be very different.

[-]HL18y10

"No math = no physics"

Tell that to Copernicus, Gilbert, Galvani, Volta, Oersted and Faraday, to mention a few. And it's not like even Galileo used much more than the high-school arithmetic of his day for his elucidation of acceleration.

[-]Rolf_Nelson218y10

Some physicists speak of "elegance" rather than "simplicity". This seems to me a bad idea; your judgments of elegance are going to be marred by evolved aesthetic criteria that exist only in your head, rather than in the exterior world, and should only be trusted inasmuch as they point towards smaller, rather than larger, Kolmogorov complexity.

Example:

In theory A, the ratio of tiny dimension #1 to tiny dimension #2 is finely-tuned to support life.

In theory B, the ratio of the mass of the electron to the mass of the neutrino is finely-tuned to support life.

An "elegance" advocate might favor A over B, whereas a "simplicity" advocate might be neutral between them.

[-]happyseaurchin16y-40

i came to this dormant thread from the future: http://lesswrong.com/lw/1k4/the_contrarian_status_catch22/1ckj.

Seems to me there is a mismapping of multiple worlds wrt quantum physics and the multiple worlds we create subjectively. I personally steer clear of physics and concern myself more with the subjective realities we create. This seems to me to be more congruent with the material that eliezer presents here, ie wrt logic and occam's razor, and what he presents in the article linked above, ie wrt contrarian dynamics and feelings of satisfaction et al.

[-]XiXiDu15y20

I'm stuck, what does ~ denote?

For every story "X∧Y", there is a story "X∧~Y". When you just tell the story "X", you get to sum over the possibilities Y and ~Y.

Y and ~Y, where ~Y does read as ... ?

[-]CarlShulman15y60

Not-Y, i.e. Y is false..

[-]XiXiDu15y00

I see, thank you. Does it add too much noise if I ask such questions, should I rather not yet read the sequences if I sometimes have to inquire about such matters? Or should I ask somewhere else?

I was looking up this table of mathematical symbols that stated that ~ does read as 'has distribution' and stopped looking any further since I'm dealing with probabilities here. I guess it should have been obvious to me to expect a logical operator. I was only used to the notation not and ¬ as the negation of a proposition. I'll have to adjust my perceived intelligence downwards.

[-]Vaniver15y10

Others that gets used a lot in various contexts are !Y, Y^c (for Y complement). There are probably more, but I can't think of them at the moment.

[-]Sniffnoy15y10

Y with a bar over it also gets used (though be careful as this more commonly means closure of Y, or, well, quite a few other things...)

[-]orthonormal15y30

No, these are reasonable questions to ask.

[-]CarlShulman15y100

There's no problem with asking a clarifying question like that, which might help other lurkers and can be answered quickly without huge amounts of work.

By the way, there's no need for such self-deprecating comments about your education or intelligence. It's socially a bit off-putting to talk about the topic, and it risks coming across as disingenous. Just ask your questions without such supplication.

[-]wnoise15y30

"Not Y".

The prefix tilde, "~", is commonly used as an ASCII approximation for logical negation, in place of the more mathematically formal notation of "¬" (U+00AC, HTML entity "¬", and latex "\lnot") . On that page are a few other common notations.

[-][anonymous]15y00

When you just tell the story "X", you get to sum over the possibilities Y and ~Y.

Maybe this is a stupid question, but shouldn't it mean "When you just tell the story "Y", you get to sum over the possibilities Y and ~Y." ?

[-]helm15y40

I have been working with decoherence in experimental physics. It confuses me that you want to use it as a synonym for the Many-Worlds theory.

[-]Luke_A_Somers14y20

MWI is the supposition that there is nothing else to the fundamental laws of nature except QM. Decoherence is the main tricky point in the bridge between QM and our subjective experiences.

With decoherence, a collapse postulate is superfluous. With decoherence, you don't need a Bohmian 'real thing' or whatever he calls it. QM is simply the way things are. You can stick with it, and MWI follows directly.

[-]Shmi14y00

The collapse postulate is just a visualization, just like the MWI is.The Born projection rule is the only "real" thing, and it persists through MWI or any other "I". So no, the MWI does not follow directly, unless you strip it of all ontological meaning.

[-]TAG6mo20

No, because decoherence isn't necessarily multi-branch decoherence. Single-way decoherence works like collapse, except that it's not fundamental or instataneous.

[-]MrCheeze15y-20

This isn't quite what your post was about, but one thing I've never understood is how anyone could possibly find "the universe is totally random" to be a MORE simple explanation.

[This comment is no longer endorsed by its author]Reply

[-]EricHerboso14y130

I did not get a chance to read this entry until four years after it was published, but it nonetheless ended up correcting a long-held flawed view I had on the Many Worlds Interpretation. Thank you for opening up my eyes to the idea that Occam's razor applies to rules, and not entities in a system. You have no idea as to how embarrassed I feel for having so drastically misunderstood the concept before now.

Incidentally, I wrote a blog entry on how this article changed my mind which seems to have generated additional discussion on this issue.

[-]RussellThor13y00

How do you explain this with many worlds, while avoiding non-locality? http://arxiv.org/pdf/1209.4191v1.pdf If results such as these are easy to explain/predict, can the many worlds theory gain credibility by predicting such things?

[-]Luke_A_Somers11y30

Glib: start with the initial states. Propagate time as specified. Observe the result come out. That's how. MWI is quantum mechanics with no modifications, so its predictions match what quantum mechanics predicts, and quantum mechanics happens to be local.

Fulller: The first moment of decoherence is when photon 1 is measured. At this point, we've split the world according to its polarization.

Then we have photons 2 and 3 interfere. They are then shortly measured, so we've split the world again in several parts. This allows post-selection for when the two photons coming out go to different detectors. When that criterion is met, then we're in the subspace with photon 2 being a match for photon 3, which is the same subspace as 4 being a match for photon 1.

When they measure photon 4, it proceeds to match photon 1.

Under MWI, this is so unsurprising that I'm having a hard time justifying performing the experiment except as a way of showing off cool things we can do with coherence.

Now, as for whether this was local, note that the procedure involved ignoring the events we don't want. That sort of thing is allowed to be non-local since it's an artifact of data analysis. It's not like photon 1 made photon 4 do anything. THAT would be non-local. But the force was applied by post-selection.

Of COURSE... you don't NEED to look at it through the lens of MWI to get that. Even Copenhagen would come up with the right answer to that, I think. Actually, I'm not sure why it would be a surprising result even under Copenhagen. Post-selection creates correlations! News at 11.

[-]CCC11y20

Under MWI, this is so unsurprising that I'm having a hard time justifying performing the experiment except as a way of showing off cool things we can do with coherence.

Then the justification is simple; it either provides evidence in favour of MWI (and Copenhagen and any other theory that predicts the expected result) or it shatters all of them.

Scientists have to do experiments to which the answer is obvious - failing to do so leads to the situation where everybody knows that heavier objects fall faster than lighter objects because nobody actually checked that.

[-]Lumifer11y10

everybody knows that heavier objects fall faster than lighter objects

Because this happens to be mostly true. Air resistance is a thing.

actually checked that

Actually, if I remember the high school physics anecdote correctly, the trouble for the idea that heavy objects fall faster than light ones began when a certain scientist asked a hypothetical question: what would happen if you drop a light and a heavy object at the same time, but connect them with a string?

[-]Vaniver11y00

Wikipedia has a longer version of the thought experiment.

[-]CCC11y30

Because this happens to be mostly true. Air resistance is a thing.

Got nothing to do with weight, though. An acre of tissue paper, spread out flat, will still fall more slowly than a ten cent coin dropped edge-first.

Actually, if I remember the high school physics anecdote correctly, the trouble for the idea that heavy objects fall faster than light ones began when a certain scientist asked a hypothetical question: what would happen if you drop a light and a heavy object at the same time, but connect them with a string?

Well, yes. (Galileo, wasn't it?) Doesn't affect my point, though - the basics do need to be checked occasionally.

[-]Lumifer11y-20

Got nothing to do with weight, though.

Nothing? Are you quite sure about that? :-)

[-]CCC11y30

Hmmm... let me consider it.

In an airless void, the answer is no - the mass terms of the force-due-to-gravity and the acceleration-due-to-force equations cancel out, and weight has nothing to do with the speed of the falling object.

In the presence of air resistance, however... the force from air resistance depends on how much air the object hits (which in turn depends on the shape of the object), and how fast relative to the object the air is moving. The force applied by air resistance is independent of the mass (but dependent on the shape and speed of the object) - but the acceleration caused by that force is dependant on the mass (f=ma). Therefore, the acceleration due to air resistance does depend partially on the mass of the object.

Okay, so not quite "nothing", but mass is not the most important factor to consider in these equations...

[-]Lumifer11y00

I don't know how you decide what's more and what's less important in physics equations :-/

If I tell you I dropped a sphere two inches in diameter from 200 feet up, can you calculate its speed at the moment it hits the ground? Without knowing its weight, I don't think you can.

[-]CCC11y00

I don't know how you decide what's more and what's less important in physics equations :-/

Predictive power. The more accurate a prediction I can make without knowing the value of a given variable, the less important that variable is.

If I tell you I dropped a sphere two inches in diameter from 200 feet up, can you calculate its speed at the moment it hits the ground? Without knowing its weight, I don't think you can.

Ugh, imperial measures. Do you mind if I work with a five-centimetre sphere dropped from 60 metres?

A sphere is quite an aerodynamic shape; so I expect, for most masses, that air friction will have a small to negligible impact on the sphere's final velocity. I know that the acceleration due to gravity is 9.8m/s^2, and so I turn to the equations of motion; v^2 = v_0^2+2*a*s (where v_0 is the starting velocity). Starting velocity v_0 is 0, a is 9.8, s is 60m; thus v^2 = (0*0)+(2*9.8*60) = 1176, therefore v = about 34.3m/s. Little slower than that because of air resistance, but probably not too much slower. (You'll also notice that I'm not using the radius of the sphere anywhere in this calculation). It's an approximation, yes, but it's probably fairly accurate... good enough for many, though not all purposes.

Now, if I know the mass but not the shape, it's a lot harder to justify the "ignore air resistance" step...

[-]Lumifer11y00

You're doing the middle-school physics "an object dropped in vacuum" calculation :-) If you want to get a number that takes air resistance into account you need college-level physics.

So, since you've mentioned accuracy, how accurate your 34.3 m/s value is? Can you give me some kind of confidence intervals?

[-]CCC11y00

You're doing the middle-school physics "an object dropped in vacuum" calculation :-)

Yes, exactly. Because for many everyday situations, it's close enough.

So, since you've mentioned accuracy, how accurate your 34.3 m/s value is? Can you give me some kind of confidence intervals?

No, I can't. In order to do that, I would need, first of all, to know how to do the air resistance calculation - I can probably look that up, but it's going to be complicated - and, importantly, some sort of probability distribution for the possible masses of the ball (knowing the radius might help in estimating this).

Of course, the greater the mass of the ball, the more accurate my value is, because the air risistance will have less effect; in the limit, if the ball is a hydrogen balloon, I expect it to float away and never actually hit the ground at all, while in the other limit, if the ball is a tiny black hole, I expect it to smash into the ground at exactly the calculated value (and then keep going).

[-]Lumifer11y30

No, I can't.

And thus we get back to the question of what's important in physics equations.

But let's do a numerical example for fun.

Our ball is 5 cm in diameter, so its volume is about 65.5 cm3. Let's make it out of wood, say, bamboo. Its density is about 0.35 g/cm3 so the ball will weigh about 23 g.

Let's calculate its terminal velocity, that is, the speed at which drag exactly balances gravity. The formula is v = sqrt(2mg/(pAC)) where m is mass (0.023 kg) , g is the same old 9.8, p is air density which is about 1.2 kg/m3, A is projected area and since we have a sphere it's 19.6 cm2 or 0.00196 m2, and C is the drag coefficient which for a sphere is 0.47.

v = sqrt( 2 0.023 9.8 / (1.2 0.00196 0.47)) = 20.2 m/s

So the terminal velocity of a 5 cm diameter bamboo ball is about 20 m/s. That is quite a way off your estimate of 34.3 and we got there without using things like hollow balls or aerogel :-)

[-]CCC11y10

To be fair, a light ball is exactly where my estimate is known to be least accurate. Let's consider, rather, a ball with a density of 1 - one that neither floats nor sinks in water. (Since, in my experience, many things sink in water and many, but not quite as many, things float in it, I think it makes a reasonable guess for the average density of all possible balls). Then you have m=0.0655kg, and thus:

v = sqrt( 2 0.0655 9.8 / (1.2 0.00196 0.47)) = 34.0785 m/s

...okay, if it was falling in a vacuum it would have reached that speed, but it's had air resistance all the way down, so it's probably not even close to that. (And it it had been dropped from, say, 240m, then I would have calculated a value of close on 70 m/s, which would have been even more wildly out).

So, I will admit, it turns out that mass is a good deal more important than I had expected - also, air resistance has a larger effect than I had anticipated.

[-][anonymous]11y00

while in the other limit, if the ball is a tiny black hole, I expect it to smash into the ground at exactly the calculated value

Nope, because in that case, your value of g would be significantly higher than 9.8 m/s^2.

[This comment is no longer endorsed by its author]Reply

[-]btrettel11y50

A sphere is quite an aerodynamic shape

(Engineer with a background in fluid dynamics here.)

A sphere is quite unaerodynamic. Its drag coefficient is about 10 times higher than that of a streamlined body (at a relevant Reynolds number). You have boundary layer separation off the back of the sphere, which results in a large wake and consequently high drag.

The speed as a function of time for an object with a constant drag coefficient dropping vertically is known and it is a direct function of mass. If I learned anything from making potato guns, it's that in general, dragless calculations are pretty inaccurate. You'll get the trend right in many cases with a dragless calculation, but in general it's best to not assume drag is negligible unless you've done the math or experiment to show that it is in a particular case.

[-]CCC11y00

A sphere is quite unaerodynamic.

Huh. I thought the fact that it got continually and monotonically bigger until a given point and then monotonically smaller meant at least some aerodynamics in the shape. I did not even consider the wake...

The speed as a function of time for an object with a constant drag coefficient dropping vertically is known and it is a direct function of mass.

Well. I stand corrected, then. Evidently drag has a far bigger effect than I gave it credit for.

...proportional to the square root of the mass, given all oher factors are unchanged, I see.

[-]btrettel11y30

It's better than a flat plate perpendicular to the flow. Most people seem to not expect that the back of the object affects the drag, but there's a large low pressure zone due to the wake. With high pressure in the front and low pressure in the back (along with a somewhat negligible skin friction contribution), the drag is considerable. So you need to target both the front and back to have a low drag shape. Most aerodynamic shapes trade pressure drag for skin friction drag, as the latter is small (if the Reynolds number is high).

[-]Lumifer11y00

For "an aerodynamic shape" my intuition first gives me a stylized drop: hemispheric in the front and a long tail thinning to a point in the back. But after a couple of seconds it decides that a spindle shape would probably be better :-)

[-]btrettel11y10

The "teardrop" shape is pretty good, though the name is a fair bit misleading as droplets almost never look like that. Their shape varies in time depending on the flow conditions.

Not quite sure what you mean by spindle shape, but I'm sure a variety of shapes like that could be pretty good. For the front, it's important to not have a flat tip. For the back, you'd want a gradual decay of the radius to prevent the fluid from separating off the back, creating a large wake. These are the heuristics.

Which shape objects have minimum drag is a fairly interesting subject. The shape with minimum wave drag (i.e., supersonic flow) is known, but I'm not sure there are any general proofs for other flow regimes. Perhaps it doesn't matter much, as we already know a bunch of shapes with low drag. The real problem seems to be getting these shapes adopted, as (for example) cars don't seem to be bought on rational bases like engineering. This should not be surprising.

[-]Lumifer11y00

cars don't seem to be bought on rational bases like engineering.

Of course, but I don't see it as a bad thing. Typically when people buy cars they have a collection of must-haves and then from the short list of cars matching the must-haves, they pick what they like. I think it's a perfectly fine method of picking cars. Compare to picking clothes, for example...

[-]Luke_A_Somers11y00

The problem is, we've done much, MUCH more stringent tests than this. It's like, after checking the behavior of pendulums and falling objects of varying weights and lengths and areas, over vast spans of time and all regions of the globe, and in centrifuges, and on pulleys... we went on to then check if two identical objects would fall at the same speed if we dropped one when the other landed.

Anyway, I didn't say it shouldn't be done. I support basic experiments on QM, but I'd like them to push the envelope in interesting ways rather than, well, this.

[-]dcleve8y-20

Karl Popper alread dealy with the problem of Occam's Razor not being usable based on complexity. He recast it as predictive utility. When one does that, the prediction of Many Worlds is untestable in pinciple -- IE not ever wrong.

[-]TAG3y*20

The Deutsch-Yudkowsky argument for the Many Worlds Interpretation states that you can take the core of Quantum Mechanics -- the Schrödinger wave equation, and the projection postulate -- remove the projection postulate (also known as collapse and reduction ), and end with a simpler theory that is still adequate to explain observation. The idea is that entanglement can replace collapse: a scientist observing a superposed state becomes becomes entangled with it, an effectively splits into two, each having made a definite observation.

Moreover Yudkowsky, following David Deutsch, holds the many worlds interpretation to be obviously correct, in contrast to the majority of philosophers and physicists, who regard the problem of interpreting QM as difficult and unsolved.

This has some problems.

(Which are to do with the specific argument, and the level of certainty ascribed to it. To say that you cannot be certain about a claim is not to say it is false. To point out that one argument for a claim does not work is likewise not to say that the claim itself is false. There could be better arguments for these versions of many worlds, or better many worlds theories, for that matter).

The Problems.

The first thing to note is that there is more than one quantum mechanical many worlds theory. What splittng is...how complete and irrevocable it is ... varies between particular theories. So does the rate of splitting, so does the mechanism of splitting.

The second thing to note is that many worlders are pointing at something implied the physical formalism and saying "that's a world"....but whether it qualifies as a world is a separate question from whether it's in the formalism , and a separate kind of question, from whether it is really there in the formalism. One would expect a world, or universe, to be large, stable, non-interacting, and so on. It's possible to have a n interpretation without collapse or worlds. A successful MWI needs to jump three hurdles: empirical correctness, mathematical correctness and conceptual correctness -- actually having worlds

The third problem to note is that all outstanding issues with MWI are connected in some way with quantum mechanical basis....a subject about which Deutsch and Yudkowsky have little to say.

Coherence versus Decoherence

There is an approach to MWI based on coherent superpositions, and a version based on decoherence. These are (for all practical purposes) incompatible opposites, but are treated as interchangeable in Yudkowsky's writings.

Quantum superposition is a fundamental principle of quantum mechanics that states that linear combinations of solutions to the Schrödinger equation are also solutions of the Schrödinger equation. This follows from the fact that the Schrödinger equation is a linear differential equation in time and position. (WP)

Coherent superpositions are straightforwardly implied by the core mathematics of Quantum mechanics. They are small scale in two senses: they can go down to the single particle level, and it is difficult to.maintain large coherent superpositions even if you want to. They are also possibly observer dependent, reversible, and continue to interact (strictly speaking , interfere) after "splitting". The last point is particularly problematical. because if large scale coherent superposition exist , that would create naked eye, macrocsopic evidence:, e.g. ghostly traces of a world where the Nazis won. All in all, a coherent superposition isn't a world you could live in.

I said complex coherent superpositions are difficult to maintain. What destroys them? Environmental induced decoherence!

Interference phenomena are a well-known and crucial aspect of quantum mechanics, famously exemplified by the two-slit experiment. There are many situations, however, in which interference effects are artificially or spontaneously suppressed. The theory of decoherence is precisely the study of such situations. (SEP)

Decoherence tries to explain why we don't notice "quantum weirdness" in everyday life -- why the world of our experience is a more-or-less classical world. From the standpoint of decoherence, sure there might not be any objective fact about which slit an electron went through, but there is an objective fact about what you ate for breakfast this morning: the two situations are not the same!

The basic idea is that, as soon as the information encoded in a quantum state "leaks out" into the external world, that state will look locally like a classical state. In other words, as far as a local observer is concerned, there's no difference between a classical bit and a qubit that's become hopelessly entangled with the rest of the universe.

(http://scottaaronson.com/democritus)

Decoherence is the study of interactions between a quantum system (generally a very small number of microscopic particles like electrons, photons, atoms, molecules, etc. - often just a single particle) and the larger macroscopic environment, which is normally treated "classically," that is, by ignoring quantum effects, but which decoherence theorists study quantum mechanically. Decoherence theorists attribute the absence of macroscopic quantum effects like interference (which is a coherent process) to interactions between a quantum system and the larger macroscopic environment.(www.informationphilosopher.com)

Decoherent branches are necessarily large, since decoherence is a high level phenomenon. They are also stable, non interacting and irreversible...everything that would be intuitively expected of a "world". But there is no empirical evidence for them (in the plural) , nor are they obviously supported by the core mathematics of quantum mechanics, the Schrödinger equation.

We have evidence of small scale coherent superposition, since a number of observed quantum effects depend on it, and we have evidence of decoherence, since complex superposition are difficult to maintain. What we don't have evidence of is decoherence into multiple branches. From the theoretical perspective, decoherence is a complex , entropy like process which occurs when a complex system interacts with its environment. But without decoherence, MW doesn't match observation. So there is no theory of MW that is both simple and empirically adequate, contra Yudkowsky and Deutsch.

The original, Everettian, approach is based on coherence. (Yudkowsky says "Macroscopic decoherence, a.k.a. many-worlds, was first proposed in a 1957 paper by Hugh Everett III" ... but the paper doesn't mention decoherence ^[1] ) As such, it fails to predict classical observations -- at all -- it fails to predict the appearance of a broadly classical universe. If everything is coherently superposed, so are observers...but the naturally expected experience an observer in coherent superposition with themselves, is that they function as a single observer making ambiguous, superposed observations ... not two observers each making an unambiguous , classical observation, and each unaware of the other. Such observers would only ever see superpositions of dead and living cats, etc.

(A popular but mistaken idea is that full splitting happens microscopically, at every elementary interaction But that would make complex superpositions non-existent, whereas a number of instruments and technologies depend on them -- so it's empirically false).

Later, post 1970s, many world theorists started to include decoherence to make the theory more empirically adequate, but inasmuch as it is additional structure, it places the simplicity of MWI in doubt. In the worst case, the complexity is SWE+decoherence+preferred basis, whereas in the best case, it's SWE alone, because decoherence is implicit in SWE, and preferred basis is implicit in decoherence. Decoherentists hope to show that the theory can be reduced to core QM, such as the Schrödinger equation, but it currently uses more complex math, the "reduced density matrix". The fact that this research is ongoing is strong evidence that the whole problem was not resolved by Everetts's 1957 paper. In any case, without a single definitive mechanism of decoherence, there is no definitive answer to "how complex is MWI".

And single-universe decoherence is quite feasible. Decoherence adds something to many worlds, but many worlds doesn't add anything to decoherence.

So, coherent superpositions exist, but their components aren't worlds in any intuitive sense; and decoherent branches would be worlds in the intuitive sense, but decoherence isn't simple. Also, theoretically and observationally, decoherence could be a single world phenomenon. Those facts -- the fact that it doesn't necessarily involve multi way branching, and the fact that it is hard to evaluate its complexity because there is not a single satisfactory theory for it -- means it is not a "slam dunk" in Yudkowsky's sense.

The Yudkowsky-Deutsch claim is that there is a single MW theory, which explains everything that needed explaining, and is obviously simpler than its rivals. But coherence doesn't save appearances , and decoherence, while more workable, is not known to be simple. So neither theory has both virtues

Which makes the term *Everett branch" rather confusing. The writer possibly means a decohered branch, under the mistaken assumption that Everett was talking about them. Everett's dissertation can be found here ↩︎

Moderation Log

The Problems.

The third problem to note is that all outstanding issues with MWI are connected in some way with quantum mechanical basis....a subject about which Deutsch and Yudkowsky have little to say.

Coherence versus Decoherence

Quantum superposition is a fundamental principle of quantum mechanics that states that linear combinations of solutions to the Schrödinger equation are also solutions of the Schrödinger equation. This follows from the fact that the Schrödinger equation is a linear differential equation in time and position. (WP)

I said complex coherent superpositions are difficult to maintain. What destroys them? Environmental induced decoherence!

Interference phenomena are a well-known and crucial aspect of quantum mechanics, famously exemplified by the two-slit experiment. There are many situations, however, in which interference effects are artificially or spontaneously suppressed. The theory of decoherence is precisely the study of such situations. (SEP)

Decoherence tries to explain why we don't notice "quantum weirdness" in everyday life -- why the world of our experience is a more-or-less classical world. From the standpoint of decoherence, sure there might not be any objective fact about which slit an electron went through, but there is an objective fact about what you ate for breakfast this morning: the two situations are not the same!

The basic idea is that, as soon as the information encoded in a quantum state "leaks out" into the external world, that state will look locally like a classical state. In other words, as far as a local observer is concerned, there's no difference between a classical bit and a qubit that's become hopelessly entangled with the rest of the universe.

(http://scottaaronson.com/democritus)

Decoherence is the study of interactions between a quantum system (generally a very small number of microscopic particles like electrons, photons, atoms, molecules, etc. - often just a single particle) and the larger macroscopic environment, which is normally treated "classically," that is, by ignoring quantum effects, but which decoherence theorists study quantum mechanically. Decoherence theorists attribute the absence of macroscopic quantum effects like interference (which is a coherent process) to interactions between a quantum system and the larger macroscopic environment.(www.informationphilosopher.com)

And single-universe decoherence is quite feasible. Decoherence adds something to many worlds, but many worlds doesn't add anything to decoherence.

Which makes the term *Everett branch" rather confusing. The writer possibly means a decohered branch, under the mistaken assumption that Everett was talking about them. Everett's dissertation can be found here ↩︎

Moderation Log