**Followup to:** The prior of a hypothesis does not depend on its complexity

Eliezer wrote:

In physics, you can get absolutely clear-cut issues. Not in the sense that the issues are trivial to explain. [...] But when I say "macroscopic decoherence is simpler than collapse" it is actually

strictsimplicity; you could write the two hypotheses out as computer programs and count the lines of code.

Every once in a while I come across some belief in my mind that clearly originated from someone smart, like Eliezer, and stayed unexamined because after you hear and check 100 correct statements from someone, you're not about to check the 101st quite as thoroughly. The above quote is one of those beliefs. In this post I'll try to look at it closer and see what it really means.

Imagine you have a physical theory, expressed as a computer program that generates predictions. A natural way to define the Kolmogorov complexity of that theory is to find the length of the shortest computer program that *generates* your program, as a string of bits. Under this very natural definition, the many-worlds interpretation of quantum mechanics is almost certainly simpler than the Copenhagen interpretation.

But imagine you refactor your prediction-generating program and make it shorter; does this mean the physical theory has become simpler? Note that after some innocuous refactorings of a program expressing some physical theory in a recognizable form, you may end up with a program that expresses a *different* set of physical concepts. For example, if you take a program that calculates classical mechanics in the Lagrangian formalism, and apply multiple behavior-preserving changes, you may end up with a program whose internal structures look distinctly Hamiltonian.

Therein lies the rub. Do we really want a definition of "complexity of physical theories" that tells apart theories making the same predictions? If our formalism says Hamiltonian mechanics has a higher prior probability than Lagrangian mechanics, which is demonstrably mathematically equivalent to it, something's gone horribly wrong somewhere. And do we even want to define "complexity" for physical theories that don't make any predictions at all, like "glarble flargle" or "there's a cake just outside the universe"?

At this point, the required fix to our original definition should be obvious: cut out the middleman! Instead of finding the shortest algorithm that writes your *algorithm* for you, find the shortest algorithm that outputs the same *predictions*. This new definition has many desirable properties: it's invariant to refactorings, doesn't discriminate between equivalent formulations of classical mechanics, and refuses to specify a prior for something you can never ever test by observation. Clearly we're on the right track here, and the original definition was just an easy fixable mistake.

But this easy fixable mistake... was the entire reason for Eliezer "choosing Bayes over Science" and urging us to do same. The many-worlds interpretation makes the same testable predictions as the Copenhagen interpretation right now. Therefore by the amended definition of "complexity", by the *right and proper* definition, they are equally complex. The truth of the matter is not that they express different hypotheses with equal prior probability - it's that they express the *same* hypothesis. I'll be the first to agree that there are very good reasons to prefer the MWI formulation, like its pedagogical simplicity and beauty, but K-complexity is not one of them. And there may even be good reasons to pledge your allegiance to Bayes over the scientific method, but this is not one of them either.

**ETA:** now I see that, while the post is kinda technically correct, it's horribly confused on some levels. See the comments by Daniel_Burfoot and JGWeissman. I'll write an explanation in the discussion area.

**ETA 2:** done, look here.

MWI and Copenhagen do not make the same predictions in all cases, just in testable ones. There is a simple program that makes the same predictions as MWI in all cases. There appears to be no comparably simple program that makes the same predictions as Copenhagen in all cases. So, if you gave me some complicated test which could not be carried out today, but on which the predictions of MWI and Copenhagen differed, and asked me to make a prediction about what would happen if the experiment was somehow run (it seems likely that such experiments will be possible at some point in the extremely distant future) I would predict that MWI will be correct with overwhelming probability. I agree that if some other "more complicated" theory made the same predictions as MWI in every case, then K-complexity would not give good grounds to decide between them.

I guess the fundamental disagreement is that you think MWI and Copenhagen are the same theory because discriminating between them is right now far out of reach. But I think the existence of any situation where they make different hypotheses is precisely sufficient to consider them different theories. I don't know why "testable&qu... (read more)

Yes. As you said, simpler theories have certain advantages over complex theories, such as possibility of deeper understanding of what's going on. Of course, in that case we shouldn't exactly optimize K-complexity of their presentation, we should optimize informal notion of simplicity or ease of understanding. But complexity of specification is probably useful evidence for those other metrics that are actually useful.

The error related to your preceding post would be to talk about varying probability of differently presented equivalent theories, but I don't remember that happening.

Your "fix" seems problematic too, if it doesn't allow belief in the implied invisible

If you look at the definition of the Solomonoff prior, you'll notice that it's actually a weighted sum over many programs that produce the desired output. This means that a potentially large number of programs, corresponding in this case to different formulations of physics, combine to produce the final probability of the data set.

So what's really happening is that all formulations that produce identical predictions are effectively collapsed into an equivalence class, which has a higher probability than any individual formulation.

"Therein lies the rub. Do we really want a definition of "complexity of physical theories" that tells apart theories making the same predictions? "

Yes.

"Evolution by natural selection occurs" and "God made the world and everything in it, but did so in such a way as to make it look

exactlyas if evolution by natural selection occured" make the same predictions in all situations.You can do perfectly good science with either hypothesis, but the latter postulates an extra entity - it's a less useful way of thinking abou... (read more)

Two arguments - or maybe two formulations of the one argument - for complexity reducing probability, and I think the juxtaposition explains why it doesn't feel like complexity should be a straight-up penalty for a theory.

The

human-levelargument for complexity reducing probability something like A∩B is more probable than A∩B∩C because the second has three fault-lines, so to speak, and the first only has two, so the second is more likely to crack.edit: equally or more likely, not strictly more likely.(For engineers out there; I have found this metaphor to... (read more)I think that while a sleek decoding algorithm and a massive look-up table might be mathematically equivalent, they differ markedly in what sort of process actually carries them out, at least from the POV of an observer on the same 'metaphysical level' as the process. In this case, the look-up table is essentially the program-that-lists-the-results, and the algorithm is the shortest description of how to get them. The equivalence is because, in some kind of sense, process and results imply each other. In my mind, this a bit like some kind of space-like-inf... (read more)

Prediction making is not a fundamental attribute that hypotheses have. What distinguishes hypotheses is what they are saying is really going on. We use that to make predictions.

The waters get muddy when dealing with fundamental theories of the universe. In a more general case: If we have two theories which lead to identical predictions of the behavior of an impenetrable black box, but say different things about the interior, then we should choose the simpler one. If at some point in the futu... (read more)

Suppose, counterfactually, that Many Worlds QM and Collapse QM really always made the same predictions, and so you want to say they are both the same theory QM. It still makes sense to ask what is the complexity of Many Worlds QM and how much probability does it contribute to QM, and what is the complexity of Collapse QM and how much probability does it contribute to QM. It even makes sense to say that Many Worlds QM has a strictly smaller complexity, and contributes more probability, and is the better formulation.

"But imagine you refactor your prediction-generating program and make it shorter; does this mean the physical theory has become simpler?"

Yeah, (given the caveats already mentioned by Vladimir), as any physical theory

isa prediction-generating program. A theory that isn't a prediction-generating program isn't a theory at all.At least the Quantum Immortality is something, what isn't the same under the MWI or any other interpretation of QM.

There is no QI outside the MWI. Do you postulate any quantum immortal suicider in your MWI branch? No? Why not?

I just added to the post.