Epistemic status: solid math, uncertain applicability. Would love to hear from someone who knows more physics than I do!

Informal Overview

Here's a pretty obvious fact from probability theory:

Let A and B be two disjoint events, meaning that at most one of them can happen. For example, if you flip a coin it might land heads, or it might land tails, or it might do neither, perhaps slipping into a heating duct never to be seen again. But it definitely won't come up both heads and tails. Then:

One of the surprising aspects of quantum mechanics is that this fact does not hold in a quantum system.

In a quantum system, an event does not directly have a probability: a real number between 0 and 1. Rather, it has an amplitude: a complex number of magnitude at most 1.

What's the meaning of this amplitude? The Born rule says that the probability that an event will be observed is the square of the magnitude of the amplitude of that event. Labeling this function , we have:

The surprising fact from quantum mechanics is that when there are two ways, A and B, for something to happen, and you want to know the probability that it happens, you don't add the probabilities of A and B together. You add their amplitudes. For example, if A has amplitude and B has amplitude , then has amplitude 0. Looking at their probabilities (obtained via the Born rule), we get:

We combined two events that could happen, to get one that can't!

However, in this post I'd like to explain how in a poorly calibrated experiment, this strange quantum behavior will vanish and be replaced with ordinary probability. First I'll give the basic explanation. Then I'll follow up with an analysis of a experiment.

(I've been using sloppy language. You wouldn't normally call events A and B disjoint, nor use the word "or". Though if you look at the setup of an actual experiment that leads to this, and describe it with ordinary language, "disjoint" and "or" are pretty fair words to use to describe the situation. I think the reason we avoid them when talking about quantum mechanics is because the behavior is so strange.)

What happens if you don't calibrate an experiment

In a quantum mechanical experiment, the phase of an amplitude can be very sensitive to the placement of the parts you use. (The phase of a complex number, when it is expressed in polar coordinates, is its angle.) For example, a photon's phase continuously changes as it travels. Every 700nm (for red light), it goes full circle. So if your experiment involves a photon traveling from one component to another, and their placement is off by 350nm, that will multiply the amplitude by -1. And if they're off by 175nm, that will multiply the amplitude by i. You might guess that quantum experiments are often very finicky and need careful calibration. You'd be right.

As a result, it seems plausible that for some experiments, if you don't calibrate the experiment properly, then the phases of the amplitudes in it will be unpredictable. And sufficiently so that it could be accurate to model each of them as an independent and uniformly random variable. In this post, I explore what happens if we make this assumption.

Taking the above example, this means that instead of having a combined amplitude of , it would have a combined amplitude of:

where and are independent and uniformly random complex variables of magnitude 1.

This is an unusual kind of model: we have a probability distribution over a quantum amplitude. But that's exactly what we should have! We're uncertain about the phases, and when you're uncertain about something you model it with a probability distribution.

However, what we care about here is the probability, rather than the combined amplitude. To find it, we should take the expected value of the Born rule applied to the amplitude:

Remember that and are uniformly random variables of magnitude 1. We can simplify this expression:

This is the average squared distance from the origin to a random point on a circle of radius centered at . You can solve this with an integral, and you get .

Which suggests that the quantum-mechanical behavior has vanished! We now have this, which looks just like ordinary probability:

And indeed, if instead of starting with amplitudes and , we start with an arbitrary and , you can do the same math and get:

That is, the probabilities of disjoint events add! We started with the rules of quantum mechanics, and obtained a rule from ordinary probability theory.

Possible explanation of the Born rule?

This feels to me like it's close to an explanation of the Born rule. The explanation would go like this:

In our big, messy, classical world we're never certain of the phase of any amplitude. And in the small, precise world of carefully crafted quantum experiments, we pin down the phases precisely. The purpose of the Born rule is to describe the boundary between these two models of the world. And the Born rule turns a deterministic quantum model (which only has amplitudes, not probabilities), into a probabilistic classical model (which only has probabilities, not amplitudes). To justify this leap, we should show that (i) there is legitimate uncertainty that cannot be eliminated by a better model, and (ii) that the laws of probability theory hold in our proposed probabilistic model. For (i), we are legitimately uncertain about the phases of the amplitudes, and there is no feasible way to fix this. For (ii), I have given an argument that disjoint probabilities add under the Born rule, which is the most important of the laws of probability theory.

The trouble with this explanation is that it's only appropriate to add amplitudes together in particular situations, and phases are only random in particular situations, and it's not clear whether these sets of situations overlap much. In the rest of this post, I precisely describe a situation in which they do overlap. Hopefully this can give us a clearer picture of how realistic this explanation is.

The rest of the post

In particular, I work through a particular experiment (similar to a Mach-Zehnder interferometer), showing in more detail how you can get into the above situation of adding amplitudes that have random phases. To give an outline:

  1. I describe three experiments with partially-silvered mirrors. Importantly, they are uncalibrated, i.e. their components are not precisely placed. I give a precise definition of what counts as "uncalibrated".
  2. I show that using the Born rule , the outcome of the experiment can be predicted with classical probability, but using the (incorrect, but plausible) alternative rule , it cannot be.

The Experiments

The three experiments use a single-photon light source, partially-silvered mirrors, and a photon detector. The second and third experiments are minor variations on the first. In every experiment, what the experimenter cares about is whether the detector beeps.

Background: Partially-silvered mirrors

Here are diagrams for how a photon interacts with a partially silvered mirror. It can either go straight through or reflect off, and the amplitude of each possibility depends on how silvered the mirror is (an unsilvered mirror would just be glass), and how it is oriented:

(The amplitude is an unspecified real number between 0 and 1. If it's , then the mirror is half-silvered. If is smaller it's less than half-silvered, and if is bigger it's more than half-silvered.)

Let's work through an example of how to use these diagrams to predict the outcome of an experiment made up of some mirrors, a single-photon light source, and a photon detector. To determine the probability that the detector beeps, you should:

  1. Consider all paths from the light source to the detector.
  2. Determine the amplitude of each path, by multiplying the amplitudes of each of its steps.
  3. To determine the amplitude of the outcome in which the detector beeps, add together the amplitudes of each path that leads to the detector.
  4. To determine the probability of the outcome, use the Born rule: take the absolute value of the amplitude and square it.

Those are the rules. Let's apply them to a simple example experiment:

Applying the rules:

  1. There's only one path that leads to the detector. It goes right, down, right.
  2. We multiply the amplitudes we see along the path ( and ) to get .
  3. If there were other paths, we'd add their contributions too. But there's only one so our total amplitude is .
  4. By the Born rule, the probability of the detector beeping is .

That's all there is to it.

First experiment: Uncalibrated Mach-Zehnder interferometer

Here's the setup of the first experiment:

Note that B and C are partially silvered, but need not be exactly half-silvered. I call their amplitude of reflection and respectively. There's one tricky thing going on here, on the path from D to the detector. This amplitude is either or , depending on whether the incoming photon came from B or C, respectively. (This follows from the diagrams I gave for the mirrors. I'm not just making it up!)

(If you know what a Mach-Zehnder interferometer is, this setup is the same except that B and C are only partially-silvered, allowing the photon to sometimes escape into the air.)

There's one other crucial feature for this setup, which is the point of this post. I gave an intuitive argument in the introduction that an experiment like this is extremely sensitive to the placement of its parts and must be carefully calibrated. I'm going to assume that this setup was not carefully calibrated, its parts are not fastened down, and as a result the phase of a photon taking any path is, for all practical purposes, uniformly random. More precisely:

Strong Uncalibration Assumption: The phase change of a photon traveling through the apparatus can be accurately modeled as uniformly random. Furthermore, the phase change of a photon is independent from one run of the experiment to the next, and the phase change along the top path is independent of the phase change along the bottom path.

Honestly, I'm not sure how realistic this assumption is; I'm relying on intuition here. But if it does hold, it gives a very satisfying account of how quantum mechanics degenerates into ordinary probability when things are uncalibrated. I'd love to hear from a physicist on the matter.

Calculating the amplitude

Let's see how that works. First, what is the amplitude that the detector beeps?

There are two paths to consider: A-B-D-detector, and A-C-D-detector. Their contributions are:

However, that is only true if the experiment is perfectly calibrated. We are assuming instead that it is uncalibrated, and that the phases of the paths can be modeled as uniformly random.

I'll be talking a lot about uniformly random phases, so I'll introduce some notation for it. Define to be the probability distribution of a complex number chosen uniformly at random the unit circle. For example:

  • is a uniformly random amplitude with magnitude .
  • is an amplitude chosen uniformly at random from a circle with radius centered at .

Using this notation, we can correct our calculation of the total amplitude by accounting for the random phases caused by the uncalibrated setup:

(I'm calling this AMPLITUDE_1, as it's the result of the first experiment.)

Because the negative signs make no difference when multiplied by a uniformly random phase, this becomes:

Second experiment:

The second experiment is like the first, but with half-silvered mirror A removed:

With this setup, there is only one path to consider, A-B-D-detector. The amplitude of this path (removing an unnecessary negative sign) is:

Third experiment:

The third experiment is like the first, but with the half-silvered mirror at A replaced with a full mirror:

Now there is only the path A-C-D-detector, with amplitude:

These uncalibrated experiments act clasically

Using the Born rule, we can determine the probability of the photon being detected in each of the three experiments. Since in each case we have a probability distribution over a quantum amplitude, we will take the expected value of the Born probability:

(I omit the proofs, but have evaluated these integrals by hand and checked the results with Wolfram Alpha.)

This is classical behavior! Naive classical reasoning would proceed as follows:

  • In the first experiment, the photon hits the half-silvered mirror at A, and either goes straight or reflects down with equal probability.
  • If it goes straight through A, then it has the same chance of hitting the detector as the photon in the second experiment.
  • If it reflects off of A, then it has the same chance of hitting the detector as the photon in the third experiment.
  • Thus the probability of the detector going off in the first experiment is the average of the probability that it goes off in the second and the probability that it goes off in the third.

Which matches the conclusion: is the average of and .

We can also work backwards, and take it as a given that this uncalibrated experiment should act classically. Doing so yields an equation that must constrain the Born rule:

One obvious alternative to the Born rule is to simply take the magnitude of the amplitude, (why would you square it?). However, doing so violates the above equation, so it would not produce classical probability. I'm not sure if this equation has a unique solution, but it is at least quite restrictive.

New Comment
12 comments, sorted by Click to highlight new comments since: Today at 3:53 PM

I would need to think about this more to be sure, but from my first read it seems as if your idea can be mapped to decoherence.

The maths you are using looks a bit different than what I am used to, but I am somewhat confident that your uncalibrated experiment is equivalent to a suitably defined decohering quantum channel. The amplitudes that you are calculating would be transition amplitudes from the prepared initial state to the measured final state (Denoting the initial state as |i>, the final state as |f> and the time evolution operator as U, your amplitudes would be <f|U|i> in the notation of the linked wikipedia article). The go-to method for describing statistical mixtures over quantum states or transition amplitudes is to switch from wave-functions and operators to density matrices and quantum channels (physics lectures about open quantum systems or quantum computing will introduce these concepts) - they should be equivalent to (more accurately: a super-set of) your averaging over s and t for the uncalibrated experiment, as one can just define a time evolution operator for fixed values of s and t and then get the corresponding channel by taking the probability weighted integral (compare the Operator-sum representation in the Wikipedia article) to arrive at the corresponding channel.

Regarding all the interesting aspects regarding the Born rule, I cannot contribute at the moment.

Thanks for all the pointers! I was, somewhat embarrassingly, unaware of the existence of that whole field.

I'm glad if this was helpful.

I was also surprised to learn about this formalism at my university, as it wasn't mentioned in either the introductory nor the advanced lecture on QM, but turns out to be very helpful for understanding how/when classical mechanics can be a good approximation in a QM universe.

You've got the roughly the right ideas; your setup is basically the standard setup of quantum statistical mechanics. In particular, we usually use a density matrix to represent classical uncertainty over the quantum state of a system.

I haven't closely read the details on the hypothetical experiments yet, but I want to comment on the technical details of the quantum mechanics at the beginning.

In quantum mechanics, probabilities of mutually exclusive events still add: . However, things like "particle goes through slit 1 then hits spot x on screen" and "particle goes through slit 2 then hits spot x on screen" aren't such mutually exclusive events.

This may seem like I'm nit-picking, but I'd like to make the point by example. Let's say we have a state where . If we simply add the complex amplitudes to try to calculate , we get 0; in actuality, we should get as we expect from classical logic.

Here's where I bad-mouth the common way of writing the Born rule in intro quantum material as and the way I'd been using it. By writing the state as and the event as we've made it look like they're both naturally represented as vectors in a Hilbert space. But the natural form of a state is as a density matrix, and the natural form of an event is as an orthogonal projection; I want to focus on events and projections. For mutually exclusive events and with projections and , the event has the corresponding projection .

So where's the adding of amplitudes? Let's pretend I didn't just say states are naturally density matrices and let's take the same state from above and an arbitrary projection corresponding to some event. The Born rule takes the following form:

This is notably not just an contribution plus a contribution; the other terms are the interference terms. Skipping over what a density matrix is, let's say we have a density matrix . The Born rule for density matrices is

Now this one is just a sum of two contributions, with no interference.

This ended up longer and more rambling than I'd originally intended. But I think there's a lot to the finer details of how probabilities and amplitudes behave that are worth emphasizing.

Thanks for taking the time to write this response up! This made some things click together for me.

In quantum mechanics, probabilities of mutually exclusive events still add: P(A∨B)=P(A)+P(B). However, things like “particle goes through slit 1 then hits spot x on screen” and “particle goes through slit 2 then hits spot x on screen” aren’t such mutually exclusive events.

That's a good point; is a strong precise notation of "mutually exclusive" in quantum mechanics. I meant to say that "events whose amplitudes you add" would often naturally be considered mutually exclusive under classical reasoning. ("Slit 1 then spot x" and "slit 2 then spot x" sure sound exclusive). And that if the phases are unknown then the classical reasoning actually works.

But that's kind of vague, and my whole introduction was sloppy. I added it after the fact; maybe should have stuck with just the "three experiments".

The Born rule takes the following form:

Ah! So the first Born rule you give is the only one I saw in my QM class way back when.

The second one I hadn't seen. From the wiki page, it sounds like a density matrix is a way of describing a probability distribution over wavefunctions. Which is what I've spent some time thinking about (though in this post I only wrote about probability distributions over a single amplitude). Except it isn't so simple: many distributions are indistinguishable, so the density matrix can be vastly smaller than a probability distribution over all relevant wavefunctions.

And some distributions ("ensembles") that sound different but are indistinguishable:

The wiki page: Therefore, unpolarized light cannot be described by any pure state, but can be described as a statistical ensemble of pure states in at least two ways (the ensemble of half left and half right circularly polarized, or the ensemble of half vertically and half horizontally linearly polarized). These two ensembles are completely indistinguishable experimentally, and therefore they are considered the same mixed state.

This is really interesting. It's satisfying to see things I was confusedly wondering about answered formally by von-Neumann almost 100 years ago.

That's a good point; is a strong precise notation of "mutually exclusive" in quantum mechanics. (...)

I'd be remiss at this point not to mention Gleason's theorem: once you accept that notion of mutually exclusive events, the Born rule comes (almost) automatically. There's a relatively large camp that accepts Gleason's theorem as a good proof of why the Born rule must be the correct rule, but there's of course another camp that's looking for more solid proofs. Just on a personal note, I really like this paper, but I haven't seen much discussion about it anywhere.

But that's kind of vague, and my whole introduction was sloppy. I added it after the fact; maybe should have stuck with just the "three experiments".

The general idea of adding terms without interference effects when you average over phases is solid. I will have to think about it more in the context of alternative probability rules; I've never thought about any relation before.

From the wiki page, it sounds like a density matrix is a way of describing a probability distribution over wavefunctions. Which is what I've spent some time thinking about (though in this post I only wrote about probability distributions over a single amplitude). Except it isn't so simple: many distributions are indistinguishable, so the density matrix can be vastly smaller than a probability distribution over all relevant wavefunctions.
And some distributions ("ensembles") that sound different but are indistinguishable:
This is really interesting. It's satisfying to see things I was confusedly wondering about answered formally by von-Neumann almost 100 years ago.

Yeah, some of these sorts of things that are really important for getting a good grasp of the general situation don't often get any attention in undergraduate classes. Intro quantum classes often tend to be crunched for time between teaching the required linear algebra, solving the simple, analytically tractable problems, and getting to the stuff that has utility in physics. I happened to get exposed to density matrices relatively early as an undergraduate, but I think there's probably a good number of students who didn't see it until graduate school.

Roughly speaking, there's two big uses for density matrices. One, as you say, is the idea of probability distributions over wavefunctions (or 'pure states') in the minimal way. But the other, arguably more important one, is simply being able to describe subsystems. Only in extraordinary cases (non-entangled systems) is a subsystem of some larger system going to be in a pure state. Important things like the no-communication theorem are naturally expressed in terms of density matrices.

Von Neumann invented/discovered such a huge portion of the relevant linear algebra behind quantum mechanics that it's kind of ridiculous.

Taking on the Borne rule is a tall order. It's up there with P vs NP by the amount of effort expended by extremely smart people. It is quite likely that it would emerge from integrating general relativity with quantum mechanics somewhere at the level where irreversible classicality emerges, probably for the objects on the order of the Planck mass (10^19 atoms).

I just find it mighty suspicious that when you add two amplitudes of unknown phase, their Born probabilities add:

E[Born(sa + tb)] = Born(sa) + Born(tb)    when s, t ~ ⨀

But, judging from the lack of object-level comments, no one else finds this suspicious. My conclusion is that I should update my suspicious-o-meter.

The way probabilities are calculated is quite constrained. The Kochen–Specker theorem might be a worthwhile place to start from.

If α is smaller it's less than half-silvered, and if α is bigger it's more than half-silvered.

Just a nit, but isn't this backwards? Less silvering means less reflection and more transmission, but this first diagram labels the transmitted amplitude as α, not the reflected amplitude.

Thanks. It was the diagram that was backwards; I meant for to be the amplitude of reflection, not of transmission. I updated the diagram.