Dec 16, 2010

189 comments

**Related to: **Infinite Certainty

Suppose the people at FiveThirtyEight have created a model to predict the results of an important election. After crunching poll data, area demographics, and all the usual things one crunches in such a situation, their model returns a greater than 999,999,999 in a billion chance that the incumbent wins the election. Suppose further that the results of this model are your only data and you know nothing else about the election. What is your confidence level that the incumbent wins the election?

Mine would be significantly less than 999,999,999 in a billion.

When an argument gives a probability of 999,999,999 in a billion for an event, then probably the majority of the probability of the event is no longer in "But that still leaves a one in a billion chance, right?". The majority of the probability is in "That argument is flawed". Even if you have no particular reason to believe the argument is flawed, the background chance of an argument being flawed is still greater than one in a billion.

More than one in a billion times a political scientist writes a model, ey will get completely confused and write something with no relation to reality. More than one in a billion times a programmer writes a program to crunch political statistics, there will be a bug that completely invalidates the results. More than one in a billion times a staffer at a website publishes the results of a political calculation online, ey will accidentally switch which candidate goes with which chance of winning.

So one must distinguish between levels of confidence internal and external to a specific model or argument. Here the model's internal level of confidence is 999,999,999/billion. But my external level of confidence should be lower, even if the model is my only evidence, by an amount proportional to my trust in the model.

One might be tempted to respond "But there's an equal chance that the false model is too high, versus that it is too low." Maybe there was a bug in the computer program, but it prevented it from giving the incumbent's real chances of 999,999,999,999 out of a *trillion*.

The prior probability of a candidate winning an election is 50%^{1}. We need information to push us away from this probability in either direction. To push significantly away from this probability, we need strong information. Any weakness in the information weakens its ability to push away from the prior. If there's a flaw in FiveThirtyEight's model, that takes us away from their probability of 999,999,999 in of a billion, and back closer to the prior probability of 50%

We can confirm this with a quick sanity check. Suppose we know nothing about the election (ie we still think it's 50-50) until an insane person reports a hallucination that an angel has declared the incumbent to have a 999,999,999/billion chance. We would not be tempted to accept this figure on the grounds that it is equally likely to be too high as too low.

A second objection covers situations such as a lottery. I would like to say the chance that Bob wins a lottery with one billion players is 1/1 billion. Do I have to adjust this upward to cover the possibility that my model for how lotteries work is somehow flawed? No. Even if I am misunderstanding the lottery, I have not departed from my prior. Here, new information really does have an equal chance of going against Bob as of going in his favor. For example, the lottery may be fixed (meaning my original model of how to determine lottery winners is fatally flawed), but there is no greater reason to believe it is fixed in favor of Bob than anyone else.^{2}**Spotted in the Wild**

The recent Pascal's Mugging thread spawned a discussion of the Large Hadron Collider destroying the universe, which also got continued on an older LHC thread from a few years ago. Everyone involved agreed the chances of the LHC destroying the world were less than one in a million, but several people gave extraordinarily low chances based on cosmic ray collisions. The argument was that since cosmic rays have been performing particle collisions similar to the LHC's zillions of times per year, the chance that the LHC will destroy the world is either literally zero, or else a number related to the probability that there's some chance of a cosmic ray destroying the world so miniscule that it hasn't gotten actualized in zillions of cosmic ray collisions. Of the commenters mentioning this argument, one gave a probability of 1/3*10^22, another suggested 1/10^25, both of which may be good numbers for the internal confidence of this argument.

But the connection between this argument and the general LHC argument flows through statements like "collisions produced by cosmic rays will be exactly like those produced by the LHC", "our understanding of the properties of cosmic rays is largely correct", and "I'm not high on drugs right now, staring at a package of M&Ms and mistaking it for a really intelligent argument that bears on the LHC question", all of which are probably more likely than 1/10^20. So instead of saying "the probability of an LHC apocalypse is now 1/10^20", say "I have an argument that has an internal probability of an LHC apocalypse as 1/10^20, which lowers my probability a bit depending on how much I trust that argument".

In fact, the argument has a potential flaw: according to Giddings and Mangano, the physicists officially tasked with investigating LHC risks, black holes from cosmic rays might have enough momentum to fly through Earth without harming it, and black holes from the LHC might not^{3}. This was predictable: this was a simple argument in a complex area trying to prove a negative, and it would have been presumptous to believe with greater than 99% probability that it was flawless. If you can only give 99% probability to the argument being sound, then it can only reduce your probability in the conclusion by a factor of a hundred, not a factor of 10^20.

But it's hard for me to be properly outraged about this, since the LHC did not destroy the world. A better example might be the following, taken from an online discussion of creationism^{4} and apparently based off of something by Fred Hoyle:

In order for a single cell to live, all of the parts of the cell must be assembled before life starts. This involves 60,000 proteins that are assembled in roughly 100 different combinations. The probability that these complex groupings of proteins could have happened just by chance is extremely small. It is about 1 chance in 10 to the 4,478,296 power. The probability of a living cell being assembled just by chance is so small, that you may as well consider it to be impossible. This means that the probability that the living cell is created by an intelligent creator, that designed it, is extremely large. The probability that God created the living cell is 10 to the 4,478,296 power to 1.

Note that someone just gave a confidence level of 10^4478296 to one and was wrong. This is the sort of thing that should *never ever happen*. This is possibly the *most wrong anyone has ever been*.

It is hard to say in words exactly how wrong this is. Saying "This person would be willing to bet the entire world GDP for a thousand years if evolution were true against a one in one million chance of receiving a single penny if creationism were true" doesn't even begin to cover it: a mere 1/10^25 would suffice there. Saying "This person believes he could make one statement about an issue as difficult as the origin of cellular life per Planck interval, every Planck interval from the Big Bang to the present day, and not be wrong even once" only brings us to 1/10^61 or so. If the chance of getting Ganser's Syndrome, the extraordinarily rare psychiatric condition that manifests in a compulsion to say false statements, is one in a hundred million, and the world's top hundred thousand biologists all agree that evolution is true, then this person should preferentially believe it is more likely that all hundred thousand have simultaneously come down with Ganser's Syndrome than that they are doing good biology^{5}

This creationist's flaw wasn't mathematical; the math probably does return that number. The flaw was confusing the internal probability (that complex life would form completely at random in a way that can be represented with this particular algorithm) with the external probability (that life could form without God). He should have added a term representing the chance that his knockdown argument just didn't apply.

Finally, consider the question of whether you can assign 100% certainty to a mathematical theorem for which a proof exists. Eliezer has already examined this issue and come out against it (citing as an example this story of Peter de Blanc's). In fact, this is just the specific case of differentiating internal versus external probability when internal probability is equal to 100%. Now your probability that the theorem is false is entirely based on the probability that you've made some mistake.

The many mathematical proofs that were later overturned provide practical justification for this mindset.

This is not a fully general argument against giving very high levels of confidence: very complex situations and situations with many exclusive possible outcomes (like the lottery example) may still make it to the 1/10^20 level, albeit probably not the 1/10^4478296. But in other sorts of cases, giving a very high level of confidence requires a check that you're not confusing the probability inside one argument with the probability of the question as a whole.

**Footnotes**

**1.** Although technically we know we're talking about an incumbent, who typically has a much higher chance, around 90% in Congress.**2.** A particularly devious objection might be "What if the lottery commissioner, in a fit of political correctness, decides that "everyone is a winner" and splits the jackpot a billion ways? If this would satisfy your criteria for "winning the lottery", then this mere possibility should indeed move your probability upward. In fact, since there is probably greater than a one in one billion chance of this happening, the majority of your probability for Bob winning the lottery should concentrate here!**3.** Giddings and Mangano then go on to re-prove the original "won't cause an apocalypse" argument using a more complicated method involving white dwarf stars. **4.** While searching creationist websites for the half-remembered argument I was looking for, I found what may be my new favorite quote: "Mathematicians generally agree that, statistically, any odds beyond 1 in 10 to the 50th have a zero probability of ever happening." **5.** I'm a little worried that five years from now I'll see this quoted on some creationist website as an actual argument.