Epistemic status: some thoughts about a complex problem

TL;DR: Goldbach conjecture – that any even number could be presented as a sum of two primes – is more likely to be a random coincidence than a provable rule, because for large numbers there are very many possible pairs.


Here I investigate Goldbach's Conjecture (GC), which states that any even number greater than 2 can be expressed as the sum of two prime numbers. Despite numerous attempts, this conjecture has yet to be proven. I apply logical probability theory to the hypothesis that GC is not a rule, but merely a coincidence. I demonstrate that there are four possible scenarios: 


(1) GC is true up to infinity, but without any underlying rule, due to pure statistical reasons; 

(2) GC is true, and there is a rule, but it is unprovable (in a Godel-style solution);

(3) GC is true and there is a rule that can be proved, but it was not proved yet; 

(4) GC is false for some very large number.


I then demonstrate that (1) has highest logical probability.

First, I look at statistics. There is nothing surprising that a large odd number N could be presented as a sum of two primes because for each large number there are many sums of pairs of primes which produce that number, and thus there is a significant probability that at least one sum includes two primes. 

The probability that there will be at least one sum of two primes producing even number N grows very quickly with N growth. This probability estimation was already made in literature and it was demonstrated that the chances of GC being false is less than 10-4000 for numbers higher than 2,000,000, and it was already tested for numbers lower than this.

Then I look at the theory of verifiers and complexity theory. This makes probabilistic proof stronger than any analytic proof as analytic proofs tend to have significantly higher rate of errors, like 1 in 100, and the chances of errors grow with complexity of proofs. Extensive futile prior search of proof rules out existence of any simple proof. 

Finally, I look at the idea that there is a “rule”, but there is no instance of its application, as all supposed instances are easily statistically explained. I find it being an argument against existing of the rule. The existence of the rule in GC should be visible, first of all, for very small even numbers as some statistical anomaly.


Goldbach’s conjecture (GC) states that any even number could be presented as a sum of two primes. At first, it looks like a very neat theorem that could be formally proved. 

But what if GC is not a theorem that can be proved, but just a coincidence, which is a result of a very large but semi-random process? 

It is not surprising that GС holds for a given large number N, as there are many pairs of prime numbers P+Q that give N, and the number of possible pairs grows very quickly for larger N. 

For example, for N=100 there are 6 pairs of primes: 97+3, 89+11, 83+17, 71+29, 59+41, 53+47. For N=1000 there are 28 pairs and for N=10 000 there are 127 pairs of prime numbers which produce them.

Coincidences sometimes happen in math. For example, in the beginning of e= 2,718281828… the combination of digits (1828) repeats two times and 1828 is also the date of birth of Lev Tolstoy. There is not any additional proof for this and it doesn’t have any meaning, except a useful mnemonic technic. It is just a coincidence.


Sheldon’s estimation of GC’s probability

In the article “A Statistician’s Approach to Goldbach’s Conjecture” Neil Sheldon showed that for the number N = 2 000 000, the chances that none of the members of any pair that produce N in sum are prime is 10-4663. (Note that is not the machine testing of GC, but probabilistic estimate based on the expected number of primes.

Then Sheldon shows that the probability that GC is false for numbers between N= 2 000 000 and N= 20 000 000 is less than 10-4656.

Next Sheldon numerically demonstrated that this probability declines very quickly for subsequent blocks of numbers 10-34100 for (20 mln, 200 mln), and 10-261000 for (200 mln, 2 billions). Then he multiplies all probabilities that GH is true for all blocks, like (1-10-34100 ) (1-10-261000) and shows that only first member of the multiplication matter as all the subsequent members are very close to 1. Based on this he concludes that the probability of GH being false is less than 10-4656, if we know that GC holds up to N=2 000 000 (and we know this, based on computer testing of GC).

There are two main problems with Sheldon’s calculation: first, it is numerical, but it seems to be possible to rewrite it algebraically, as his line of reasoning is clear. Second, it becomes strong only for large numbers, but GC holds even from single digits (e.g. 8=5+3). So, the rule should exist? 

Also, Sheldon’s estimation is based on the assumption of random distribution of primes. But what if it is not random or we “run out of primes” for large N? At least one of primes in sum should be more than N/2, so we need to be sure that very large primes continue to appear (It is known as https://en.wikipedia.org/wiki/Bertrand%27s_postulate).

Sheldon’s statistical proof of GC is based on some intuition about the distribution of primes, but GC is also a claim about distribution of primes, so there is a circularity. He uses Gauss distribution of primes as prior:


N(primes below n) = n / ln n.


If GC turns to be false for very large numbers (no sum of primes for some very large rogue N), all previously observed confirmations of GC will turn out to be mere coincidences. 

Logical probability

Logical probability is a measure of our uncertainty about mathematical facts, like what is the probability that the millionth digit in pi is 9?  Logical probability was explored in the article by MIRI https://arxiv.org/pdf/1609.03543.pdf


Central argument

Here I suggest that there are 4 possible situations with GC, and that (1) situation has the highest logical probability:


  1. GC is true up to infinity but for purely statistical reasons, and no “rule” exists. (“Rule” here means “a principle that governs the behaviour of numbers”).
  2. GC is true, and the rule exists, but the rule is unprovable (Godel-style solution).
  3. GC is true and the rule exists and can be proved, but any proof is very complex and error-prone.
  4. GC is false for some a very large number.


Applying logical probability to the scenarios

The first three situations from our central argument are observationally undistinguishable, and the failure to find a relatively simple proof in the last 250 years may imply that solutions “1” or “2” are more likely – in the sense of logical probability – to be true. 

The proof, if it exists, is very computationally complex, and therefore more likely to be error-prone and false. Complex proofs are more likely to have hidden errors and their verification requires (now) very highly trained and rare humans and is also time-consuming. Machine verifiers also need to be verified themselves (see Yampolsky https://arxiv.org/abs/1609.00331 ), either by machines or humans which creates an eternal loop of verifications or at least very high complexity. And the complexity of  a proof is proportional to its probability to be false (see also Scott Aaronson about complexity arguments https://www.scottaaronson.com/papers/philos.pdf). 

The more time passes, the more evidence we get that (4) is false from computational tests and also the more we should doubt any proof (3) as its complexity grows. Therefore, the logical probability of (1) and (2) is growing over time. 

But is where the way to distinguish (1) and (2)? The rule will show itself if there will be a statistical deviation from the random distribution of the number of possible representations for any even number. Such deviation does actually exist and is known as the “Goldbach comet”.

But: Wiki: “It is interesting to speculate on the possibility of any number E having zero prime pairs, taking these Gaussian forms as probabilities, and assuming it is legitimate to extrapolate to the zero-pair point. If this is done, the probability of zero pairs for any one E, in the range considered here, is of order 10−3700. The integrated probability over all E to infinity, taking into account the narrowing of the peak width, is not much larger. Any search for violation of the Goldbach conjecture may reasonably be expected to have these odds to contend with.” https://en.wikipedia.org/wiki/Goldbach%27s_comet 

This 10−3700 is close to Sheldon’s estimation discussed above.

While the “Goldbach comet” chart (the distribution of sums) has some peculiarities, it looks like a random process.

In some sense, if a “mathematical god” (or mathematical universe) exists, GC is a rule, but if no such god, then GC is just a coincidence. 


GC as a coincidence is more probable than validity of any formal proof

Yampolskiy wrote about the limited provability of proofers. Any proofer, human or machine, can have errors and this creates a fundamental limit on the probability that any given proof P is true. 

Let p(P) be the maximum attainable probability that a randomly taken (from all accepted mathematical proofs) proof is true. I assume that p(P) is around 0.999, but the real number could be smaller and depends on the type of proof and the year it was suggested.

In general, shorter and simpler proofs are more likely to be true. Some proofs are too long to be checked by one human or need years of work from a specially trained person to be checked. Even if a long proof can be machine verified, the verifier itself can have errors, which creates an unresolvable source of uncertainty as was shown by Yampolsky.

The idea that GC is just a coincidence is simple and the proof of this is very simple. It means that it has higher a priory probability to be true than any long and complex proof which will ever appear. And it seems that there are no simple proofs of GC, after more than 200 years of search. 

We could think about a hypothetical prediction market on which different objects with different “logical probabilities” are evaluated, as was suggested in the article about Logical Induction by MIRI. I expect that in such a market, GC-as-coincidence will be rated higher than any single proof, for example, a random proof pulled out from arxiv.org. 

There are several papers which claim that they have proved GC (how many?) As there are many “proofs”, most of them should be false, which gives us a baseline chance that any new proof is true. (I assume here that there is only one correct proof, but actually could be many different valid proofs, but not for such complex and well-researched topics as GC.)

In short, coincidence theory has chances of being false 1 to 10−3700 and any future formal proof has chances to be false around 1 to 100. Thus, I still should bet on the coincidence theory, even if the formal proof is convincing and generally accepted.


How will counterfactual universe look where GC is false?

In the such universe, GC will still hold for non-extremely large numbers, as statistic beats chance here, but the difference could become visible for either small or for extremely large numbers. 

Difference 1: there will be some small number around N=100, for which GC will not hold. (Let’s check if there are real numbers which have a very small number of representations as a sum of pairs). We then will know that it is false in our universe.

Difference 2: There could be also an extremely large number, maybe beyond anything we could write down, for which there is no pair of primes. We don’t know this and can’t know for sure without formal proof. 

Only difference 2 could hold in our observable universe. In it, is just a coincidence that we didn’t yet encounter the exception.


Very improbable event doesn’t need formal proof

While GC looks like a claim which should have formal proof, it could be valid even if no such proof exists, because the event it describes is very unlikely. It is better to reformulate GC: there is no such even integer N, which is not a sum of two primes.

While GC looks like a claim about existence, it is actually a claim about non-existence. And the non-existence of something is more likely to be a coincidence. That is, this is the difference of GC from the Great Fermat Theorem which claims the universal non-existence of n such that xn + yn = zn for n more than 2. Fermat theorem is surprising, but GC is unsurprising from probabilistic reasons alone. 

Even if GC is false, it may be false for so large numbers that we will never observe them in computations. Thus, computational tests of GC give us a very small Bayesian update as it was a priori clear that GC violations are very unlikely even if no such GC-rule exists. 


Counterarguments: GC is a law, not a coincidence

GC holds for small numbers

One may argue that GC is so strong for the numbers below 100 that we should assume that there is a rule. This could be tested empirically by assuming a random distribution of pairs with some property, not primes and then comparing it actual distribution of primes.

Non-random distribution of primes

We could search for non-randomness in prime distribution as evidence against GC-as-coincidence. E.g. twin primes are an example of non-randomness. And if there are too many primes in some place there should be a void.

We are close to real proof, and other similar problems were solved formally

For example, the Great Fermat Theorem was finally proved. I argue here that the statistic nature of GC is different from many other similar-sounding claims.


The idea of logical probability can help us to get a better understanding of how we should interpret the complexity of proofs: the more complex is proof, the less are chances that it is valid, and at some level, these chances fall below background levels (=a priori probability) of what we should expect about the validity of a mathematical claim. 

GC is a good example of the above: GC has a very strong a priori probability to be merely a coincidence + no simple proof exists. Therefore, we should bet that it is just a coincidence even after possible proof appear.

New to LessWrong?

New Comment
15 comments, sorted by Click to highlight new comments since: Today at 8:32 AM

Does your argument fail for https://en.wikipedia.org/wiki/Goldbach%27s_weak_conjecture?

If so, can you explain why?  If not, it seems your argument is no good, as a good proof of this (weaker) claim exists.

Not that you asked my advice, but I would stay away from number theory unless you get a lot of training.

The existence of a simple probability argument for Goldbach's conjecture, actually favors the existence of a simple proof. Primes aren't actually random; if the assumption of randomness already implies the truth of the conjecture with high probability, then a few extra bits of provable structural information about their distribution may be enough to supply a proof. 

Yes. But also what works here is not the randomness of distrubution of primes, but the number of attempts (to get a sum of primes) which is implied by any sufficiently large number (N/20. Only very large gaps in prime distribution will be sufficient to disprove statistical proof. There is a postulate that there are no such gaps exist https://en.wikipedia.org/wiki/Bertrand%27s_postulate

Just to be clear, Bertrand's postulate is actually a theorem (i.e. a known result), not a postulate/hypothesis. It is called a "postulate" for historical reasons.

Interesting!  I wonder if you could find some property of some absurdly large number, then pretend you forgot that this number has this property and then construct a (false) proof that with extremely high probability no number has the property.  

Yes. I thought about finding another example of such pseudo-rule, but didn't find yet. 

We could use such examples to estimate logical probability that Goldbach conjecture is false, like share of eventually disproved conjectures to the number of all conjectures (somehow normalised by their complexity and initial plausibility) .

Normalizing such a thing would be fraught with difficult-to-justify assumptions. Conjectures aren't a random process so most conceivable reference classes would likely suffer from severe biases.

Number theory is a subtle art with a long history of dabblers and crackpots. Personally, I wouldn't update significantly unless such an estimation were done by someone with a strong mathematical track record and it survived some peer review.

-"The more time passes, the more evidence we get that (4) is false from computational tests and also the more we should doubt any proof (3) as its complexity grows. Therefore, the logical probability of (1) and (2) is growing over time."

The fact that the methods we use to check proofs are not perfectly reliable (and I think you greatly overstate the importance of this consideration, computer proof checkers are very reliable) does not at all imply that the probability that a proof exists is decreasing over time. You need to distinguish between the fact of a proof existing (which is what Godel's theorem is about) and the fact of us being able to check a proof.

-"In short, coincidence theory has chances of being false 1 to 10^−3700 and any future formal proof has chances to be false around 1 to 100. Thus, I still should bet on the coincidence theory, even if the formal proof is convincing and generally accepted."

Neither of these numbers are right. The ratio 1 to 10^−3700 is not the chance that coincidence theory is false, but rather the chance that GC is false given that either GC is false or coincidence theory is true. And I don't know where you got the ratio 1 to 100 from, but it is not remotely plausible; out of the many thousands of formal proofs which have gained widespread acceptance since the advent of formality in mathematics, I am not sure if there has been even one which was wrong. (I suppose it may come down to exactly how you define your terms.)

Since this seems to be the core of your argument, I guess I have disproved your thesis?

-"(I assume here that there is only one correct proof, but actually could be many different valid proofs, but not for such complex and well-researched topics as GC.)"

Why would you assume that? There are many theorems with multiple non-equivalent valid proofs, even for "such complex and well-researched topics as GC". This is leaving aside that "non-equivalent" for proofs is an ill-defined concept, and any proof can always be transformed into a new proof via trivial changes.

-"Fermat theorem is surprising, but GC is unsurprising from probabilistic reasons alone."

Fermat's theorem is also unsurprising from probabilistic reasons alone.

-"One may argue that GC is so strong for the numbers below 100 that we should assume that there is a rule."

Actually, the probabilistic arguments already imply that GC is likely (although with less dramatic odds) even for numbers less than 100, especially since 3, 5, and 7 are all prime.

-"E.g. twin primes are an example of non-randomness."

Actually, the standard random model of the primes predicts that there are infinitely many twin primes.

Your post is too long for me to want to think through all of it, but I looked at some parts and they don't seem to make sense to me. For instance for justifying the validity of logical probability, you reference's MIRI's paper on logical inductors. However, logical inductors are based on markets of heuristics, and you don't seem to have used a market of heuristics to derive your probability estimate (and I wouldn't expect you to since the idea is intractable).

If you make an argument, but you bring up irrelevant things that you aren't actually relying on in the argument, then that lengthens the argument, which makes it harder to read. Therefore people usually try to avoid bringing up irrelevant things in their arguments.

A consequence of the fact that people usually try to avoid bringing up irrelevant things in their arguments is that one can generally assume that the things people do bring up are going to be relevant. So this makes it extra confusing when you bring up something irrelevant, because we end up assuming that you intend it to be important in your argument, and try to figure out in what way it is relevant.

I would recommend redoing this argument where you cut out all the things that are not relevant or which you do not use in your argument.

I linked the MIRI paper because it has good introduction in logical probability.

logical inductors are actually defined by the logical induction criterion. The market bit is there to prove that it is possible to fulfill the criterion.