There are fundamental rules of the universe that I don't yet understand. And for some reason, one of them seems to spell out: "thou shall use complex-valued analysis to study the behavior of prime numbers".
I am not at all a number theory expert and I am not quite sure what shape of explanation you are looking for here. One possible explanation I have here though is that you might be missing the forest for all the trees. From my outsider perspective the connection is already obvious in your introduction: Prime numbers -> modular arithmetic ~= arithmetic on circles -> complex numbers
If you have a problem involving operations on circles throwing complex analysis at it seems like the type of thing you would want to throw at the problem. The arrow Prime numbers -> modular arithmetic seems actually more worthy of a good compressed explanation.
One of the weirdest things in mathematics is how completely unrelated fields tend to connect with one another.
A particularly interesting case is the play between number theory (the study of natural numbers ) and complex analysis (the study of functions on ). One is discrete and uses modular arithmetic and combinatorics; one is continuous and uses integrals and epsilon-delta proofs. And yet, many papers in modern number theory use complex analysis. Even if not directly, basically all of them rely on some result with a complex analytic flavor.
What's going on?
This post gives an example of a fundamental result in number theory that uses complex analysis, and tries to explain where that usage is coming from. My writing is semi-technical, I'm not trying to formalize everything, but mostly doing this as an exercise to clear out my own confusion (and hopefully to convince others of why this mathematical connection is valuable).
Technical details in the proof follow Ang Li’s notes; exposition, intuition and possible mistakes are mine.
We know that there are infinitely many prime numbers since around 300 BC. Over 2000 years later, Dirichlet proved that there are infinitely many primes even if we restrict ourselves to an arithmetic progression of the form , as long as have no common factors.
I'll focus on a concrete example: there are infinitely many prime numbers ending in the digit 7, that is, primes of the form .
This statement can be understood by a curious 7-year-old: she could learn about prime numbers, then write out examples of them - her own age, 17, 37, ... and then be told that, although such numbers ending in 7 become harder and harder to find, they will always be there, beyond whatever limits we imagine.
If the statement is so easy to phrase, why does it require complex analysis?
Do we really need to use a number defined as the square root of -1, as well as computing infinite sums and integrals, just to handle primes ending in 7?
You might say that this is necessary because we're working with an "infinite" object, and that necessarily requires tools from analysis. But that's not dissipating the mystery. Here's a simpler statement:
There is a natural number such that is prime.
This statement has a finite "elementary" proof. In fact, I can guide you on how to construct one: start with and keep adding 1 until you get that is prime (as verified by your favorite primality test). This search will stop at some point when we get to a definite prime. Done.
This template of a proof clearly isn't enough; how do we know for sure that it will stop? we can't actually run the search, because just writing out requires more space than there are atoms in the known universe.
And yet, we know for sure that this algorithm will stop with a definite prime. And we got this knowledge from properties of and . But why?
First we'll prove the easier result: that there are infinitely many primes, without restricting their last digit.
Euler came up with an interesting proof, that shows something stronger than that: the sum , made by summing over all primes , diverges to infinity.
This implies the infinitude of primes, because in order for a sum of finite terms to have an infinite value, it must have infinitely many terms. It's also a nice trick to turn a "combinatorial" problem (about the amount of prime numbers) to an "analytic" problem (about the divergence of a certain sum).
How would we go about proving something like that?
Well, we have one classic example of a divergent infinite sum - the harmonic series:
We want some way to express this sum in terms of prime numbers.
Primes are the building blocks of the natural numbers: each number can be written as a unique product of prime numbers. So it makes sense to write this sum in terms of products involving prime numbers.
Let's first sum over natural numbers whose only prime factors are 2 and 3. We'll denote this set by We have:
We can use the distributive law, to write this as the product of two distinct sums:
Each sum on the right-hand side is an infinite geometric series. We know that whenever the sum converges. So we have a simple closed formula:
This is when summing over numbers whose only prime factors are 2 and 3. If we add 5 as a prime factor, by summing over , we get:
And we carry on, each time adding a new prime factor to our set. This makes the left-hand side sum over more natural numbers, while the right-hand side gets more factors.
Since all natural numbers are products of some prime numbers, in the limit we get the identity:
where ranges over all prime numbers. This is the key identity for our purposes.[1]
On itself, is already enough to show the infinitude of primes; the LHS is infinite, while the RHS is a product of terms, one term for each prime number. This only makes sense if the product has infinitely many terms.
But still, we want to show that diverges, which is a stronger requirement. We wish to use , but it involves products over prime numbers, not sums over prime numbers.
Which mathematical machinery can we use to turn products into sums?
Oh, right; the natural logarithm satisfies . This holds for infinite products/sums as well:
The function has a nice Taylor expansion: . We only care about asymptotic behavior, so for small , we can approximate .[2]
Substituting that into , we get:
Since the left term diverges, the right term also diverges, so . Done.
This was a cool trick; we proved a purely number-theoretic result using tools from analysis, like taking the Taylor series of a logarithm.
Can we do the same thing for proving the infinitude of primes of the form ?
Let denote the function that returns 1 whenever , and 0 otherwise. We want to show that infinitely many primes satisfy . Dirichlet actually shows the stronger claim, that (remember that ranges over prime numbers while ranges over all natural numbers). How do we do that?
We can try a similar technique to Euler's proof. The analogue of identity would be:
Unfortunately, this doesn't hold. The second equality is still valid, but the first one fails, because is not a multiplicative function; we have but we don't have . (Concretely, but .)
But this identity does hold whenever we replace with a multiplicative function , one that satisfies for all natural numbers . That is, we have for such :
Euler's original proof used this identity for the constant function .
How do we use this property for analyzing ?
The main idea is to write as a linear combination of multiplicative functions. If we analyze each separately, combining them gives us insight to the resulting function.
Well, is a periodic function on the integers of period 10. It's a signal, or a heart beat. It beats with the value 1 every 10 integers, and is flat elsewhere.
We already have a technique to write a general periodic signal as a sum of well-behaved components - a Fourier sum:
Fourier series and Fourier transforms are examples of a much more general concept. I won't get into the details here, but if you have a group satisfying some properties, then you can write every (suitably regular) function as a combination of multiplicative functions . Fourier transforms act on , Fourier sums act on the circle group , while our example acts on the group of invertible elements modulo 10. (Don't worry if you are not familiar with groups.)
The main point is that, given a function with period 10, we can always[3] write it as a linear combination of multiplicative functions of the same period. These multiplicative components are called characters.
Allowing to range over complex numbers is necessary; just using real-valued functions isn't enough. This is similar to how Fourier sums require both a cosine and a sine component, where considering only real-valued components only gives us the cosine part. This is the reason why complex numbers are fundamental to our proof - we need roots of unity in order to have a rich choice of characters (the real numbers only have as possible character values).
Finding all characters modulo 10 is rather easy. Here they are:
| 1 | 3 | 7 | 9 | |
| 1 | 1 | 1 | 1 | |
| 1 | +i | -i | -1 | |
| 1 | -i | +i | -1 | |
| 1 | -1 | -1 | 1 |
There are 4 such characters in total: . Since they are 10-periodic, it's enough to write out their values on the set . Also, whenever and 10 have a common divisor, so we can throw out all multiples of 2 and 5; we're left with just 4 values that has a non-zero value on. These values are described in the table above.
One can check that all characters defined above are multiplicative: . (This is when working modulo 10, so for example, .)
Finally, we can see that the following identity holds (one can check it explicitly for ):
Going back, we wanted to show that diverges. We'll write it as a sum of these 4 components:
We know that the first term diverges: . This is because all primes besides satisfy ; this is almost equivalent to , which was already proven to diverge.
What about the three other components?
They involve sums of the form , where is a non-trivial character; it's not always equal to 1, and we may have as well.
Looking again into the table above, we see that besides , all other rows sum to 0. That is, sums to 0 when summing over all integers modulo 10:
Fourier series have a similar phenomenon; there is the "constant" component , but all other components for create a periodic "wave" that cancels out:
This is not a coincidence because nothing is ever a coincidence.
Since has a cancelling-out behavior, we might expect that unlike for , remains bounded; this is like how even though . This does turn out to be the case.
After proving that, we look back in : The right-hand side is a sum of 4 components. The first one is infinite, while the other three are bounded. Intuitively (this can be formalized), divergent sum + bounded sum = divergent sum. Thus, the entire expression is divergent. Since the left-hand side is a sum over all primes (these are the only primes with ), the number of such elements must be infinite. Done.
Of course, we're not really done. We have to prove that is bounded whenever is a non-trivial character modulo 10.
A first step would be to revisit formula , that relates sums over all natural numbers to sums over all prime numbers. The same reasoning as above, while using the fact that is multiplicative, gives us:[4]
This means that to show that is bounded, it's enough to show that the infinite series converges to a non-zero number. (If then diverges, breaking our argument.)
Showing that converges to a finite number turns out to be pretty easy; we use summation by parts, a discrete analogue of integration by parts, the same technique which shows that converges.
So, the only thing keeping us from showing that there are infinitely many prime numbers ending with 7, is the following claim:
satisfies whenever is a non-trivial character modulo 10.
There are three sums to check: (for as defined in the table above). Each one of them can be calculated explicitly to a high-enough degree of accuracy, showing that the final sum escapes 0. So, by doing additional finite calculation, we are truly done.
The next step is to prove the general form of Dirichlet's theorem: if are coprime, then contains infinitely many prime numbers. Our proof only worked for .
The same techniques work for other , but this time, we can't calculate explicitly; there are infinitely many characters to check.
In the general case proving is very hard - in fact, this is the main difficulty of the proof!
This is a typical phenomenon in analytic number theory: given some problem involving prime numbers, there is a standard way to encode it in the language of complex numbers (typically using the Riemann zeta function or other character sums like ), and the hard part is to show that some analytic bound holds.
After studying enough number theory, proofs like the one above start feeling more intuitive to me. I already know how the arguments take place, what's the role of each component, how they play with each other and what are the main difficulties.
At some point, I was even able to give my own new observations and prove them using similar techniques.
Unfortunately, after all that, I notice that I am still confused.
There are fundamental rules of the universe that I don't yet understand. And for some reason, one of them seems to spell out: "thou shall use complex-valued analysis to study the behavior of prime numbers".
This is not always the case; some well-known statements in number theory which were proven with complex-analytic methods were later proved by other means. In fact, Dirichlet's theorem is one of them! But I am still not satisfied, because:
This is far from being the only mystery in mathematics. When going deep enough into some field, we see more and more surprising connections. Homotopy type theory is one of them; Monstrous moonshine is another. I chose to introduce a specific connection I'm more familiar with. If anyone has insights into these other connections, please share them here.
Maybe studying more math will resolve my confusion. But until then, I remain awestruck by the unreasonable deepness of number theory.
Since the sum diverges, doing algebraic manipulations on this series is tricky; this can often result in nonsense results like . The formal proof involves the series instead, for , and studying the asymptotic behavior as . I ignore this (important) detail here.
Formally, we say that . Instead of saying things are approximately equal, we show an explicit bound on the error term, and note that the total error is finite.
Assuming whenever and 10 have a common divisor.
Defining over complex numbers is non-trivial, as is a multi-branch function. Li's notes (link above) show how to settle this over the required domain.