Divergence of the sum of the reciprocals of the primes

The sum of the reciprocals of all prime numbers diverges; that is: $$\sum_{p\text{ prime}}\frac1p = \frac12 + \frac13 + \frac15 + \frac17 + \frac1{11} + \frac1{13} + \frac1{17} + \cdots = \infty$$

This was proved by Leonhard Euler in 1737, and strengthens Euclid's 3rd-century-BC result that there are infinitely many prime numbers and Nicole Oresme's 14th-century proof of the divergence of the sum of the reciprocals of the integers (harmonic series).

There are a variety of proofs of Euler's result, including a lower bound for the partial sums stating that $$\sum_{\scriptstyle p\text{ prime}\atop \scriptstyle p\le n}\frac1p \ge \log \log (n+1) - \log\frac{\pi^2}6$$ for all natural numbers $n$. The double natural logarithm ($log log$) indicates that the divergence might be very slow, which is indeed the case. See Meissel–Mertens constant.

The harmonic series
First, we describe how Euler originally discovered the result. He was considering the harmonic series $$ \sum_{n=1}^\infty \frac{1}{n} = 1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \cdots = \infty $$

He had already used the following "product formula" to show the existence of infinitely many primes.

$$ \sum_{n=1}^\infty \frac{1}{n} = \prod_{p} \left( 1+\frac{1}{p}+\frac{1}{p^2}+\cdots \right) = \prod_{p} \frac{1}{1-p^{-1}} $$

Here the product is taken over the set of all primes.

Such infinite products are today called Euler products. The product above is a reflection of the fundamental theorem of arithmetic. Euler noted that if there were only a finite number of primes, then the product on the right would clearly converge, contradicting the divergence of the harmonic series.

Euler's proof
Euler considered the above product formula and proceeded to make a sequence of audacious leaps of logic. First, he took the natural logarithm of each side, then he used the Taylor series expansion for $log x$ as well as the sum of a converging series:

$$\begin{align} \log \left( \sum_{n=1}^\infty \frac{1}{n}\right) & {} = \log\left( \prod_p \frac{1}{1-p^{-1}}\right) = -\sum_p \log \left( 1-\frac{1}{p}\right) \\[5pt] & = \sum_p \left( \frac{1}{p} + \frac{1}{2p^2} + \frac{1}{3p^3} + \cdots \right) \\[5pt] & = \sum_{p}\frac{1}{p} + \frac{1}{2}\sum_p \frac{1}{p^2} + \frac{1}{3}\sum_p \frac{1}{p^3}  + \frac{1}{4}\sum_p \frac{1}{p^4}+ \cdots  \\[5pt] & =  A  + \frac{1}{2} B+ \frac{1}{3} C+ \frac{1}{4} D  + \cdots  \\[5pt] & = A + K \end{align}$$

for a fixed constant $K < 1$. Then he invoked the relation

$$\sum_{n=1}^\infty\frac{1}{n} = \log\infty,$$

which he explained, for instance in a later 1748 work, by setting $x = 1$ in the Taylor series expansion

$$\log\left(\frac1{1-x}\right)=\sum_{n=1}^\infty\frac{x^{n}}n.$$

This allowed him to conclude that $$A = \frac{1}{2} + \frac{1}{3} + \frac{1}{5} + \frac{1}{7} + \frac{1}{11} + \cdots = \log \log \infty.$$

It is almost certain that Euler meant that the sum of the reciprocals of the primes less than $n$ is asymptotic to $log log n$ as $n$ approaches infinity. It turns out this is indeed the case, and a more precise version of this fact was rigorously proved by Franz Mertens in 1874. Thus Euler obtained a correct result by questionable means.

Erdős's proof by upper and lower estimates
The following proof by contradiction comes from Paul Erdős.

Let $p_{i}$ denote the $i$th prime number. Assume that the sum of the reciprocals of the primes converges.

Then there exists a smallest positive integer $k$ such that

$$\sum_{i=k+1}^\infty \frac 1 {p_i} < \frac12 \qquad(1)$$

For a positive integer $x$, let $M_{x}$ denote the set of those $n$ in ${1, 2, ..., x}$ which are not divisible by any prime greater than $p_{k}$ (or equivalently all $n ≤ x$ which are a product of powers of primes $p_{i} ≤ p_{k}$). We will now derive an upper and a lower estimate for $|M_{x}|$, the number of elements in $M_{x}$. For large $x$, these bounds will turn out to be contradictory.


 * Upper estimate:
 * Every $n$ in $M_{x}$ can be written as $n = m^{2}r$ with positive integers $m$ and $r$, where $r$ is square-free. Since only the $k$ primes $p_{1}, ..., p_{k}$ can show up (with exponent 1) in the prime factorization of $r$, there are at most $2^{k}$ different possibilities for $r$. Furthermore, there are at most $√x$ possible values for $m$. This gives us the upper estimate $$|M_x| \le 2^k\sqrt{x} \qquad(2)$$


 * Lower estimate:
 * The remaining $x − |M_{x}|$ numbers in the set difference ${1, 2, ..., x} \ M_{x}$ are all divisible by a prime greater than $p_{k}$. Let $N_{i,x}$ denote the set of those $n$ in {1, 2, ..., x$)$}} which are divisible by the $i$th prime $p_{i}$. Then $$\{1,2,\ldots,x\}\setminus M_x = \bigcup_{i=k+1}^\infty N_{i,x}$$
 * Since the number of integers in $N_{i,x}$ is at most $x⁄p_{i}$ (actually zero for $p_{i} > x$), we get $$x-|M_x| \le \sum_{i=k+1}^\infty |N_{i,x}|< \sum_{i=k+1}^\infty \frac x {p_i}$$
 * Using (1), this implies $$\frac x 2 < |M_x| \qquad(3)$$

This produces a contradiction: when $x ≥ 2^{2k + 2}$, the estimates (2) and (3) cannot both hold, because $x⁄2 ≥ 2^{k}√x$.

Proof that the series exhibits log-log growth
Here is another proof that actually gives a lower estimate for the partial sums; in particular, it shows that these sums grow at least as fast as $log log n$. The proof is due to Ivan Niven, adapted from the product expansion idea of Euler. In the following, a sum or product taken over $p$ always represents a sum or product taken over a specified set of primes.

The proof rests upon the following four inequalities:

To see this, note that $$\frac 1 i = \frac 1 {p_1 p_2 \cdots p_s} \cdot \frac 1 {b^2},$$ and $$\begin{align} \left(1 + \frac{1}{p_1}\right)\left(1 + \frac{1}{p_2}\right) \ldots \left(1 + \frac{1}{p_s}\right) &= \left(\frac{1}{p_1}\right)\left(\frac{1}{p_2}\right)\cdots\left(\frac{1}{p_s}\right) + \ldots\\ &= \frac 1 {p_1 p_2 \cdots p_s} + \ldots. \end{align}$$ That is, $$1/(p_1p_2 \cdots p_s)$$ is one of the summands in the expanded product $p$. And since $$1 / b^2$$ is one of the summands of $i$, every summand $$1/i$$ is represented in one of the terms of $q$ when multiplied out. The inequality follows. \log(n+1) &= \int_1^{n+1} \frac{dx}x \\ &= \sum_{i=1}^n\underbrace{\int_i^{i+1}\frac{dx}x}_{{} \,<\, \frac1i} \\ &< \sum_{i=1}^n \frac 1 i \end{align}$$ \sum_{k=1}^n \frac 1 {k^2} &< 1 + \sum_{k=2}^n \underbrace{\left(\frac1{k - \frac{1}{2}} - \frac1{k + \frac{1}{2}}\right)}_{=\, \frac{1}{k^2 - \frac14} \,>\, \frac{1}{k^2}} \\ &= 1 + \frac23 - \frac1{n + \frac{1}{2}} < \frac53 \end{align}$$
 * Every positive integer $q$ can be uniquely expressed as the product of a square-free integer and a square as a consequence of the fundamental theorem of arithmetic. Start with $$i = q_1^{2{\alpha}_1+{\beta}_1} \cdot q_2^{2{\alpha}_2+{\beta}_2} \cdots q_r^{2{\alpha}_r+{\beta}_r},$$ where the βs are 0 (the corresponding power of prime $i$ is even) or 1 (the corresponding power of prime $A$ is odd). Factor out one copy of all the primes whose &beta; is 1, leaving a product of primes to even powers, itself a square. Relabeling: $$i = (p_1 p_2 \cdots p_s) \cdot b^2,$$ where the first factor, a product of primes to the first power, is square free. Inverting all the $B$s gives the inequality $$ \sum_{i=1}^n \frac 1 i \le \left(\prod_{p \le n} \left(1 + \frac 1 p \right)\right) \cdot \left(\sum_{k=1}^n \frac 1 {k^2}\right) = A \cdot B.$$
 * The upper estimate for the natural logarithm $$\begin{align}
 * The lower estimate $1 + x < exp(x)$ for the exponential function, which holds for all $x > 0$.
 * Let $n ≥ 2$.  The upper bound (using a telescoping sum) for the partial sums (convergence is all we really need) $$\begin{align}

Combining all these inequalities, we see that $$\begin{align} \log(n+1) & < \sum_{i=1}^n\frac{1}{i} \\ & \le \prod_{p \le n} \left(1 + \frac{1}{p}\right) \sum_{k=1}^n \frac{1}{k^2} \\ & < \frac53\prod_{p \le n} \exp\left(\frac{1}{p}\right) \\ & = \frac53\exp\left(\sum_{p \le n} \frac{1}{p} \right) \end{align}$$

Dividing through by $AB$ and taking the natural logarithm of both sides gives $$\log\log(n + 1) - \log\frac53 < \sum_{p \le n} \frac{1}{p}$$

as desired. Q.E.D.

Using

$$\sum_{k=1}^\infty \frac{1}{k^2} = \frac{\pi^2}6$$

(see the Basel problem), the above constant $log 5⁄3 = 0.51082...$ can be improved to $log π^{2}⁄6 = 0.4977...$; in fact it turns out that $$ \lim_{n \to \infty } \left( \sum_{p \leq n} \frac{1}{p} - \log \log n \right) = M$$

where $M = 0.261497...$ is the Meissel–Mertens constant (somewhat analogous to the much more famous Euler–Mascheroni constant).

Proof from Dusart's inequality
From Dusart's inequality, we get $$ p_n < n \log n + n \log \log n \quad\mbox{for } n \ge 6$$

Then $$\begin{align} \sum_{n=1}^\infty \frac1{ p_n} &\ge \sum_{n=6}^\infty \frac{1}{ p_n} \\ &\ge \sum_{n=6}^\infty \frac{1}{ n \log n + n \log \log n} \\ &\ge \sum_{n=6}^\infty \frac{1}{2n \log n} = \infty \end{align}$$ by the integral test for convergence. This shows that the series on the left diverges.

Geometric and harmonic-series proof
The following proof is modified from James A. Clarkson.

Define the k-th tail

$$x_{k} = \sum_{n = k+1} ^{\infty} \frac{1}{p_n}.$$

Then for $$i \geq 0$$, the expansion of $$(x_{k})^{i}$$ contains at least one term for each reciprocal of a positive integer with exactly $$i$$ prime factors (counting multiplicities) only from the set $$ \{ p_{k+1}, p_{k+2}, \cdots \}$$. It follows that the geometric series $\sum_{i = 0} ^{\infty} (x_{k})^{i}$ contains at least one term for each reciprocal of a positive integer not divisible by any $$p_{n},n\leq k$$. But since $$1+j(p_{1}p_{2}\cdots p_{k})$$ always satisfies this criterion,

$$\sum_{i=0}^{\infty}(x_{k})^{i}>\sum_{j=1}^{\infty} \frac{1}{1+j(p_{1}p_{2} \cdots p_{k})}>\frac{1}{1+p_{1}p_{2} \cdots p_{k}} \sum_{j=1}^{\infty}\frac{1}{j}=\infty$$

by the divergence of the harmonic series. This shows that $$x_{k}\geq 1$$ for all $$k$$, and since the tails of a convergent series must themselves converge to zero, this proves divergence.

Partial sums
While the partial sums of the reciprocals of the primes eventually exceed any integer value, they never equal an integer.

One proof is by induction: The first partial sum is $5⁄3$, which has the form $1⁄2$. If the $odd⁄even$th partial sum (for $n ≥ 1$) has the form $n$, then the $(n + 1)$st sum is

$$\frac\text{odd}\text{even} + \frac{1}{p_{n+1}} = \frac{\text{odd} \cdot p_{n+1} + \text{even}}{\text{even} \cdot p_{n+1}} = \frac{\text{odd} + \text{even}}\text{even} = \frac\text{odd}\text{even}$$

as the $(n + 1)$st prime $p_{n + 1}$ is odd; since this sum also has an $odd⁄even$ form, this partial sum cannot be an integer (because 2 divides the denominator but not the numerator), and the induction continues.

Another proof rewrites the expression for the sum of the first $odd⁄even$ reciprocals of primes (or indeed the sum of the reciprocals of any set of primes) in terms of the least common denominator, which is the product of all these primes. Then each of these primes divides all but one of the numerator terms and hence does not divide the numerator itself; but each prime does divide the denominator. Thus the expression is irreducible and is non-integer.