Natural density

In number theory, natural density, also referred to as asymptotic density or arithmetic density, is one method to measure how "large" a subset of the set of natural numbers is. It relies chiefly on the probability of encountering members of the desired subset when combing through the interval $[1, n]$ as $n$ grows large.

Intuitively, it is thought that there are more positive integers than perfect squares, since every perfect square is already positive, and many other positive integers exist besides. However, the set of positive integers is not in fact larger than the set of perfect squares: both sets are infinite and countable and can therefore be put in one-to-one correspondence. Nevertheless if one goes through the natural numbers, the squares become increasingly scarce. The notion of natural density makes this intuition precise for many, but not all, subsets of the naturals (see Schnirelmann density, which is similar to natural density but defined for all subsets of $$\mathbb{N}$$).

If an integer is randomly selected from the interval $[1, n]$, then the probability that it belongs to $A$ is the ratio of the number of elements of $A$ in $[1, n]$ to the total number of elements in $[1, n]$. If this probability tends to some limit as $n$ tends to infinity, then this limit is referred to as the asymptotic density of $A$. This notion can be understood as a kind of probability of choosing a number from the set $A$. Indeed, the asymptotic density (as well as some other types of densities) is studied in probabilistic number theory.

Definition
A subset $A$ of positive integers has natural density $α$ if the proportion of elements of $A$ among all natural numbers from 1 to $n$ converges to $α$ as $n$ tends to infinity.

More explicitly, if one defines for any natural number $n$ the counting function $a(n)$ as the number of elements of $A$ less than or equal to $n$, then the natural density of $A$ being $α$ exactly means that

It follows from the definition that if a set $A$ has natural density $α$ then $a(n)/n → α$.

Upper and lower asymptotic density
Let $$A$$ be a subset of the set of natural numbers $$\mathbb{N}=\{1,2,\ldots\}.$$ For any $$n \in \mathbb{N}$$, define $$A(n)$$ to be the intersection $$A(n)=\{1,2,\ldots,n\} \cap A,$$ and let $$a(n)=|A(n)|$$ be the number of elements of $$A$$ less than or equal to $$n$$.

Define the upper asymptotic density $$\overline{d}(A)$$ of $$A$$ (also called the "upper density") by $$ \overline{d}(A) = \limsup_{n \rightarrow \infty} \frac{a(n)}{n} $$ where lim sup is the limit superior.

Similarly, define the lower asymptotic density $$\underline{d}(A)$$ of $$A$$ (also called the "lower density") by $$ \underline{d}(A) = \liminf_{n \rightarrow \infty} \frac{ a(n) }{n} $$ where lim inf is the limit inferior. One may say $$A$$ has asymptotic density $$d(A)$$ if $$\underline{d}(A)=\overline{d}(A)$$, in which case $$d(A)$$ is equal to this common value.

This definition can be restated in the following way: $$ d(A)=\lim_{n \rightarrow \infty} \frac{a(n)}{n} $$ if this limit exists.

These definitions may equivalently be expressed in the following way. Given a subset $$A$$ of $$\mathbb{N}$$, write it as an increasing sequence indexed by the natural numbers: $$A = \{a_1 < a_2 < \ldots\}.$$ Then $$\underline{d}(A) = \liminf_{n \rightarrow \infty} \frac{n}{a_n},$$ $$\overline{d}(A) = \limsup_{n \rightarrow \infty} \frac{n}{a_n}$$ and $$d(A) = \lim_{n \rightarrow \infty} \frac{n}{a_n}$$ if the limit exists.

A somewhat weaker notion of density is the upper Banach density $$d^*(A)$$ of a set $$A \subseteq \mathbb{N}.$$ This is defined as $$ d^*(A) = \limsup_{N-M \rightarrow \infty} \frac{| A \cap \{M, M+1, \ldots, N\}|}{N-M+1}. $$

Properties and examples

 * For any finite set F of positive integers, d(F) = 0.


 * If d(A) exists for some set A and Ac denotes its complement set with respect to $$\N$$, then d(Ac) = 1 − d(A).
 * Corollary: If $$F\subset \N $$ is finite (including the case $$F=\emptyset$$), $$d(\N \setminus F)=1.$$

\max\{d(A),d(B)\} \leq d(A\cup B) \leq \min\{d(A)+d(B),1\}.$$
 * If $$d(A), d(B),$$ and $$d(A \cup B)$$ exist, then $$


 * If $$A = \{n^2 : n \in \N\}$$ is the set of all squares, then d(A) = 0.


 * If $$A = \{2n : n \in \N\}$$ is the set of all even numbers, then d(A) = 0.5. Similarly, for any arithmetical progression $$A = \{an + b : n \in \N\}$$ we get $$d(A) = \tfrac{1}{a}.$$


 * For the set P of all primes we get from the prime number theorem that d(P) = 0.


 * The set of all square-free integers has density $$\tfrac{6}{\pi^2}.$$ More generally, the set of all nth-power-free numbers for any natural n has density $$\tfrac{1}{\zeta(n)},$$ where $$\zeta(n)$$ is the Riemann zeta function.


 * The set of abundant numbers has non-zero density. Marc Deléglise showed in 1998 that the density of the set of abundant numbers is between 0.2474 and 0.2480.


 * The set
 * $$A=\bigcup_{n=0}^\infty \left \{2^{2n},\ldots,2^{2n+1}-1 \right \}$$
 * of numbers whose binary expansion contains an odd number of digits is an example of a set which does not have an asymptotic density, since the upper density of this set is
 * $$\overline d(A)=\lim_{m \to \infty}\frac{1+2^2+\cdots +2^{2m}}{2^{2m+1}-1}=\lim_{m \to\infty} \frac{2^{2m+2}-1}{3(2^{2m+1}-1)} = \frac 23,$$
 * whereas its lower density is
 * $$\underline d(A)=\lim_{m \to\infty}\frac{1+2^2+\cdots +2^{2m}}{2^{2m+2}-1}=\lim_{m \to\infty} \frac{2^{2m+2}-1}{3(2^{2m+2}-1)} = \frac 13.$$


 * The set of numbers whose decimal expansion begins with the digit 1 similarly has no natural density: the lower density is 1/9 and the upper density is 5/9. (See Benford's law.)


 * Consider an equidistributed sequence $$\{\alpha_n\}_{n\in\N}$$ in $$[0,1]$$ and define a monotone family $$\{A_x\}_{x\in[0,1]}$$ of sets:
 * $$A_x:=\{n\in\N : \alpha_n<x \}.$$
 * Then, by definition, $$d(A_x)= x$$ for all $$x$$.


 * If S is a set of positive upper density then Szemerédi's theorem states that S contains arbitrarily large finite arithmetic progressions, and the Furstenberg–Sárközy theorem states that some two members of S differ by a square number.

Other density functions
Other density functions on subsets of the natural numbers may be defined analogously. For example, the logarithmic density of a set A is defined as the limit (if it exists)


 * $$\mathbf{\delta}(A) = \lim_{x \rightarrow \infty} \frac{1}{\log x} \sum_{n \in A, n \le x} \frac{1}{n} \ . $$

Upper and lower logarithmic densities are defined analogously as well.

For the set of multiples of an integer sequence, the Davenport–Erdős theorem states that the natural density, when it exists, is equal to the logarithmic density.