Salem–Spencer set

In mathematics, and in particular in arithmetic combinatorics, a Salem-Spencer set is a set of numbers no three of which form an arithmetic progression. Salem–Spencer sets are also called 3-AP-free sequences or progression-free sets. They have also been called non-averaging sets, but this term has also been used to denote a set of integers none of which can be obtained as the average of any subset of the other numbers. Salem-Spencer sets are named after Raphaël Salem and Donald C. Spencer, who showed in 1942 that Salem–Spencer sets can have nearly-linear size. However a later theorem of Klaus Roth shows that the size is always less than linear.

Examples
For $$k=1,2,\dots$$ the smallest values of $$n$$ such that the numbers from $$1$$ to $$n$$ have a $$k$$-element Salem-Spencer set are
 * 1, 2, 4, 5, 9, 11, 13, 14, 20, 24, 26, 30, 32, 36, ...

For instance, among the numbers from 1 to 14, the eight numbers
 * {1, 2, 4, 5, 10, 11, 13, 14}

form the unique largest Salem-Spencer set.

This example is shifted by adding one to the elements of an infinite Salem–Spencer set, the Stanley sequence
 * 0, 1, 3, 4, 9, 10, 12, 13, 27, 28, 30, 31, 36, 37, 39, 40, ...

of numbers that, when written as a ternary number, use only the digits 0 and 1. This sequence is the lexicographically first infinite Salem–Spencer set. Another infinite Salem–Spencer set is given by the cubes
 * 0, 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000, ...

It is a theorem of Leonhard Euler that no three cubes are in arithmetic progression.

Size
In 1942, Salem and Spencer published a proof that the integers in the range from $$1$$ to $$n$$ have large Salem–Spencer sets, of size $$n/e^{O(\log n/\log\log n)}$$. The denominator of this expression uses big O notation, and grows more slowly than any power of $$n$$, so the sets found by Salem and Spencer have a size that is nearly linear. This bound disproved a conjecture of Paul Erdős and Pál Turán that the size of such a set could be at most $$n^{1-\delta}$$ for some $$\delta>0$$. The construction of Salem and Spencer was improved by Felix Behrend in 1946, who found sets of size $$n/e^{O(\sqrt{\log n})}$$.

In 1952, Klaus Roth proved Roth's theorem establishing that the size of a Salem-Spencer set must be $$O(n/\log\log n)$$. Therefore, although the sets constructed by Salem, Spencer, and Behrend have sizes that are nearly linear, it is not possible to improve them and find sets whose size is actually linear. This result became a special case of Szemerédi's theorem on the density of sets of integers that avoid longer arithmetic progressions. To distinguish Roth's bound on Salem–Spencer sets from Roth's theorem on Diophantine approximation of algebraic numbers, this result has been called Roth's theorem on arithmetic progressions. After several additional improvements to Roth's theorem, the size of a Salem–Spencer set has been proven to be $$O\bigl(n(\log\log n)^4/\log n\bigr)$$. An even better bound of $$O\bigl(n/(\log n)^{1+\delta}\bigr)$$ (for some $$\delta>0$$ that has not been explicitly computed) was announced in 2020 but has not yet been refereed and published. In 2023 a new bound of $$2^{-O((\log N)^c)} \cdot N$$ was found and four days later the result was simplified with a little improvement to $$\exp(-c(\log N)^{1/11})N$$, these results have not yet been refereed and published either.

Construction
A simple construction for a Salem–Spencer set (of size considerably smaller than Behrend's bound) is to choose the ternary numbers that use only the digits 0 and 1, not 2. Such a set must be progression-free, because if two of its elements $$x$$ and $$y$$ are the first and second members of an arithmetic progression, the third member must have the digit two at the position of the least significant digit where $$x$$ and $$y$$ differ. The illustration shows a set of this form, for the three-digit ternary numbers (shifted by one to make the smallest element 1 instead of 0).

Behrend's construction uses a similar idea, for a larger odd radix $$2d-1$$. His set consists of the numbers whose digits are restricted to the range from $$0$$ to $$d-1$$ (so that addition of these numbers has no carries), with the extra constraint that the sum of the squares of the digits is some chosen value $$k$$. If the digits of each number are thought of as coordinates of a vector, this constraint describes a sphere in the resulting vector space, and by convexity the average of two distinct values on this sphere will be interior to the sphere rather than on it. Therefore, if two elements of Behrend's set are the endpoints of an arithmetic progression, the middle value of the progression (their average) will not be in the set. Thus, the resulting set is progression-free.

With a careful choice of $$d$$, and a choice of $$k$$ as the most frequently-occurring sum of squares of digits, Behrend achieves his bound. In 1953, Leo Moser proved that there is a single infinite Salem–Spencer sequence achieving the same asymptotic density on every prefix as Behrend's construction. By considering the convex hull of points inside a sphere, rather than the set of points on a sphere, it is possible to improve the construction by a factor of $$\sqrt{\log n}$$. However, this does not affect the size bound in the form stated above.

Generalization
The notion of Salem–Spencer sets (3-AP-free set) can be generalized to $$k$$-AP-free sets, in which $$k$$ elements form an arithmetic progression if and only if they are all equal. gave constructions of large $$k$$-AP-free sets.

Computational results
Gasarch, Glenn, and Kruskal have performed a comparison of different computational methods for large subsets of $$\{1,\dots n\}$$ with no arithmetic progression. Using these methods they found the exact size of the largest such set for $$n\le 187$$. Their results include several new bounds for different values of $$n$$, found by branch-and-bound algorithms that use linear programming and problem-specific heuristics to bound the size that can be achieved in any branch of the search tree. One heuristic that they found to be particularly effective was the thirds method, in which two shifted copies of a Salem–Spencer set for $$n$$ are placed in the first and last thirds of a set for $$3n$$.

Applications
In connection with the Ruzsa–Szemerédi problem, Salem–Spencer sets have been used to construct dense graphs in which each edge belongs to a unique triangle.

Salem–Spencer sets have also been used in theoretical computer science. They have been used in the design of the Coppersmith–Winograd algorithm for fast matrix multiplication, and in the construction of efficient non-interactive zero-knowledge proofs. Recently, they have been used to show size lower bounds for graph spanners, and the strong exponential time hypothesis based hardness of the subset sum problem.

These sets can also be applied in recreational mathematics to a mathematical chess problem of placing as few queens as possible on the main diagonal of an $$n\times n$$ chessboard so that all squares of the board are attacked. The set of diagonal squares that remain unoccupied must form a Salem–Spencer set, in which all values have the same parity (all odd or all even). The smallest possible set of queens is the complement of the largest Salem–Spencer subset of the odd numbers in $$\{1,\dots n\}$$. This Salem-Spencer subset can be found by doubling and subtracting one from the values in a Salem–Spencer subset of all the numbers in $$\{1,\dots n/2\}.$$