Hypergraph removal lemma

In graph theory, the hypergraph removal lemma states that when a hypergraph contains few copies of a given sub-hypergraph, then all of the copies can be eliminated by removing a small number of hyperedges. It is a generalization of the graph removal lemma. The special case in which the graph is a tetrahedron is known as the tetrahedron removal lemma. It was first proved by Nagle, Rödl, Schacht and Skokan and, independently, by Gowers.

The hypergraph removal lemma can be used to prove results such as Szemerédi's theorem and the multi-dimensional Szemerédi theorem.

Statement
The hypergraph removal lemma states that for any $$\varepsilon, r, m > 0$$, there exists $$\delta = \delta(\varepsilon, r, m) > 0$$ such that for any $$r$$-uniform hypergraph $$H$$ with $$m$$ vertices the following is true: if $$G$$ is any $$n$$-vertex $$r$$-uniform hypergraph with at most $$\delta n^{v(H)}$$ subgraphs isomorphic to $$H$$, then it is possible to eliminate all copies of $$H$$ from $$G$$ by removing at most $$\varepsilon n^r$$ hyperedges from $$G$$.

An equivalent formulation is that, for any graph $$G$$ with $$o(n^{v(H)})$$ copies of $$H$$, we can eliminate all copies of $$H$$ from $$G$$ by removing $$o(n^r)$$ hyperedges.

Proof idea of the hypergraph removal lemma
The high level idea of the proof is similar to that of graph removal lemma. We prove a hypergraph version of Szemerédi's regularity lemma (partition hypergraphs into pseudorandom blocks) and a counting lemma (estimate the number of hypergraphs in an appropriate pseudorandom block). The key difficulty in the proof is to define the correct notion of hypergraph regularity. There were multiple attempts         to define "partition" and "pseudorandom (regular) blocks" in a hypergraph, but none of them are able to give a strong counting lemma. The first correct definition of Szemerédi's regularity lemma for general hypergraphs is given by Rödl et al.

In Szemerédi's regularity lemma, the partitions are performed on vertices (1-hyperedge) to regulate edges (2-hyperedge). However, for $$k>2$$, if we simply regulate $$k$$-hyperedges using only 1-hyperedge, we will lose information of all $$j$$-hyperedges in the middle where $$1<j<k$$, and fail to find a counting lemma. The correct version has to partition $$(k-1)$$-hyperedges in order to regulate $$k$$-hyperedges. To gain more control of the $$(k-1)$$-hyperedges, we can go a level deeper and partition on $$(k-2)$$-hyperedges to regulate them, etc. In the end, we will reach a complex structure of regulating hyperedges.

Proof idea for 3-uniform hypergraphs
For example, we demonstrate an informal 3-hypergraph version of Szemerédi's regularity lemma, first given by Frankl and Rödl. Consider a partition of edges$$E(K_n) = G^{(2)}_1\cup\dots\cup G^{(2)}_l$$ such that for most triples $$(i,j,k),$$ there are a lot of triangles on top of $$\left(G^{(2)}_i,G^{(2)}_j,G^{(2)}_k\right).$$ We say that $$\left(G^{(2)}_i,G^{(2)}_j,G^{(2)}_k\right)$$ is "pseudorandom" in the sense that for all subgraphs $$A^{(2)}_i\subset G^{(2)}_i$$ with not too few triangles on top of $$\left(A^{(2)}_i,A^{(2)}_j,A^{(2)}_k\right),$$ we have



\left|d\left(G^{(2)}_i,G^{(2)}_j,G^{(2)}_k\right) - d\left(A^{(2)}_i,A^{(2)}_j,A^{(2)}_k\right)\right|\le\varepsilon, $$
 * where $$d(X, Y, Z)$$ denotes the proportion of $$3$$-uniform hyperedge in $$G^{(3)}$$ among all triangles on top of $$(X, Y, Z)$$.

We then subsequently define a regular partition as a partition in which the triples of parts that are not regular constitute at most an $$\varepsilon$$ fraction of all triples of parts in the partition.

In addition to this, we need to further regularize $$G^{(2)}_1, \dots, G^{(2)}_l$$ via a partition of the vertex set. As a result, we have the total data of hypergraph regularity as follows:


 * 1) a partition of $$E(K_n)$$ into graphs such that $$G^{(3)}$$ sits pseudorandomly on top;
 * 2) a partition of $$V(G)$$ such that the graphs in (1) are extremely pseudorandom (in a fashion resembling Szemerédi's regularity lemma).

After proving the hypergraph regularity lemma, we can prove a hypergraph counting lemma. The rest of proof proceeds similarly to that of Graph removal lemma.

Proof of Szemerédi's theorem
Let $$r_k(N)$$ be the size of the largest subset of $$\{1, \ldots, N\}$$ that does not contain a length $$k$$ arithmetic progression. Szemerédi's theorem states that, $$r_k(N) = o(N)$$ for any constant $$k$$. The high level idea of the proof is that, we construct a hypergraph from a subset without any length $$k$$ arithmetic progression, then use graph removal lemma to show that this graph cannot have too many hyperedges, which in turn shows that the original subset cannot be too big.

Let $$A \subset \{1, \ldots, N\}$$ be a subset that does not contain any length $$k$$ arithmetic progression. Let $$M = k^2N + 1$$ be a large enough integer. We can think of $$A$$ as a subset of $$\mathbb{Z} / M\mathbb{Z}$$. Clearly, if $$A$$ doesn't have length $$k$$ arithmetic progression in $$\mathbb{Z}$$, it also doesn't have length $$k$$ arithmetic progression in $$\mathbb{Z} / M\mathbb{Z}$$.

We will construct a $$k$$-partite $$(k-1)$$-uniform hypergraph $$G$$ from $$A$$ with parts $$V_1, V_2, \ldots, V_k$$, all of which are $$M$$ element vertex sets indexed by $$\mathbb{Z} / M\mathbb{Z}$$. For each $$1 \le i \le k$$, we add a hyperedge among vertices $$(v_j \in V_j)_{j \in [k] \setminus \{i\}}$$ if and only if $$\sum_{j \ne i} (j-i) v_j \in A.$$ Let $$H$$ be the complete $$k$$-partite $$(k-1)$$-uniform hypergraph. If $$G$$contains an isomorphic copy of $$H$$ with vertices $$v_1, \ldots, v_k$$, then $$\alpha_i = \sum_{j \ne i} (j-i) v_j \in A$$ for any $$1 \le i \le j$$. However, note that $$\alpha_i$$ is a length $$k$$ arithmetic progression with common difference $$\alpha_{i+1}-\alpha_i = -\sum_{j} v_j$$. Since $$A$$ has no length $$k$$ arithmetic progression, it must be the case that $$\alpha_1 = \cdots = \alpha_k$$, so $$\sum_{j} v_j = 0$$.

Thus, for each hyperedge $$(v_j \in V_j)_{j \in [k] \setminus \{i\}}$$, we can find a unique copy of $$H$$ that this edge lies in by finding $$v_i = -\sum_{j \ne i} v_j$$. The number of copies of $$H$$ in $$G$$ equals $$\frac{1}{k} e(G) = O(N^{k-1}) = o(N^k)$$. Therefore, by the hypergraph removal lemma, we can remove $$o(N^{k-1})$$ edges to eliminate all copies of $$H$$ in $$G$$. Since every hyperedge of $$G$$ is in a unique copy of $$H$$, to eliminate all copies of $$H$$ in $$G$$, we need to remove at least $$e(G) / k$$ edges. Thus, $$e(G) = o(N^{k-1})$$.

The number of hyperedges in $$G$$ is $$kM^{k-2}|A| = o(N^{k-1})$$, which concludes that $$|A| = o(N)$$.

This method usually does not give a good quantitative bound, since the hidden constants in hypergraph removal lemma involves the inverse Ackermann function. For a better quantitive bound, Gowers proved that $$|A| \le \frac{N}{(\log \log N)^{c_k}}$$ for some constant $$c_k$$ depending on $$k$$. It is the best bound for $$k \ge 5$$ so far.

Applications

 * The hypergraph removal lemma is used to prove the multidimensional Szemerédi theorem by J. Solymosi. The statement is that any for any finite subset $$S$$ of $$\mathbb{Z}^r$$, any $$\delta>0$$ and any $$n$$ large enough, any subset of $$[n]^r$$ of size at least $$\delta n^r$$ contains a subset of the form $$a\cdot S + d$$, that is, a dilated and translated copy of $$S$$. Corners theorem is a special case when $$S=\{(0,0),(0,1),(1,0)\}$$.
 * It is also used to prove the polynomial Szemerédi theorem, the finite field Szemerédi theorem and the finite abelian group Szemerédi theorem.