Blow-up lemma

The blow-up lemma, proved by János Komlós, Gábor N. Sárközy, and Endre Szemerédi in 1997, is an important result in extremal graph theory, particularly within the context of the regularity method. It states that the regular pairs in the statement of Szemerédi's regularity lemma behave like complete bipartite graphs in the context of embedding spanning graphs of bounded degree.

Definitions and Statement
To formally state the blow-up lemma, we first need to define the notion of a super-regular pair.

Super-regular pairs
A pair $$(A,B)$$ of subsets of the vertex set is called $$(\varepsilon, \delta)$$-super-regular if for every $$ X \subset A $$ and $$ Y \subset B $$ satisfying


 * $$ |X| > \varepsilon |A| $$ and $$ |Y| > \varepsilon |B| $$

we have


 * $$ e(X,Y) > \delta |X| |Y| $$

and furthermore,


 * $$ \deg(a) > \delta |B| $$ for all $$ a \in A $$ and $$ \deg(b) > \delta |A| $$ for all $$ b \in B $$.

Here $$ e(X, Y) $$ denotes the number of pairs $$ (x,y) $$ with $$ x \in X $$ and $$ y \in Y $$ such that $$\{x,y\}$$ is an edge.

Statement of the Blow-up Lemma
Given a graph $$R$$ of order $$r$$ and positive parameters $$\delta, \Delta$$, there exists a positive $$\varepsilon = \varepsilon(\delta, \Delta, r)$$ such that the following holds. Let $$n_1, n_2,\dots,n_r$$ be arbitrary positive integers and let us replace the vertices $$v_1, v_2, \dots,v_r$$ of $$R$$ with pairwise disjoint sets $$V_1, V_2, \dots, V_r$$ of sizes $$n_1, n_2, \dots, n_r$$ (blowing up). We construct two graphs on the same vertex set $$V = \bigcup V_i$$. The first graph $$\mathbf R$$ is obtained by replacing each edge $$\{v_i, v_j\}$$ of $$R$$ with the complete bipartite graph between the corresponding vertex sets $$V_i$$ and $$V_j$$. A sparser graph G is constructed by replacing each edge $$\{v_i, v_j\}$$ with an $$(\varepsilon, \delta)$$-super-regular pair between $$V_i$$ and $$V_j$$. If a graph $$H$$ with $$\Delta(H) \le \Delta$$ is embeddable into $$\mathbf R$$ then it is already embeddable into G.

Proof Sketch
The proof of the blow-up lemma is based on using a randomized greedy algorithm (RGA) to embed the vertices of $$H$$ into $$G$$ sequentially. The argument then proceeds by bounding the failure rate of the algorithm such that it is less than 1 (and in fact $$o(1)$$) for an appropriate choice of parameters. This means that there is a non-zero chance for the algorithm to succeed, so an embedding must exist.

Attempting to directly embed all the vertices of $$H$$ in this manner does not work because the algorithm may get stuck when only a small number of vertices are left. Instead, we set aside a small fraction of the vertex set, called buffer vertices, and attempt to embed the rest of the vertices. The buffer vertices are subsequently embedded by using Hall's marriage theorem to find a perfect matching between the buffer vertices and the remaining vertices of $$G$$.

Notation
We borrow all notation introduced in previous sections. Let $$ n = |V(G)| = \sum n_i $$. Since $$H$$ can be embedded into $$\mathbf R$$, we can write $$V(H) = X = \bigcup_{i \le r} X_i$$ with $$ |X_i| = |V_i| $$ for all $$i$$. For a vertex $$ x \in X_i $$, let $$ V_x $$ denote $$ V_i $$. For $$ x \in X_i, y \in X_j $$,


 * $$ d_{xy} = \frac{e(V_i,V_j)}{|V_i| |V_j|}$$

denotes the density of edges between the corresponding vertex sets of $$ G $$. $$\phi:V(G) \to V(H)$$ is the embedding that we wish to construct. $$ T $$ is the final time after which the algorithm concludes.

Phase 0: Initialization

 * 1) Greedily choose the set of buffer vertices $$B$$ from the vertices of $$H$$ as a maximal set of vertices distance at least $$4$$ from each other
 * 2) Order the remaining vertices (those in $$H \setminus B$$) in a list $$L$$, placing the neighbors of $$B$$ first.
 * 3) Declare a queue $$q$$ of presently prioritized vertices, which is initially empty.
 * 4) Declare an array of sets $$F_z$$ indexed by the vertices of $$H$$, representing the set of all "free spots" of $$z$$, that is, the set of unoccupied vertices in $$G$$ the vertex $$z$$ could be mapped to without violating any of the adjacency conditions from the already-embedded neighbors of $$z$$ in $$H$$. $$F_z$$ is initialized to $$V_z$$.

Phase 1: Randomized Greedy Embedding

 * 1) Choose a vertex $$x$$ from the set of remaining vertices as follows:
 * 2) If the queue $$q$$ of prioritized vertices is non-empty, then choose the vertex from $$q$$
 * 3) Otherwise, choose a vertex from the list $$L$$ of remaining vertices
 * 4) Choose the image $$\phi(x)$$ in $$G$$ for the vertex $$x$$ randomly from the set of "good" choices, where a choice is good iff none of the new free-sets $$F_z(t)$$ differ too much in size from the expected value.
 * 5) Update the free sets $$F_z(t)$$, and put vertices whose free sets have become too small with respect to their size in the last update in the set of prioritized vertices $$q$$
 * 6) Abort if the queue $$q$$ contains a sufficiently large fraction of any of the sets $$X_i$$
 * 7) If there are non-buffer vertices left to be embedded in either $$L$$ or $$q$$, update time $$t$$ and go back to step 1; otherwise move on to phase 2.

Phase 2: Kőnig-Hall matching for remaining vertices
Consider the set of vertices left to be embedded, which is precisely $$ B $$, and the set of free spots $$ \bigcup_{b \in B} F_b(T) $$. Form a bipartite graph between these two sets, joining each $$ b \in B $$ to $$ F_b(T) $$, and find a perfect matching in this bipartite graph. Embed according to this matching.

Proof of correctness
The proof of correctness is technical and quite involved, so we omit the details. The core argument proceeds as follows:

Step 1: most vertices are good, and enough vertices are free
Prove simultaneously by induction on $$ t $$ that if $$ x $$ is the vertex embedded at time $$ t $$, then


 * 1) only a small fraction of the choices in $$ F_x(t) $$ are bad
 * 2) all of the free sets $$ F_z(t+1) $$ are fairly large for unembedded vertices $$ z $$

Step 2: the "main lemma"
Consider $$ 1 \le i \le r, Y \subseteq X_i $$, and $$ A \subseteq V_i $$ such that $$|A|$$ is not too small. Consider the event $$ E_{A,Y} $$ where


 * 1) no vertices are embedded in $$ A $$ during the first phase
 * 2) for every $$ y \in Y $$ there is a time $$ t_y $$ such that the fraction of free vertices of $$ y $$ in $$ A $$ at time $$ t $$ was small.

Then, we prove that the probability of $$ E_{A,Y} $$ happening is low.

Step 3: phase 1 succeeds with high probability
The only way that the first phase could fail is if it aborts, since by the first step we know that there is always a sufficient choice of good vertices. The program aborts only when the queue is too long. The argument then proceeds by union-bounding over all modes of failure, noting that for any particular choice of $$ 1 \le i \le r $$, $$ Y \subseteq X_i, |Y| \ge \delta_Q |X_i| $$ and $$ A = V_i $$ with $$ Y $$ representing a subset of the queue that failed, the triple $$(i,Y,A)$$ satisfy the conditions of the "main lemma", and thus have a low probability of occurring.

Step 4: no queue in initial phase
Recall that the list was set up so that neighbors of vertices in the buffer get embedded first. The time until all of these vertices get embedded is called the initial phase. Prove by induction on $$t$$ that no vertices get added to the queue during the initial phase. It follows that all of the neighbors of the buffer vertices get added before the rest of the vertices.

Step 5: buffer vertices have enough free spots
For any $$ x \in B $$ and $$ v \in V_x $$, we can find a sufficiently large lower bound on the probability that $$\phi(N_H(x)) \subseteq N_G(v)$$, conditional on the assumption that $$ v $$ was free before any of the vertices in $$ N_H(x) $$ were embedded.

Step 6: phase 2 succeeds with high probability
By Hall's marriage theorem, phase 2 fails if and only if Hall's condition is violated. For this to happen, there must be some $$ 1 \le i \le r $$ and $$ S \subseteq X_i $$ such that $$ |\bigcup_{z \in S} F_z(T)| < |S| $$. $$ |S| $$ cannot be too small by largeness of free sets (step 1). If $$ |S| $$ is too large, then with high probability $$ \bigcup_{z \in S} F_z(T) = V_i(T) $$, so the probability of failure in such a case would be low. If $$ |S| $$ is neither too small nor too large, then noting that $$ A := V_i(T) \setminus \bigcup_{z \in S} F_z(T)$$ is a large set of unused vertices, we can use the main lemma and union-bound the failure probability.

Applications
The blow-up lemma has a number of applications in embedding dense graphs.

Pósa-Seymour Conjecture
In 1962, Lajos Pósa conjectured that every $$n$$-vertex graph with minimum degree at least $$\frac{2n}3$$ contains the square of a Hamiltonian cycle, generalizing Dirac's theorem. The conjecture was further extended by Paul Seymour in 1974 to the following:
 * Every graph on $$n$$ vertices with minimum degree at least $$\frac{kn}{k+1}$$ contains the $$k$$-th power of a Hamiltonian cycle.

The blow-up lemma was used by Komlós, Sárközy, and Szemerédi to prove the conjecture for all sufficiently large values of $$n$$ (for a fixed $$k$$) in 1998.

Alon-Yuster Conjecture
In 1995, Noga Alon and Raphael Yuster considered the generalization of the well-known Hajnal–Szemerédi theorem to arbitrary $$H$$-factors (instead of just complete graphs), and proved the following statement:
 * For every fixed graph $$ H $$ with $$ h $$ vertices, any graph G with n vertices and with minimum degree $$ d \ge \frac{\chi(H)-1}{\chi(H)}n $$ contains $$ (1-o(1))n/h $$ vertex disjoint copies of H.

They also conjectured that the result holds with only a constant (instead of linear) error:
 * For every integer $$h$$ there exists a constant $$ c(h) $$ such that for every graph $$ H $$ with $$ h $$ vertices, any graph $$ G $$ with $$ n $$ vertices and with minimum degree $$ d \ge \frac{\chi(H)-1}{\chi(H)}n $$ contains at least $$n/h-c(h) $$ vertex disjoint copies of $$ H $$.

This conjecture was proven by Komlós, Sárközy, and Szemerédi in 2001 using the blow-up lemma.

History and Variants
The blow-up lemma, first published in 1997 by Komlós, Sárközy, and Szemerédi, emerged as a refinement of existing proof techniques using the regularity method to embed spanning graphs, as in the proof of the Bollobás conjecture on spanning trees, work on the Pósa-Seymour conjecture about the minimum degree necessary to contain the k-th graph power of a Hamiltonian cycle, and the proof of the Alon-Yuster conjecture on the minimum degree needed for a graph to have a perfect H-factor. The proofs of all of these theorems relied on using a randomized greedy algorithm to embed the majority of vertices, and then using a Kőnig-Hall like argument to find an embedding for the remaining vertices. The first proof of the blow-up lemma also used a similar argument. Later in 1997, however, the same authors published another paper that found an improvement to the randomized algorithm to make it deterministic.

Peter Keevash found a generalization of the blow-up lemma to hypergraphs in 2010.

Stefan Glock and Felix Joos discovered a variant of the blow-up lemma for rainbow graphs in 2018.

In 2019, Peter Allen, Julia Böttcher, Hiep Hàn, Yoshiharu Kohayakawa, and Yury Person, found sparse analogues of the blow-up lemma for embedding bounded degree graphs into random and pseudorandom graphs