Lovász local lemma

In probability theory, if a large number of events are all independent of one another and each has probability less than 1, then there is a positive (possibly small) probability that none of the events will occur. The Lovász local lemma allows one to relax the independence condition slightly: As long as the events are "mostly" independent from one another and aren't individually too likely, then there will still be a positive probability that none of them occurs. It is most commonly used in the probabilistic method, in particular to give existence proofs.

There are several different versions of the lemma. The simplest and most frequently used is the symmetric version given below. A weaker version was proved in 1975 by László Lovász and Paul Erdős in the article Problems and results on 3-chromatic hypergraphs and some related questions. For other versions, see. In 2020, Robin Moser and Gábor Tardos received the Gödel Prize for their algorithmic version of the Lovász Local Lemma, which uses entropy compression to provide an efficient randomized algorithm for finding an outcome in which none of the events occurs.

Statements of the lemma (symmetric version)
Let $$A_1, A_2,\dots, A_k$$ be a sequence of events such that each event occurs with probability at most p and such that each event is independent of all the other events except for at most d of them.

Lemma I (Lovász and Erdős 1973; published 1975) If
 * $$4 p d \le 1$$

then there is a nonzero probability that none of the events occurs.

Lemma II (Lovász 1977; published by Joel Spencer ) If
 * $$e p (d+1) \le 1,$$

where e = 2.718... is the base of natural logarithms, then there is a nonzero probability that none of the events occurs.

Lemma II today is usually referred to as "Lovász local lemma".

Lemma III (Shearer 1985 ) If
 * $$\begin{cases} p < \frac{(d-1)^{d-1}}{d^d} & d > 1\\ p < \tfrac{1}{2} & d = 1 \end{cases}$$

then there is a nonzero probability that none of the events occurs.

The threshold in Lemma III is optimal and it implies that the bound
 * $$ epd \le 1$$

is also sufficient.

Asymmetric Lovász local lemma
A statement of the asymmetric version (which allows for events with different probability bounds) is as follows:

Lemma (asymmetric version). Let $$ \mathcal{A} = \{ A_1, \ldots, A_n \}$$ be a finite set of events in the probability space Ω. For $$ A \in \mathcal{A} $$ let $$ \Gamma(A)$$ denote the neighbours of $$A$$ in the dependency graph (In the dependency graph, event $$A$$ is not adjacent to events which are mutually independent). If there exists an assignment of reals $$x : \mathcal{A} \to [0,1) $$ to the events such that


 * $$ \forall A \in \mathcal{A} : \Pr(A) \leq x(A) \prod_{B \in \Gamma(A)} (1-x(B)) $$

then the probability of avoiding all events in $$ \mathcal{A} $$ is positive, in particular


 * $$ \Pr\left(\overline{A_1} \wedge \cdots \wedge \overline{A_n} \right) \geq \prod_{i\in \{1,\cdot\cdot\cdot,n\}} (1-x(A_i)). $$

The symmetric version follows immediately from the asymmetric version by setting


 * $$ \forall A \in \mathcal{A} : x(A) = \frac{1}{d+1}$$

to get the sufficient condition


 * $$ p \leq \frac{1}{d+1} \cdot \frac{1}{e} $$

since


 * $$\frac{1}{e} \leq \left (1 - \frac{1}{d+1} \right)^d.$$

Constructive versus non-constructive
Note that, as is often the case with probabilistic arguments, this theorem is nonconstructive and gives no method of determining an explicit element of the probability space in which no event occurs. However, algorithmic versions of the local lemma with stronger preconditions are also known (Beck 1991; Czumaj and Scheideler 2000). More recently, a constructive version of the local lemma was given by Robin Moser and Gábor Tardos requiring no stronger preconditions.

Non-constructive proof
We prove the asymmetric version of the lemma, from which the symmetric version can be derived. By using the principle of mathematical induction we prove that for all $$A$$ in $$\mathcal{A}$$ and all subsets $$S$$ of $$\mathcal{A}$$ that do not include $$A$$, $$ \Pr\left(A\mid\bigwedge_{B \in S}\overline{B}\right)\leq x(A)$$. The induction here is applied on the size (cardinality) of the set $$ S $$. For base case $$S=\emptyset$$ the statement obviously holds since $$ \Pr(A_i) \leq x\left(A_i\right) $$. We need to show that the inequality holds for any subset of $$\mathcal{A} $$ of a certain cardinality given that it holds for all subsets of a lower cardinality.

Let $$S_1 = S\cap \Gamma(A), S_2 = S \setminus S_1$$. We have from Bayes' theorem


 * $$\Pr\left(A\mid\bigwedge_{B\in S} \overline{B}\right) = \frac{\Pr\left(A\wedge\bigwedge_{B\in S_{1}} \overline{B}\mid \bigwedge_{B\in S_2} \overline{B}\right)}{\Pr\left(\bigwedge_{B\in S_1}\overline{B}\mid\bigwedge_{B\in S_2} \overline{B} \right)}. $$

We bound the numerator and denominator of the above expression separately. For this, let $$ S_1=\{B_{j1},B_{j2},\ldots,B_{jl}\} $$. First, exploiting the fact that $$A$$ does not depend upon any event in $$ S_2 $$.


 * $$ \text{Numerator} \leq \Pr\left(A\mid\bigwedge_{B\in S_2} \overline{B}\right) = \Pr(A) \leq x(A) \prod_{B\in\Gamma(A)}(1-x(B)). \qquad (1) $$

Expanding the denominator by using Bayes' theorem and then using the inductive assumption, we get



\begin{align} & \text{Denominator} \\ = {} & \Pr\left(\overline{B}_{j1}\mid\bigwedge_{t=2}^l \overline{B}_{jt}\wedge\bigwedge_{B\in S_2} \overline{B} \right)\cdot \Pr\left(\overline{B}_{j2}\mid\bigwedge_{t=3}^l\overline{B}_{jt}\wedge\bigwedge_{B\in S_2} \overline{B} \right)\cdots \Pr\left(\overline{B}_{jl}\mid\bigwedge_{B\in S_2} \overline{B} \right) \geq \prod_{B\in S_1} (1-x(B)) \qquad (2) \end{align} $$

The inductive assumption can be applied here since each event is conditioned on lesser number of other events, i.e. on a subset of cardinality less than $$|S|$$. From (1) and (2), we get


 * $$ \Pr\left(A\mid\bigwedge_{B\in S} \overline{B}\right) \leq x(A)\prod_{B\in \Gamma(A)-S_1}(1-x(B)) \leq x(A) $$

Since the value of x is always in $$[0,1)$$. Note that we have essentially proved $$ \Pr\left(\overline{A}\mid\bigwedge_{B\in S} \overline{B}\right) \geq 1-x(A) $$. To get the desired probability, we write it in terms of conditional probabilities applying Bayes' theorem repeatedly. Hence,



\begin{align} & \Pr\left(\overline{A_1} \wedge \cdots \wedge \overline{A_n} \right) \\ = {} & \Pr\left(\overline{A_1}\mid\overline{A_{2}}\wedge \cdots \overline{A_n}\right)\cdot\Pr\left(\overline{A_2}\mid\overline{A_3}\wedge \cdots \overline{A_n}\right) \cdots \Pr\left(\overline{A_n}\right) \\ \geq {} & \prod_{A\in\mathcal{A}}(1-x(A)), \end{align} $$

which is what we had intended to prove.

Example
Suppose 11n points are placed around a circle and colored with n different colors in such a way that each color is applied to exactly 11 points. In any such coloring, there must be a set of n points containing one point of each color but not containing any pair of adjacent points.

To see this, imagine picking a point of each color randomly, with all points equally likely (i.e., having probability 1/11) to be chosen. The 11n different events we want to avoid correspond to the 11n pairs of adjacent points on the circle. For each pair our chance of picking both points in that pair is at most 1/121 (exactly 1/121 if the two points are of different colors, otherwise 0), so we will take p = 1/121.

Whether a given pair (a, b) of points is chosen depends only on what happens in the colors of a and b, and not at all on whether any other collection of points in the other n − 2 colors are chosen. This implies the event "a and b are both chosen" is dependent only on those pairs of adjacent points which share a color either with a or with b.

There are 11 points on the circle sharing a color with a (including a itself), each of which is involved with 2 pairs. This means there are 21 pairs other than (a, b) which include the same color as a, and the same holds true for b. The worst that can happen is that these two sets are disjoint, so we can take d = 42 in the lemma. This gives


 * $$ e p (d+1) \approx 0.966<1.$$

By the local lemma, there is a positive probability that none of the bad events occur, meaning that our set contains no pair of adjacent points. This implies that a set satisfying our conditions must exist.