User:Sadeq/Averaging lemma

The averaging argument is a standard argument which helps in proving theorems in complexity theory and cryptography. It usually allows us to convert probabilistic polynomial-time algorithms into non-uniform polynomial-size circuits.

Example
To simplify, let's first consider an example.

Example: If every person likes at least 1/3 of the books in a library, then, there exists a book, which at least 1/3 of people liked it.

Proof: Suppose there are $$N$$ people and B books. Each person likes at least $$B/3$$ of the books. Let people leave a mark on the book the like. Then, there will be at least $$M=(NB)/3$$ marks. The averaging argument claims that there exists a book with at least $$N/3$$ marks on it. Assume, to the contradiction, that no such book exists. Then, every book has less than $$N/3$$ marks. But, since there are $$N$$ books, the total number of marks will be less than $$(NB)/3$$, contradicting the fact that there is at least $$M$$ marks. $$\scriptstyle\blacksquare$$

Formalized Definition of Averaging Argument
Consider two sets: X and Y, a proposition $$p \colon X\times Y \to \mbox{TRUE/FALSE}$$, and a fraction  $$f$$ (where  $$0\le f\le 1$$ ).

If for all $$x\in X$$  and at least a fraction  $$f$$  of  $$y\in Y$$, the proposition  $$p(x,y)$$  holds, then there exists a  $$y\in Y$$ , for which there exists a fraction  $$f$$  of  $$x\in X$$ that the proposition  $$p(x,y)$$  holds.

Another formal (and more complicated) definition is due to Barak :

Let $$f$$ be some function. The averaging argument is the following claim: if we have a circuit $$C$$ such that $$C(x, y) = f(x)$$ with probability at least $$\rho$$, where $$x$$ is chosen at random and $$y$$ is chosen independently from some distribution $$Y$$ over $$\{0, 1\}^m$$ (which might not even be efficiently sampleable) then there exists a single string $$y_0 \in \{0, 1\}^m$$ such that $$Pr_x[C(x, y_0) = f(x)] \ge \rho$$.

Indeed, for every $$y$$ define $$p_y$$ to be $$Pr_x[C(x, y) = f(x)]$$ then

$$Pr_{x,y}[C(x, y) = f(x)] = E_y[p_y]$$

and then this reduces to the claim that for every random variable $$Z$$, if $$E[Z] \ge \rho$$ then $$Pr[Z \ge \rho] > 0$$ (this holds since $$E[Z]$$ is the weighted average of $$Z$$ and clearly if the average of some values is at least $$\rho$$ then one of the values must be at least $$\rho$$).

Application
This argument has wide use in complexity theory (e.g. proving $$\mathcal{BPP}\subsetneq\mathcal{P}/poly$$) and cryptography (e.g. proving that indistinguishable encryption results in semantic security). A plethora of such applications can be found in Goldreich's books.