Randomness merger

In extractor theory, a randomness merger is a function which extracts randomness out of a set of random variables, provided that at least one of them is uniformly random. Its name stems from the fact that it can be seen as a procedure which "merges" all the variables into one, preserving at least some of the entropy contained in the uniformly random variable. Mergers are currently used in order to explicitly construct randomness extractors.

Intuition and definition
Consider a set of $$k$$ random variables, $$X_1,\ldots,X_k$$, each distributed over $$\{0,1\}^n$$ at least one of which is uniformly random; but it is not known which one. Furthermore, the variables may be arbitrarily correlated: they may be functions of one another, they may be constant, and so on. However, since at least one of them is uniform, the set as a whole contains at least $$n$$ bits of entropy.

The job of the merger is to output a new random variable, also distributed over $$\{0,1\}^n$$, that retains as much of that entropy as possible. Ideally, if it were known which of the variables is uniform, it could be used as the output, but that information is not known. The idea behind mergers is that by using a small additional random seed, it is possible to get a good result even without knowing which one is the uniform variable.

A naive idea would be to take the xor of all the variables. If one of them is uniformly distributed and independent of the other variables, then the output would be uniform. However, if suppose $$X_1 = X_2$$, and both of them are uniformly distributed, then the method would not work.

Definition (merger):

A function $$M : (\{0,1\}^n)^k \times \{0,1\}^d \rightarrow \{0,1\}^n$$ is called an $$(m,\varepsilon)$$-merger if for every set of random variables $$(X_1,\ldots,X_k)$$ distributed over $$\{0,1\}^n$$, at least one of which is uniform, the distribution of $$Z = M(X_1,\ldots,X_k, U_d)$$ has smooth min-entropy $$H_\infty^\varepsilon(Z) \geq m$$. The variable $$U_d$$ denotes the uniform distribution over $$d$$ bits, and represents a truly random seed.

In other words, by using a small uniform seed of length $$d$$, the merger returns a string which is $$\varepsilon$$-close to having at least $$m$$ min-entropy; this means that its statistical distance from a string with $$m$$ min-entropy is no larger than $$\varepsilon$$.

Reminder: There are several notions of measuring the randomness of a distribution; the min-entropy of a random variable $$Z$$ is defined as the largest $$k$$ such that the most probable value of $$Z$$ occurs with probability no more than $$2^{-k}$$. The min-entropy of a string is an upper bound to the amount of randomness that can be extracted from it.

Parameters
There are three parameters to optimize when building mergers:


 * 1) The output's min-entropy $$m$$ should be as high as possible, for then more bits can be extracted from it.
 * 2) $$\varepsilon$$ should be as small as possible, for then after applying an extractor to the merger's output, the result will be closer to uniform.
 * 3) The seed length $$d$$ should be as small as possible, for then the merger requires fewer initial truly random bits to work.

Explicit constructions for mergers are known with relatively good parameters. For example, Dvir and Wigderson's construction gives: For every $$\alpha > 0$$ and integer $$n$$, if $$k \leq 2^{o(n)}$$, there exists an explicit $$(m,\varepsilon)$$-merger $$M : (\{0,1\}^n)^k \times \{0,1\}^d \rightarrow \{0,1\}^n$$ such that:
 * 1) $$m = (1-\alpha)n,$$
 * 2) $$d = O(\log(n) + \log(k)),$$
 * 3) $$\varepsilon = O\left(\frac{1}{n \cdot k}\right).$$

The proof is constructive and allows building such a merger in polynomial time in the given parameters.

Usage
It is possible to use mergers in order to produce randomness extractors with good parameters. Recall that an extractor is a function which takes a random variable that has high min-entropy, and returns a smaller random variable, but one that is close to uniform. An arbitrary min-entropy extractor can be obtained using the following merger-based scheme:


 * Given a source of high min-entropy, partition it into blocks. By a known result, at least one of these partitions will also have high min-entropy as a block-source.
 * Apply a block extractor separately to all the blocks. This is a weaker sort of extractor, and good constructions for it are known. Since at least one of the blocks has high min-entropy, at least one of the outputs is very close to uniform.
 * Use the merger to combine all the previous outputs into one string. If a good merger is used, then the resultant string will have very high min-entropy compared to its length.
 * Use a known extractor that works only for very high min-entropy sources to extract the randomness.

The essence of the scheme above is to use the merger in order to transform a string with arbitrary min-entropy into a smaller string, while not losing a lot of min-entropy in the process. This new string has very high min-entropy compared to its length, and it's then possible to use older, known, extractors which only work for those type of strings.