User:Zmoboros

=Johnson-Lindenstrauss lemma= The Johnson-Lindenstrauss lemma asserts that a set of n points in any high dimensional Euclidean space can be mapped down into an $$O(\frac{log n}{\epsilon^{2}} )$$ dimensional Euclidean space such that the distance between any two points changes by only a factor of $$(1 +\epsilon)$$ for any $$0<\epsilon<1$$.

Introduction
Johnson and Lindenstrauss {cite} proved a fundamental mathematical result: any $$n$$ point set in any Euclidean space can be embedded in $$O(\log n /\epsilon^2)$$ dimensions without distorting the distances between any pair of points by more than a factor of $$(1+\epsilon)$$, for any $$0<\epsilon<1$$. The original proof of Johnson and Lindenstrauss was much simplified by Frankl and Maehara {cite}, using geometric insights and refined approximation techniques.

Proof
Suppose we have a set of $$n$$ $$d$$-dimensional points $$p_1, p_2,...,p_n$$ and we map them down to $$k=C \frac{\log n}{\epsilon^2} \ll d $$  dimensions, for appropriate constant $$C > 1$$. Define $$ T(.) $$ as the linear map, that is if $$ x \in R^d$$, then $$ T(x) \in R^k $$. For example $$T$$ could be a $$k \times d $$ matrix.

The general proof framework

All known proofs of the Johnson-Lindenstrauss lemma proceed according to the following scheme: For given $$n$$ and an appropriate $$k$$, one defines a suitable probability distribution $$F$$ on the set of all linear maps $$T: R^d -> R^k $$. Then one proves the following statement:

Statement: If any $$ T: R^n -> R^k $$ is a random linear mapping drown from the distribution $$ F $$, then for every vector $$x \in R^d$$ we have

$$ Prob[(1-\epsilon)||x|| \leq ||T(x)|| \leq (1+\epsilon) ||x||] \geq 1 - \frac{1}{n^2} $$

Having established this statement for the considered distribution $$F$$, the JL result follows easily: We choose $$T$$ at random according to F. Then for every $$i < j $$, using linearity of $$T$$ and the above Statement with $$ x = p i - p j $$, we get that $$T$$ fails to satisfy $$ (1-\epsilon) ||p_i - p_j|| \leq ||T(p_i) - T(p_j)|| \leq (1+\epsilon)||p_i - p_j|| $$ with probability at most $$ \frac{1}{n^2}$$. Consequently, the probability that any of the $$ (n 2)$$ pairwise distances is distorted by $$T$$ by more than $$1+\epsilon$$  is at most $$(n 2)/n^2 < \frac{1}{2}$$. Therefore, a random $$ T $$ works with probability at least $$ \frac{1}{2}$$.