Rokhlin lemma

In mathematics, the Rokhlin lemma, or Kakutani–Rokhlin lemma is an important result in ergodic theory. It states that an aperiodic measure preserving dynamical system can be decomposed to an arbitrary high tower of measurable sets and a remainder of arbitrarily small measure. It was proven by Vladimir Abramovich Rokhlin and independently by Shizuo Kakutani. The lemma is used extensively in ergodic theory, for example in Ornstein theory and has many generalizations.

Rokhlin lemma belongs to the group mathematical statements such as Zorn's lemma in set theory and Schwarz lemma in complex analysis which are traditionally called lemmas despite the fact that their roles in their respective fields are fundamental.

Terminology
A Lebesgue space is a measure space $$(X, \mathcal B, \mu)$$ composed of two parts. One atomic part with finite/countably many atoms, and one continuum part isomorphic to an interval on $\R$.

We consider only measure-preserving maps. As typical in measure theory, we can freely discard countably many sets of measure zero.

An ergodic map is a map $$T$$ such that if $$T^{-1}(A) = A$$ (except on a measure-zero set) then $$A$$ or $$X-A$$ has measure zero.

An aperiodic map is a map such that the set of periodic points is measure zero:$$ \mu( \cup_{n \geq 1}\{x = T^n x\}) = 0 $$A Rokhlin tower is a family of sets $S, TS, \dots, T^{N-1}S$ that are disjoint. $S$ is called the base of the tower, and each $T^nS$  is a rung or level of the tower. $N$ is the height of the tower. The tower itself is $$R := (S \cup TS \cup \dots \cup T^{N-1}S)$$. The set outside the tower $X - R$ is the error set.

There are several Rokhlin lemmas. Each states that, under some assumptions, we can construct Rokhlin towers that are arbitrarily high with arbitrarily small error sets.

Applications
The Rokhlin lemma can be used to prove some theorems. For example, (Section 2.5 )

(Section 4.6 )

Ornstein isomorphism theorem (Chapter 6 ).

Topological Rokhlin lemmas
Let $$\textstyle (X,T)$$ be a topological dynamical system consisting of a compact metric space $$\textstyle X$$ and a homeomorphism $$\textstyle T:X\rightarrow X$$. The topological dynamical system $$\textstyle (X,T)$$ is called minimal if it has no proper non-empty closed $$\textstyle T$$-invariant subsets. It is called (topologically) aperiodic if it has no periodic points ($$T^{k}x=x$$ for some $$x\in X$$ and $$k\in\mathbb$$ implies $$k=0$$). A topological dynamical system $$\textstyle (Y,S)$$ is called a factor of $$\textstyle (X,T)$$ if there exists a continuous surjective mapping $$\textstyle \varphi:X\rightarrow Y$$ which is equivariant, i.e., $$\textstyle \varphi(Tx)=S\varphi(x)$$ for all $$\textstyle x\in X$$.

Elon Lindenstrauss proved the following theorem:

Theorem: Let $$\textstyle (X,T)$$ be a topological dynamical system which has an aperiodic minimal factor. Then for integer $$\textstyle n\in\N$$ there is a continuous function $$\textstyle f\colon X\rightarrow\R$$ such that the set $$\textstyle E=\{x\in X\mid f(Tx)\neq f(x)+1\}$$ satisfies $$\textstyle E,TE,\ldots,T^{n-1}E$$ are pairwise disjoint.

Gutman proved the following theorem:

Theorem: Let $$(X,T)$$ be a topological dynamical system which has an aperiodic factor with the small boundary property. Then for every $$\varepsilon>0$$, there exists a continuous function $$f\colon X\rightarrow\R$$ such that the set $$\textstyle E=\{x\in X \mid f(Tx)\neq f(x)+1\}$$ satisfies $$\operatorname{ocap}(\textstyle E)<\varepsilon$$, where $$\operatorname{ocap}$$ denotes orbit capacity.

Other generalizations

 * There are versions for non-invertible measure-preserving transformations.
 * Donald Ornstein and Benjamin Weiss proved a version for free actions by countable discrete amenable groups.
 * Carl Linderholm proved a version for periodic non-singular transformations.

Proofs
Proofs taken from.

Useful results
Proposition. An ergodic map on an atomless Lebesgue space is aperiodic.

Proof. If the map is not aperiodic, then there exists a number $n$, such that the set of periodic points of period $n$ has positive measure. Call the set $S$. Since measure is preserved, points outside of $S$ do not map into it, nor the other way. Since the space is atomless, we can divide $S$ into two halves, and $T$  maps each into itself, so $T$  is not ergodic.

Proposition. If there is an aperiodic map on a Lebesgue space of measure 1, then the space is atomless.

Proof. If there are atoms, then by measure-preservation, each atom can only map into another atom of greater or equal measure. If it maps into an atom of greater measure, it would drain out measure from the lighter atoms, so each atom maps to another atom of equal measure. Since the space has finite total measure, there are only finitely many atoms of a certain measure, and they must cycle back to the start eventually.

Proposition. If $T$ is ergodic, then any set $A > 0$  satisfies (up to a null set)$$X = \cup_{k \geq 0} T^k A = \cup_{k \leq 0} T^k A$$Proof. $T^{-1}(\cup_{k \leq 0} T^k A)$  is a subset of $\cup_{k \leq 0} T^k A$, so by measure-preservation they are equal. Thus $\cup_{k \leq 0} T^k A$ is a factor of $T$, and since it contains $A> 0$ , it is all of $X$.

Similarly, $T(\cup_{k \leq 0} T^k A)$ is a subset of $\cup_{k \leq 0} T^k A$, so by measure-preservation they are equal, etc.

Ergodic case
Let $A$ be a set of measure $< \epsilon$. Since $T$ is ergodic, $X = \cup_{k \leq 0} T^k A$, almost any point sooner or later falls into $A$. So we define a “time till arrival” function: $$ f(x) := \min\{n \geq 0: T^n x \in A\} $$ with $f(x) := +\infty$ if $x$  never falls into $A$. The set of $\{f(x) = +\infty\}$ is null.

Now let $S = \{x: f(x) \in \{N, 2N, 3N, \dots\}\}$.

Simplify
By a previous proposition, $$X$$ is atomless, so we can map it to the unit interval $(0, 1)$.

If we can pick a near-zero set with near-full coverage, namely some $A = O(\epsilon)$ such that $X - \cup_{k \in \Z} T^k A = O(\epsilon)$, then there exists some $n$ , such that $X - \cup_{k \leq n} T^k A = O(\epsilon)$ , and since $$T^{-i}(T^n A) \supset T^{n-i}A $$ for each $$i = 0, 1, 2, \dots$$, we have$$X - \cup_{k \leq 0} T^k (T^nA) =  O(\epsilon)$$Now, repeating the previous construction with $$T^n A$$, we obtain a Rokhlin tower of height $N$  and coverage $1-O(\epsilon)$.

Thus, our task reduces to picking a near-zero set with near-full coverage.

Constructing A
Pick $M > 1/\epsilon$. Let $S$ be the family of sets $A$  such that $A, T^{-1}A, \dots, T^{-M}A$  are disjoint. Since $T$ preserves measure, any $A \in S$  has size $< \epsilon$.

The set $S$ nonempty, because $\emptyset \in S$. It is preordered by $A < B$ iff $\mu(B-A) = 0$. Any totally ordered chain contains an upper bound. So by a simple Zorn-lemma–like argument, there exists a maximal element $A$ in it. This is the desired set.

We prove by contradiction that $X = \cup_{k\in \Z}T^k A$. Assume not, then we will construct a set $I\cap E > 0$, disjoint from $A$ , such that $A \cup (I \cap E) \in S$ , which makes $A$ no longer a maximal element, a contradiction.

Constructing E
Since we assumed $X - \cup_{k\in \Z}T^k A= \epsilon' > 0$, with positive probability, $$ x \not\in \cup_{k\in \Z}T^k A$$.

Since $T$ is aperiodic, with probability 1,$$ (x \neq Tx) \wedge(x \neq T^2x) \wedge \dots \wedge (x \neq T^Mx) $$And so, for a small enough $\delta$, with probability $> 1- \epsilon'/2$ ,$$ (|x - Tx| > \delta) \wedge(|x - T^2 x| > \delta) \wedge \dots \wedge (|x - T^M x| > \delta) $$And so, for a small enough $\delta$, with probability $> \epsilon'/2$ , these two events occur simultaneously. Let the event be $E$.

$$

Simplify
It suffices to prove the case where only the base of the tower is probabilistically independent of the partition. Once that case is proved, we can apply the base case to the partition $P \vee T^{-1} P \vee \dots \vee T^{-N+1}P$.

Since events with zero probability can be ignored, we only consider partitions where each event $P_k$ has positive probability.

The goal is to construct a Rokhlin tower $R'$ with base $S'$, such that $$\mu(S' \cap P_i ) =\frac{1-\epsilon}{N} \mu(P_i)$$ for each $i \in 0:K-1$.

Symbolic dynamics
Given a partition $P$ and a map $T$, we can trace out the orbit of every point $x$  as a string of symbols $a_0(x), a_1(x), a_2(x), \dots$ , such that each $T^i x \in P_{a_i(x)}$. That is, we follow $x$ to $T^ix$, then check which partition it has ended up in, and write that partition’s name as $a_i(x)$.

Given any Rokhlin tower of height $N$, we can take its base $S$ , and divide it into $K^N$ equivalence classes. The equivalence is defined thus: two elements are equivalent iff their names have the same first-$N$ symbols.

Let $E \subset S$ be one such equivalence class, then we call $E, TE, \dots, T^{N-1}E$  a column of the Rokhlin tower.

For each word $a_{0:N-1}\in (0:K-1)^N$, let the corresponding equivalence class be $E_{a}$.

Since $T$ is invertible, the columns partition the tower. One can imagine the tower made of string cheese, cut up the base of the tower into the $K^N$ equivalence classes, then pull it apart into $K^N$  columns.

First Rokhlin tower R
Let $\delta \ll \epsilon$ be very small, and let $M \gg N$  be very large. Construct a Rokhlin tower with $M$ levels and error set of size $\delta$. Let its base be $S$. The tower $R = S\cup TS \cup\dots\cup T^{M-1}S$ has mass $1-\delta$.

Divide its base into $K^N$ equivalence classes, as previously described. This divides it into $K^N$ columns $\{E_a\}_{a}$  where $a$  ranges over the possible words $(0:K-1)^N$.

Because of how we defined the equivalence classes, each level in each column $T^nE_a$ falls entirely within one of the partitions $P_0, \dots, P_{K-1}$. Therefore, the column levels $\{T^nE_a\}_{a, n}$ almost make up a refinement of the partition $P$, except for an error set of size $\delta$.

That is,$$\mu(R\cap P_i) = \sum_{a \in (0:K-1)^N,\; n \in 0:M-1}\mu(T^nE_a) = \mu(P_i) + O(\delta)$$The critical idea: If we partition each $T^nE_a$ equally into $N$  parts, and put one into a new Rokhlin tower base $S'$, we will have$$\mu(S'\cap P_i) = \frac{1}{N}\mu(P_i) + O(\delta)$$

Second Rokhlin tower R'
Now we construct a new base $S'$ as follows: For each column based on $E_a$, add to $S'$ , in a staircase pattern, the sets$$E_{a, 0}, TE_{a, 1}, \dots, T^{N-1}E_{a, N-1}$$then wrap back to the start: $$T^NE_{a, 0}, T^{N+1}E_{a, 1}, \dots, T^{2N-1}E_{a, N-1}$$and so on, until the column is exhausted. The new Rokhlin tower base $S'$ is almost correct, but needs to be trimmed slightly into another set $$S$$, which would satisfy $$\mu(S\cap P_i)  = \frac{1-\epsilon}{N}\mu(P_i)$$ for each $i \in 0:K-1$, finishing the construction. (Only now do we use the assumption that there are only finitely many partitions. If there are countably many partitions, then the trimming cannot be done.)

The new Rokhlin tower $S', TS', \dots, T^{N-1}S'$, contains almost as much mass as the original Rokhlin tower. The only lost mass is due to a small corner on the top right and bottom left of each column, which takes up $\leq \frac{2N^2}{MN}$ proportion of the whole column’s mass. If we set $M \gg N/\delta$, this lost mass is still $O(\delta)$. Thus, the new Rokhlin tower still has a very small error set.

Even after accounting for the mass lost from cutting off the column corners, we still have$$\begin{aligned} \mu(S'\cap P_i) &= \frac{1}{N}\mu(P_i) + O(\delta) + O(\delta) \\ &= \frac{1}{N}\mu(P_i) + O(\delta) \\ &= \frac{1}{N}\mu(P_i)\times (1 + O(N\delta/\mu(P_i)))\quad\forall i = 0, 1, \dots, K-1

\end{aligned}$$

Since there are only finitely many partitions, we can set $\delta = o(\frac{\epsilon}{N\min_i \mu(P_i)})$, we then have$$ \mu(S'\cap P_i) = \frac{1}{N}\mu(P_i)\times (1 + o(1) \epsilon) $$In other words, we have real numbers $c_0, c_1, \dots, c_{K-1} = o(1)$ such that $$ \mu(S'\cap P_i) = \frac{1-c_i \epsilon}{N}\mu(P_i) $$.

Now for each column $i = 0, 1, \dots, K-1$, trim away a part of $S'\cap P_i$ into $S\cap P_i$ , so that $$\mu(S\cap P_i)  = \frac{1-\epsilon}{N}\mu(P_i)$$. This finishes the construction.