User:Jean Raimbault/sandbox/Random walk

Informal discussion
Maybe the most simple example of a random walk is the random walk on the integer number line $$\mathbb Z$$ which starts at 0 and at each step moves +1 or −1 with equal probability. This walk can be illustrated as follows. A marker is placed at zero on the number line and a fair coin is flipped. If it lands on heads, the marker is moved one unit to the right. If it lands on tails, the marker is moved one unit to the left. After the first step the counter is at 1 or -1 both with probability 0.5; after the second it can be at 0 if the flips' result was head/tails or tails/head so with probability 0.5, since that makes two outcomes out of four, or at 2 (two heads) or -2 (two tails), both with probability 0.25. See the figure below for an illustration of the possible outcomes up to 5 flips.



The walk can also be represented by the graph of its position with respect to time, as in the following figure.



Definition
To define this walk formally, take independent random variables $$Z_1, Z_2,\dots$$, where each variable is either 1 or &minus;1, with a 50% probability for either value, and set


 * $$S_1 = 0, S_n =\sum_{j=1}^{n-1} Z_j$$ for $$n > 1$$.

The sequence of random variables $$(S_n)_{n\ge 1}$$ is called the simple random walk on $$\mathbb Z$$; the random variable $$S_n$$ is the $$n$$th-position of the walk.

Asymptotic size
The expectation $$E(S_n)\,\!$$ of $$S_n\,\!$$ is zero. That is, the mean of all coin flips approaches zero as the number of flips increases. This follows by the finite additivity property of expectation:
 * $$E(S_n)=\sum_{j=1}^n E(Z_j)=0.$$

The next question is: how large is the typical state at time $$n$$? A rough answer is given by the expected values of the random variables $$Z_n^2$$ and $$|Z_n|$$. A calculation similar to the one above, using the independence of the random variables $$Z_n$$ and the fact that $$E(Z_n^2)=1$$, shows that:
 * $$E(S_n^2)=\sum_{i=1}^n \sum_{j=1}^n E(Z_j Z_i)=n.$$

This implies, using the Cauchy-Schwarz inequality, that $$E(|S_n|)\,\!$$, the expected translation distance after n steps, is at most of the order of $$\sqrt n$$. In fact, it follows from the central limit theorem that:
 * $$\lim_{n\to\infty} \frac{E(|S_n|)}{\sqrt n}= \sqrt{\frac {2}{\pi}}.$$

The law of the iterated logarithm describes the amplitude of the ocillations of the random walk: it states that for any $$\varepsilon$$, for almost all paths $$|S_n| < (\sqrt 2 + \varepsilon)\sqrt n \log\log(n)$$ and $$|S_n| > (\sqrt 2 + \varepsilon)\sqrt n \log\log(n)$$ for infinitely many $$n$$.

Combinatorial computation of the distributions
Some of the results mentioned above can be derived by giving exact formulas for the probability of hitting a given integer at time n, in terms of the binomial coefficients in Pascal's triangle, by a simple computation. The total number of possible trajectories up to time n is 2$n$. If |k| &le; n and k has the same parity as n then those trajectories which result in a state S$n$ = k at time n are exactly those which have taken (n+k)/2 times the step +1, and (n-k)/2 times the step -1 (note that S$n$ and n always have the same parity). For the simple random walk all of these trajectories are equally likely and the number of trajectories satisfying this is thus equal to the binomial coefficient computing the configurations of (n-k)/2 (or (n+k)/2) elements among n, and therefore:
 * $$ \mathbb P(S_n = k) = \frac 1 {2^n} \binom{n}{(n-k)/2} = \frac 1 {2^n} \binom{n}{(n+k)/2}$$

Using the expression n!/(k!(n-k)/2!) for the binomial coefficients and the asymptotic equivalent for the factorials given by Stirling's formula, one can obtain good estimates for these probabilities for large values of $$n$$.

Here is a table listing the probabilities up to n=5.

Recurrence
Another question is the recurrence of the walk: that is, does it almost surely return only finitely many or infinitely many times to a given state (in the first case it is said to be transient, in the second recurrent). An equivalent phrasing is: how many times will a random walk pass through a given integer if permitted to continue walking forever? A simple random walk on $$\mathbb Z$$ will cross every point an infinite number of times, that is the simple random walk on the integers is recurrent. This result has many names: the level-crossing phenomenon, recurrence or the gambler's ruin. The reason for the last name is as follows: a gambler with a finite amount of money will eventually lose when playing a fair game against a bank with an infinite amount of money. The gambler's money will perform a random walk, and it will reach zero at some point, and the game will be over.

The proof of recurrence follows immediately from the estimate of the return probability by the local central limit theorem, and the Borel-Cantelli lemma

Hitting times
If a and b are positive integers, then the expected number of steps until a one-dimensional simple random walk starting at 0 first hits b or &minus;a is ab. The probability that this walk will hit b before −a is $$a/(a+b)$$, which can be derived from the fact that simple random walk is a martingale.

As a Markov chain
A more sophisticated equivalent definition of a symmetric random walk on $$\mathbb Z$$ is a symmetric Markov chain with state space $$\mathbb Z$$. For example the simple random walk is the Markov chain whose transition probabilities are $$p(k, k+1) = p(k, k-1) = 1/2$$.

Lattice random walk
A popular random walk model is that of a random walk on a regular lattice, where at each step the location jumps to another site according to some probability distribution. In a simple random walk, the location can only jump to neighboring sites of the lattice, forming a lattice path. In  simple symmetric random walk on a locally finite lattice, the probabilities of the location jumping to each one of its immediate neighbours are the same. The best studied example is of random walk on the d-dimensional integer lattice (sometimes called the hypercubic lattice) $$\mathbb Z^d$$.

Higher dimensions


In higher dimensions, the set of randomly walked points has interesting geometric properties. In fact, one gets a discrete fractal, that is, a set which exhibits stochastic self-similarity on large scales. On small scales, one can observe "jaggedness" resulting from the grid on which the walk is performed. Two books of Lawler referenced below are a good source on this topic. The trajectory of a random walk is the collection of points visited, considered as a set with disregard to when the walk arrived at the point. In one dimension, the trajectory is simply all points between the minimum height and the maximum height the walk achieved (both are, on average, on the order of √n).

To visualize the two dimensional case, one can imagine a person walking randomly around a city. The city is effectively infinite and arranged in a square grid of sidewalks. At every intersection, the person randomly chooses one of the four possible routes (including the one originally traveled from). Formally, this is a random walk on the set of all points in the plane with integer coordinates.

Will the person ever get back to the original starting point of the walk? This is the 2-dimensional equivalent of the level crossing problem discussed above. It turns out that the person almost surely will in a 2-dimensional random walk, but for 3 dimensions or higher, the probability of returning to the origin decreases as the number of dimensions increases. In 3 dimensions, the probability decreases to roughly 34%.

GGGUHHHAs a direct generalization, one can consider random walks on crystal lattices (infinite-fold abelian covering graphs over finite graphs). Actually it is possible to establish the central limit theorem and large deviation theorem in this setting.

Setting
Let $$\mathcal G = (V, E)$$ be an undirected graph: the set $$V$$ is the set of vertices of $$\mathcal G$$, and the set $$E$$ of edges is a symmetric subset of $$V \times V$$ (that is $$(x, y) \in E$$ if and only if $$(y, x) \in E$$); two vertices $$x, y \in V$$ are adjacent or neighbours whenever they are joined by an edge, meaning that $$(x, y) \in E$$. The degree of a vertex is the number of incident edges: we use the notation $$d_x$$ for the degree of $$x$$, which is equal to the number of edges $$(x, y) \in E$$ (plus one if $$(x, x) \in E$$: a loop is considered to be twice incident to its base vertex).

A Markov chain on $$V$$ is a random process which can be described in the following way: for any two vertices $$x,y$$ there is a transition probability $$p(x,y)$$ so that these satisfy that $$\Sigma_y \, p(x,y) = 1$$, and an initial state $$m$$ (a probability measure on $$V$$). The random walk is a sequence $$(X_t)_{t \ge 1}$$ of random variables with values in $$V$$, such that the law of $$X_1$$ is $$m$$ and the law of $$X_{t+1}$$ is specified by the conditional probabilities $$\mathbb P(X_{t+1} = y | X_t = x) = p(x, y)$$. Such a chain is called reversible (or time-symmetric) if there exists a function $$w: V \to \mathbb R$$ such that $$w(x)p(x,y) = w(y)p(y,x)$$ for all $$x, y \in V$$. Suppose that the Markov chain is reversible and in addition that $$p(x,y) = 0$$ whenever $$(x, y) \not\in E$$ (in this case it is called a nearest neighbour chain). Then there is a nice geometric interpretation for the process: label each edge $$e \in E$$ between vertices $$x,y$$ with the number $$c(e) = w(x)p(x,y)$$ (by the reversibility condition this does not depend on the order pf $$x,y$$); if $$X_t = x$$ the probability that $$X_{t+1} = y$$, for a vertex $$y$$ adjacent to $$x$$ via an edge $$e$$, is equal to $$c(e)/w(x)$$. The term random walk is often reserved to these processes, though the nearest neighbour condition is often dropped as it is not important for the class of processes defined. The triple $$(V,E,c)$$ is called a network as it can be used to model an electrical network.

A graph is called locally finite if any vertex has finite degree. On such a graph there is a natural random walk, the simple random walk, which is defined by taking $$c(e) = 1$$ for all edges, or equivalently is the Markov chain with $$p(x,y) = 1/d$x$$$. Informally this is the random walk which at each step chooses a neighbour at random with the same property for each.

Properties
If $$X_t$$ is a random walk on a graph $$\mathcal G = (V, E)$$ of which $$x, y \in V$$ are vertices, then the probability $$p_n(x, y)$$ for $$n \in \mathbb N$$ is the conditional probability that $$X_n = y$$ knowing that $$X_0 = x$$. So $$p_1(x, y) = p(x, y)$$, and in general
 * $$ p_n(x,y) = \sum_{(x_1=x, x_2, \ldots, x_n=y)} p(x, x_2) p(x_2, x_3) \cdots p(x_{n-1}, y)$$

(the sum is over all paths from $$x$$ to $$y$$ in $$\mathcal G$$), that is the summation indices satisfy $$(x_i, x_{i+1}) \in E$$ for all $$i=1, \ldots, n-1$$).

The most basic property of a Markov chain is irreduciblity: informally this means that any given state is accessible in finite time from any other state; for a random walk it means that the probability that at some time the walk passes through any given vertex $$y$$ is positive (independently of the initial distribution). With the terminology above it can be stated as $$\forall x, y \in V \exists n \ge 0: p_n(x, y) > 0$$. If the walk is not irreducible its state space can always be decomposed into irreducible components and studied independently on those. A simple way to see if a random walk on a graph is irreducible is to delete from a graph all edges $$(x, y)$$ with $$p_e = 0$$; if the resulting graph is connected then the walk is irreducible. A simple random walk on a connected graph is always irreducible.

An ireducible random walk is said to be recurrent if it returns infinitely many times to any given vertex (for a simple walk this does not depend on the chosen vertex); this is equivalent to $$\Sigma_{n \ge 0} p_n(x,x) = +\infty$$. Otherwise the random walk is said to be transient, equivalently it almost surely leaves any given finite subset. A graph is said to be transient or recurrent according to whether the simple random walk has the corresponding property. The random walk on a finite graph is always recurrent. The simple random walk on a $$d$$-dimensional lattice is recurrent if and only if $$d=1, 2$$ ("Poly&aacute;'s theorem"). The simple random walk on a regular (infinite) tree is always transient. The spectral radius $$\rho$$ of an irreducible random walk is defined by:
 * $$\rho = \limsup_{n \to +\infty} p_n(x, x)^{1/n} $$

which does not depend on the initial distribution or the vertex $$x$$. If $$\rho < 1$$ then the walk is transient: the converse is not true, as illustrated by the random walk on the 3-dimensional lattice.

Random walks are better-behaved than more general Markov chains. Random walks on graphs enjoy a property called time symmetry or reversibility. Roughly speaking, this property, also called the principle of detailed balance, means that the probabilities to traverse a given path in one direction or in the other have a very simple connection between them (if the graph is regular they are in fact equal). This property has important consequences.

The basic paradigm of the study of random walks on graphs and groups is to relate graph-theoretical properties (or algebraic properties in the group case) to the behaviour of the random walk. Particularly simple examples are the detection of irreducibility by connectedness and of transience by the spectral radius given above, but there are many more subtler relations.

Random walk on a finite graph
If the graph $$\mathcal G=(V, E)$$ is finite, that is its vertex set $$V$$ is finite, then a random walk $$(X_t)$$ on $$\mathcal G$$ is a Markov chain with finite state space, and as such it is represented by a stochastic matrix $$P$$ with entries indexed by $$V \times V$$: the entry corresponding to a pair $$(x, y)$$ of vertices is the transition probability $$p(y,x)$$. In the case of a simple random walk on a $$d$$-regular graph the matrix is equal to $$1/d$$ times the incidence matrix of the graph. If the distribution of $$X_1$$ is represented by a vector $$v$$ with entries indexed by $$V$$ summing to 1, then the distribution of $$X_{t+1}$$ is described by the vector $$P^t v$$.

There are two problems which are studied specifically in the context of the random walk on finite graphs: the study of convergence to the equilibrium state of the walk (the stationary measure) and the expected time it takes for the walk to visit all vertices (the cover time). The latter notion is intuitively obvious and can be easily formalised. The former deserves a more detailed commentary. A measure $$\nu$$ on $$V$$ is said to be stationary for the random walk if for all $$x \in V$$, $$\Sigma_y p(y,x)\nu(y) = \nu(x)$$, in other words $$\nu$$ is a fixed vector of the matrix $$P$$, and informally if the initial distribution is $$\nu$$ then it remains so for all time. On any finite graph there is a unique stationary probability measure which gives the vertex $$x$$ the mass $$d_x/M$$ where $$M = \Sigma_y d_y$$ is the total degree: on a regular graph this is the uniform distribution. If the graph $$ \mathcal G$$ is not bipartite then the distribution $$P^t v$$ converges to the stationary measure for any initial distribution $$v$$; on a bipartite class the random walk may oscillate between the two colours of vertices. Once the limiting distribution $$\nu$$ is known it is natural to ask how fast the convergence is. This is done by studying the mixing rate of the random walk, which is defined (for a non-bipartite graph) by:
 * $$ \mu = \limsup_{t \to +\infty} \max_{x, y \in V} \left( p_t(x, y) - \nu(y) \right)^{\frac 1 t}$$

(there is a similar, albeit more involved definition for the bipartite case which measures the convergence rate to the limiting regime of alternating between two states).

The problem is to provide upper and lower bounds for both the cover time and the mixing rate and the cover time. There are general results in this direction, and also sharper result for certain classes of finite graphs. Some general results on the cover time are: The mixing rate is related to the spectral gap of the matrix $$P$$. Supposing to simplify, that the matrix $$P$$ is symmetric (which amounts to the graph $$\mathcal G$$ being regular) and that $$ \mathcal G$$ is not bipartite. Let $$n$$ be the number of vertices of $$\mathcal G$$; then $$P$$ has $$n$$ eigenvalues $$\lambda_1 = 1 > \lambda_2 \ge \ldots \ge \lambda_n > -1$$ (the last inequality being strict is a consequence of $$\mathcal G$$ not being bipartite), and the mixing rate is equal to $$\max(|\lambda_2|, |\lambda_n|)$$. Thus graphs with large spectral gaps lead to faster convergence towards equilibrium: such graphs are commonly called expander graphs.
 * There are constants $$C, c > 0$$ such that the cover time of any graph on n vertices is at most $$Cn^3$$ and at least $$cn\log(n)$$ (for any $$\varepsilon > 0$$, for large enough $$n$$ one can take $$C = 4/27 + \varepsilon$$ and $$c = 1 + \varepsilon$$).
 * The cover time of any regular graph on n vertices is at most $$2n^2$$.

The fact that the mixing rate $$\mu = \max(|\lambda_2|, |\lambda_n|)$$ is smaller than 1 indicates that a random walk on a (non-bipartite) graph always converges exponentially to the stationary measure. For applications it is important to know the speed at which this convergence takes place. An often-used way of quantifying this is by using the total variation distance between the distribution at time $$t$$ and the limit distribution $$\nu$$. The total variation of a function $$f: V \to \mathbb R$$ is defined by
 * $$ \| f \|_{TV} = \max_{A \subset V} |\sum_{a\in A} f(A)| $$

and the problem is then to estimate precisely how fast $$\max{}_{x \in V} \|p_t(x, \cdot) - \nu \|_{TV}$$ decreases with $$t$$. This decay is eventually exponential of rate $$\mu$$, but this fast decay may take some time to kick in. This is measured by the mixing time which may be defined by:
 * $$T(\mathcal G, p, \varepsilon) = \min \left( t \ge 1: \max_{x \in V} \| p(x, \cdot) - \nu \|_{TV} \le \varepsilon \right)$$.

The mixing time is significant only for $$\varepsilon < 1/2$$. It is often defined for a specific value in this range, for example $$1/4, 1/2e$$. The most general bounds for $$T$$ in the siginificant range are given by:
 * $$D/2 \le T(\mathcal G, p, \varepsilon) \le 1 + \frac{\log(2\varepsilon \nu_\min)}{\mu} $$

where $$D$$ is the diameter of $$\mathcal G$$ (the maximal number of edges between two vertices) and $$\nu_\min = \min{}_{x\in V} \nu(x)$$. The lower bound is far from sharp in general.

Random walk on an infinite graph
On infinite connected graphs one can study the asymptotic behaviour of trajectories. The most basic question is to establish whether the random walk is transient or recurrent. For this there is a basic criterion which can be formulated as follows: let $$(V, E, c)$$ be a network, fix any vertex $$x_0 \in V$$ and let $$u$$ be an antisymmetric function on $$E$$ (that is, $$u(y, x) = -u-x, y)$$) such that $$\Sigma_y u(x,y) = 0$$ for all $$x \neq x_0$$ and $$\Sigma_y u(x_0, y) = 1$$ such a function is called a flow from $$x_<0$$ to infinity). Then the random walk is transient if and only if there exists such a function which also has finite energy in the sense that $$\Sigma_e c(e)^{-1} u(e)^2 < +\infty $$; this criterion does not depend on the base vertex $$x_0$$.

Applications of this include proving P&oacute;lya's theorem without computing the return probabilities, and proving that the simple random walk on an infinite tree is transient. More generally, the last result is true more generally for (non-elementary) hyperbolic graphs, a vast class including for example the skeletons of tessellations of the hyperbolic plane.

Once the random walk is known to be transient more precise questions can be asked. First, how fast does it escape to infinity? The roughest estimate of this escape rate is its linear speed given by:
 * $$ \liminf_{t \to +\infty} \left( \frac 1 t \mathbb E(d(x_0, X_t) \right) \in [0, 1] $$.

We say that the walk has a linear escape rate if this is positive, sublinear otherwise. The spectral radius detects graphs with a linear escape rate:
 * A random walk has a linear escape rate if $$\rho < 1$$. 

This implies that simple random walks on trees and hyperbolic graphs have a linear rate of escape.

Secondly, transience means that in the one-point compactification of the metric realisation of $$\mathcal G$$ the random walk almost surely converges to the point at infinity. The natural question is whether there exists finer compactifications in which the random walk converges almost surely to a point of the boundary. The notion of Poisson boundary gives a measure-theoretic context to work on this problem. In many cases where the spectral radius is 1 (for example P&oacute;lya's walk) the Poisson boundary is trivial and brings no new information. In other cases it is nontrivial and can be realised as a topological compactification, the Martin compactification. For example:
 * In a regular tree, the simple random walk almost surely converges to an end;
 * In a regular hyperbolic graph, the simple random walk almost surely converges to a point in the Gromov boundary;

Random walk on random graphs
If the transition kernel $$p(x,y)$$ is itself random (based on an environment $$\omega$$) then the random walk is called a "random walk in random environment". When the law of the random walk includes the randomness of $$\omega$$, the law is called the annealed law; on the other hand, if $$\omega$$ is seen as fixed, the law is called a quenched law. See the book of Hughes, the book of Revesz, or the lecture notes of Zeitouni.

In the context of Random graphs, particularly that of the Erdős–Rényi model, analytical results to some properties of random walkers have been obtained. These include the distribution of first and last hitting times of the walker, where the first hitting time is given by the first time the walker steps into a previously visited site of the graph, and the last hitting time corresponds the first time the walker cannot perform an additional move without revisiting a previously visited site.

Setting
Let $$G$$ be a group and $$S$$ and $$q$$ a probability measure on $$G$$. Suppose that $$q$$ is symmetric, that is $$q(g) = q(g^{-1})$$ for all $$g \in G$$. Then a random walk on $$G$$ with step distribution $$q$$ is a Markov chain on $$G$$ with transition probabilities $$p(x, y) = q(x^{-1}y)$$. In other words the probability of jumping from $$x$$ to $$xg$$ is equal to $$q(g)$$. Such a process is always reversible, in fact it is symmetric since
 * $$p(y, x) = q(y^{-1}x) = q\left( (x^{-1}y)^{-1} \right) = q(x^{-1}y) = p(x, y)$$.

It is irreducible if and only if the support of $$q$$ generates $$G$$. An important property is that the step distribution is $$G$$-invariant, that is $$p(gx, gy) = p(x, y)$$; this implies for example that the behaviour is exactly the same for any starting point.

Still another interpretation is as a random walk on the Cayley graph of $$G$$ with respect to the support of $$q$$, with probability $$q(g)$$ associated to the edge $$(x, gx)$$. In case $$q$$ is the uniform probability on a symmetric generating set $$S$$ for $$G$$ the walk is the simple random walk on the Cayley graph of $$(G, S)$$ and is called the simple random walk on $$(G, S)$$.

Random walk on a finite group
Because of transitivity, random walks on finite groups are better behaved than random walks on more general graphs. In addition the group structure can be used to prove better results than in the general transitive case, or used to great effect in special cases. Examples include: An important phenomenon for large finite groups, discovered by Diaconis and Bayer, is the cutoff phenomenon: in some families of groups with cardinality going to infinity, well-chosen simple random walks exhibit an abrupt behaviour, meaning that until a certain time the distance to the stationary measure remain constant, after which it starts decreasing fast. Conjecturally this should be a generic phenomenon; there are various special cases where it is known to take place, with explicit estimates for the decay after the cutoff time.
 * Cayley graphs of simple groups are very often (conjecturally always) good expanders.
 * Spectral estimates for convergence to the stationary measure are more precise than in general.

Random walks on discrete groups
As in the finite case, it is possible to prove general results for simple random walks on finitely generated groups which are unattainable in a more general setting. For example: The theory of the Poisson boundary is particularly developed for discrete groups.
 * The random walk on an infinite group is transient if and only if the group is not commensurable to $$\mathbb Z$$ or $$\mathbb Z^2$$.
 * (Kesten's theorem) The spectral radius $$\rho$$ is equal to 1 if and only if the group is amenable.

Random products of matrices
Suppose that $$\mu$$ is a measure on the linear group $$\mathrm{GL}_d(_mathbb R)$$ and let $$(X_t)$$ be the random walk with step distribution $$\mu$$ and initial distribution supported at the identity matrix. Then $$X_t$$ is just a product of $$t$$ matrices chosen independently randomly according to the law $$\mu$$. The additional linear structure allows to define the norm of a matrix (for example the operator norm), or the matrix coefficients (linear functionals of the colums) and it is natural to ask the following questions: Under reasonable hypotheses there are very precise answers to these two questions, in particular analogues of the law of large numbersn central limit theorem, large deviation principle and law of the iterated logarithm.
 * What is the asymptotic behaviour of the nrom $$\| X_t \|$$?
 * What is the asymptotic behaviour of the matrix coefficients?