Potts model

In statistical mechanics, the Potts model, a generalization of the Ising model, is a model of interacting spins on a crystalline lattice. By studying the Potts model, one may gain insight into the behaviour of ferromagnets and certain other phenomena of solid-state physics. The strength of the Potts model is not so much that it models these physical systems well; it is rather that the one-dimensional case is exactly solvable, and that it has a rich mathematical formulation that has been studied extensively.

The model is named after Renfrey Potts, who described the model near the end of his 1951 Ph.D. thesis. The model was related to the "planar Potts" or "clock model", which was suggested to him by his advisor, Cyril Domb. The four-state Potts model is sometimes known as the Ashkin–Teller model, after Julius Ashkin and Edward Teller, who considered an equivalent model in 1943.

The Potts model is related to, and generalized by, several other models, including the XY model, the Heisenberg model and the N-vector model. The infinite-range Potts model is known as the Kac model. When the spins are taken to interact in a non-Abelian manner, the model is related to the flux tube model, which is used to discuss confinement in quantum chromodynamics. Generalizations of the Potts model have also been used to model grain growth in metals, coarsening in foams, and statistical properties of proteins. A further generalization of these methods by James Glazier and Francois Graner, known as the cellular Potts model, has been used to simulate static and kinetic phenomena in foam and biological morphogenesis.

Vector Potts model
The Potts model consists of spins that are placed on a lattice; the lattice is usually taken to be a two-dimensional rectangular Euclidean lattice, but is often generalized to other dimensions and lattice structures.

Originally, Domb suggested that the spin takes one of $$q$$ possible values, distributed uniformly about the circle, at angles
 * $$\theta_s = \frac{2\pi s}{q},$$

where $$s = 0, 1, ..., q-1$$ and that the interaction Hamiltonian is given by
 * $$H_c = J_c\sum_{\langle i, j \rangle} \cos \left( \theta_{s_i} - \theta_{s_j} \right)$$

with the sum running over the nearest neighbor pairs $$\langle i,j \rangle$$ over all lattice sites, and $$J_c$$ is a coupling constant, determining the interaction strength. This model is now known as the vector Potts model or the clock model. Potts provided the location in two dimensions of the phase transition for $$q = 3,4$$. In the limit $$q \to \infty$$, this becomes the XY model.

Standard Potts model
What is now known as the standard Potts model was suggested by Potts in the course of his study of the model above and is defined by a simpler Hamiltonian:
 * $$H_p = -J_p \sum_{(i,j)}\delta(s_i,s_j) \,$$

where $$\delta(s_i, s_j)$$ is the Kronecker delta, which equals one whenever $$s_i = s_j$$ and zero otherwise.

The $$q=2$$ standard Potts model is equivalent to the Ising model and the 2-state vector Potts model, with $$J_p = -2J_c$$. The $$q=3$$ standard Potts model is equivalent to the three-state vector Potts model, with $$J_p = -\frac{3}{2}J_c$$.

Generalized Potts model
A generalization of the Potts model is often used in statistical inference and biophysics, particularly for modelling proteins through direct coupling analysis. This generalized Potts model consists of 'spins' that each may take on $$q$$ states: $$s_i \in \{1,\dots,q\}$$ (with no particular ordering). The Hamiltonian is,

H = \sum_{i < j} J_{ij}(s_i,s_j) + \sum_i h_i(s_i), $$ where $$J_{ij}(k,k')$$ is the energetic cost of spin $$i$$ being in state $$k$$ while spin $$j$$ is in state $$k'$$, and $$h_i(k)$$ is the energetic cost of spin $$i$$ being in state $$k$$. Note: $$J_{ij}(k,k') = J_{ji}(k',k)$$. This model resembles the Sherrington-Kirkpatrick model in that couplings can be heterogeneous and non-local. There is no explicit lattice structure in this model.

Phase transitions
Despite its simplicity as a model of a physical system, the Potts model is useful as a model system for the study of phase transitions. For example, for the standard ferromagnetic Potts model in $$2d$$, a phase transition exists for all real values $$q \geq 1$$, with the critical point at $$\beta J = \log(1 + \sqrt{q})$$. The phase transition is continuous (second order) for $$1 \leq q \leq 4$$ and discontinuous (first order) for $$q > 4$$.

For the clock model, there is evidence that the corresponding phase transitions are infinite order BKT transitions, and a continuous phase transition is observed when $$q \leq 4$$. Further use is found through the model's relation to percolation problems and the Tutte and chromatic polynomials found in combinatorics. For integer values of $$q \geq 3$$, the model displays the phenomenon of 'interfacial adsorption' with intriguing critical wetting properties when fixing opposite boundaries in two different states.

Relation with the random cluster model
The Potts model has a close relation to the Fortuin-Kasteleyn random cluster model, another model in statistical mechanics. Understanding this relationship has helped develop efficient Markov chain Monte Carlo methods for numerical exploration of the model at small $$q$$, and led to the rigorous proof of the critical temperature of the model.

At the level of the partition function $$Z_p = \sum_{\{s_i\}} e^{-H_p}$$, the relation amounts to transforming the sum over spin configurations $$\{s_i\}$$ into a sum over edge configurations $$\omega=\Big\{(i,j)\Big|s_i=s_j\Big\}$$ i.e. sets of nearest neighbor pairs of the same color. The transformation is done using the identity

e^{J_p\delta(s_i,s_j)} = 1 + v \delta(s_i,s_j) \qquad \text{ with } \qquad v = e^{J_p}-1 \. $$ This leads to rewriting the partition function as

Z_p = \sum_\omega v^{\#\text{edges}(\omega)} q^{\#\text{clusters}(\omega)} $$ where the  FK clusters are the connected components of the union of closed segments $$\cup_{(i,j)\in\omega}[i,j]$$. This is proportional to the partition function of the random cluster model with the open edge probability $$p=\frac{v}{1+v}=1-e^{-J_p}$$. An advantage of the random cluster formulation is that $$q$$ can be an arbitrary complex number, rather than a natural integer.

Alternatively, instead of FK clusters, the model can be formulated in terms of spin clusters, using the identity

e^{J_p\delta(s_i,s_j)} = (1 - \delta(s_i,s_j)) + e^{J_p} \delta(s_i,s_j)\. $$ A spin cluster is the union of neighbouring FK clusters with the same color: two neighbouring spin clusters have different colors, while two neighbouring FK clusters are colored independently.

Measure-theoretic description
The one dimensional Potts model may be expressed in terms of a subshift of finite type, and thus gains access to all of the mathematical techniques associated with this formalism. In particular, it can be solved exactly using the techniques of transfer operators. (However, Ernst Ising used combinatorial methods to solve the Ising model, which is the "ancestor" of the Potts model, in his 1924 PhD thesis). This section develops the mathematical formalism, based on measure theory, behind this solution.

While the example below is developed for the one-dimensional case, many of the arguments, and almost all of the notation, generalizes easily to any number of dimensions. Some of the formalism is also broad enough to handle related models, such as the XY model, the Heisenberg model and the N-vector model.

Topology of the space of states
Let Q = {1, ..., q} be a finite set of symbols, and let
 * $$Q^\mathbf{Z}=\{ s=(\ldots,s_{-1},s_0,s_1,\ldots) : s_k \in Q \; \forall k \in \mathbf{Z} \}$$

be the set of all bi-infinite strings of values from the set Q. This set is called a full shift. For defining the Potts model, either this whole space, or a certain subset of it, a subshift of finite type, may be used. Shifts get this name because there exists a natural operator on this space, the shift operator τ : QZ → QZ, acting as
 * $$\tau (s)_k = s_{k+1}$$

This set has a natural product topology; the base for this topology are the cylinder sets
 * $$C_m[\xi_0, \ldots, \xi_k]= \{s \in Q^\mathbf{Z} : s_m = \xi_0, \ldots ,s_{m+k} = \xi_k \}$$

that is, the set of all possible strings where k+1 spins match up exactly to a given, specific set of values ξ0, ..., ξk. Explicit representations for the cylinder sets can be gotten by noting that the string of values corresponds to a q-adic number, however the natural topology of the q-adic numbers is finer than the above product topology.

Interaction energy
The interaction between the spins is then given by a continuous function V : QZ → R on this topology. Any continuous function will do; for example
 * $$V(s) = -J\delta(s_0,s_1)$$

will be seen to describe the interaction between nearest neighbors. Of course, different functions give different interactions; so a function of s0, s1 and s2 will describe a next-nearest neighbor interaction. A function V gives interaction energy between a set of spins; it is not the Hamiltonian, but is used to build it. The argument to the function V is an element s ∈ QZ, that is, an infinite string of spins. In the above example, the function V just picked out two spins out of the infinite string: the values s0 and s1. In general, the function V may depend on some or all of the spins; currently, only those that depend on a finite number are exactly solvable.

Define the function Hn : QZ → R as
 * $$H_n(s)= \sum_{k=0}^n V(\tau^k s)$$

This function can be seen to consist of two parts: the self-energy of a configuration [s0, s1, ..., sn] of spins, plus the interaction energy of this set and all the other spins in the lattice. The n → ∞ limit of this function is the Hamiltonian of the system; for finite n, these are sometimes called the finite state Hamiltonians.

Partition function and measure
The corresponding finite-state partition function is given by
 * $$Z_n(V) = \sum_{s_0,\ldots,s_n \in Q} \exp(-\beta H_n(C_0[s_0,s_1,\ldots,s_n]))$$

with C0 being the cylinder sets defined above. Here, β = 1/kT, where k is the Boltzmann constant, and T is the temperature. It is very common in mathematical treatments to set β = 1, as it is easily regained by rescaling the interaction energy. This partition function is written as a function of the interaction V to emphasize that it is only a function of the interaction, and not of any specific configuration of spins. The partition function, together with the Hamiltonian, are used to define a measure on the Borel σ-algebra in the following way: The measure of a cylinder set, i.e. an element of the base, is given by
 * $$\mu (C_k[s_0,s_1,\ldots,s_n]) = \frac{1}{Z_n(V)}  \exp(-\beta H_n (C_k[s_0,s_1,\ldots,s_n]))$$

One can then extend by countable additivity to the full σ-algebra. This measure is a probability measure; it gives the likelihood of a given configuration occurring in the configuration space QZ. By endowing the configuration space with a probability measure built from a Hamiltonian in this way, the configuration space turns into a canonical ensemble.

Most thermodynamic properties can be expressed directly in terms of the partition function. Thus, for example, the Helmholtz free energy is given by
 * $$A_n(V)=-kT \log Z_n(V)$$

Another important related quantity is the topological pressure, defined as
 * $$P(V) = \lim_{n\to\infty} \frac{1}{n} \log Z_n(V)$$

which will show up as the logarithm of the leading eigenvalue of the transfer operator of the solution.

Free field solution
The simplest model is the model where there is no interaction at all, and so V = c and Hn = c (with c constant and independent of any spin configuration). The partition function becomes
 * $$Z_n(c) = e^{-c\beta} \sum_{s_0,\ldots,s_n \in Q} 1$$

If all states are allowed, that is, the underlying set of states is given by a full shift, then the sum may be trivially evaluated as
 * $$Z_n(c) = e^{-c\beta} q^{n+1}$$

If neighboring spins are only allowed in certain specific configurations, then the state space is given by a subshift of finite type. The partition function may then be written as
 * $$Z_n(c) = e^{-c\beta} |\mbox{Fix}\, \tau^n| = e^{-c\beta} \mbox{Tr} A^n$$

where card is the cardinality or count of a set, and Fix is the set of fixed points of the iterated shift function:
 * $$\mbox{Fix}\, \tau^n = \{ s \in Q^\mathbf{Z} : \tau^n s = s \}$$

The q × q matrix A is the adjacency matrix specifying which neighboring spin values are allowed.

Interacting model
The simplest case of the interacting model is the Ising model, where the spin can only take on one of two values, sn ∈ $\{−1, 1\}$ and only nearest neighbor spins interact. The interaction potential is given by
 * $$V(\sigma) = -J_p s_0 s_1\,$$

This potential can be captured in a 2 × 2 matrix with matrix elements
 * $$M_{\sigma \sigma'} = \exp \left( \beta J_p \sigma \sigma' \right)$$

with the index σ, σ′ ∈ {−1, 1}. The partition function is then given by
 * $$Z_n(V) = \mbox{Tr}\, M^n$$

The general solution for an arbitrary number of spins, and an arbitrary finite-range interaction, is given by the same general form. In this case, the precise expression for the matrix M is a bit more complex.

The goal of solving a model such as the Potts model is to give an exact closed-form expression for the partition function and an expression for the Gibbs states or equilibrium states in the limit of n → ∞, the thermodynamic limit.

Signal and image processing
The Potts model has applications in signal reconstruction. Assume that we are given noisy observation of a piecewise constant signal g in Rn. To recover g from the noisy observation vector f in Rn, one seeks a minimizer of the corresponding inverse problem, the Lp-Potts functional Pγ(u), which is defined by
 * $$ P_\gamma(u) = \gamma \| \nabla u \|_0 + \| u-f\|_p^p = \gamma \# \{ i : u_i \neq u_{i+1} \} + \sum_{i=1}^n |u_i - f_i|^p$$

The jump penalty $$\| \nabla u \|_0$$ forces piecewise constant solutions and the data term $$\| u-f\|_p^p$$ couples the minimizing candidate u to the data f. The parameter γ > 0 controls the tradeoff between regularity and data fidelity. There are fast algorithms for the exact minimization of the L1 and the L2-Potts functional.

In image processing, the Potts functional is related to the segmentation problem. However, in two dimensions the problem is NP-hard.