Relativistic wave equations

In physics, specifically relativistic quantum mechanics (RQM) and its applications to particle physics, relativistic wave equations predict the behavior of particles at high energies and velocities comparable to the speed of light. In the context of quantum field theory (QFT), the equations determine the dynamics of quantum fields. The solutions to the equations, universally denoted as $ψ$ or $Ψ$ (Greek psi), are referred to as "wave functions" in the context of RQM, and "fields" in the context of QFT. The equations themselves are called "wave equations" or "field equations", because they have the mathematical form of a wave equation or are generated from a Lagrangian density and the field-theoretic Euler–Lagrange equations (see classical field theory for background).

In the Schrödinger picture, the wave function or field is the solution to the Schrödinger equation; $$ i\hbar\frac{\partial}{\partial t}\psi = \hat{H} \psi$$ one of the postulates of quantum mechanics. All relativistic wave equations can be constructed by specifying various forms of the Hamiltonian operator Ĥ describing the quantum system. Alternatively, Feynman's path integral formulation uses a Lagrangian rather than a Hamiltonian operator.

More generally – the modern formalism behind relativistic wave equations is Lorentz group theory, wherein the spin of the particle has a correspondence with the representations of the Lorentz group.

Early 1920s: Classical and quantum mechanics
The failure of classical mechanics applied to molecular, atomic, and nuclear systems and smaller induced the need for a new mechanics: quantum mechanics. The mathematical formulation was led by De Broglie, Bohr, Schrödinger, Pauli, and Heisenberg, and others, around the mid-1920s, and at that time was analogous to that of classical mechanics. The Schrödinger equation and the Heisenberg picture resemble the classical equations of motion in the limit of large quantum numbers and as the reduced Planck constant $ħ$, the quantum of action, tends to zero. This is the correspondence principle. At this point, special relativity was not fully combined with quantum mechanics, so the Schrödinger and Heisenberg formulations, as originally proposed, could not be used in situations where the particles travel near the speed of light, or when the number of each type of particle changes (this happens in real particle interactions; the numerous forms of particle decays, annihilation, matter creation, pair production, and so on).

Late 1920s: Relativistic quantum mechanics of spin-0 and spin-1/2 particles
A description of quantum mechanical systems which could account for relativistic effects was sought for by many theoretical physicists from the late 1920s to the mid-1940s. The first basis for relativistic quantum mechanics, i.e. special relativity applied with quantum mechanics together, was found by all those who discovered what is frequently called the Klein–Gordon equation:

by inserting the energy operator and momentum operator into the relativistic energy–momentum relation:

The solutions to ($$) are scalar fields. The KG equation is undesirable due to its prediction of negative energies and probabilities, as a result of the quadratic nature of ($$) – inevitable in a relativistic theory. This equation was initially proposed by Schrödinger, and he discarded it for such reasons, only to realize a few months later that its non-relativistic limit (what is now called the Schrödinger equation) was still of importance. Nevertheless, ($$) is applicable to spin-0 bosons.

Neither the non-relativistic nor relativistic equations found by Schrödinger could predict the fine structure in the Hydrogen spectral series. The mysterious underlying property was spin. The first two-dimensional spin matrices (better known as the Pauli matrices) were introduced by Pauli in the Pauli equation; the Schrödinger equation with a non-relativistic Hamiltonian including an extra term for particles in magnetic fields, but this was phenomenological. Weyl found a relativistic equation in terms of the Pauli matrices; the Weyl equation, for massless spin-$$ fermions. The problem was resolved by Dirac in the late 1920s, when he furthered the application of equation ($$) to the electron – by various manipulations he factorized the equation into the form:

and one of these factors is the Dirac equation (see below), upon inserting the energy and momentum operators. For the first time, this introduced new four-dimensional spin matrices $α$ and $β$ in a relativistic wave equation, and explained the fine structure of hydrogen. The solutions to ($1⁄2$) are multi-component spinor fields, and each component satisfies ($$). A remarkable result of spinor solutions is that half of the components describe a particle while the other half describe an antiparticle; in this case the electron and positron. The Dirac equation is now known to apply for all massive spin-$$ fermions. In the non-relativistic limit, the Pauli equation is recovered, while the massless case results in the Weyl equation.

Although a landmark in quantum theory, the Dirac equation is only true for spin-$$ fermions, and still predicts negative energy solutions, which caused controversy at the time (in particular – not all physicists were comfortable with the "Dirac sea" of negative energy states).

1930s–1960s: Relativistic quantum mechanics of higher-spin particles
The natural problem became clear: to generalize the Dirac equation to particles with any spin; both fermions and bosons, and in the same equations their antiparticles (possible because of the spinor formalism introduced by Dirac in his equation, and then-recent developments in spinor calculus by van der Waerden in 1929), and ideally with positive energy solutions.

This was introduced and solved by Majorana in 1932, by a deviated approach to Dirac. Majorana considered one "root" of ($$):

where $ψ$ is a spinor field now with infinitely many components, irreducible to a finite number of tensors or spinors, to remove the indeterminacy in sign. The matrices $α$ and $β$ are infinite-dimensional matrices, related to infinitesimal Lorentz transformations. He did not demand that each component of $1⁄2$ satisfy equation ($1⁄2$); instead he regenerated the equation using a Lorentz-invariant action, via the principle of least action, and application of Lorentz group theory.

Majorana produced other important contributions that were unpublished, including wave equations of various dimensions (5, 6, and 16). They were anticipated later (in a more involved way) by de Broglie (1934), and Duffin, Kemmer, and Petiau (around 1938–1939) see Duffin–Kemmer–Petiau algebra. The Dirac–Fierz–Pauli formalism was more sophisticated than Majorana's, as spinors were new mathematical tools in the early twentieth century, although Majorana's paper of 1932 was difficult to fully understand; it took Pauli and Wigner some time to understand it, around 1940.

Dirac in 1936, and Fierz and Pauli in 1939, built equations from irreducible spinors $A$ and $B$, symmetric in all indices, for a massive particle of spin $n + 1/2$ for integer $n$ (see Van der Waerden notation for the meaning of the dotted indices):

where $p$ is the momentum as a covariant spinor operator. For $n = 0$, the equations reduce to the coupled Dirac equations and $A$ and $B$ together transform as the original Dirac spinor. Eliminating either $A$ or $B$ shows that $A$ and $B$ each fulfill ($$). The direct derivation of the Dirac-Pauli-Fierz equations using the Bargmann-Wigner operators is given in.

In 1941, Rarita and Schwinger focussed on spin-$$ particles and derived the Rarita–Schwinger equation, including a Lagrangian to generate it, and later generalized the equations analogous to spin $n + 1/2$ for integer $n$. In 1945, Pauli suggested Majorana's 1932 paper to Bhabha, who returned to the general ideas introduced by Majorana in 1932. Bhabha and Lubanski proposed a completely general set of equations by replacing the mass terms in ($$) and ($$) by an arbitrary constant, subject to a set of conditions which the wave functions must obey.

Finally, in the year 1948 (the same year as Feynman's path integral formulation was cast), Bargmann and Wigner formulated the general equation for massive particles which could have any spin, by considering the Dirac equation with a totally symmetric finite-component spinor, and using Lorentz group theory (as Majorana did): the Bargmann–Wigner equations. In the early 1960s, a reformulation of the Bargmann–Wigner equations was made by H. Joos and Steven Weinberg, the Joos–Weinberg equation. Various theorists at this time did further research in relativistic Hamiltonians for higher spin particles.

1960s–present
The relativistic description of spin particles has been a difficult problem in quantum theory. It is still an area of the present-day research because the problem is only partially solved; including interactions in the equations is problematic, and paradoxical predictions (even from the Dirac equation) are still present.

Linear equations
The following equations have solutions which satisfy the superposition principle, that is, the wave functions are additive.

Throughout, the standard conventions of tensor index notation and Feynman slash notation are used, including Greek indices which take the values 1, 2, 3 for the spatial components and 0 for the timelike component of the indexed quantities. The wave functions are denoted $ψ$, and $∂_{μ}$ are the components of the four-gradient operator.

In matrix equations, the Pauli matrices are denoted by $σ^{μ}$ in which $μ = 0, 1, 2, 3$, where $σ^{0}$ is the $2 × 2$ identity matrix: $$\sigma^0 = \begin{pmatrix} 1&0 \\ 0&1 \\ \end{pmatrix} $$ and the other matrices have their usual representations. The expression $$\sigma^\mu \partial_\mu \equiv \sigma^0 \partial_0 + \sigma^1 \partial_1 + \sigma^2 \partial_2 + \sigma^3 \partial_3 $$ is a $2 × 2$ matrix operator which acts on 2-component spinor fields.

The gamma matrices are denoted by $γ^{μ}$, in which again $μ = 0, 1, 2, 3$, and there are a number of representations to select from. The matrix $γ^{0}$ is not necessarily the $4 × 4$ identity matrix. The expression $$i\hbar \gamma^\mu \partial_\mu + mc \equiv i\hbar(\gamma^0 \partial_0 + \gamma^1 \partial_1 + \gamma^2 \partial_2 + \gamma^3 \partial_3) + mc \begin{pmatrix}1&0&0&0\\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1 \end{pmatrix} $$ is a $4 × 4$ matrix operator which acts on 4-component spinor fields.

Note that terms such as "$mc$" scalar multiply an identity matrix of the relevant dimension, the common sizes are $2 × 2$ or $4 × 4$, and are conventionally not written for simplicity.

Linear gauge fields
The Duffin–Kemmer–Petiau equation is an alternative equation for spin-0 and spin-1 particles: $$(i \hbar \beta^{a} \partial_a - m c) \psi = 0$$

Using 4-vectors and the energy–momentum relation
Start with the standard special relativity (SR) 4-vectors
 * 4-position $$X^\mu = \mathbf{X} = (ct,\vec{\mathbf{x}})$$
 * 4-velocity $$U^\mu = \mathbf{U} = \gamma(c,\vec{\mathbf{u}})$$
 * 4-momentum $$P^\mu = \mathbf{P} = \left(\frac{E}{c},\vec{\mathbf{p}}\right)$$
 * 4-wavevector $$K^\mu = \mathbf{K} = \left(\frac{\omega}{c},\vec{\mathbf{k}}\right)$$
 * 4-gradient $$\partial^\mu = \mathbf{\partial} = \left(\frac{\partial_t}{c},-\vec{\mathbf{\nabla}}\right)$$

Note that each 4-vector is related to another by a Lorentz scalar:
 * $$\mathbf{U} = \frac{d}{d\tau} \mathbf{X}$$, where $$\tau$$ is the proper time
 * $$\mathbf{P} = m_0 \mathbf{U}$$, where $$m_0$$ is the rest mass
 * $$\mathbf{K} = (1/\hbar) \mathbf{P}$$, which is the 4-vector version of the Planck–Einstein relation & the de Broglie matter wave relation
 * $$\mathbf{\partial} = -i \mathbf{K}$$, which is the 4-gradient version of complex-valued plane waves

Now, just apply the standard Lorentz scalar product rule to each one: The last equation is a fundamental quantum relation.
 * $$\mathbf{U} \cdot \mathbf{U} = (c)^2$$
 * $$\mathbf{P} \cdot \mathbf{P} = (m_0 c)^2$$
 * $$\mathbf{K} \cdot \mathbf{K} = \left(\frac{m_0 c}{\hbar}\right)^2$$
 * $$\mathbf{\partial} \cdot \mathbf{\partial} = \left(\frac{-i m_0 c}{\hbar}\right)^2 = -\left(\frac{m_0 c}{\hbar}\right)^2$$

When applied to a Lorentz scalar field $$\psi$$, one gets the Klein–Gordon equation, the most basic of the quantum relativistic wave equations.
 * $$\left[\mathbf{\partial} \cdot \mathbf{\partial} + \left(\frac{m_0 c}{\hbar}\right)^2\right]\psi = 0$$: in 4-vector format
 * $$\left[\partial_\mu \partial^\mu + \left(\frac{m_0 c}{\hbar}\right)^2\right]\psi = 0$$: in tensor format
 * $$\left[(\hbar \partial_{\mu} + i m_0 c)(\hbar \partial^{\mu} -i m_0 c)\right]\psi = 0$$: in factored tensor format

The Schrödinger equation is the low-velocity limiting case ($ψ$) of the Klein–Gordon equation.

When the relation is applied to a four-vector field $$A^\mu$$ instead of a Lorentz scalar field $$\psi$$, then one gets the Proca equation (in Lorenz gauge): $$\left[\mathbf{\partial} \cdot \mathbf{\partial} + \left(\frac{m_0 c}{\hbar}\right)^2\right]A^\mu = 0$$

If the rest mass term is set to zero (light-like particles), then this gives the free Maxwell equation (in Lorenz gauge) $$[\mathbf{\partial} \cdot \mathbf{\partial}]A^\mu = 0$$

Representations of the Lorentz group
Under a proper orthochronous Lorentz transformation $v ≪ c$ in Minkowski space, all one-particle quantum states $x → Λx$ of spin $ψ^{j}_{σ}$ with spin z-component $j$ locally transform under some representation $σ$ of the Lorentz group: $$\psi(x) \rightarrow D(\Lambda) \psi(\Lambda^{-1}x) $$ where $D$ is some finite-dimensional representation, i.e. a matrix. Here $D(Λ)$ is thought of as a column vector containing components with the allowed values of $ψ$. The quantum numbers $σ$ and $j$ as well as other labels, continuous or discrete, representing other quantum numbers are suppressed. One value of $σ$ may occur more than once depending on the representation. Representations with several possible values for $σ$ are considered below.

The irreducible representations are labeled by a pair of half-integers or integers $j$. From these all other representations can be built up using a variety of standard methods, like taking tensor products and direct sums. In particular, space-time itself constitutes a 4-vector representation $(A, B)$ so that $(1⁄2, 1⁄2)$. To put this into context; Dirac spinors transform under the $Λ ∈ D^{(1/2, 1/2)}$ representation. In general, the $(1⁄2, 0) ⊕ (0, 1⁄2)$ representation space has subspaces that under the subgroup of spatial rotations, SO(3), transform irreducibly like objects of spin j, where each allowed value: $$j = A + B, A + B - 1, \dots, |A - B|,$$ occurs exactly once. In general, tensor products of irreducible representations are reducible; they decompose as direct sums of irreducible representations.

The representations $(A, B)$ and $D^{(j, 0)}$ can each separately represent particles of spin $D^{(0, j)}$. A state or quantum field in such a representation would satisfy no field equation except the Klein–Gordon equation.

Non-linear equations
There are equations which have solutions that do not satisfy the superposition principle.

Nonlinear gauge fields

 * Yang–Mills equation: describes a non-abelian gauge field
 * Yang–Mills–Higgs equations: describes a non-abelian gauge field coupled with a massive spin-0 particle

Spin 2

 * Einstein field equations: describe interaction of matter with the gravitational field (massless spin-2 field): $$R_{\mu \nu} - \frac{1}{2} g_{\mu \nu}\,R + g_{\mu \nu} \Lambda = \frac{8 \pi G}{c^4} T_{\mu \nu}$$ The solution is a metric tensor field, rather than a wave function.