Gauge covariant derivative

In physics, the gauge covariant derivative is a means of expressing how fields vary from place to place, in a way that respects how the coordinate systems used to describe a physical phenomenon can themselves change from place to place. The gauge covariant derivative is used in many areas of physics, including quantum field theory and fluid dynamics and in a very special way general relativity.

If a physical theory is independent of the choice of local frames, the group of local frame changes, the gauge transformations, act on the fields in the theory while leaving unchanged the physical content of the theory. Ordinary differentiation of field components is not invariant under such gauge transformations, because they depend on the local frame. However, when gauge transformations act on fields and the gauge covariant derivative simultaneously, they preserve properties of theories that do not depend on frame choice and hence are valid descriptions of physics. Like the covariant derivative used in general relativity (which is special case), the gauge covariant derivative is an expression for a connection in local coordinates after choosing a frame for the fields involved, often in the form of index notation.

Overview
There are many ways to understand the gauge covariant derivative. The approach taken in this article is based on the historically traditional notation used in many physics textbooks. Another approach is to understand the gauge covariant derivative as a kind of connection, and more specifically, an affine connection. The affine connection is interesting because it does not require any concept of a metric tensor to be defined; the curvature of an affine connection can be understood as the field strength of the gauge potential. When a metric is available, then one can go in a different direction, and define a connection on a frame bundle. This path leads directly to general relativity; however, it requires a metric, which particle physics gauge theories do not have.

Rather than being generalizations of one-another, affine and metric geometry go off in different directions: the gauge group of (pseudo-)Riemannian geometry must be the indefinite orthogonal group O(s,r) in general, or the Lorentz group O(3,1) for space-time. This is because the fibers of the frame bundle must necessarily, by definition, connect the tangent and cotangent spaces of space-time. In contrast, the gauge groups employed in particle physics could in principle be any Lie group at all, although in practice the Standard Model only uses U(1), SU(2) and SU(3). Note that Lie groups do not come equipped with a metric.

A yet more complicated, yet more accurate and geometrically enlightening, approach is to understand that the gauge covariant derivative is (exactly) the same thing as the exterior covariant derivative on a section of an associated bundle for the principal fiber bundle of the gauge theory; and, for the case of spinors, the associated bundle would be a spin bundle of the spin structure. Although conceptually the same, this approach uses a very different set of notation, and requires a far more advanced background in multiple areas of differential geometry.

The final step in the geometrization of gauge invariance is to recognize that, in quantum theory, one needs only to compare neighboring fibers of the principal fiber bundle, and that the fibers themselves provide a superfluous extra description. This leads to the idea of modding out the gauge group to obtain the gauge groupoid as the closest description of the gauge connection in quantum field theory.

For ordinary Lie algebras, the gauge covariant derivative on the space symmetries (those of the pseudo-Riemannian manifold and general relativity) cannot be intertwined with the internal gauge symmetries; that is, metric geometry and affine geometry are necessarily distinct mathematical subjects: this is the content of the Coleman–Mandula theorem. However, a premise of this theorem is violated by the Lie superalgebras (which are not Lie algebras!) thus offering hope that a single unified symmetry can describe both spatial and internal symmetries: this is the foundation of supersymmetry.

The more mathematical approach uses an index-free notation, emphasizing the geometric and algebraic structure of the gauge theory and its relationship to Lie algebras and Riemannian manifolds; for example, treating gauge covariance as equivariance on fibers of a fiber bundle. The index notation used in physics makes it far more convenient for practical calculations, although it makes the overall geometric structure of the theory more opaque. The physics approach also has a pedagogical advantage: the general structure of a gauge theory can be exposed after a minimal background in multivariate calculus, whereas the geometric approach requires a large investment of time in the general theory of differential geometry, Riemannian manifolds, Lie algebras, representations of Lie algebras and principle bundles before a general understanding can be developed. In more advanced discussions, both notations are commonly intermixed.

This article attempts to follow more closely to the notation and language commonly employed in physics curriculum, touching only briefly on the more abstract connections.

Motivation of the covariant derivative through gauge covariance requirement
Consider a generic (possibly non-Abelian) gauge transformation acting on a $$n$$ component field $$\phi = (\phi_a)_{a = 1..n}$$. The main examples in field theory have a compact gauge group and we write the symmetry operator as $$U(x)= e^{i\alpha(x)}$$ where $$\alpha(x)$$ is an element of the Lie algebra associated with the Lie group of symmetry transformations, and can be expressed in terms of the hermitian generators of the Lie algebra (i.e. up to a factor $$i$$, the infinitesimal generators of the gauge group), $$\{t_K\}_{K \in \mathcal{K}}$$, as $$\alpha(x) = \alpha^K(x) t_K$$.

It acts on the field $$\phi(x)$$ as
 * $$ \phi(x) \rightarrow \phi'(x) = U(x) \phi(x) \equiv e^{i\alpha(x)} \phi(x), $$
 * $$ \phi^\dagger(x) \rightarrow \phi{'}^{\dagger} \equiv \phi^\dagger(x) U^\dagger (x) = \phi^\dagger(x) e^{-i\alpha(x)}, \qquad U^\dagger = U^{-1}. $$

Now the partial derivative $$\partial_\mu$$ transforms, accordingly, as
 * $$ \partial_\mu \phi(x)

\rightarrow \partial_\mu \phi'(x) = U(x) \partial_\mu \phi(x) + (\partial_\mu U) \phi(x) \equiv e^{i\alpha(x)} \partial_\mu \phi(x) + i (\partial_\mu \alpha) e^{i\alpha(x)} \phi(x) $$. Therefore, a kinetic term of the form $$ \phi^\dagger \partial_\mu \phi $$ in a Lagrangian is not invariant under gauge transformations.

Definition of the gauge covariant derivative
The root cause of the non gauge invariance is that in writing the field $$\phi = (\phi_1, \ldots \phi_n)$$ as a row vector or in index notation $$\phi_a$$, we have implicitly made a choice of basis frame field i.e. a set of fields $$\varphi^1(x),\ldots, \varphi^n(x)$$ such that every field can be uniquely expressed as $$\phi = \phi_a\varphi^a$$ for functions $$\phi_a(x)$$ (using Einstein summation), and assumed the frame fields $$\varphi^a$$ are constant. Local (i.e. $$x$$ dependent) gauge invariance can be considered as invariance under the choice of frame. However, if one basis frame is as good as any gauge equivalent other one, we can not assume a frame fields to be constant without breaking local gauge symmetry.

We can introduce the gauge covariant derivative $$D_\mu$$ as a generalisation of the partial derivative $$\partial_\mu$$ that acts directly on the field $$\phi$$ rather than its components $$\phi_a$$ with respect to a choice of frame. A gauge covariant derivative is defined as an operator satisfying a product rule
 * $$ D_\mu (f \phi) = (\partial_\mu f)\phi + f (D_\mu \phi) $$

for every smooth function $$f$$ (this is the defining property of a connection).

To go back to index notation we use the product rule
 * $$ D_\mu \phi = D_\mu(\phi_a\varphi^a) = (\partial_\mu \phi_a)\varphi^a + \phi_a (D_\mu \varphi^a).$$.

For a fixed $$a$$, $$D_\mu \varphi^a$$ is a field, so can be expanded w.r.t. the frame field. Hence a gauge covariant derivative and frame field defines a (possibly non Abelian) gauge potential
 * $$D_\mu \varphi^a = -igA^a_{\mu b} \varphi^b$$

(the factor $$-ig$$ is conventional for compact gauge groups and is interpreted as a coupling constant). Conversely given the frame $$\varphi^1, \ldots \varphi^n$$ and a gauge potential $$A^a_{\mu b}$$, this uniquely defines the gauge covariant derivative. We then get
 * $$D_\mu\phi = (D_\mu\phi)_a\varphi^a = (\partial_\mu \phi_a -igA^b_{\mu a}\phi_b)\varphi^a$$.

and with suppressed frame fields this gives in index notation
 * $$ (D_\mu \phi)_a = \partial_\mu \phi_a - ig A^b_{\mu a}\phi_b, $$

which by abuse of notation is often written as
 * $$ D_\mu \phi_a = \partial_\mu \phi_a - ig A^b_{\mu a}\phi_b $$.

This is the definition of the gauge covariant derivative as usually presented in physics.

The gauge covariant derivative is often assumed to satisfy additional conditions making additional structure "constant" in the sense that the covariant derivative vanishes. For example, if we have a Hermitian product $$ h $$ on the fields (e.g. the Dirac conjugate inner product $$\bar \phi \psi$$ for spinors) reducing the gauge group to a unitary group, we can impose the further condition
 * $$ \partial_\mu h(\phi, \psi) = h(D_\mu \phi, \psi) + h(\phi, D_\mu \psi)$$

making the Hermitian product "constant". Writing this out with respect to a local $$h$$-orthonormal frame field gives
 * $$ \partial_\mu (\phi_a^* \psi_a) = \sum_a (D_\mu \phi)_a^* \psi_a + \phi_a^* (D_\mu \psi)_a $$,

and using the above we see that $$A_\mu$$ must be Hermitian i.e. $$A^b_{\mu a} = {A^a_{\mu b}}^*$$ (motivating the extra factor $$i$$). The Hermitian matrices are (up to the factor $$i$$) the generators of the unitary group. More generally if the gauge covariant derivative preserves a gauge group $$G$$ acting with representation $$\rho$$, the gauge covariant connection can be written as
 * $$ (D_\mu \phi)_a = \partial_\mu \phi_a - ig A_\mu^K\rho'(t_K)^b_a\phi_b $$

where $$\rho'$$ is representation of the Lie algebra associated to the group representation $$\rho$$ (loc. cit.).

Note that including the gauge covariant derivative (or its gauge potential), as a physical field, "field with zero gauge covariant derivative along the tangent of a curve $$\gamma$$"
 * $$D_{\dot \gamma}\phi = (\frac{d}{dt} \gamma^\mu) D_\mu \phi = 0 $$

is a physically meaningful definition of a field $$\phi$$ constant along a (smooth) curve. Hence the gauge covariant derivative defines (and is defined by) parallel transport.

Gauge Field Strength
Unlike the partial derivatives, the gauge covariant derivatives do not commute. However they almost do in the sense that the commutator is not an operator of order 2 but of order 0, i.e. is linear over functions:
 * $$ [D_\mu, D_\nu] (f \phi) = (\partial_\mu \partial_\nu f) \phi + \partial_\nu f D_\mu \phi + \partial_\mu f D_\nu \phi + f D_\mu D_\nu \phi - (\mu \leftrightarrow \nu) = f [D_\mu, D_\nu] \phi$$.

The linear map
 * $$ F_{\mu\nu} = -1/(ig) [D_\mu, D_\nu]$$

is called the gauge field strength (loc. cit). In index notation, using the gauge potential
 * $$ F_{\mu\nu\,b}^{\ a} = \partial_\mu A^a_{\nu b} - \partial_\nu A^a_{\mu b} - ig(A^a_{\mu c}A^c_{\nu b} - A^a_{\nu c}A^c_{\mu b})$$.

If $$D_\mu$$ is a G covariant derivative, one can interpret the latter term as a commutator in the Lie algebra of G and $$F_{\mu\nu}$$ as Lie algebra valued (loc. cit).

Invariance under gauge transformations
The gauge covariant derivative transforms covariantly under Gauge transformations, i.e. for all $$\phi$$
 * $$ D_\mu \phi(x) \rightarrow D'_\mu \phi'(x) = D'_\mu U(x) \phi(x) = U(x) D_\mu \phi(x), $$

which in operator form takes the form
 * $$ D'_\mu U(x) = U(x) D_\mu$$

or
 * $$ D'_\mu = U(x) D_\mu U^{-1}(x).$$

In particular (suppressing dependence on $$x$$)
 * $$ -ig F'_{\mu\nu} = [D'_\mu, D'_\nu] = [UD_\mu U^{-1}, UD_\nu U^{-1}] = U[D_\mu, D_\nu]U^{-1} = -ig U F_{\mu\nu} U^{-1}$$.

Further, (suppressing indices and replacing them by matrix multiplication) if $$D_\mu = \partial_\mu - ig A_\mu$$ is of the form above, $$D'_\mu$$ is of the form
 * $$ D'_\mu = \partial_\mu + (\partial_\mu U^{-1})U - igU A_\mu U^{-1} $$

or using $$U(x) = e^{i \alpha(x)}$$,
 * $$ D'_\mu = \partial_\mu - i\partial_\mu \alpha -ig U A_\mu U^{-1} $$

which is also of this form.

In the Hermitian case with a unitary gauge group $$ U^{-1} = U^\dagger$$ and we have found a first order differential operator $$D_\mu$$ with $$\partial_\mu$$ as first order term such that
 * $$ \phi^\dagger D_\mu \phi \rightarrow \phi'^\dagger D'_\mu \phi' = \phi^\dagger D_\mu \phi.$$.

Gauge theory
In gauge theory, which studies a particular class of fields which are of importance in quantum field theory, different fields are used in Lagrangians that are invariant under local gauge transformations. Kinetic terms involve derivatives of the fields which by the above arguments need to involve gauge covariant derivatives.

Abelian Gauge Theory
the gauge covariant derivative $$D_\mu$$ on a complex scalar field $$\phi = \phi_1 \varphi^1$$ (i.e. $$n = 1$$) of charge $$q$$ is a $$U(1)$$ connection. The gauge  potential $$A_\mu$$ is a (1 x 1) matrix, i.e. a scalar.
 * $$ (D_\mu \phi)_1 = (\partial_\mu \phi_1 - iq A_\mu \phi_1) $$

The gauge field strength is
 * $$ F_{\mu\nu} = \partial_\mu A_\nu -\partial_\nu A_\mu$$

The gauge potential can be interpreted as electromagnetic four-potential and the gauge field strength as the electromagnetic field tensor. Since this only involves the charge of the field and not higher multipoles like the magnetic moment (and in a loose and non unique way, because it replaces $$\partial_\mu$$ by $$D_\mu$$ ) this is called minimal coupling.

For a Dirac spinor field $$ \psi $$ of charge $$q$$ the covariant derivative is also a $$U(1)$$ connection (because it has to commute with the gamma matrices) and is defined as
 * $$ (D_\mu \psi)_\alpha := (\partial_\mu - i q A_\mu) \psi_\alpha$$

where again $$A_\mu$$ is interpreted as the electromagnetic four-potential and $$F_{\mu\nu}$$ as the electromagnetic field tensor. (The minus sign is a convention valid for a Minkowski metric signature (−, +, +, +), which is common in general relativity and used below. For the particle physics convention (+, −, −, −), it is $$ D_\mu := \partial_\mu + i q A_\mu $$. The electron's charge is defined negative as $$q_e=-|e|$$, while the Dirac field is defined to transform positively as $$\psi(x) \rightarrow e^{iq\alpha(x)} \psi(x).$$)

Quantum electrodynamics
If a gauge transformation is given by
 * $$ \psi \mapsto e^{i\Lambda} \psi $$

and for the gauge potential
 * $$ A_\mu \mapsto A_\mu + {1 \over e} (\partial_\mu \Lambda) $$

then $$ D_\mu $$ transforms as
 * $$ D_\mu \mapsto \partial_\mu - i e A_\mu - i (\partial_\mu \Lambda) $$,

and $$ D_\mu \psi $$ transforms as
 * $$ D_\mu \psi \mapsto e^{i \Lambda} D_\mu \psi $$

and $$ \bar \psi := \psi^\dagger \gamma^0 $$ transforms as
 * $$ \bar \psi \mapsto \bar \psi e^{-i \Lambda} $$

so that
 * $$ \bar \psi D_\mu \psi \mapsto \bar \psi D_\mu \psi $$

and $$ \bar \psi D_\mu \psi $$ in the QED Lagrangian is therefore gauge invariant, and the gauge covariant derivative is thus named aptly.

On the other hand, the non-covariant derivative $$ \partial_\mu $$ would not preserve the Lagrangian's gauge symmetry, since
 * $$ \bar \psi \partial_\mu \psi \mapsto \bar \psi \partial_\mu \psi + i \bar \psi (\partial_\mu \Lambda) \psi $$.

Quantum chromodynamics
In quantum chromodynamics, the gauge covariant derivative is
 * $$ D_\mu := \partial_\mu - i g_s \, G_\mu^\alpha \, \lambda_\alpha /2 $$

where $$g_s$$ is the coupling constant of the strong interaction, $$G$$ is the gluon gauge field, for eight different gluons $$\alpha=1 \dots 8$$, and where $$\lambda_\alpha$$ is one of the eight Gell-Mann matrices. The Gell-Mann matrices give a representation of the color symmetry group SU(3). For quarks, the representation is the fundamental representation, for gluons, the representation is the adjoint representation.

Standard Model
The covariant derivative in the Standard Model combines the electromagnetic, the weak and the strong interactions. It can be expressed in the following form:
 * $$ D_\mu := \partial_\mu - i \frac{g'}{2} Y \, B_\mu - i \frac{g}{2} \sigma_j \, W_\mu^j - i \frac{g_s}{2} \lambda_\alpha \, G_\mu^\alpha $$

The gauge fields here belong to the fundamental representations of the electroweak Lie group $$U(1)\times SU(2)$$ times the color symmetry Lie group SU(3). The coupling constant $$g'$$ provides the coupling of the hypercharge $$Y$$ to the $$B$$ boson and $$g$$ the coupling via the three vector bosons $$W^j$$ $$(j = 1,2,3)$$ to the weak isospin, whose components are written here as the Pauli matrices $$\sigma_j$$. Via the Higgs mechanism, these boson fields combine into the massless electromagnetic field $$A_\mu$$ and the fields for the three massive vector bosons $$W^\pm$$ and $$Z$$.

General relativity
The covariant derivative in general relativity is a special example of the gauge covariant derivative. It corresponds to the Levi Civita connection (a special Riemannian connection) on the tangent bundle (or the frame bundle) i.e. it acts on tangent vector fields or more generally, tensors. It is usually written as $$\nabla$$ instead of $$D$$. In this special case, a choice of (local) coordinates $$x^1,\ldots, x^d$$ not only gives partial derivatives $$\partial_\mu$$, but they double as a frame of tangent vectors $$\partial_1, \ldots \partial_d$$ in which a vector field $$v$$ can be uniquely expressed as $$v = v^\mu\partial_\mu$$ (this uses the definition of a vector field as an operator on smooth functions that satisfies a product rule i.e. a derivation). Therefore, in this case "the internal indices are also space time indices". Up to slightly different normalisation (and notation) the gauge potential $$A^\lambda_{ \mu\nu}$$ is the Christoffel symbol defined by
 * $$ \nabla_\mu \partial_\nu = \Gamma^\lambda_{\mu \nu} \partial_\lambda$$.

It gives the covariant derivative
 * $$ (\nabla_\mu v)^\nu = (\nabla_\mu (v^\lambda\partial_\lambda))^\nu = ((\partial_\mu v^\lambda) \partial_\lambda + v^\lambda (\nabla_\mu \partial_\lambda))^\nu = \partial_\mu v^\nu + \Gamma^\nu_{\mu \lambda}v^\lambda $$.

The formal similarity with the gauge covariant derivative is more clear when the choice of coordinates is decoupled from the choice of frame of vector fields $$e_1 = e_1^\mu\partial_\mu, \ldots, e_d = e_d^\mu\partial_\mu$$. Especially when the frame is orthonormal, such a frame is usually called a d-Bein. Then
 * $$ (\nabla_\mu v)^n = (\nabla_\mu (v^\ell e_\ell))^n = ((\partial_\mu v^\ell) e_\ell + v^\ell (\nabla_\mu e_\ell))^n = \partial_\mu v^n + \Gamma^n_{\mu \ell} v^\ell $$

where $$\nabla_\mu e_m = \Gamma^\ell_{\mu m} e_\ell$$. The direct analogue of the "gauge freedom" of the gauge covariant derivative is the arbitrariness of the choice of an orthonormal d-Bein at each point in space-time: local Lorentz invariance. However, in this case the more general independence of the choice of coordinates for the definition of the Levi Civita connection gives diffeomorphism or general coordinate invariance.

Fluid dynamics
In fluid dynamics, the gauge covariant derivative of a fluid may be defined as
 * $$ \nabla_t \mathbf{v}:= \partial_t \mathbf{v} + (\mathbf{v} \cdot \nabla) \mathbf{v}$$

where $$\mathbf{v}$$ is a velocity vector field of a fluid.