Baker–Campbell–Hausdorff formula

In mathematics, the Baker–Campbell–Hausdorff formula gives the value of $$Z$$ that solves the equation $$e^X e^Y = e^Z$$ for possibly noncommutative $X$ and $Y$ in the Lie algebra of a Lie group. There are various ways of writing the formula, but all ultimately yield an expression for $$Z$$ in Lie algebraic terms, that is, as a formal series (not necessarily convergent) in $$X$$ and $$Y$$ and iterated commutators thereof. The first few terms of this series are: $$Z = X + Y + \frac{1}{2} [X,Y] + \frac{1}{12} [X,[X,Y]] - \frac{1}{12} [Y,[X,Y]] + \cdots\,,$$ where "$$\cdots$$" indicates terms involving higher commutators of $$X$$ and $$Y$$. If $$X$$ and $$Y$$ are sufficiently small elements of the Lie algebra $$\mathfrak g$$ of a Lie group $$G$$, the series is convergent. Meanwhile, every element $$g$$ sufficiently close to the identity in $$G$$ can be expressed as $$g = e^X$$ for a small $$X$$ in $$\mathfrak g$$. Thus, we can say that near the identity the group multiplication in $$G$$—written as $$e^X e^Y = e^Z$$—can be expressed in purely Lie algebraic terms. The Baker–Campbell–Hausdorff formula can be used to give comparatively simple proofs of deep results in the Lie group–Lie algebra correspondence.

If $$X$$ and $$Y$$ are sufficiently small $$n \times n$$ matrices, then $$Z$$ can be computed as the logarithm of $$e^X e^Y$$, where the exponentials and the logarithm can be computed as power series. The point of the Baker–Campbell–Hausdorff formula is then the highly nonobvious claim that $$Z := \log \left(e^X e^Y\right)$$ can be expressed as a series in repeated commutators of $$X$$ and $$Y$$.

Modern expositions of the formula can be found in, among other places, the books of Rossmann and Hall.

History
The formula is named after Henry Frederick Baker, John Edward Campbell, and Felix Hausdorff who stated its qualitative form, i.e. that only commutators and commutators of commutators, ad infinitum, are needed to express the solution. An earlier statement of the form was adumbrated by Friedrich Schur in 1890 where a convergent power series is given, with terms recursively defined. This qualitative form is what is used in the most important applications, such as the relatively accessible proofs of the Lie correspondence and in quantum field theory. Following Schur, it was noted in print by Campbell (1897); elaborated by Henri Poincaré (1899) and Baker (1902); and systematized geometrically, and linked to the Jacobi identity by Hausdorff (1906). The first actual explicit formula, with all numerical coefficients, is due to Eugene Dynkin (1947). The history of the formula is described in detail in the article of Achilles and Bonfiglioli and in the book of Bonfiglioli and Fulci.

Explicit forms
For many purposes, it is only necessary to know that an expansion for $$Z$$ in terms of iterated commutators of $$X$$ and $$Y$$ exists; the exact coefficients are often irrelevant. (See, for example, the discussion of the relationship between Lie group and Lie algebra homomorphisms in Section 5.2 of Hall's book, where the precise coefficients play no role in the argument.) A remarkably direct existence proof was given by Martin Eichler, see also the "Existence results" section below.

In other cases, one may need detailed information about $$Z$$ and it is therefore desirable to compute $$Z$$ as explicitly as possible. Numerous formulas exist; we will describe two of the main ones (Dynkin's formula and the integral formula of Poincaré) in this section.

Dynkin's formula
Let G be a Lie group with Lie algebra $$\mathfrak g$$. Let $$\exp : \mathfrak g \to G $$ be the exponential map. The following general combinatorial formula was introduced by Eugene Dynkin (1947), $$\log(\exp X\exp Y) = \sum_{n = 1}^\infty\frac {(-1)^{n-1}}{n} \sum_{\begin{smallmatrix} r_1 + s_1 > 0 \\ \vdots \\ r_n + s_n > 0 \end{smallmatrix}} \frac{[ X^{r_1} Y^{s_1} X^{r_2} Y^{s_2} \dotsm X^{r_n} Y^{s_n} ]}{\left(\sum_{j = 1}^n (r_j + s_j)\right) \cdot \prod_{i = 1}^n r_i! s_i!}, $$ where the sum is performed over all nonnegative values of $$s_i$$ and $$r_i$$, and the following notation has been used: $$ [ X^{r_1} Y^{s_1} \dotsm X^{r_n} Y^{s_n} ] = [ \underbrace{X,[X,\dotsm[X}_{r_1} ,[ \underbrace{Y,[Y,\dotsm[Y}_{s_1} ,\,\dotsm\, [ \underbrace{X,[X,\dotsm[X}_{r_n} ,[ \underbrace{Y,[Y,\dotsm Y}_{s_n} ]]\dotsm]]$$ with the understanding that $[X] := X$.

The series is not convergent in general; it is convergent (and the stated formula is valid) for all sufficiently small $$X$$ and $$Y$$. Since $[A, A] = 0$, the term is zero if $$s_n > 1$$ or if $$s_n = 0$$ and $$r_n > 1$$.

The first few terms are well-known, with all higher-order terms involving $[X,Y]$ and commutator nestings thereof (thus in the Lie algebra):

The above lists all summands of order 6 or lower (i.e. those containing 6 or fewer $X$'s and $Y$'s). The $X ↔ Y$ (anti-)/symmetry in alternating orders of the expansion, follows from $Z(Y, X) = −Z(−X, −Y)$. A complete elementary proof of this formula can be found in the article on the derivative of the exponential map.

An integral formula
There are numerous other expressions for $$Z$$, many of which are used in the physics literature. A popular integral formula is $$\log\left(e^X e^Y\right) = X + \left ( \int_0^1 \psi \left ( e^{\operatorname{ad} _X} ~ e^{t \operatorname{ad} _ Y}\right ) dt \right) Y, $$ involving the generating function for the Bernoulli numbers, $$ \psi(x) ~\stackrel{\text{def}}{=} ~ \frac{x \log x}{x-1}= 1- \sum^\infty_{n=1} {(1-x)^n \over n (n+1)} ~, $$ utilized by Poincaré and Hausdorff.

Matrix Lie group illustration
For a matrix Lie group $$G \sub \mbox{GL}(n,\mathbb{R})$$ the Lie algebra is the tangent space of the identity I, and the commutator is simply $[X, Y] = XY − YX$; the exponential map is the standard exponential map of matrices, $$\exp X = e^X = \sum_{n=0}^\infty {\frac{X^n}{n!}}.$$

When one solves for Z in $$e^Z = e^X e^Y,$$ using the series expansions for $exp$ and $log$ one obtains a simpler formula: $$ Z = \sum_{n>0} \frac{(-1)^{n-1}}{n} \sum_{\stackrel{r_i+s_i > 0}{1 \le i \le n}} \frac{X^{r_1}Y^{s_1} \cdots X^{r_n}Y^{s_n}}{r_1!s_1!\cdots r_n!s_n!}, \quad \|X\| + \|Y\| < \log 2, \|Z\| < \log 2.$$ The first, second, third, and fourth order terms are:
 * $$z_1 = X + Y$$
 * $$z_2 = \frac{1}{2} (XY - YX)$$
 * $$z_3 = \frac{1}{12} \left(X^2Y + XY^2 - 2XYX + Y^2X + YX^2 - 2YXY\right)$$
 * $$z_4 = \frac{1}{24} \left(X^2Y^2 - 2XYXY - Y^2X^2 + 2YXYX \right).$$

The formulas for the various $$z_j$$'s is not the Baker–Campbell–Hausdorff formula. Rather, the Baker–Campbell–Hausdorff formula is one of various expressions for $$z_j$$'s in terms of repeated commutators of $$X$$ and $$Y$$. The point is that it is far from obvious that it is possible to express each $$z_j$$ in terms of commutators. (The reader is invited, for example, to verify by direct computation that $$z_3$$ is expressible as a linear combination of the two nontrivial third-order commutators of $$X$$ and $$Y$$, namely $$[X,[X,Y]]$$ and $$[Y,[X,Y]]$$.) The general result that each $$z_j$$ is expressible as a combination of commutators was shown in an elegant, recursive way by Eichler.

A consequence of the Baker–Campbell–Hausdorff formula is the following result about the trace: $$\operatorname{tr} \log \left(e^X e^Y \right) = \operatorname{tr} X + \operatorname{tr} Y.  $$ That is to say, since each $$z_j$$ with $$j\geq 2$$ is expressible as a linear combination of commutators, the trace of each such terms is zero.

Questions of convergence
Suppose $$X$$ and $$Y$$ are the following matrices in the Lie algebra $$\mathfrak{sl}(2;\mathbb C)$$ (the space of $$2\times 2$$ matrices with trace zero): $$X=\begin{pmatrix}0&i\pi\\ i\pi&0\end{pmatrix};\quad Y=\begin{pmatrix}0&1\\ 0&0\end{pmatrix}.$$ Then $$e^X e^Y = \begin{pmatrix}-1&0\\ 0&-1\end{pmatrix}\begin{pmatrix}1&1\\ 0&1\end{pmatrix}=\begin{pmatrix}-1&-1\\ 0&-1\end{pmatrix}.$$ It is then not hard to show that there does not exist a matrix $$Z$$ in $$\operatorname{sl}(2;\mathbb C)$$ with $$e^X e^Y = e^Z$$. (Similar examples may be found in the article of Wei. )

This simple example illustrates that the various versions of the Baker–Campbell–Hausdorff formula, which give expressions for $R$ in terms of iterated Lie-brackets of $C$ and $e^{Z} = e^{X}e^{Y}$, describe formal power series whose convergence is not guaranteed. Thus, if one wants $\|X\| + \|Y\| < log 2, \|Z\| < log 2$ to be an actual element of the Lie algebra containing $Z$ and $X$ (as opposed to a formal power series), one has to assume that $Y$ and $Z$ are small. Thus, the conclusion that the product operation on a Lie group is determined by the Lie algebra is only a local statement. Indeed, the result cannot be global, because globally one can have nonisomorphic Lie groups with isomorphic Lie algebras.

Concretely, if working with a matrix Lie algebra and $$\|\cdot\|$$ is a given submultiplicative matrix norm, convergence is guaranteed if $$\|X\| + \|Y\| < \frac{\ln 2} 2.$$

Special cases
If $$X$$ and $$Y$$ commute, that is $$[X, Y]=0$$, the Baker–Campbell–Hausdorff formula reduces to $$e^X e^Y = e^{X+Y}$$.

Another case assumes that $$[X,Y]$$ commutes with both $$X$$ and $$Y$$, as for the nilpotent Heisenberg group. Then the formula reduces to its first three terms.

Theorem: If $$X$$ and $$Y$$ commute with their commutator, $$[X,[X,Y]] = [Y,[X,Y]] = 0$$, then $$e^X e^Y = e^{X+Y+\frac{1}{2}[X,Y]}$$.

This is the degenerate case used routinely in quantum mechanics, as illustrated below and is sometimes known as the disentangling theorem. In this case, there are no smallness restrictions on $$X$$ and $$Y$$. This result is behind the "exponentiated commutation relations" that enter into the Stone–von Neumann theorem. A simple proof of this identity is given below.

Another useful form of the general formula emphasizes expansion in terms of Y and uses the adjoint mapping notation $$\operatorname{ad}_X(Y)=[X,Y]$$: $$\log(\exp X\exp Y) = X + \frac{\operatorname{ad}_X}{1-e^{-\operatorname{ad}_X}} ~ Y + O\left(Y^2\right) = X + \operatorname{ad}_{X/2} (1 + \coth \operatorname{ad}_{X/2}) ~ Y + O\left(Y^2\right) ,$$ which is evident from the integral formula above. (The coefficients of the nested commutators with a single $$Y$$ are normalized Bernoulli numbers.)

Now assume that the commutator is a multiple of $$Y$$, so that $$[X,Y] = sY$$. Then all iterated commutators will be multiples of $$Y$$, and no quadratic or higher terms in $$Y$$ appear. Thus, the $$O\left(Y^2\right)$$ term above vanishes and we obtain:

Theorem: If $$[X,Y] = sY$$, where $$s$$ is a complex number with $$s \neq 2\pi in$$ for all integers $$n$$, then we have $$e^X e^Y = \exp\left( X+\frac{s}{1-e^{-s}}Y\right).$$

Again, in this case there are no smallness restriction on $$X$$ and $$Y$$. The restriction on $$s$$ guarantees that the expression on the right side makes sense. (When $$s = 0$$ we may interpret $\lim_{s\to 0} s/(1-e^{-s}) = 1$ .) We also obtain a simple "braiding identity": $$e^{X} e^{Y} = e^{\exp (s) Y} e^{X} ,$$ which may be written as an adjoint dilation: $$e^{X} e^{Y} e^{-X} = e^{\exp (s) \, Y} .$$

Existence results
If $$X$$ and $$Y$$ are matrices, one can compute $$Z := \log \left(e^X e^Y\right)$$ using the power series for the exponential and logarithm, with convergence of the series if $$X$$ and $$Y$$ are sufficiently small. It is natural to collect together all terms where the total degree in $$X$$ and $$Y$$ equals a fixed number $$k$$, giving an expression $$z_k$$. (See the section "Matrix Lie group illustration" above for formulas for the first several $$z_k$$'s.) A remarkably direct and concise, recursive proof that each $$z_k$$ is expressible in terms of repeated commutators of $$X$$ and $$Y$$ was given by Martin Eichler.

Alternatively, we can give an existence argument as follows. The Baker–Campbell–Hausdorff formula implies that if $X$ and $Y$ are in some Lie algebra $$\mathfrak g,$$ defined over any field of characteristic 0 like $$\Reals$$ or $$\Complex$$, then $$Z = \log(\exp(X) \exp(Y)),$$ can formally be written as an infinite sum of elements of $$\mathfrak g$$. [This infinite series may or may not converge, so it need not define an actual element $X$ in $$\mathfrak g$$.] For many applications, the mere assurance of the existence of this formal expression is sufficient, and an explicit expression for this infinite sum is not needed. This is for instance the case in the Lorentzian construction of a Lie group representation from a Lie algebra representation. Existence can be seen as follows.

We consider the ring $$S = \RX,Y$$ of all non-commuting formal power series with real coefficients in the non-commuting variables $Y$ and $X$. There is a ring homomorphism from $Y$ to the tensor product of $Z$ with $X$ over $Y$, $$\Delta \colon S \to S \otimes S,$$ called the coproduct, such that $$\Delta(X) = X \otimes 1 + 1 \otimes X$$ and $$\Delta(Y) = Y \otimes 1 + 1 \otimes Y.$$ (The definition of Δ is extended to the other elements of S by requiring R-linearity, multiplicativity and infinite additivity.)

One can then verify the following properties:
 * The map $S$, defined by its standard Taylor series, is a bijection between the set of elements of $S$ with constant term 0 and the set of elements of $S$ with constant term 1; the inverse of exp is log
 * $$r = \exp(s)$$ is grouplike (this means $$\Delta(r) = r\otimes r$$) if and only if s is primitive (this means $$\Delta(s) = s\otimes 1 + 1\otimes s$$).
 * The grouplike elements form a group under multiplication.
 * The primitive elements are exactly the formal infinite sums of elements of the Lie algebra generated by X and Y, where the Lie bracket is given by the commutator $$[U,V] = UV - VU$$. (Friedrichs' theorem )

The existence of the Campbell–Baker–Hausdorff formula can now be seen as follows: The elements X and Y are primitive, so $$\exp(X)$$ and $$\exp(Y)$$ are grouplike; so their product $$\exp(X)\exp(Y)$$ is also grouplike; so its logarithm $$\log(\exp(X)\exp(Y))$$ is primitive; and hence can be written as an infinite sum of elements of the Lie algebra generated by $R$ and $exp$.

The universal enveloping algebra of the free Lie algebra generated by $S$ and $S$ is isomorphic to the algebra of all non-commuting polynomials in $X$ and $Y$. In common with all universal enveloping algebras, it has a natural structure of a Hopf algebra, with a coproduct $X$. The ring $Y$ used above is just a completion of this Hopf algebra.

Zassenhaus formula
A related combinatoric expansion that is useful in dual applications is $$e^{t(X+Y)} = e^{tX}~ e^{tY} ~e^{-\frac{t^2}{2} [X,Y]} ~ e^{\frac{t^3}{6}(2[Y,[X,Y]]+ [X,[X,Y]] )} ~ e^{\frac{-t^4}{24}([[[X,Y],X],X] + 3[[[X,Y],X],Y] + 3[[[X,Y],Y],Y]) } \cdots$$ where the exponents of higher order in $X$ are likewise nested commutators, i.e., homogeneous Lie polynomials. These exponents, $Y$ in $Δ$, follow recursively by application of the above BCH expansion.

As a corollary of this, the Suzuki–Trotter decomposition follows.

The identity (Campbell 1897)
Let $S$ be a matrix Lie group and $t$ its corresponding Lie algebra. Let $C_{n}$ be the linear operator on $exp(−tX) exp(t(X+Y)) = Π_{n} exp(t^{n} C_{n})$ defined by $G$ for some fixed $g$. (The adjoint endomorphism encountered above.) Denote with $ad_{X}$ for fixed $g$ the linear transformation of $ad_{X } Y = [X,Y] = XY − YX$ given by $X ∈ g$.

A standard combinatorial lemma which is utilized in producing the above explicit expansions is given by $$ \operatorname{Ad}_{e^X} = e^{\operatorname{ad}_X}, $$ so, explicitly, $$ \operatorname{Ad}_{e^X}Y = e^{X}Y e^{-X} = e^{\operatorname{ad} _X} Y =Y+\left[X,Y\right]+\frac{1}{2!}[X,[X,Y]]+\frac{1}{3!}[X,[X,[X,Y]]]+\cdots.$$ This is a particularly useful formula which is commonly used to conduct unitary transforms in quantum mechanics. By defining the iterated commutator, $$[(X)^n,Y] \equiv \underbrace{[X,\dotsb[X,[X}_{n \text { times }}, Y]] \dotsb],\quad [(X)^0,Y] \equiv Y,$$ we can write this formula more compactly as, $$e^X Y e^{-X} = \sum_{n=0}^{\infty} \frac{[(X)^n,Y]}{n!}.$$

This formula can be proved by evaluation of the derivative with respect to $Ad_{A}$ of $A ∈ G$, solution of the resulting differential equation and evaluation at $g$, $$\frac{d}{ds}f(s)Y = \frac{d}{ds} \left (e^{sX}Ye^{-sX} \right ) = X e^{sX} Y e^{-sX} - e^{sX}Y e^{-sX}X = \operatorname{ad}_X (e^{sX}Ye^{-sX})$$ or $$ f'(s) = \operatorname{ad}_Xf(s), \qquad f(0) = 1 \qquad \Longrightarrow \qquad f(s) = e^{s \operatorname{ad}_X}.$$

An application of the identity
For $Ad_{A}Y = AYA^{−1}$ central, i.e., commuting with both $X$ and $Y$, $$e^{sX} Y e^{-sX} = Y + s [ X, Y ] ~. $$ Consequently, for $s$, it follows that $$\frac{dg}{ds} = \Bigl( X+ e^{sX} Y e^{-sX}\Bigr) g(s) = (X + Y + s [ X, Y ]) ~g(s) ~, $$ whose solution is $$g(s)= e^{s(X+Y) +\frac{s^2}{2} [ X, Y ] } ~. $$ Taking $$s=1$$ gives one of the special cases of the Baker–Campbell–Hausdorff formula described above: $$e^X e^Y= e^{X+Y +\frac{1}{2} [ X, Y] } ~. $$

More generally, for non-central $f (s)Y ≡ e^{sX} Y e^{−sX}$, the following braiding identity further follows readily, $$e^{X} e^{Y} = e^{(Y+\left[X,Y\right]+\frac{1}{2!}[X,[X,Y]]+\frac{1}{3!}[X,[X,[X,Y]]]+\cdots)} ~e^X.$$

Infinitesimal case
A particularly useful variant of the above is the infinitesimal form. This is commonly written as $$e^{-X} de^X= dX-\frac{1}{2!}\left[X,dX\right]+\frac{1}{3!}[X,[X,dX]]-\frac{1}{4!}[X,[X,[X,dX]]]+\cdots$$ This variation is commonly used to write coordinates and vielbeins as pullbacks of the metric on a Lie group.

For example, writing $$X=X^ie_i$$ for some functions $$X^i$$ and a basis $$e_i$$ for the Lie algebra, one readily computes that $$e^{-X}d e^X= dX^i e_i-\frac{1}{2!} X^i dX^j {f_{ij}}^k e_k + \frac{1}{3!} X^iX^j dX^k {f_{jk}}^l {f_{il}}^m e_m - \cdots ,$$ for $$[e_i,e_j] = {f_{ij}}^k e_k$$ the structure constants of the Lie algebra.

The series can be written more compactly (cf. main article) as $$e^{-X}d e^X = e_i{W^i}_j dX^j,$$ with the infinite series $$W = \sum_{n=0}^\infty \frac{(-1)^n M^n}{(n+1)!} = (I-e^{-M}) M^{-1}.$$ Here, $M$ is a matrix whose matrix elements are $${M_j}^k = X^i {f_{ij}}^k$$.

The usefulness of this expression comes from the fact that the matrix $M$ is a vielbein. Thus, given some map $$N \to G$$ from some manifold $N$ to some manifold $G$, the metric tensor on the manifold $N$ can be written as the pullback of the metric tensor $$B_{mn}$$ on the Lie group $G$, $$g_{ij} = {W_i}^m {W_j}^n B_{mn}.$$ The metric tensor $$B_{mn}$$ on the Lie group is the Cartan metric, the Killing form. For $N$ a (pseudo-)Riemannian manifold, the metric is a (pseudo-)Riemannian metric.

Application in quantum mechanics
A special case of the Baker–Campbell–Hausdorff formula is useful in quantum mechanics and especially quantum optics, where $s = 1$ and $[X,Y]$ are Hilbert space operators, generating the Heisenberg Lie algebra. Specifically, the position and momentum operators in quantum mechanics, usually denoted $$X$$ and $$P$$, satisfy the canonical commutation relation: $$[X,P] = i\hbar I$$ where $$I$$ is the identity operator. It follows that $$X$$ and $$P$$ commute with their commutator. Thus, if we formally applied a special case of the Baker–Campbell–Hausdorff formula (even though $$X$$ and $$P$$ are unbounded operators and not matrices), we would conclude that $$e^{iaX} e^{ibP} = e^{i \left(aX + bP - \frac{ab\hbar}{2}\right)}.$$ This "exponentiated commutation relation" does indeed hold, and forms the basis of the Stone–von Neumann theorem. Further,

A related application is the annihilation and creation operators, $â$ and $g(s) ≡ e^{sX} e^{sY}$. Their commutator $[X,Y]$ is central, that is, it commutes with both $â$ and $X$. As indicated above, the expansion then collapses to the semi-trivial degenerate form: $$ e^{v\hat{a}^\dagger - v^*\hat{a}} = e^{v\hat{a}^\dagger} e^{-v^*\hat{a}} e^{-|v|^{2}/2} ,$$ where $v$ is just a complex number.

This example illustrates the resolution of the displacement operator, $Y$, into exponentials of annihilation and creation operators and scalars.

This degenerate Baker–Campbell–Hausdorff formula then displays the product of two displacement operators as another displacement operator (up to a phase factor), with the resultant displacement equal to the sum of the two displacements, $$ e^{v\hat{a}^\dagger - v^*\hat{a}} e^{u\hat{a}^\dagger - u^*\hat{a}} = e^{(v+u)\hat{a}^\dagger -(v^*+u^*)\hat{a}} e^{(vu^*-uv^*)/2},$$ since the Heisenberg group they provide a representation of is nilpotent. The degenerate Baker–Campbell–Hausdorff formula is frequently used in quantum field theory as well.