Photon polarization

Photon polarization is the quantum mechanical description of the classical polarized sinusoidal plane electromagnetic wave. An individual photon can be described as having right or left circular polarization, or a superposition of the two. Equivalently, a photon can be described as having horizontal or vertical linear polarization, or a superposition of the two.

The description of photon polarization contains many of the physical concepts and much of the mathematical machinery of more involved quantum descriptions, such as the quantum mechanics of an electron in a potential well. Polarization is an example of a qubit degree of freedom, which forms a fundamental basis for an understanding of more complicated quantum phenomena. Much of the mathematical machinery of quantum mechanics, such as state vectors, probability amplitudes, unitary operators, and Hermitian operators, emerge naturally from the classical Maxwell's equations in the description. The quantum polarization state vector for the photon, for instance, is identical with the Jones vector, usually used to describe the polarization of a classical wave. Unitary operators emerge from the classical requirement of the conservation of energy of a classical wave propagating through lossless media that alter the polarization state of the wave. Hermitian operators then follow for infinitesimal transformations of a classical polarization state.

Many of the implications of the mathematical machinery are easily verified experimentally. In fact, many of the experiments can be performed with polaroid sunglass lenses.

The connection with quantum mechanics is made through the identification of a minimum packet size, called a photon, for energy in the electromagnetic field. The identification is based on the theories of Planck and the interpretation of those theories by Einstein. The correspondence principle then allows the identification of momentum and angular momentum (called spin), as well as energy, with the photon.

Linear polarization


The wave is linearly polarized (or plane polarized) when the phase angles $$ \alpha_x \, ,\; \alpha_y $$ are equal, $$ \alpha_x = \alpha_y \ \stackrel{\mathrm{def}}{=}\ \alpha. $$

This represents a wave with phase $$\alpha$$ polarized at an angle $$ \theta $$ with respect to the x axis. In this case the Jones vector $$ |\psi\rangle = \begin{pmatrix} \cos\theta \exp \left ( i \alpha_x \right ) \\ \sin\theta \exp \left ( i \alpha_y \right ) \end{pmatrix} $$ can be written with a single phase: $$ |\psi\rangle = \begin{pmatrix} \cos\theta \\ \sin\theta \end{pmatrix} \exp \left ( i \alpha \right ) .$$

The state vectors for linear polarization in x or y are special cases of this state vector.

If unit vectors are defined such that $$ |x\rangle \ \stackrel{\mathrm{def}}{=}\ \begin{pmatrix} 1 \\ 0 \end{pmatrix} $$ and $$ |y\rangle \ \stackrel{\mathrm{def}}{=}\ \begin{pmatrix} 0 \\ 1 \end{pmatrix} $$ then the linearly polarized polarization state can be written in the "x–y basis" as $$ |\psi\rangle = \cos\theta \exp \left ( i \alpha \right ) |x\rangle + \sin\theta \exp \left ( i \alpha \right ) |y\rangle = \psi_x |x\rangle + \psi_y |y\rangle. $$

Circular polarization
If the phase angles $$\alpha_x$$ and $$\alpha_y$$ differ by exactly $$\pi / 2$$ and the x amplitude equals the y amplitude the wave is circularly polarized. The Jones vector then becomes $$ |\psi\rangle = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ \pm i \end{pmatrix} \exp \left ( i \alpha_x \right ) $$ where the plus sign indicates left circular polarization and the minus sign indicates right circular polarization. In the case of circular polarization, the electric field vector of constant magnitude rotates in the x–y plane.

If unit vectors are defined such that $$ |\mathrm{R}\rangle \ \stackrel{\mathrm{def}}{=}\ {1 \over \sqrt{2}} \begin{pmatrix} 1 \\ i \end{pmatrix} $$ and $$ |\mathrm{L}\rangle \ \stackrel{\mathrm{def}}{=}\ {1 \over \sqrt{2}} \begin{pmatrix} 1 \\ -i \end{pmatrix} $$ then an arbitrary polarization state can be written in the "R–L basis" as $$ |\psi\rangle = \psi_{\rm R} |\mathrm{R}\rangle + \psi_{\rm L} |\mathrm{L}\rangle $$ where $$\psi_{\rm R} = \langle \mathrm{R}|\psi\rangle = \frac{1}{\sqrt{2}} \left(\cos\theta \exp(i\alpha_x) - i\sin\theta \exp(i\alpha_y)\right)$$ and $$\psi_{\rm L} = \langle \mathrm{L}|\psi\rangle = \frac{1}{\sqrt{2}} \left(\cos\theta\exp(i\alpha_x) + i\sin\theta\exp(i\alpha_y)\right).$$

We can see that $$ 1 = |\psi_{\rm R}|^2 + |\psi_{\rm L}|^2. $$

Elliptical polarization
The general case in which the electric field rotates in the x–y plane and has variable magnitude is called elliptical polarization. The state vector is given by $$ |\psi\rangle \ \stackrel{\mathrm{def}}{=}\ \begin{pmatrix} \psi_x \\ \psi_y \end{pmatrix} = \begin{pmatrix} \cos\theta \exp \left ( i \alpha_x \right ) \\ \sin\theta \exp \left ( i \alpha_y \right ) \end{pmatrix}. $$

Geometric visualization of an arbitrary polarization state
To get an understanding of what a polarization state looks like, one can observe the orbit that is made if the polarization state is multiplied by a phase factor of $$e^{i\omega t}$$ and then having the real parts of its components interpreted as x and y coordinates respectively. That is: $$\begin{pmatrix}x(t)\\y(t)\end{pmatrix} = \begin{pmatrix}\Re(e^{i\omega t}\psi_x)\\ \Re(e^{i\omega t}\psi_y)\end{pmatrix} = \Re\left[e^{i\omega t}\begin{pmatrix}\psi_x\\ \psi_y\end{pmatrix}\right] = \Re\left(e^{i\omega t}|\psi\rangle\right).$$

If only the traced out shape and the direction of the rotation of $( x ( t ), y ( t ))$ is considered when interpreting the polarization state, i.e. only $$M(|\psi\rangle) = \left.\left\{\Big( x(t),\,y(t) \Big)\,\right|\,\forall\,t \right\}$$ (where $x ( t )$ and $y ( t )$ are defined as above) and whether it is overall more right circularly or left circularly polarized (i.e. whether $| ψ_{R} | > | ψ_{L} |$ or vice versa), it can be seen that the physical interpretation will be the same even if the state is multiplied by an arbitrary phase factor, since $$M(e^{i\alpha}|\psi\rangle) = M(|\psi\rangle),\ \alpha\in\mathbb{R}$$ and the direction of rotation will remain the same. In other words, there is no physical difference between two polarization states $$|\psi\rangle$$ and $$e^{i\alpha}|\psi\rangle$$, between which only a phase factor differs.

It can be seen that for a linearly polarized state, M will be a line in the xy plane, with length 2 and its middle in the origin, and whose slope equals to $tan(θ)$. For a circularly polarized state, M will be a circle with radius $1/√2$ and with the middle in the origin.

Energy in a plane wave
The energy per unit volume in classical electromagnetic fields is (cgs units) and also Planck units: $$ \mathcal{E}_c = \frac{1}{8\pi} \left [ \mathbf{E}^2( \mathbf{r}, t ) + \mathbf{B}^2( \mathbf{r} , t ) \right ] .$$

For a plane wave, this becomes: $$ \mathcal{E}_c = \frac{\mid \mathbf{E} \mid^2}{8\pi} $$ where the energy has been averaged over a wavelength of the wave.

Fraction of energy in each component
The fraction of energy in the x component of the plane wave is $$ f_x = \frac{ | \mathbf{E} |^2 \cos^2\theta }{ \vert \mathbf{E} \vert^2 } = \psi_x^*\psi_x = \cos^2 \theta $$ with a similar expression for the y component resulting in $$f_y=\sin^2\theta$$.

The fraction in both components is $$ \psi_x^*\psi_x + \psi_y^*\psi_y = \langle \psi | \psi\rangle = 1. $$

Momentum density of classical electromagnetic waves
The momentum density is given by the Poynting vector $$ \boldsymbol { \mathcal{P}} = {1 \over 4\pi c } \mathbf{E}( \mathbf{r}, t ) \times \mathbf{B}( \mathbf{r}, t ). $$

For a sinusoidal plane wave traveling in the z direction, the momentum is in the z direction and is related to the energy density: $$ \mathcal{P}_z c = \mathcal{E}_c. $$

The momentum density has been averaged over a wavelength.

Angular momentum density of classical electromagnetic waves
Electromagnetic waves can have both orbital and spin angular momentum. The total angular momentum density is $$ \boldsymbol { \mathcal{L} } = \mathbf{r} \times \boldsymbol { \mathcal{P} } = {1 \over 4\pi c } \mathbf{r} \times \left [ \mathbf{E}( \mathbf{r}, t ) \times \mathbf{B}( \mathbf{r}, t ) \right ]. $$

For a sinusoidal plane wave propagating along $$z$$ axis the orbital angular momentum density vanishes. The spin angular momentum density is in the $$z$$ direction and is given by $$ \mathcal{L} = { {\vert \mathbf{E} \vert^2} \over {8\pi\omega} } \left ( \left\vert \langle \mathrm{R} | \psi\rangle \right\vert^2 - \left\vert \langle \mathrm{L} | \psi\rangle \right\vert^2 \right ) = \frac{ 1 }{ \omega } \mathcal{E}_c \left ( \vert \psi_{\rm R} \vert^2 - \vert \psi_{\rm L} \vert^2 \right ) $$ where again the density is averaged over a wavelength.

Passage of a classical wave through a polaroid filter


A linear filter transmits one component of a plane wave and absorbs the perpendicular component. In that case, if the filter is polarized in the x direction, the fraction of energy passing through the filter is $$ f_x = \psi_x^*\psi_x = \cos^2\theta.\, $$

Example of energy conservation: Passage of a classical wave through a birefringent crystal
An ideal birefringent crystal transforms the polarization state of an electromagnetic wave without loss of wave energy. Birefringent crystals therefore provide an ideal test bed for examining the conservative transformation of polarization states. Even though this treatment is still purely classical, standard quantum tools such as unitary and Hermitian operators that evolve the state in time naturally emerge.

Initial and final states
A birefringent crystal is a material that has an optic axis with the property that the light has a different index of refraction for light polarized parallel to the axis than it has for light polarized perpendicular to the axis. Light polarized parallel to the axis are called "extraordinary rays" or "extraordinary photons", while light polarized perpendicular to the axis are called "ordinary rays" or "ordinary photons". If a linearly polarized wave impinges on the crystal, the extraordinary component of the wave will emerge from the crystal with a different phase than the ordinary component. In mathematical language, if the incident wave is linearly polarized at an angle $$ theta $$ with respect to the optic axis, the incident state vector can be written $$ |\psi\rangle = \begin{pmatrix} \cos\theta \\ \sin\theta \end{pmatrix} $$ and the state vector for the emerging wave can be written $$ |\psi '\rangle = \begin{pmatrix} \cos\theta \exp \left ( i \alpha_x \right ) \\ \sin\theta \exp \left ( i \alpha_y \right ) \end{pmatrix} = \begin{pmatrix} \exp \left ( i \alpha_x \right ) & 0 \\ 0 & \exp \left ( i \alpha_y \right ) \end{pmatrix} \begin{pmatrix} \cos\theta \\ \sin\theta \end{pmatrix} \ \stackrel{\mathrm{def}}{=}\ \hat{U} |\psi\rangle. $$

While the initial state was linearly polarized, the final state is elliptically polarized. The birefringent crystal alters the character of the polarization.

Dual of the final state


The initial polarization state is transformed into the final state with the operator U. The dual of the final state is given by $$ \langle \psi '| = \langle \psi | \hat{U}^{\dagger} $$ where $$ U^{\dagger} $$ is the adjoint of U, the complex conjugate transpose of the matrix.

Unitary operators and energy conservation
The fraction of energy that emerges from the crystal is $$\langle\psi '| \psi '\rangle = \langle\psi |\hat{U}^{\dagger}\hat{U}|\psi\rangle = \langle \psi|\psi\rangle = 1.$$

In this ideal case, all the energy impinging on the crystal emerges from the crystal. An operator U with the property that $$\hat{U}^{\dagger}\hat{U} = I,$$ where I is the identity operator and U is called a unitary operator. The unitary property is necessary to ensure energy conservation in state transformations.

Hermitian operators and energy conservation


If the crystal is very thin, the final state will be only slightly different from the initial state. The unitary operator will be close to the identity operator. We can define the operator H by $$ \hat{U} \approx I + i\hat{H} $$ and the adjoint by $$ \hat{U}^{\dagger} \approx I - i\hat{H}^{\dagger}. $$

Energy conservation then requires

$$ I = \hat{U}^{\dagger} \hat{U} \approx \left ( I - i\hat{H}^{\dagger} \right ) \left ( I + i\hat{H} \right ) \approx I - i\hat{H}^{\dagger} + i\hat{H}. $$

This requires that $$ \hat{H} = \hat{H}^{\dagger}. $$

Operators like this that are equal to their adjoints are called Hermitian or self-adjoint.

The infinitesimal transition of the polarization state is $$ |\psi ' \rangle - |\psi\rangle = i\hat{H} |\psi\rangle. $$

Thus, energy conservation requires that infinitesimal transformations of a polarization state occur through the action of a Hermitian operator.

Energy
The treatment to this point has been classical. It is a testament, however, to the generality of Maxwell's equations for electrodynamics that the treatment can be made quantum mechanical with only a reinterpretation of classical quantities. The reinterpretation is based on the theories of Max Planck and the interpretation by Albert Einstein of those theories and of other experiments.

Einstein's conclusion from early experiments on the photoelectric effect is that electromagnetic radiation is composed of irreducible packets of energy, known as photons. The energy of each packet is related to the angular frequency of the wave by the relation $$ \epsilon = \hbar \omega $$ where $$ \hbar $$ is an experimentally determined quantity known as the reduced Planck constant. If there are $$ N $$ photons in a box of volume $$ V $$, the energy in the electromagnetic field is $$ N \hbar \omega $$ and the energy density is $$ {N \hbar \omega \over V} $$

The photon energy can be related to classical fields through the correspondence principle that states that for a large number of photons, the quantum and classical treatments must agree. Thus, for very large $$ N $$, the quantum energy density must be the same as the classical energy density $$ {N \hbar \omega \over V} = \mathcal{E}_c = \frac{\vert \mathbf{E} \vert^2}{8\pi}. $$

The number of photons in the box is then $$ N = \frac{V }{8\pi \hbar \omega}\vert \mathbf{E} \vert^2. $$

Momentum
The correspondence principle also determines the momentum and angular momentum of the photon. For momentum $$ \mathcal{P}_z = {N \hbar \omega \over cV} = {N \hbar k_z \over V} $$ where $$k_z$$ is the wave number. This implies that the momentum of a photon is $$ p_z = \hbar k_z .\, $$

Angular momentum and spin
Similarly for the spin angular momentum $$ \mathcal{L} = \frac{ 1 }{ \omega } \mathcal{E}_c \left ( \vert \psi_{\rm R} \vert^2 - \vert \psi_{\rm L} \vert^2 \right ) = \frac{ N\hbar }{ V } \left ( \vert \psi_{\rm R} \vert^2 - \vert \psi_{\rm L} \vert^2 \right )$$ where $$\mathcal{E}_c$$ is field strength. This implies that the spin angular momentum of the photon is $$ l_z = \hbar \left ( \vert \psi_{\rm R} \vert^2 - \vert \psi_{\rm L} \vert^2 \right ). $$ the quantum interpretation of this expression is that the photon has a probability of $$ \mid \psi_{\rm R} \mid^2 $$ of having a spin angular momentum of $$ \hbar $$ and a probability of $$ \mid \psi_{\rm L} \mid^2 $$ of having a spin angular momentum of $$ -\hbar $$. We can therefore think of the spin angular momentum of the photon being quantized as well as the energy. The angular momentum of classical light has been verified. A photon that is linearly polarized (plane polarized) is in a superposition of equal amounts of the left-handed and right-handed states.

Spin operator
The spin of the photon is defined as the coefficient of $$ \hbar $$ in the spin angular momentum calculation. A photon has spin 1 if it is in the $$ | R \rangle $$ state and −1 if it is in the $$ | L \rangle $$ state. The spin operator is defined as the outer product $$ \hat{S} \ \stackrel{\mathrm{def}}{=}\ |\mathrm{R}\rangle \langle \mathrm{R} | - |\mathrm{L}\rangle \langle \mathrm{L} | = \begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix}. $$

The eigenvectors of the spin operator are $$ |\mathrm{R}\rangle $$ and $$ |\mathrm{L}\rangle $$ with eigenvalues 1 and −1, respectively.

The expected value of a spin measurement on a photon is then $$ \langle \psi |\hat{S} |\psi\rangle = \vert \psi_{\rm R} \vert^2 - \vert \psi_{\rm L} \vert^2. $$

An operator S has been associated with an observable quantity, the spin angular momentum. The eigenvalues of the operator are the allowed observable values. This has been demonstrated for spin angular momentum, but it is in general true for any observable quantity.

Spin states
We can write the circularly polarized states as $$ |s\rangle $$ where s = 1 for $$ |\mathrm{R}\rangle $$ and s = −1 for $$ |\mathrm{L}\rangle$$. An arbitrary state can be written $$ |\psi\rangle = \sum_{s=-1,1} a_s \exp \left ( i \alpha_x -i s \theta \right ) |s\rangle $$ where $$\alpha_1$$ and $$\alpha_{-1}$$ are phase angles, θ is the angle by which the frame of reference is rotated, and $$ \sum_{s=-1,1} \vert a_s \vert^2 = 1. $$

Spin and angular momentum operators in differential form
When the state is written in spin notation, the spin operator can be written $$ \hat{S}_d \rightarrow i { \partial \over \partial \theta} $$ $$ \hat{S}_d^{\dagger} \rightarrow -i { \partial \over \partial \theta}. $$

The eigenvectors of the differential spin operator are $$ \exp \left ( i \alpha_x -i s \theta \right ) |s\rangle. $$

To see this note $$ \hat{S}_d \exp \left ( i \alpha_x -i s \theta \right ) |s\rangle \rightarrow i { \partial \over \partial \theta} \exp \left ( i \alpha_x -i s \theta \right ) |s\rangle = s \left [ \exp \left ( i \alpha_x -i s \theta \right ) |s\rangle \right ]. $$

The spin angular momentum operator is $$ \hat{l}_z = \hbar \hat{S}_d. $$

Probability for a single photon
There are two ways in which probability can be applied to the behavior of photons; probability can be used to calculate the probable number of photons in a particular state, or probability can be used to calculate the likelihood of a single photon to be in a particular state. The former interpretation violates energy conservation. The latter interpretation is the viable, if nonintuitive, option. Dirac explains this in the context of the double-slit experiment: "Some time before the discovery of quantum mechanics people realized that the connection between light waves and photons must be of a statistical character. What they did not clearly realize, however, was that the wave function gives information about the probability of one photon being in a particular place and not the probable number of photons in that place. The importance of the distinction can be made clear in the following way. Suppose we have a beam of light consisting of a large number of photons split up into two components of equal intensity. On the assumption that the beam is connected with the probable number of photons in it, we should have half the total number going into each component. If the two components are now made to interfere, we should require a photon in one component to be able to interfere with one in the other. Sometimes these two photons would have to annihilate one another and other times they would have to produce four photons. This would contradict the conservation of energy. The new theory, which connects the wave function with probabilities for one photon gets over the difficulty by making each photon go partly into each of the two components. Each photon then interferes only with itself. Interference between two different photons never occurs.&mdash;Paul Dirac, The Principles of Quantum Mechanics, 1930, Chapter 1"

Probability amplitudes
The probability for a photon to be in a particular polarization state depends on the fields as calculated by the classical Maxwell's equations. The polarization state of the photon is proportional to the field. The probability itself is quadratic in the fields and consequently is also quadratic in the quantum state of polarization. In quantum mechanics, therefore, the state or probability amplitude contains the basic probability information. In general, the rules for combining probability amplitudes look very much like the classical rules for composition of probabilities: [The following quote is from Baym, Chapter 1]


 * 1) The probability amplitude for two successive probabilities is the product of amplitudes for the individual possibilities. For example, the amplitude for the x polarized photon to be right circularly polarized and for the right circularly polarized photon to pass through the y-polaroid is $$\langle R|x\rangle\langle y|R\rangle,$$ the product of the individual amplitudes.
 * 2) The amplitude for a process that can take place in one of several indistinguishable ways is the sum of amplitudes for each of the individual ways. For example, the total amplitude for the x polarized photon to pass through the y-polaroid is the sum of the amplitudes for it to pass as a right circularly polarized photon, $$\langle y|R\rangle\langle R|x\rangle,$$ plus the amplitude for it to pass as a left circularly polarized photon, $$\langle y|L\rangle\langle L|x\rangle\dots$$
 * 3) The total probability for the process to occur is the absolute value squared of the total amplitude calculated by 1 and 2.

Mathematical preparation
For any legal operators the following inequality, a consequence of the Cauchy–Schwarz inequality, is true. $$ \frac{1}{4} \left|\langle (\hat{A} \hat{B} - \hat{B} \hat{A} )x | x \rangle\right|^2\leq \left\| \hat{A} x \right\|^2 \left\| \hat{B} x \right\|^2.$$

If B A ψ and A B ψ are defined, then by subtracting the means and re-inserting in the above formula, we deduce $$ \Delta_{\psi} \hat{A} \, \Delta_{\psi} \hat{B} \ge \frac{1}{2} \left|\left\langle\left[{\hat{A}},{\hat{B}}\right]\right\rangle_\psi\right| $$ where $$\left\langle \hat{X} \right\rangle_\psi = \left\langle \psi \right| \hat{X} \left| \psi \right\rangle$$ is the operator mean of observable X in the system state ψ and $$\Delta_{\psi} \hat{X} = \sqrt{\langle {\hat{X}}^2\rangle_\psi - \langle {\hat{X}}\rangle_\psi ^2}.$$

Here $$ \left[{\hat{A}},{\hat{B}}\right] \ \stackrel{\mathrm{def}}{=}\ \hat{A} \hat{B} - \hat{B} \hat{A} $$ is called the commutator of A and B.

This is a purely mathematical result. No reference has been made to any physical quantity or principle. It simply states that the uncertainty of one operator times the uncertainty of another operator has a lower bound.

Application to angular momentum
The connection to physics can be made if we identify the operators with physical operators such as the angular momentum and the polarization angle. We have then $$ \Delta_{\psi} \hat{L}_z \, \Delta_{\psi} {\theta} \ge \frac{\hbar}{2}, $$ which means that angular momentum and the polarization angle cannot be measured simultaneously with infinite accuracy. (The polarization angle can be measured by checking whether the photon can pass through a polarizing filter oriented at a particular angle, or a polarizing beam splitter. This results in a yes/no answer that, if the photon was plane-polarized at some other angle, depends on the difference between the two angles.)

States, probability amplitudes, unitary and Hermitian operators, and eigenvectors
Much of the mathematical apparatus of quantum mechanics appears in the classical description of a polarized sinusoidal electromagnetic wave. The Jones vector for a classical wave, for instance, is identical with the quantum polarization state vector for a photon. The right and left circular components of the Jones vector can be interpreted as probability amplitudes of spin states of the photon. Energy conservation requires that the states be transformed with a unitary operation. This implies that infinitesimal transformations are transformed with a Hermitian operator. These conclusions are a natural consequence of the structure of Maxwell's equations for classical waves.

Quantum mechanics enters the picture when observed quantities are measured and found to be discrete rather than continuous. The allowed observable values are determined by the eigenvalues of the operators associated with the observable. In the case angular momentum, for instance, the allowed observable values are the eigenvalues of the spin operator.

These concepts have emerged naturally from Maxwell's equations and Planck's and Einstein's theories. They have been found to be true for many other physical systems. In fact, the typical program is to assume the concepts of this section and then to infer the unknown dynamics of a physical system. This was done, for instance, with the dynamics of electrons. In that case, working back from the principles in this section, the quantum dynamics of particles were inferred, leading to Schrödinger's equation, a departure from Newtonian mechanics. The solution of this equation for atoms led to the explanation of the Balmer series for atomic spectra and consequently formed a basis for all of atomic physics and chemistry.

This is not the only occasion in which Maxwell's equations have forced a restructuring of Newtonian mechanics. Maxwell's equations are relativistically consistent. Special relativity resulted from attempts to make classical mechanics consistent with Maxwell's equations (see, for example, Moving magnet and conductor problem).