Gravitational lensing formalism

In general relativity, a point mass deflects a light ray with impact parameter $$b~$$ by an angle approximately equal to


 * $$\hat{\alpha} = \frac{4GM}{c^2b}$$

where G is the gravitational constant, M the mass of the deflecting object and c the speed of light. A naive application of Newtonian gravity can yield exactly half this value, where the light ray is assumed as a massed particle and scattered by the gravitational potential well. This approximation is good when $$4GM/c^2b$$ is small.

In situations where general relativity can be approximated by linearized gravity, the deflection due to a spatially extended mass can be written simply as a vector sum over point masses. In the continuum limit, this becomes an integral over the density $$\rho~$$, and if the deflection is small we can approximate the gravitational potential along the deflected trajectory by the potential along the undeflected trajectory, as in the Born approximation in quantum mechanics. The deflection is then


 * $$\vec{\hat{\alpha}}(\vec{\xi})=\frac{4 G}{c^2} \int d^2\xi^{\prime} \int dz \rho(\vec{\xi}^{\prime},z) \frac{\vec{b}}{|\vec{b}|^2}, ~ \vec{b} \equiv \vec{\xi}  - \vec{\xi^{\prime}}

$$

where $$z$$ is the line-of-sight coordinate, and $$ \vec{b} $$ is the vector impact parameter of the actual ray path from the infinitesimal mass $$ d^2\xi^{\prime}  dz\rho(\vec{\xi}^{\prime},z) $$ located at the coordinates $$(\vec{\xi}^{\prime}, z)$$.

Thin lens approximation
In the limit of a "thin lens", where the distances between the source, lens, and observer are much larger than the size of the lens (this is almost always true for astronomical objects), we can define the projected mass density


 * $$\Sigma(\vec{\xi}^\prime)=\int \rho(\vec{\xi}^\prime,z) dz $$

where $$\vec{\xi}^\prime$$ is a vector in the plane of the sky. The deflection angle is then


 * $$\vec{\hat{\alpha}}(\vec{\xi})=

\frac{4 G}{c^2} \int \frac{(\vec{\xi}-\vec{\xi}^{\prime})\Sigma(\vec{\xi}^{\prime})}{|\vec{\xi}-\vec{\xi}^{\prime}|^2}d^2 \xi^{\prime}$$



As shown in the diagram on the right, the difference between the unlensed angular position $$\vec{\beta}$$ and the observed position $$\vec{\theta}$$ is this deflection angle, reduced by a ratio of distances, described as the lens equation


 * $$\vec{\beta}=\vec{\theta}-\vec{\alpha}(\vec{\theta}) = \vec{\theta} - \frac{D_{ds}}{D_s} \vec{\hat{\alpha}}(\vec{D_d\theta})$$

where $$D_{ds}~$$ is the distance from the lens to the source, $$D_s~$$ is the distance from the observer to the source, and $$D_d~$$ is the distance from the observer to the lens. For extragalactic lenses, these must be angular diameter distances.

In strong gravitational lensing, this equation can have multiple solutions, because a single source at $$\vec{\beta}$$ can be lensed into multiple images.

Convergence and deflection potential
The reduced deflection angle $$\vec{\alpha}(\vec{\theta})$$ can be written as



\vec{\alpha}(\vec{\theta}) = \frac{1}{\pi}\int d^2 \theta^{\prime} \frac{(\vec{\theta}-\vec{\theta}^{\prime})\kappa(\vec{\theta}^{\prime})}{|\vec{\theta}-\vec{\theta}^{\prime}|^2} $$

where we define the convergence



\kappa(\vec{\theta}) = \frac{\Sigma(\vec{\theta})}{\Sigma_{cr}} $$

and the critical surface density (not to be confused with the critical density of the universe)



\Sigma_{cr} = \frac{c^2 D_s}{4\pi G D_{ds}D_d} $$

We can also define the deflection potential



\psi(\vec{\theta}) = \frac{1}{\pi}\int d^2 \theta^{\prime} \kappa(\vec{\theta}^{\prime}) \ln |\vec{\theta}-\vec{\theta}^{\prime}| $$

such that the scaled deflection angle is just the gradient of the potential and the convergence is half the Laplacian of the potential:



\vec{\theta}-\vec{\beta} =\vec{\alpha}(\vec{\theta}) = \vec{\nabla} \psi(\vec{\theta}) $$



\kappa(\vec{\theta}) = \frac{1}{2} \nabla^2 \psi(\vec{\theta}) $$

The deflection potential can also be written as a scaled projection of the Newtonian gravitational potential $$\Phi~$$ of the lens



\psi(\vec{\theta}) = \frac{2 D_{ds}}{D_d D_s c^2} \int \Phi(D_d\vec{\theta},z) dz $$

Lensing Jacobian
The Jacobian between the unlensed and lensed coordinate systems is


 * $$A_{ij}=\frac{\partial \beta_i}{\partial \theta_j}=\delta_{ij} - \frac{\partial \alpha_i}{\partial \theta_j}

= \delta_{ij} - \frac{\partial^2 \psi}{\partial \theta_i \partial \theta_j}$$

where $$\delta_{ij}~$$ is the Kronecker delta. Because the matrix of second derivatives must be symmetric, the Jacobian can be decomposed into a diagonal term involving the convergence and a trace-free term involving the shear $$\gamma~$$


 * $$A=(1-\kappa)\left[\begin{array}{ c c } 1 & 0 \\ 0 & 1 \end{array}\right]-\gamma\left[\begin{array}{ c c } \cos 2\phi & \sin 2\phi \\ \sin 2\phi & -\cos 2\phi \end{array}\right]$$

where $$\phi~$$ is the angle between $$\vec{\alpha}$$ and the x-axis. The term involving the convergence magnifies the image by increasing its size while conserving surface brightness. The term involving the shear stretches the image tangentially around the lens, as discussed in weak lensing observables.

The shear defined here is not equivalent to the shear traditionally defined in mathematics, though both stretch an image non-uniformly.



Fermat surface
There is an alternative way of deriving the lens equation, starting from the photon arrival time (Fermat surface)



t = \int_{0}^{z_s}   { n dz \over c \cos \alpha(z) } $$

where $$ dz/c $$ is the time to travel an infinitesimal line element along the source-observer straight line in vacuum, which is then corrected by the factor



1/\cos(\alpha(z)) \approx 1 + {\alpha(z)^2 \over 2} $$

to get the line element along the bended path $$ dl = {dz \over c \cos \alpha(z) } $$ with a varying small pitch angle $$ \alpha(z), $$ and the refraction index $n$ for the "aether", i.e., the gravitational field. The last can be obtained from the fact that a photon travels on a null geodesic of a weakly perturbed static Minkowski universe



ds^2 = 0 = c^2 dt^2 \left(1 + {2 \Phi \over c^2} \right) -  \left(1 + {2 \Phi \over c^2} \right)^{-1} dl^2 $$ where the uneven gravitational potential $$ \Phi \ll c^2 $$ drives a changing the speed of light



c' = {dl/dt} = \left(1 + {2 \Phi \over c^2} \right) c. $$

So the refraction index

n \equiv {c \over c'} \approx \left(1 -  {2 \Phi \over c^2} \right). $$

The refraction index greater than unity because of the negative gravitational potential $$ \Phi $$.

Put these together and keep the leading terms we have the time arrival surface



t \approx \int_0^{z_s} {dz \over c} +  \int_0^{z_s} {dz \over c} {\alpha(z)^2 \over 2} -  \int_0^{z_s} {dz \over c} {2 \Phi \over c^2}. $$

The first term is the straight path travel time, the second term is the extra geometric path, and the third is the gravitational delay. Make the triangle approximation that $$ \alpha(z) = \theta - \beta $$ for the path between the observer and the lens, and $$ \alpha(z) \approx (\theta - \beta) {D_d \over D_{ds}} $$ for the path between the lens and the source. The geometric delay term becomes



{D_d \over c}  { (\vec{\theta} - \vec{\beta})^2 \over 2} + {D_{ds} \over c} { \left[ (\vec{\theta} - \vec{\beta} )  {D_d \over D_{ds}} \right]^2 \over 2} = {D_d D_s \over D_{ds}c }   { ( \vec{\theta} - \vec{\beta} )^2 \over 2}. $$

(How? There is no $$D_s$$ on the left. Angular diameter distances don't add in a simple way, in general.) So the Fermat surface becomes



t = constant + {D_d D_s \over D_{ds} c} \tau, ~ \tau \equiv \left[ { (\vec{\theta}-\vec{\beta})^2 \over 2} -  \psi \right] $$

where $$ \tau $$ is so-called dimensionless time delay, and the 2D lensing potential



\psi(\vec{\theta}) = \frac{2 D_{ds}}{D_d D_s c^2} \int \Phi(D_d\vec{\theta},z) dz. $$ The images lie at the extrema of this surface, so the variation of $$ \tau $$ with $$ \vec{\theta} $$ is zero,



0 = \nabla_{\vec{\theta}} \tau =  \vec{\theta} - \vec{\beta}  - \nabla_{\vec{\theta}} \psi(\vec{\theta}) $$

which is the lens equation. Take the Poisson's equation for 3D potential

\Phi(\vec{\xi}) = - \int  \frac{d^3\xi^{\prime} \rho(\vec{\xi}^{\prime})}{|\vec{\xi}-\vec{\xi}^{\prime}|} $$

and we find the 2D lensing potential


 * $$\psi(\vec{\theta})  = - \frac{2 G D_{ds}}{D_d D_s c^2}   \int dz  \int  \frac{d^3\xi^{\prime} \rho(\vec{\xi}^{\prime})}{|\vec{\xi}-\vec{\xi}^{\prime}|}

= -  \sum_i \frac{2 G M_i D_{is} }{D_s D_i c^2}   \left[  \sinh^{-1}  { |z -D_i| \over D_i |\vec{\theta}-\vec{\theta}_i |  }  \right ] |_{D_i}^{D_s}  + |_{D_i}^{0}. $$

Here we assumed the lens is a collection of point masses $$ M_i $$ at angular coordinates $$ \vec{\theta}_i $$ and distances $$ z=D_i .$$ Use $$ \sinh^{-1} 1/x = \ln(1/x + \sqrt{1/x^2+1}) \approx -\ln(x/2) $$ for very small $x$ we find



\psi(\vec{\theta})  \approx   \sum_i  \frac{4 GM_i D_{is} }{D_s D_i c^2}   \left[   \ln\left( { |\vec{\theta}-\vec{\theta}_i | \over 2}  { D_i \over D_{is} } \right)     \right]. $$

One can compute the convergence by applying the 2D Laplacian of the 2D lensing potential



\kappa(\vec{\theta}) = \frac{1}{2} \nabla_{\vec{\theta}}^2 \psi(\vec{\theta}) = \frac{4\pi G D_{ds}D_d} {c^2 D_s} \int dz \rho( D_d \vec{\theta},z) = {\Sigma \over \Sigma_{cr} } = \sum_i { 4\pi G M_i D_{is} \over c^2 D_i D_s} \delta(\vec{\theta}-\vec{\theta}_i) $$

in agreement with earlier definition $$ \kappa(\vec{\theta}) = {\Sigma \over \Sigma_{cr} }$$ as the ratio of projected density with the critical density. Here we used $$ \nabla^2 1/r = - 4 \pi \delta(r) $$ and $$ \nabla_{\vec{\theta}} = D_d \nabla. $$

We can also confirm the previously defined reduced deflection angle



\vec{\theta} -\vec{\beta}  =  \nabla_{\vec{\theta}} \psi(\vec{\theta}) =  \sum_i   {  \theta_{Ei}^2  \over |\vec{\theta}-\vec{\theta}_i |}, ~ \pi \theta_{Ei}^2 \equiv {4 \pi GM_i D_{is}  \over c^2 D_s  D_i } $$

where $$ \theta_{Ei} $$ is the so-called Einstein angular radius of a point lens $$ M_i $$. For a single point lens at the origin we recover the standard result that there will be two images at the two solutions of the essentially quadratic equation


 * $$ \vec{\theta} -\vec{\beta}  =   {\theta_{E}^2  \over |\vec{\theta} |}.  $$

The amplification matrix can be obtained by double derivatives of the dimensionless time delay



A_{ij} = {\partial \beta_j \over \partial \theta_i} = {\partial \tau \over \partial \theta_i \partial \theta_j } = \delta_{ij} - {\partial \psi \over \partial \theta_i \partial \theta_j } = \left[\begin{array}{ c c } 1-\kappa -\gamma_1 & \gamma_2 \\ \gamma_2 & 1-\kappa +\gamma_1 \end{array}\right] $$

where we have define the derivatives


 * $$ \kappa = {\partial \psi \over 2 \partial \theta_1 \partial \theta_1 } + {\partial \psi \over 2\partial \theta_2 \partial \theta_2 } ,

~ \gamma_1 \equiv {\partial \psi \over 2 \partial \theta_1 \partial \theta_1 } -  {\partial \psi \over 2\partial \theta_2 \partial \theta_2 } , ~ \gamma_2 \equiv {\partial \psi \over \partial \theta_1 \partial \theta_2 }    $$

which takes the meaning of convergence and shear. The amplification is the inverse of the Jacobian


 * $$  A = 1/det(A_{ij}) = {1 \over (1-\kappa)^2 -\gamma_1^2 -\gamma_2^2}  $$

where a positive $$ A $$ means either a maxima or a minima, and a negative $$ A $$ means a saddle point in the arrival surface.

For a single point lens, one can show (albeit a lengthy calculation) that


 * $$ \kappa =0, ~ \gamma = \sqrt{\gamma_1^2 + \gamma_2^2} = {\theta_E^2 \over |\theta|^2}, ~ \theta_E^2= {4GM D_{ds} \over c^2 D_dD_s}.

$$

So the amplification of a point lens is given by



A = \left( 1 - {\theta_E^4 \over \theta^4} \right)^{-1}. $$

Note A diverges for images at the Einstein radius $$ \theta_E. $$

In cases there are multiple point lenses plus a smooth background of (dark) particles of surface density $$\Sigma_{\rm cr} \kappa_{\rm smooth}, $$ the time arrival surface is



\psi(\vec{\theta})  \approx   {1 \over 2} \kappa_{\rm smooth} |\theta|^2 +  \sum_i  \theta_E^2  \left[   \ln\left( { |\vec{\theta}-\vec{\theta}_i |^2 \over 4}  { D_d \over D_{ds} } \right)     \right]. $$

To compute the amplification, e.g., at the origin (0,0), due to identical point masses distributed at $$ (\theta_{xi},\theta_{yi} ) $$ we have to add up the total shear, and include a convergence of the smooth background,

A = \left[ (1 - \kappa_{\rm smooth})^2 - \left( \sum_i { (\theta_{xi}^2 - \theta_{yi}^2 ) \theta_E^2 \over (\theta_{xi}^2 + \theta_{yi}^2)^2 }\right) ^2 - \left( \sum_i { (2 \theta_{xi} \theta_{yi}) \theta_E^2 \over (\theta_{xi}^2 + \theta_{yi}^2)^2 }  \right)^2   \right] ^{-1} $$

This generally creates a network of critical curves, lines connecting image points of infinite amplification.

General weak lensing
In weak lensing by large-scale structure, the thin-lens approximation may break down, and low-density extended structures may not be well approximated by multiple thin-lens planes. In this case, the deflection can be derived by instead assuming that the gravitational potential is slowly varying everywhere (for this reason, this approximation is not valid for strong lensing). This approach assumes the universe is well described by a Newtonian-perturbed FRW metric, but it makes no other assumptions about the distribution of the lensing mass.

As in the thin-lens case, the effect can be written as a mapping from the unlensed angular position $$\vec{\beta}$$ to the lensed position $$\vec{\theta}$$. The Jacobian of the transform can be written as an integral over the gravitational potential $$\Phi~$$ along the line of sight



\frac{\partial \beta_i}{\partial \theta_j} = \delta_{ij} + \int_0^{r_\infty} dr  g(r) \frac{\partial^2  \Phi(\vec{x}(r))}{\partial x^i \partial x^j} $$

where $$r~$$ is the comoving distance, $$x^i~$$ are the transverse distances, and



g(r) = 2 r \int^{r_\infty}_r dr' \left(1-\frac{r^\prime}{r}\right)W(r^\prime) $$

is the lensing kernel, which defines the efficiency of lensing for a distribution of sources $$W(r)~$$.

The Jacobian $$A_{ij}~$$ can be decomposed into convergence and shear terms just as with the thin-lens case, and in the limit of a lens that is both thin and weak, their physical interpretations are the same.

Weak lensing observables
In weak gravitational lensing, the Jacobian is mapped out by observing the effect of the shear on the ellipticities of background galaxies. This effect is purely statistical; the shape of any galaxy will be dominated by its random, unlensed shape, but lensing will produce a spatially coherent distortion of these shapes.

Measures of ellipticity
In most fields of astronomy, the ellipticity is defined as $$1-q~$$, where $$q=\frac{b}{a}$$ is the axis ratio of the ellipse. In weak gravitational lensing, two different definitions are commonly used, and both are complex quantities which specify both the axis ratio and the position angle $$\phi~$$:



\chi = \frac{1-q^2}{1+q^2}e^{2i\phi} = \frac{a^2-b^2}{a^2+b^2}e^{2i\phi} $$



\epsilon = \frac{1-q}{1+q}e^{2i\phi} = \frac{a-b}{a+b}e^{2i\phi} $$

Like the traditional ellipticity, the magnitudes of both of these quantities range from 0 (circular) to 1 (a line segment). The position angle is encoded in the complex phase, but because of the factor of 2 in the trigonometric arguments, ellipticity is invariant under a rotation of 180 degrees. This is to be expected; an ellipse is unchanged by a 180° rotation. Taken as imaginary and real parts, the real part of the complex ellipticity describes the elongation along the coordinate axes, while the imaginary part describes the elongation at 45° from the axes.

The ellipticity is often written as a two-component vector instead of a complex number, though it is not a true vector with regard to transforms:



\chi = \{\left|\chi\right|\cos 2\phi, \left|\chi\right|\sin 2\phi\} $$



\epsilon = \{\left|\epsilon\right|\cos 2\phi, \left|\epsilon\right| \sin 2\phi\} $$

Real astronomical background sources are not perfect ellipses. Their ellipticities can be measured by finding a best-fit elliptical model to the data, or by measuring the second moments of the image about some centroid $$(\bar{x},\bar{y})$$



q_{xx} = \frac{\sum (x-\bar{x})^2 I(x,y)}{\sum I(x,y)} $$



q_{yy} = \frac{\sum (y-\bar{y})^2 I(x,y)}{\sum I(x,y)} $$



q_{xy} = \frac{\sum (x-\bar{x})(y-\bar{y}) I(x,y)}{\sum I(x,y)} $$

The complex ellipticities are then



\chi = \frac{q_{xx}-q_{yy} + 2 i q_{xy}}{q_{xx}+q_{yy}} $$



\epsilon = \frac{q_{xx}-q_{yy} + 2 i q_{xy}}{q_{xx}+q_{yy} + 2\sqrt{q_{xx}q_{yy}-q_{xy}^2}} $$

This can be used to relate the second moments to traditional ellipse parameters:



q_{xx} = a^2 \cos^2 \theta + b^2 \sin^2 \theta\, $$



q_{yy} = a^2 \sin^2 \theta + b^2 \cos^2 \theta\, $$



q_{xy} = (a^2-b^2)\sin \theta \cos \theta\, $$

and in reverse:



a^2 = \frac{q_{xx}+q_{yy} + \sqrt{(q_{xx}-q_{yy})^2 + 4q_{xy}^2}}{2} $$



b^2 = \frac{q_{xx}+q_{yy} - \sqrt{(q_{xx}-q_{yy})^2 + 4q_{xy}^2}}{2} $$



\tan 2\theta = \frac{2q_{xy}}{q_{xx}-q_{yy}} $$

The unweighted second moments above are problematic in the presence of noise, neighboring objects, or extended galaxy profiles, so it is typical to use apodized moments instead:



q_{xx} = \frac{\sum (x-\bar{x})^2 w(x-\bar{x},y-\bar{y}) I(x,y)}{\sum w(x-\bar{x},y-\bar{y}) I(x,y)} $$



q_{yy} = \frac{\sum (y-\bar{y})^2 w(x-\bar{x},y-\bar{y}) I(x,y)}{\sum w(x-\bar{x},y-\bar{y}) I(x,y)} $$



q_{xy} = \frac{\sum (x-\bar{x})(y-\bar{y}) w(x-\bar{x},y-\bar{y}) I(x,y)}{\sum w(x-\bar{x},y-\bar{y}) I(x,y)} $$

Here $$w(x,y)~$$ is a weight function that typically goes to zero or quickly approaches zero at some finite radius.

Image moments cannot generally be used to measure the ellipticity of galaxies without correcting for observational effects, particularly the point spread function.

Shear and reduced shear
Recall that the lensing Jacobian can be decomposed into shear $$\gamma~$$ and convergence $$\kappa~$$. Acting on a circular background source with radius $$R~$$, lensing generates an ellipse with major and minor axes


 * $$a = \frac{R}{1-\kappa-\gamma}$$


 * $$b = \frac{R}{1-\kappa+\gamma}$$

as long as the shear and convergence do not change appreciably over the size of the source (in that case, the lensed image is not an ellipse). Galaxies are not intrinsically circular, however, so it is necessary to quantify the effect of lensing on a non-zero ellipticity.

We can define the complex shear in analogy to the complex ellipticities defined above



\gamma = \left|\gamma\right| e^{2i\phi} $$

as well as the reduced shear



g \equiv \frac{\gamma}{1-\kappa} $$

The lensing Jacobian can now be written as



A=\left[\begin{array}{ c c } 1 - \kappa - \mathrm{Re}[\gamma] & -\mathrm{Im}[\gamma] \\ -\mathrm{Im}[\gamma] & 1 -\kappa + \mathrm{Re}[\gamma]\end{array}\right] =(1-\kappa)\left[\begin{array}{ c c } 1-\mathrm{Re}[g] & -\mathrm{Im}[g] \\ -\mathrm{Im}[g] & 1+ \mathrm{Re}[g]\end{array}\right] $$

For a reduced shear $$g~$$ and unlensed complex ellipticities $$\chi_s~$$ and $$\epsilon_s~$$, the lensed ellipticities are



\chi = \frac{\chi_s+2g+g^2\chi_s^*}{1+|g|^2 + 2\mathrm{Re}(g\chi_s^*)} $$



\epsilon = \frac{\epsilon_s+g}{1+g^*\epsilon_s} $$

In the weak lensing limit, $$\gamma \ll 1$$ and $$\kappa \ll 1$$, so



\chi \approx \chi_s+2g \approx \chi_s+2\gamma $$



\epsilon \approx \epsilon_s+g \approx \epsilon_s+\gamma $$

If we can assume that the sources are randomly oriented, their complex ellipticities average to zero, so
 * $$ \langle \chi \rangle = 2\langle \gamma \rangle $$ and $$\langle \epsilon \rangle = \langle \gamma \rangle$$.

This is the principal equation of weak lensing: the average ellipticity of background galaxies is a direct measure of the shear induced by foreground mass.

Magnification
While gravitational lensing preserves surface brightness, as dictated by Liouville's theorem, lensing does change the apparent solid angle of a source. The amount of magnification is given by the ratio of the image area to the source area. For a circularly symmetric lens, the magnification factor μ is given by



\mu = \frac{\theta}{\beta} \frac{d\theta}{d\beta} $$

In terms of convergence and shear



\mu = \frac{1}{\det A} = \frac{1}{[(1-\kappa)^2-\gamma^2]} $$

For this reason, the Jacobian $$A~$$ is also known as the "inverse magnification matrix".

The reduced shear is invariant with the scaling of the Jacobian $$A~$$ by a scalar $$\lambda~$$, which is equivalent to the transformations

1-\kappa^{\prime} = \lambda(1-\kappa) $$ and

\gamma^{\prime} = \lambda \gamma $$.

Thus, $$\kappa$$ can only be determined up to a transformation $$\kappa \rightarrow \lambda \kappa+(1-\lambda)$$, which is known as the "mass sheet degeneracy." In principle, this degeneracy can be broken if an independent measurement of the magnification is available because the magnification is not invariant under the aforementioned degeneracy transformation. Specifically, $$\mu~$$ scales with $$\lambda~$$ as $$\mu \propto \lambda^{-2}$$.