Light field microscopy

Light field microscopy (LFM) is a scanning-free 3-dimensional (3D) microscopic imaging method based on the theory of light field. This technique allows sub-second (~10 Hz) large volumetric imaging ([~0.1 to 1 mm]3) with ~1 μm spatial resolution in the condition of weak scattering and semi-transparence, which has never been achieved by other methods. Just as in traditional light field rendering, there are two steps for LFM imaging: light field capture and processing. In most setups, a microlens array is used to capture the light field. As for processing, it can be based on two kinds of representations of light propagation: the ray optics picture and the wave optics picture. The Stanford University Computer Graphics Laboratory published their first prototype LFM in 2006 and has been working on the cutting edge since then.

Light field generation
A light field is a collection of all the rays flowing through some free space, where each ray can be parameterized with four variables. In many cases, two 2D coordinates–denoted as $$(s,t)$$ & $$(u,v)$$–on two parallel planes with which the rays intersect are applied for parameterization. Accordingly, the intensity of the 4D light field can be described as a scalar function: $L_f(s,t,u,v)$, where $$f$$ is the distance between two planes.

LFM can be built upon the traditional setup of a wide-field fluorescence microscope and a standard CCD camera or sCMOS. A light field is generated by placing a microlens array at the intermediate image plane of the objective (or the rear focal plane of an optional relay lens) and is further captured by placing the camera sensor at the rear focal plane of the microlenses. As a result, the coordinates of the microlenses $$(s,t)$$ conjugate with those on the object plane (if additional relay lenses are added, then on the front focal plane of the objective) $$(s',t')$$; the coordinates of the pixels behind each microlens $$(u,v)$$ conjugate with those on the objective plane $$(u',v')$$. For uniformity and convenience, we shall call the plane $$(s',t')$$ the original focus plane in this article. Correspondingly, $$f$$ is the focal length of the microlenses (i.e., the distance between microlens array plane and the sensor plane).

In addition, the apertures and the focal-length of each lens and the dimensions of the sensor and microlens array should all be properly chosen to ensure that there is neither overlap nor empty areas between adjacent subimages behind the corresponding microlenses.

Realization from the ray optics picture
This section mainly introduces the work of Levoy et al., 2006.

Perspective views from varied angles
Owing to the conjugated relationships as mentioned above, any certain pixel $$(u_j,v_j)$$ behind a certain microlens $$(s_i,t_i)$$ corresponds to the ray passing through the point $$(s_i',t_i')$$ towards the direction $$(u_j',v_j')$$. Therefore, by extracting the pixel $$(u_j,v_j)$$ from all subimages and stitching them together, a perspective view from the certain angle is obtained: $L_f(:,:,u_j,v_j)$. In this scenario, spatial resolution is determined by the number of microlenses; angular resolution is determined by the number of pixels behind each microlens.

Step 1: Digital refocusing
Synthetic focusing uses the captured light field to compute the photograph focusing at any arbitrary section. By simply summing all the pixels in each subimage behind the microlens (equivalent to collecting all radiation coming from different angles that falls on the same position), the image is focused exactly on the plane that conjugates with the microlens array plane:

$$E_f(s,t)={1 \over f^2}\iint L_f(s,t,u,v)\cos^4\phi~dudv$$,

where $$\phi$$ is the angle between the ray and the normal of the sensor plane, and $\phi=\arctan(\sqrt{u^2+v^2}/f)$ if the origin of the coordinate system of each subimage is located on the principal optic axis of the corresponding microlens. Now, a new function can defined to absorb the effective projection factor $$\cos^4\phi$$ into the light field intensity $$L_{f}$$ and obtain the actual radiance collection of each pixel: $$\bar{L}_f=L_f\cos^4\phi$$.

In order to focus on some other plane besides the front focal plane of the objective, say, the plane whose conjugated plane is $$f'=\alpha f$$ away from the sensor plane, the conjugated plane can be moved from $$f$$ to $$\alpha f$$ and reparameterize its light field back to the original one at $$f$$:

$$\bar{L}_{\alpha f}(s,t,u,v)=\bar{L}_f(u+(s-u)/\alpha,v+(t-v)/\alpha,u,v)$$.

Thereby, the refocused photograph can be computed with the following formula:

$$E_{\alpha f}(s,t)={1 \over \alpha^2 f^2}\iint \bar{L}_f(u(1-1/\alpha)+s/\alpha,v(1-1/\alpha)+t/\alpha,u,v)~dudv$$.

Consequently, a focal stack is generated to recapitulate the instant 3D imaging of the object space. Furthermore, tilted or even curved focal planes are also synthetically possible. In addition, any reconstructed 2D image focused at an arbitrary depth corresponds to a 2D slice of a 4D light field in the Fourier domain, where the algorithm complexity can be reduced from $$O(n^4)$$ to $$O(n^2\log n)$$.

Step 2: Point spread function measurement
Due to diffraction and defocus, however, the focal stack $$FS$$ differs from the actual intensity distribution of voxels $$V$$, which is really desired. Instead, $$FS$$ is a convolution of $$V$$ and a point spread function (PSF): $$V*PSF=FS.$$ Thus, the 3D shape of the PSF has to be measured in order to subtract its effect and to obtain voxels' net intensity. This measurement can be easily done by placing a fluorescent bead at the center of the original focus plane and recording its light field, based on which the PSF's 3D shape is ascertained by synthetically focusing on varied depth. Given that the PSF is acquired with the same LFM setup and digital refocusing procedure as the focal stack, this measurement correctly reflects the angular range of rays captured by the objective (including any falloff in intensity); therefore, this synthetic PSF is actually free of noise and aberrations. The shape of the PSF can be considered identical everywhere within our desired field of view (FOV); hence, multiple measurements can be avoided.

Step 3: 3D deconvolution
In the Fourier domain, the actual intensity of voxels has a very simple relation with the focal stack and the PSF:

$$\mathcal{F}(V)=\frac{\mathcal{F}(FS)}{\mathcal{F}(PSF)}$$,

where $$\mathcal{F}$$ is the operator of the Fourier transform. However, it may not be possible to directly solve the equation above, given the fact that the aperture is of limited size, resulting in the PSF being bandlimited (i.e., its Fourier transform has zeros). Instead, an iterative algorithm called constrained iterative deconvolution in the spatial domain is much more practical here: This idea is based on constrained gradient descent: the estimation of $$V$$ is improved iteratively by calculating the difference between the actual focal stack $$FS$$ and the estimated focal stack $V*PSF$ and correcting $$V$$ with the current difference ($$V$$ is constrained to be non-negative).
 * 1) $$\bigtriangleup^{(k+1)}~\leftarrow~FS-V^{(k)}*PSF$$;
 * 2) $$V^{(k+1)}~\leftarrow~\max(V^{(k)}+\bigtriangleup^{(k+1)},0)$$.

Fourier Slice Photography
The formula of $$E_{\alpha f}(s,t)$$ can be rewritten by adopting the concept of the Fourier Projection-Slice Theorem. Because the photography operator $$\mathcal{P}_\alpha[\bar{L_f}](s,t)=E_{\alpha f}(s,t)$$ can be viewed as a shear followed by projection, the result should be proportional to a dilated 2D slice of the 4D Fourier transform of a light field. Precisely, a refocused image can be generated from the 4D Fourier spectrum of a light field by extracting an 2D slice, applying an inverse 2D transform, and scaling. Before the proof, we first introduce some operators:


 * 1) Integral Projection Operator:  $$\mathcal{I}^N_M[f](x_1,...,x_M)=\int{f(x_1,...,x_N)dx_{M+1}...dx_{N}}$$
 * 2) Slicing operator: $$\mathcal{S}^N_M[f](x_1,...,x_M)=f(x_1,...,x_M,0,...,0)$$
 * 3) Photography Change of Basis:  Let $$\mathcal{B}_\alpha$$ denote an operator for a change of basis of an 4-dimensional function so that $$\mathcal{B}[f](\mathrm{x})=f(\mathcal{B}^{-1}\mathrm{x})$$, with $$\mathcal{B}_\alpha=\begin{bmatrix} \alpha & 0 & 1-\alpha& 0 \\ 0 & \alpha & 0 & 1-\alpha\\ 0&0&1&0\\0&0&0&1\end{bmatrix}$$.
 * 4) Fourier Transform Operator:  Let $$\mathcal{F}^N$$ denote the N-dimensional Fourier transform operator.

By these definitions, we can rewrite $$\mathcal{P}_\alpha[\bar{L_f}](s,t)={1 \over \alpha^2 f^2}\iint \bar{L}_f(u(1-1/\alpha)+s/\alpha,v(1-1/\alpha)+t/\alpha,u,v)~dudv\equiv\frac{1}{\alpha^2f^2}\mathcal{I}^4_2\circ\mathcal{B}_\alpha[L_f]$$.

According to the generalized Fourier-slice theorem, we have

$$\mathcal{F}^M\circ\mathcal{I}^N_M\circ\mathcal{B}\equiv\mathcal{S}^N_M\circ\frac{\mathcal{B}^{-T}}{|\mathcal{B}^{-T}|}\circ\mathcal{F}^N$$,

and hence the photography operator has the form

$$\mathcal{P}_\alpha[\bar{L_f}](s,t)\equiv\frac{1}{f^2}\mathcal{F}^{-2}\circ\mathcal{S}^4_2\circ\mathcal{B}_\alpha^{-T}\circ\mathcal{F}^4$$.

According to the formula, we know a photograph is the inverse 2D Fourier transform of a dilated 2D slice in the 4D Fourier transform of the light field.

Discrete Fourier Slice Photography
If all we have available are samples of the light field, instead of use Fourier slice theorem for continuous signal mentioned above, we adopt discrete Fourier slice theorem, which is a generalization of the discrete Radon transform, to compute refocused image.

Assume that a lightfield $$\bar{L}_f$$ is periodic with periods $$T_s, T_t, T_u, T_v$$ and is defined on the hypercube $$H=[-T_s/2,T_s/2]\times[-T_t/2,T_t/2]\times[-T_u/2,T_u/2]\times[-T_v/2,T_v/2]$$. Also, assume there are $$N_s\times N_t\times N_u\times N_v$$ known samples of the light field $$(\hat{s}\Delta s, \hat{t}\Delta t, \hat{u}\Delta u, \hat{v}\Delta v)$$, where $$\hat{s},\hat{t},\hat{u},\hat{v}\in \mathbb{Z}$$ and $$\Delta s,\Delta t,\Delta u,\Delta v=\frac{T_s}{N_s},\frac{T_t}{N_t},\frac{T_u}{N_u},\frac{T_v}{N_v}$$, respectively. Then, we can define $$\bar{L}^d_f$$ using trigonometric interpolation with these sample points:

$$\bar{L}^d_f(\hat{s},\hat{t},\hat{u},\hat{v})=\sum_{\omega_\hat{s}}\sum_{\omega_\hat{t}}\sum_{\omega_\hat{u}}\sum_{\omega_\hat{v}}\mathcal{\bar{L}}^d_f(\omega_\hat{s},\omega_\hat{t},\omega_\hat{u},\omega_\hat{v})e^{2\pi i(\hat{s}\omega_\hat{s}+\hat{t}\omega_\hat{t}+\hat{u}\omega_\hat{u}+\hat{v}\omega_\hat{v})}$$,

where

$$\mathcal{\bar{L}}^d_f(\omega_\hat{s},\omega_\hat{t},\omega_\hat{u},\omega_\hat{v})=\sum_{\hat{s}} \sum_{\hat{t}} \sum_{\hat{u}} \sum_{\hat{v}} \bar{L}^d_f(\hat{s},\hat{t},\hat{u},\hat{v})e^{2\pi i(\hat{s}\omega_\hat{s}+\hat{t}\omega_\hat{t}+\hat{u}\omega_\hat{u}+\hat{v}\omega_\hat{v})}$$.

Note that the constant factors are dropped for simplicity.

To compute its refocused photograph, we replace infinite integral in the formula of $$\mathcal{P}_\alpha$$ with summation whose bounds are $$[-T_u/2,T_u/2]$$ and $$[-T_v/2,T_v/2]$$. That is,

$$\mathcal{P}^d_\alpha[\bar{L^d_f}](\hat{s},\hat{t})={1 \over \alpha^2 f^2}\sum^{N_u/2}_{-N_u/2}\sum^{N_v/2}_{-N_v/2}\bar{L}^d_f(\hat{u}\Delta u(1-1/\alpha)+\hat{s}\Delta s/\alpha,\hat{v}\Delta v(1-1/\alpha)+\hat{t}\Delta t/\alpha,\hat{u}\Delta u,\hat{v}\Delta v)$$.

Then, by discrete Fourier slice theorem indicates, we can represent the photograph using Fourier slice:

$$\mathcal{P}^d_\alpha[\bar{L^d_f}](\hat{s},\hat{t})=\sum^{N_u/2}_{-N_u/2}\sum^{N_v/2}_{-N_v/2} {e^{2\pi i(\hat{s}\omega_\hat{s}+\hat{t}\omega_\hat{t})}\mathcal{\bar{L}}^d_f} {(\alpha\omega_\hat{s},\alpha\omega_\hat{t},(1-\alpha)\omega_\hat{s},(1-\alpha)\omega_{\hat{t}})}$$

Realization from the wave optics picture
Although ray-optics based plenoptic camera has demonstrated favorable performance in the macroscopic world, diffraction places a limit on the LFM reconstruction when staying with ray-optics parlance. Hence, it may be much more convenient to switch to wave optics. (This section mainly introduce the work of Broxton et al., 2013. )

Discretization of the space
The interested FOV is segmented into $$N_v$$ voxels, each with a label $i$. Thus, the whole FOV can be discretely represented with a vector $$\mathbf{g}$$ with a dimension of $$N_v\times1$$. Similarly, a $$N_p\times1$$ vector $$\mathbf{f}$$ represents the sensor plane, where each element $$\mathrm{f}_j$$ denotes one sensor pixel. Under the condition of incoherent propagation among different voxels, the light field transmission from the object space to the sensor can be linearly linked by a $$N_p\times N_v$$ measurement matrix, in which the information of PSF is incorporated: $$\mathbf{f}=\mathrm{H}~\mathbf{g}.$$ In the ray-optics scenario, a focal stack is generated via synthetically focusing of rays, and then deconvolution with a synthesized PSF is applied to diminish the blurring caused by the wave nature of light. In the wave optics picture, on the other hand, the measurement matrix $$\mathrm{H}$$–describing light field transmission–is directly calculated based on propagation of waves. Unlike transitional optical microscopes whose PSF shape is invariant (e.g., Airy Pattern) with respect to position of the emitter, an emitter in each voxel generates a unique pattern on the sensor of a LFM. In other words, each column in $$\mathrm{H}$$ is distinct. In the following sections, the calculation of the whole measurement matrix would be discussed in detail.

Optical impulse response
The optical impulse response $h(\mathbf{x},\mathbf{p})$ is the intensity of an electric field at a 2D position $$\mathbf{x}\in\mathbb{R^2}$$ on the sensor plane when an isotropic point source of unit amplitude is placed at some 3D position $$\mathbf{p}\in\mathbb{R^3}$$ in the FOV. There are three steps along the electric-field propagation: traveling from a point source to the native image plane (i.e., the microlens array plane), passing through the microlens array, and propagating onto the sensor plane.

Step 1: Propagation cross an objective
For an objective with a circular aperture, the wavefront at the native image plane $$\mathbf{x}=(x_1,x_2)$$ initiated from an emitter at $\mathbf{p}=(p_1,p_2,p_3)$ can be computed using the scalar Debye theory:

$$U_i(\mathbf{x},\mathbf{p})=\frac{\mathrm{M}}{f_{obj}^2\lambda^2}\exp\biggl(-\frac{iu}{4\sin^2(\alpha/2)}\biggr)\int_{0}^{\alpha}d\theta~P(\theta)\exp\biggl(-\frac{iu\sin^2(\theta/2)}{2\sin^2(\alpha/2)}\biggr)J_0\biggl(\frac{\sin\theta}{\sin\alpha}\nu\biggr)\sin\theta$$,

where $$f_{obj}$$ is the focal length of the objective; $$\mathrm{M}$$ is its magnification. $$\lambda$$ is the wavelength. $$\alpha=\arcsin(\mathrm{NA}/n)$$ is the half-angle of the numerical aperture ($$n$$ is the index of refraction of the sample). $$P(\theta)$$ is the apodization function of the microscope ($$P(\theta)=\sqrt{\cos\theta}$$ for Abbe-sine corrected objectives). $$J_0(\cdot)$$ is the zeroth order Bessel function of the first kind. $$\nu$$ and $$u$$ are the normalized radial and axial optical coordinates, respectively:

$$\nu\thickapprox k\sqrt{(x_1-p_1)^2+(x_2-p_2)^2}\sin\alpha$$

$$u\thickapprox 4 k p_3\sin^2(\alpha/2)$$,

where $$k=2\pi n/\lambda$$ is the wave number.

Step 2: Focusing through the microlens array
Each microlens can be regarded as a phase mask:

$$\phi(\mathbf x)=\exp\biggl(\frac{-ik}{2f}\|\Delta\mathbf x\|_2^2\biggr)$$,

where $f$ is the focal length of microlenses and $$\Delta\mathbf x=\mathbf x-\mathbf x_{\mu lens}$$ is the vector pointing from the center of the microlens to a point $$\mathbf x$$ on the microlens. It is worth noticing that $\phi(\mathbf x)$ is non-zero only when $$\mathbf x$$ is located at the effective transmission area of a microlens.

Thereby, the transmission function of the overall microlens array can be represented as $$\phi(\mathbf x)$$ convoluted with a 2D comb function:

$$\Phi(\mathbf x)=\phi (\mathbf x)*\mathrm{comb}(\mathbf x/d)$$,

where $$d$$ is the pitch (say, the dimension) of microlenses.

Step 3: Near-field propagation to the sensor
The propagation of wave front with distance $$f$$ from the native image plane to the sensor plane can be computed with a Fresnel diffraction integral:

$$ E(\mathbf{x})|_{z=f} = \frac{e^{ikf}}{i \lambda f} \iint E(\mathbf{x}')|_{z=0} \exp\biggl(\frac{ik}{2f}\|\mathbf{x}-\mathbf{x}'\|_2^2\biggr)d\mathbf{x}' $$,

where $ E(\mathbf{x}')|_{z=0}=U_i(\mathbf{x}',\mathbf{p})\Phi(\mathbf{x}') $ is the wave front immediately passing the native imaging plane.

Therefore, the whole optical impulse response can be expressed in terms of a convolution:

$$ h(\mathbf{x},\mathbf{p})=\biggl(U_i(\mathbf{x},\mathbf{p})\Phi(\mathbf{x})\biggr) * \biggl(\frac{e^{ikf}}{i \lambda f}e^{\frac{ik}{2f}\mathbf \|\mathbf x\|_2^2}\biggr) $$.

Computing the measurement matrix
Having acquired the optical impulse response, any element $$ h_{ij} $$ in the measurement matrix $$ \mathrm{H} $$ can be calculated as:

$$ h_{ij}=\int_{\alpha_j} \int_{\beta_i} w_i(\mathbf{p})|h(\mathbf{x},\mathbf{p})|^2 d\mathbf{p}d\mathbf{x} $$,

where $$ \alpha_j $$ is the area for pixel $$ j $$ and $$ \beta_i $$ is the volume for voxel $$ i $$. The weight filter $ w_i(\mathbf{p}) $ is added to match the fact that a PSF contributes more at the center of a voxel than at the edges. The linear superposition integral is based on the assumption that fluorophores in each infinitesimal volume $ d\mathbf{p} $ experience an incoherent, stochastic emission process, considering their rapid, random fluctuations.

The noisy nature of the measurements
Again, due to the limited bandwidth, the photon shot noise, and the huge matrix dimension, it is impossible to directly solve the inverse problem as: $\mathbf{g}=\mathrm{H}^{-1}\mathbf{f}$. Instead, a stochastic relation between a discrete light field and FOV more resembles:

$$ \hat{\mathbf {f}}}\sim\mathrm{Pois}(\mathrm {H} ~{\mathbf {g}+\mathbf{b}) $$,

where $$ \mathbf{b} $$ is the background fluorescence measured prior to imaging; $$  \mathrm{Pois}(\cdot) $$ is the Poisson noise. Therefore, $  \hat{\mathbf {f}} $  now becomes a random vector with Possion-distributed values in units of photoelectrons e−.

Maximum likelihood estimation
Based on the idea of maximizing the likelihood of the measured light field $  \hat{\mathbf {f}} $  given a particular FOV $   \mathbf {g} $  and background $   \mathbf {b} $, the Richardson-Lucy iteration scheme provides an effective 3D deconvolution algorithm here:

$$\mathbf{g}^{(k+1)}=\mathrm{diag}(\mathrm{H}^T\mathbf{1})^{-1}\mathrm{diag}(\mathrm{H}^T\mathrm{diag}(\mathrm{H}\mathbf{g}^{(k)}+\mathbf{b})^{-1}\mathbf{f})\mathbf{g}^{(k)}$$.

where the operator $$\mathrm{diag}(\cdot)$$ remains the diagonal arguments of a matrix and sets its off-diagonal elements to zero.

Light Field Microscopy for functional neural imaging
Starting with initial work at Stanford University applying Light Field Microscopy to calcium imaging in larval zebrafish (Danio Rerio), a number of articles have now applied Light Field Microscopy to functional neural imaging including measuring the neuron dynamic activities across the whole brain of C. elegans, whole-brain imaging in larval zebrafish, imaging calcium and voltage activity sensors across the brain of fruit flies (Drosophila) at up to 200 Hz, and fast imaging of 1mm x 1mm x 0.75mm volumes in the hippocampus of mice navigating a virtual environment. This area of application is a rapidly developing area at the intersection of computational optics and neuroscience.