Bayesian model of computational anatomy

Computational anatomy (CA) is a discipline within medical imaging focusing on the study of anatomical shape and form at the visible or gross anatomical scale of morphology. The field is broadly defined and includes foundations in anatomy, applied mathematics and pure mathematics, including medical imaging, neuroscience, physics, probability, and statistics. It focuses on the anatomical structures being imaged, rather than the medical imaging devices. The central focus of the sub-field of computational anatomy within medical imaging is mapping information across anatomical coordinate systems most often dense information measured within a magnetic resonance image (MRI). The introduction of flows into CA, which are akin to the equations of motion used in fluid dynamics, exploit the notion that dense coordinates in image analysis follow the Lagrangian and Eulerian equations of motion. In models based on Lagrangian and Eulerian flows of diffeomorphisms, the constraint is associated to topological properties, such as open sets being preserved, coordinates not crossing implying uniqueness and existence of the inverse mapping, and connected sets remaining connected. The use of diffeomorphic methods grew quickly to dominate the field of mapping methods post Christensen's original paper, with fast and symmetric methods becoming available.

The main statistical model
The central statistical model of Computational Anatomy in the context of medical imaging has been the source-channel model of Shannon theory; the source is the deformable template of images $$ I \in \mathcal {I} $$, the channel outputs are the imaging sensors with observables $$ I^D \in {\mathcal I}^{\mathcal D} $$ (see Figure). The importance of the source-channel model is that the variation in the anatomical configuration are modelled separated from the sensor variations of the Medical imagery. The Bayes theory dictates that the model is characterized by the prior on the source, $$ \pi_{\mathcal{I}} (\cdot) $$ on $$I \in \mathcal{I} $$, and the conditional density on the observable


 * $$ p(\cdot \mid I) \text{ on } I^D \in {\mathcal I}^{\mathcal D} $$

conditioned on $$ I \in \mathcal{I} $$.

In deformable template theory, the images are linked to the templates, with the deformations a group which acts on the template; see group action in computational anatomy For image action $$I(g) \doteq g \cdot I_\mathrm{temp}, g \in \mathcal{G}$$, then the prior on the group $$\pi_{\mathcal{G}} (\cdot)$$ induces the prior on images $$\pi_{\mathcal{I}} (\cdot)$$, written as densities the log-posterior takes the form



\log p(I(g)\mid I^D) \simeq \log p(I^D \mid I(g))+ \log \pi_{\mathcal{G}}(g). $$

The random orbit model which follows specifies how to generate the group elements and therefore the random spray of objects which form the prior distribution.

The random orbit model of computational anatomy


The random orbit model of Computational Anatomy first appeared in   modelling the change in coordinates associated to the randomness of the group acting on the templates, which induces the randomness on the source of images in the anatomical orbit of shapes and forms and resulting observations through the medical imaging devices. Such a random orbit model in which randomness on the group induces randomness on the images was examined for the Special Euclidean Group for object recognition in which the group element $$ g \in \mathcal{G}$$ was the special Euclidean group in.

For the study of deformable shape in CA, the high-dimensional diffeomorphism groups used in computational anatomy are generated via smooth flows $$ \varphi_t, t \in [0,1] $$ which satisfy the Lagrangian and Eulerian specification of the flow fields satisfying the ordinary differential equation:

with $$ v \doteq (v_1,v_2,v_3) $$ the vector fields on $$ {\mathbb R}^3   $$ termed the Eulerian velocity of the particles at position $$\varphi$$ of the flow. The vector fields are functions in a function space, modelled as a smooth Hilbert space with the vector fields having 1-continuous derivative. For $$v_t = \dot \varphi_t \circ \varphi_t^{-1}, t \in [0,1]$$, the inverse of the flow is given by

and the $$3 \times 3$$ Jacobian matrix for flows in $$\mathbb{R}^3$$ given as $$ \ D\varphi \doteq \left(\frac{\partial \varphi_i}{\partial x_j}\right). $$

To ensure smooth flows of diffeomorphisms with inverse, the vector fields $$  {\mathbb R}^3   $$ must be at least 1-time continuously differentiable in space  which are modelled as elements of the Hilbert space  $$(V, \| \cdot \|_V )$$ using the Sobolev embedding theorems so that each element $$v_i \in H_0^3, i=1,2,3,$$ has 3-square-integrable derivatives. Thus $$(V, \| \cdot \|_V )$$ embed smoothly in 1-time continuously differentiable functions. The diffeomorphism group are flows with vector fields absolutely integrable in Sobolev norm:

where $$\| v_t \|_V^2 \doteq \int_X Av_t \cdot v_t dx $$ with $$A$$ a linear operator $$A:V \mapsto V^*$$ defining the norm of the RKHS. The integral is calculated by integration by parts when $$Av$$ is a generalized function in the dual space $$ V^*$$.

Riemannian exponential
In the random orbit model of computational anatomy, the entire flow is reduced to the initial condition which forms the coordinates encoding the diffeomorphism. From the initial condition $$ v_0 $$ then geodesic positioning with respect to the Riemannian metric of Computational anatomy solves for the flow of the Euler-Lagrange equation. Solving the geodesic from the initial condition $$ v_0 $$ is termed the Riemannian-exponential, a mapping $$ \operatorname{Exp}_\mathrm{id}(\cdot): V \to \operatorname{Diff}_V $$ at identity to the group.

The Riemannian exponential satisfies $$ \operatorname{Exp}_\mathrm{id} (v_0)= \varphi_1 $$ for initial condition $$\dot \varphi_0 = v_0$$, vector field dynamics $$\dot \varphi_t = v_t \circ \varphi_t, t \in [0,1] $$,
 * for classical equation diffeomorphic shape momentum $$ \int_X Av_t \cdot w \, dx $$, $$Av \in V$$, then



\frac{d}{dt} Av_t + (Dv_t)^T Av_t +(DAv_t)v_t + ( \nabla \cdot v) Av_t =0  \ ; $$ Av \in V^* $$, $$ w \in V $$
 * for generalized equation, then $$
 * $$ \int_X \frac{d}{dt} Av_t \cdot w \,dx + \int_X Av_t \cdot ((Dv_t)w-(Dw)v_t)\,dx =0 . $$

It is extended to the entire group, $$ \varphi= \operatorname{Exp}_\varphi(v_0\circ \varphi) \doteq \operatorname{Exp}_\mathrm{id} (v_0) \circ \varphi. $$ Depicted in the accompanying figure is a depiction of the random orbits around each exemplar, $$m_0 \in \mathcal{M}$$, generated by randomizing the flow by generating the initial tangent space vector field at the identity $$v_0 \in V$$, and then generating random object $$n \doteq \operatorname{Exp}_\mathrm{id}(v_0) \cdot m_0 \in \mathcal{M}$$. Shown in the Figure on the right the cartoon orbit, are a random spray of the subcortical manifolds generated by randomizing the vector fields $$v_0$$ supported over the submanifolds. The random orbit model induces the prior on shapes and images $$I \in \mathcal{I}$$ conditioned on a particular atlas $$ I_a \in \mathcal{I}$$. For this the generative model generates the mean field $$I$$ as a random change in coordinates of the template according to $$ I \doteq \varphi \cdot I_a $$, where the diffeomorphic change in coordinates is generated randomly via the geodesic flows.

MAP estimation in the multiple-atlas orbit model
The random orbit model induces the prior on shapes and images $$I \in \mathcal{I}$$ conditioned on a particular atlas $$ I_a \in \mathcal{I}$$. For this the generative model generates the mean field $$ I $$ as a random change in coordinates of the template according to $$I \doteq \varphi \cdot I_a $$, where the diffeomorphic change in coordinates is generated randomly via the geodesic flows. The prior on random transformations $$\pi_\mathrm{Diff} (d\varphi) $$ on $$\operatorname{Diff}_V$$ is induced by the flow $$ \operatorname{Exp}_\mathrm{id}(v)$$, with $$ v \in V $$ constructed as a Gaussian random field prior $$ \pi_V(dv) $$. The density on the random observables at the output of the sensor $$I^D \in \mathcal{I}^D$$ are given by



p(I^D\mid I_a) = \int_V p(I^D \mid \operatorname{Exp}_\mathrm{id}(v) \cdot I_a ) \pi_V (dv) \. $$

Maximum a posteriori estimation (MAP) estimation is central to modern statistical theory. Parameters of interest $$ \theta \in \Theta $$ take many forms including (i) disease type such as neurodegenerative or neurodevelopmental diseases, (ii) structure type such as cortical or subcortical structures in problems associated to segmentation of images, and (iii) template reconstruction from populations. Given the observed image $$ I^D $$, MAP estimation maximizes the posterior:



\hat \theta \doteq \arg \max_{\theta \in \Theta} \log p(\theta \mid I^D). $$

This requires computation of the conditional probabilities $$p(\theta\mid I^D) = \frac{p(I^D,\theta)}{p(I^D)}$$. The multiple atlas orbit model randomizes over the denumerable set of atlases $$\{ I_a, a \in \mathcal{A} \}$$. The model on images in the orbit take the form of a multi-modal mixture distribution


 * $$p(I^D, \theta) = \sum_{a \in \mathcal{A}} p(I^D,\theta\mid  I_a) \pi_{\mathcal A}(a) \ .$$

The conditional Gaussian model has been examined heavily for inexact matching in dense images and for landmark matching.

Dense emage matching
Model $$ I^D(x), x \in X $$ as a conditionally Gaussian random field conditioned, mean field, $$\varphi_1 \cdot I \doteq I(\varphi_1^{-1}), \varphi_1 \in Diff_V$$. For uniform variance the endpoint error terms plays the role of the log-conditional (only a function of the mean field) giving the endpoint term:

Landmark matching
Model $$Y = \{ y_1,y_2,\dots \}$$ as conditionally Gaussian with mean field $$\varphi_1 (x_i), i=1,2,\dots, \varphi_1 \in \operatorname{Diff}_V$$, constant noise variance independent of landmarks. The log-conditional (only a function of the mean field) can be viewed as the endpoint term:
 * $$ -\log p(I^D \mid I(g))\simeq \operatorname{E}(\varphi_1) \doteq \frac{1}{2 \sigma^2}\sum_i \| y_i -\varphi_1(x_i) \|^2. $$

MAP segmentation based on multiple atlases
The random orbit model for multiple atlases models the orbit of shapes as the union over multiple anatomical orbits generated from the group action of diffeomorphisms, $$ \mathcal{I} = \textstyle \bigcup_{a \in \mathcal{A}}\displaystyle \operatorname{Diff}_V \cdot I_a $$, with each atlas having a template and predefined segmentation field $$ (I_a,W_a),a = a_1,a_2,\ldots $$. incorporating the parcellation into anatomical structures of the coordinate of the MRI.. The pairs are indexed over the voxel lattice $$I_a(x_i),W_a(x_i), x_i \in X \subset {\mathbb R}^3$$ with an MRI image and a dense labelling of every voxel coordinate. The anatomical labelling of parcellated structures are manual delineations by neuroanatomists.

The Bayes segmentation problem is given measurement $$I^D$$ with mean field and parcellation $$(I,W)$$, the anatomical labelling $$\theta \doteq W$$. mustg be estimated for the measured MRI image. The mean-field of the observable $$ I^D $$ image is modelled as a random deformation from one of the templates $$ I \doteq \varphi \cdot I_a $$, which is also randomly selected, $$ A = a $$,. The optimal diffeomorphism $$ \varphi \in \mathcal{G} $$ is hidden and acts on the background space of coordinates of the randomly selected template image $$ I_a  $$. Given a single atlas $$ a $$, the likelihood model for inference is determined by the joint probability $$ p(I^D,W\mid A = a) $$; with multiple atlases, the fusion of the likelihood functions yields the multi-modal mixture model with the prior averaging over models.

The MAP estimator of segmentation $$ W_a  $$  is the maximizer  $$ \max_{W} \log p(W \mid I^D) $$ given $$ I^D  $$, which involves the mixture over all atlases.
 * $$ \hat W \doteq \arg \textstyle \max_W \displaystyle \log p(I^D,W) \text{ with } p(I^D,W) =\textstyle \sum_{a \in \mathcal{A}} \displaystyle p(I^D,W\mid A = a)\pi _A(a). $$

The quantity $$ p(I^D,W) $$ is computed via a fusion of likelihoods from multiple deformable atlases, with $$ \pi _A(a) $$ being the prior probability that the observed image evolves from the specific template image $$ I_a $$.

The MAP segmentation can be iteratively solved via the expectation–maximization algorithm


 * $$ W^\text{new} \doteq \arg \max_W \int \log p(W,I^D,A, \varphi ) \, dp(A,\varphi \mid W^\text{old},I^D). $$

MAP estimation of volume templates from populations and the EM algorithm
Generating templates empirically from populations is a fundamental operation ubiquitous to the discipline. Several methods based on Bayesian statistics have emerged for submanifolds and dense image volumes. For the dense image volume case, given the observable $$ I^{D_1}, I^{D_2}, \dots $$ the problem is to estimate the template in the orbit of dense images $$ I \in \mathcal{I} $$. Ma's procedure takes an initial hypertemplate $$ I_0 \in \mathcal{I} $$ as the starting point, and models the template in the orbit under the unknown to be estimated diffeomorphism $$ I \doteq \varphi_0 \cdot I_0 $$, with the parameters to be estimated the log-coordinates $$\theta \doteq v_0$$ determining the geodesic mapping of the hyper-template $$\operatorname{Exp}_\mathrm{id}(v_0) \cdot I_0 = I \in \mathcal{I}$$.

In the Bayesian random orbit model of computational anatomy the observed MRI images $$I^{D_i}$$ are modelled as a conditionally Gaussian random field with mean field $$\varphi_i \cdot I$$, with $$\varphi_i$$ a random unknown transformation of the template. The MAP estimation problem is to estimate the unknown template $$ I \in \mathcal{I}$$ given the observed MRI images.

Ma's procedure for dense imagery takes an initial hypertemplate $$ I_0 \in \mathcal{I} $$ as the starting point, and models the template in the orbit under the unknown to be estimated diffeomorphism $$ I \doteq \varphi_0 \cdot I_0 $$. The observables are modelled as conditional random fields, $$ I^{D_i} $$ a $$ random field with mean field $$ \varphi_i \cdot I \doteq \varphi_i \cdot \varphi_0 \cdot I_0 $$. The unknown variable to be estimated explicitly by MAP is the mapping of the hyper-template $$ \varphi_0$$, with the other mappings considered as nuisance or hidden variables which are integrated out via the Bayes procedure. This is accomplished using the expectation–maximization algorithm.

The orbit-model is exploited by associating the unknown to be estimated flows to their log-coordinates $$v_i,i=1,\dots$$ via the Riemannian geodesic log and exponential for computational anatomy the initial vector field in the tangent space at the identity so that $$ \operatorname{Exp}_\mathrm{id}(v_{i}) \doteq \varphi_i $$, with $$ \operatorname{Exp}_\mathrm{id}(v_{0}) $$ the mapping of the hyper-template. The MAP estimation problem becomes



\max_{v_0} p(I^D, \theta = v_0) = \int p(I^D, \theta= v_0\mid v_1,v_2, \dots ) \pi(v_1,v_2, \dots ) \, dv $$

The EM algorithm takes as complete data the vector-field coordinates parameterizing the mapping, $$v_i,i=1,\dots$$ and compute iteratively the conditional-expectation

\begin{cases} Q(\theta=v_0; \theta^\text{old}=v_0^\text{old}) & = - \operatorname{E} ( \log p(I^D, \theta=v_0 \mid v_1,v_2,\dots)\mid I^D, \theta^\text{old}) \\ & = - \| ( \bar I^\text{old} - I_0 \circ \operatorname{Exp}_\mathrm{id}(v_0)^{-1} ) \sqrt{\beta^\text{old}} \|^2 - \|v_0\|_V^2 \end{cases} $$
 * Compute new template maximizing Q-function, setting
 * $$\theta^\text{new}\doteq v_0^\text{new} = \arg \max_{\theta = v_0} Q(\theta ; \theta^\text{old}=v_0^\text{old}) = - \left\| ( \bar I^\text{old} - I_0 \circ \operatorname{Exp}_\mathrm{id}(v_0)^{-1} ) \sqrt{\beta^\text{old}} \right\|^2

- \| v_0\|_V^2$$
 * Compute the mode-approximation for the expectation updating the expected-values for the mode values:
 * $$v_i^\text{new} = \arg \max_{v: \dot \varphi = v \circ \varphi } - \int_0^1 \| v_t \|_V^2 \, dt - \| I^{D_i} - I_0 \circ \operatorname{Exp}_\mathrm{id}(v_0^\text{old})^{-1} \circ \operatorname{Exp}_\mathrm{id}(v)^{-1} \|^2. i=1,2,\dots$$
 * $$\beta^\text{new} (x) = \sum_{i=1}^n | D \operatorname{Exp}_\mathrm{id}(v_i^\text{new})(x) |, \text{ with } \bar I^\text{new} (x) = \frac{ \sum_{i=1}^n I^{D_i} \circ \operatorname{Exp}_\mathrm{id}(v_i^\text{new}) | D \operatorname{Exp}_\mathrm{id}(v_i^\text{new})(x) | }{ \beta^\text{old}(x)}$$