Bayesian estimation of templates in computational anatomy

Statistical shape analysis and statistical shape theory in computational anatomy (CA) is performed relative to templates, therefore it is a local theory of statistics on shape. Template estimation in computational anatomy from populations of observations is a fundamental operation ubiquitous to the discipline. Several methods for template estimation based on Bayesian probability and statistics in the random orbit model of CA have emerged for submanifolds and dense image volumes.

The deformable template model of shapes and forms via diffeomorphic group actions
Linear algebra is one of the central tools to modern engineering. Central to linear algebra is the notion of an orbit of vectors, with the matrices forming groups (matrices with inverses and identity) which act on the vectors. In linear algebra the equations describing the orbit elements the vectors are linear in the vectors being acted upon by the matrices. In computational anatomy the space of all shapes and forms is modeled as an orbit similar to the vectors in linear-algebra, however the groups do not act linear as the matrices do, and the shapes and forms are not additive. In computational anatomy addition is essentially replaced by the law of composition.

The central group acting CA defined on volumes in $${\mathbb R}^3$$ are the diffeomorphisms $$\mathcal{G} \doteq Diff$$ which are mappings with 3-components $$\phi(\cdot) = (\phi_1(\cdot),\phi_2 (\cdot),\phi_3 (\cdot))$$, law of composition of functions $$ \phi \circ \phi^\prime (\cdot)\doteq \phi (\phi^\prime(\cdot)) $$, with inverse $$ \phi \circ \phi^{-1}(\cdot) =\phi ( \phi^{-1}(\cdot))= id$$.

Groups and group are familiar to the Engineering community with the universal popularization and standardization of linear algebra as a basic model

A popular group action is on scalar images, $$I(x),x \in {\mathbb R}^3$$, with action on the right via the inverse.



\phi \cdot I(x) = I \circ \phi^{-1} (x), x \in {\mathbb R}^3. $$

For sub-manifolds $$ X \subset {\mathbb R}^3 \in \mathcal{M} $$, parametrized by a chart or immersion $$ m(u), u \in U $$, the diffeomorphic action the flow of the position



\phi \cdot m(u) \doteq \phi\circ m(u), u \in U. $$

Several group actions in computational anatomy have been defined.

Geodesic positioning via the Riemannian exponential
For the study of deformable shape in CA, a more general diffeomorphism group has been the group of choice, which is the infinite dimensional analogue. The high-dimensional diffeomorphism groups used in computational anatomy are generated via smooth flows $$ \phi_t, t \in [0,1] $$ which satisfy the Lagrangian and Eulerian specification of the flow fields satisfying the ordinary differential equation:

with $$ v \doteq (v_1,v_2,v_3) $$ the vector fields on $$ {\mathbb R}^3   $$ termed the Eulerian velocity of the particles at position $$\phi$$ of the flow. The vector fields are functions in a function space, modelled as a smooth Hilbert space with the vector fields having 1-continuous derivative. For $$v_t = \dot \phi_t \circ \phi_t^{-1}, t \in [0,1]$$, with the inverse for the flow given by

and the $$3 \times 3$$ Jacobian matrix for flows in $$\mathbb{R}^3$$ given as $$ \ D\phi \doteq \left(\frac{\partial \phi_i}{\partial x_j}\right). $$

Flows were first introduced for large deformations in image matching; $$\dot \phi_t(x)$$ is the instantaneous velocity of particle $$x$$ at time $$t$$. with the vector fields termed the Eulerian velocity of the particles at position  of the flow. The modelling approach used in CA enforces a continuous differentiability condition on the vector fields by modelling the space of vector fields $$(V, \| \cdot \|_V )$$ as a reproducing kernel Hilbert space (RKHS), with the norm defined by a 1-1, differential operator$$ A: V \rightarrow V^* $$, Green's inverse $$K = A^{-1}$$. The norm according to $$ \| v\|_V^2 \doteq \int_X Av \cdot v dx, v \in V, $$ where for $$ \sigma(v) \doteq Av \in V^* $$ a generalized function or distribution, then $$ (\sigma\mid w)\doteq \int_{{\mathbb R}^3} \sum_i w_i(x) \sigma_i(dx) $$. Since $$ A $$ is a differential operator, finiteness of the norm-square $$ \int_X Av \cdot v dx < \infty $$ includes derivatives from the differential operator implying smoothness of the vector fields.

To ensure smooth flows of diffeomorphisms with inverse, the vector fields $$  {\mathbb R}^3   $$ must be at least 1-time continuously differentiable in space  which are modelled as elements of the Hilbert space  $$(V, \| \cdot \|_V )$$ using the Sobolev embedding theorems so that each element $$v_i \in H_0^3, i=1,2,3,$$ has 3-square-integrable derivatives. Thus $$(V, \| \cdot \|_V )$$ embed smoothly in 1-time continuously differentiable functions. The diffeomorphism group are flows with vector fields absolutely integrable in Sobolev norm:

The Bayes model of computational anatomy
The central statistical model of computational anatomy in the context of medical imaging is the source-channel model of Shannon theory;  the source is the deformable template of images $$ I \in \mathcal {I} $$, the channel outputs are the imaging sensors with observables $$ I^D \in {\mathcal I}^{\mathcal D} $$. The variation in the anatomical configurations are modelled separately from the Medical imaging modalities Computed Axial Tomography machine, MRI machine, PET machine, and others. The Bayes theory models the prior on the source of images $$ \pi_{\mathcal{I}} (\cdot) $$ on $$I \in \mathcal{I} $$, and the conditional density on the observable imagery $$ p(\cdot |I) \ \text{on} \ I^D \in {\mathcal I}^{\mathcal D} $$, conditioned on $$ I \in \mathcal{I} $$. For images with diffeomorphism group action $$I \doteq \phi \cdot I_\mathrm{temp}, \phi \in Diff_V$$, then the prior on the group $$\pi_{Diff_V} (\cdot)$$ induces the prior on images $$\pi_{\mathcal{I}} (\cdot)$$, written as densities the log-posterior takes the form



\log p(\phi \cdot I\mid I^D) \simeq \log p(I^D \mid \phi \cdot I)+ \log \pi_{\operatorname{Diff}_V}(\phi) \. $$

Maximum a posteriori estimation (MAP) estimation is central to modern statistical theory. Parameters of interest $$ \theta \in \Theta $$ take many forms including (i) disease type such as neurodegenerative or neurodevelopmental diseases, (ii) structure type such as cortical or subcortical structures in problems associated to segmentation of images, and (iii) template reconstruction from populations. Given the observed image $$ I^D $$, MAP estimation maximizes the posterior:

\hat \theta \doteq \arg \max_{\theta \in \Theta} \log p(\theta \mid I^D). $$ This requires computation of the conditional probabilities $$p(\theta\mid I^D) = \frac{p(I^D,\theta)}{p(I^D)}$$. The multiple atlas orbit model randomizes over the denumerable set of atlases $$\{ I_a, a \in \mathcal{A} \}$$. The model on images in the orbit take the form of a multi-modal mixture distribution
 * $$p(I^D, \theta) = \textstyle \sum_{a \in \mathcal{A}}  p(I^D,\theta\mid  I_a) \pi_{\mathcal A}(a) \ .$$

Surface templates for computational neuroanatomy and subcortical structures
The study of sub-cortical neuroanatomy has been the focus of many studies. Since the original publications by Csernansky and colleagues of hippocampal change in Schizophrenia,   Alzheimer's disease,   and Depression,  many neuroanatomical shape statistical studies have now been completed using templates built from all of the subcortical structures for depression, Alzheimer's,     Bipolar disorder, ADHD, autism, and Huntington's Disease. Templates were generated using Bayesian template estimation data back to Ma, Younes and Miller.

Shown in the accompanying Figure is an example of subcortical structure templates generated from T1-weighted magnetic resonance imagery by Tang et al. for the study of Alzheimer's disease in the ADNI population of subjects.

Surface estimation in cardiac computational anatomy
Numerous studies have now been done on cardiac hypertrophy and the role of the structural integraties in the functional mechanics of the heart. Siamak Ardekani has been working on populations of Cardiac anatomies reconstructing atlas coordinate systems from populations. The figure on the right shows the computational cardiac anatomy method being used to identify regional differences in radial thickness at end-systolic cardiac phase between patients with hypertrophic cardiomyopathy (left) and hypertensive heart disease (right). Color map that is placed on a common surface template (gray mesh) represents region ( basilar septal and the anterior epicardial wall) that has on average significantly larger radial thickness in patients with hypertrophic cardiomyopathy vs. hypertensive heart disease (reference below).

MAP Estimation of volume templates from populations and the EM algorithm
Generating templates empirically from populations is a fundamental operation ubiquitous to the discipline. Several methods based on Bayesian statistics have emerged for submanifolds and dense image volumes. For the dense image volume case, given the observable $$ I^{D_1}, I^{D_2}, \dots $$ the problem is to estimate the template in the orbit of dense images $$ I \in \mathcal{I} $$. Ma's procedure takes an initial hypertemplate $$ I_0 \in \mathcal{I} $$ as the starting point, and models the template in the orbit under the unknown to be estimated diffeomorphism $$ I \doteq \phi_0 \cdot I_0 $$, with the parameters to be estimated the log-coordinates $$\theta \doteq v_0$$ determining the geodesic mapping of the hyper-template $$\mathrm{Exp}_\mathrm{id}(v_0) \cdot I_0 = I \in \mathcal{I}$$.

In the Bayesian random orbit model of computational anatomy the observed MRI images $$I^{D_i}$$ are modelled as a conditionally Gaussian random field with mean field $$\phi_i \cdot I$$, with $$\phi_i$$ a random unknown transformation of the template. The MAP estimation problem is to estimate the unknown template $$ I \in \mathcal{I}$$ given the observed MRI images.

Ma's procedure for dense imagery takes an initial hypertemplate $$ I_0 \in \mathcal{I} $$ as the starting point, and models the template in the orbit under the unknown to be estimated diffeomorphism $$ I \doteq \phi_0 \cdot I_0 $$. The observables are modelled as conditional random fields, $$ I^{D_i} $$ a $$ random field with mean field $$ \phi_i \cdot I \doteq \phi_i \cdot \phi_0 \cdot I_0 $$. The unknown variable to be estimated explicitly by MAP is the mapping of the hyper-template $$ \phi_0$$, with the other mappings considered as nuisance or hidden variables which are integrated out via the Bayes procedure. This is accomplished using the expectation–maximization (EM) algorithm.

The orbit-model is exploited by associating the unknown to be estimated flows to their log-coordinates $$v_i,i=1,\dots$$ via the Riemannian geodesic log and exponential for computational anatomy the initial vector field in the tangent space at the identity so that $$ \mathrm{Exp}_\mathrm{id}(v_{i}) \doteq \phi_i $$, with $$ \mathrm{Exp}_\mathrm{id}(v_{0}) $$ the mapping of the hyper-template. The MAP estimation problem becomes



\max_{v_0} p(I^D, \theta = v_0) = \int p(I^D, \theta= v_0\mid v_1,v_2, \dots ) \pi(v_1,v_2, \dots ) \, dv $$

The EM algorithm takes as complete data the vector-field coordinates parameterizing the mapping, $$v_i,i=1,\dots$$ and compute iteratively the conditional-expectation

\begin{matrix} Q(\theta=v_0; \theta^\text{old}=v_0^\text{old}) & = - E ( \log p(I^D, \theta=v_0 \mid v_1,v_2,\dots)|I^D, \theta^\text{old}) \\ & = - \| ( \bar I^\text{old} - I_0 \circ \mathrm{Exp}_\mathrm{id}(v_0)^{-1} ) \sqrt{\beta^\text{old}} \|^2 - \|v_0\|_V^2 \end{matrix} $$ - \| v_0\|_V^2$$
 * Compute new template maximizing Q-function, setting $$\theta^\text{new}\doteq v_0^\text{new} = \arg \max_{\theta = v_0} Q(\theta ; \theta^\text{old}=v_0^\text{old}) = - \| ( \bar I^\text{old} - I_0 \circ \mathrm{Exp}_\mathrm{id}(v_0)^{-1} ) \sqrt{\beta^\text{old}}\|^2
 * Compute the mode-approximation for the expectation updating the expected-values for the mode values:
 * $$v_i^\text{new} = \arg \max_{v: \dot \phi = v \circ \phi } - \int_0^1 \| v_t \|_V^2 \, dt - \| I^{D_i} - I_0 \circ \mathrm{Exp}_\mathrm{id}(v_0^\text{old})^{-1} \circ \mathrm{Exp}_\mathrm{id}(v)^{-1} \|^2. i=1,2,\dots$$
 * $$\beta^\text{new} (x) = \sum_{i=1}^n | D \mathrm{Exp}_\mathrm{id}(v_i^\text{new})(x) |, \text{ with } \bar I^\text{new} (x) = \frac{ \sum_{i=1}^n I^{D_i} \circ \mathrm{Exp}_\mathrm{id}(v_i^\text{new}) | D \mathrm{Exp}_\mathrm{id}(v_i^\text{new})(x) | }{ \beta^\text{old}(x)}$$