Kushner equation

In filtering theory the Kushner equation (after Harold Kushner) is an equation for the conditional probability density of the state of a stochastic non-linear dynamical system, given noisy measurements of the state. It therefore provides the solution of the nonlinear filtering problem in estimation theory. The equation is sometimes referred to as the Stratonovich–Kushner   (or Kushner–Stratonovich) equation.

Overview
Assume the state of the system evolves according to


 * $$dx = f(x,t) \, dt + \sigma\, dw$$

and a noisy measurement of the system state is available:


 * $$dz = h(x,t) \, dt + \eta\, dv$$

where w, v are independent Wiener processes. Then the conditional probability density p(x, t) of the state at time t is given by the Kushner equation:


 * $$dp(x,t) = L[p(x,t)] dt + p(x,t) \big(h(x,t)-E_t h(x,t) \big)^\top \eta^{-\top}\eta^{-1} \big(dz-E_t h(x,t) dt\big).$$

where
 * $$L[p] := -\sum \frac{\partial (f_i p)}{\partial x_i} + \frac{1}{2} \sum (\sigma \sigma^\top)_{i,j} \frac{\partial^2 p}{\partial x_i \partial x_j}$$

is the Kolmogorov forward operator and
 * $$dp(x,t) = p(x,t + dt) - p(x,t)$$

is the variation of the conditional probability.

The term $$dz - E_t h(x,t) dt$$ is the innovation, i.e. the difference between the measurement and its expected value.

Kalman–Bucy filter
One can use the Kushner equation to derive the Kalman–Bucy filter for a linear diffusion process. Suppose we have $$ f(x,t) = A x$$ and $$ h(x,t) = C x $$. The Kushner equation will be given by

dp(x,t) = L[p(x,t)] dt + p(x,t) \big( C x- C \mu(t) \big)^\top \eta^{-\top}\eta^{-1} \big(dz-C \mu(t) dt\big), $$ where $$ \mu(t) $$ is the mean of the conditional probability at time $$ t$$. Multiplying by $$ x$$ and integrating over it, we obtain the variation of the mean

d\mu(t) = A \mu(t) dt + \Sigma(t) C^\top \eta^{-\top}\eta^{-1} \big(dz - C\mu(t) dt\big). $$ Likewise, the variation of the variance $$\Sigma(t)$$ is given by

\tfrac{d}{dt}\Sigma(t) = A\Sigma(t) + \Sigma(t) A^\top + \sigma^\top \sigma-\Sigma(t) C^\top\eta^{-\top} \eta^{-1} C \,\Sigma(t). $$ The conditional probability is then given at every instant by a normal distribution $$\mathcal{N}(\mu(t),\Sigma(t))$$.