Velocity Moments

In the field of computer vision, velocity moments are weighted averages of the intensities of pixels in a sequence of images, similar to image moments but in addition to describing an object's shape also describe its motion through the sequence of images. Velocity moments can be used to aid automated identification of a shape in an image when information about the motion is significant in its description. There are currently two established versions of velocity moments: Cartesian and Zernike.

Cartesian moments for single images
A Cartesian moment of a single image is calculated by


 * $$ m_{pq} = \sum_{x=1}^M \sum_{y=1}^N x^p y^q P_{xy} $$

where $$M$$ and $$N$$ are the dimensions of the image, $$P_{xy}$$ is the intensity of the pixel at the point $$(x,y)$$ in the image, and $$x^p y^q$$ is the basis function.

Cartesian velocity moments for sequences of images
Cartesian velocity moments are based on these Cartesian moments. A Cartesian velocity moment $$vm_{pq\mu\gamma}$$ is defined by


 * $$ vm_{pq\mu\gamma} = \sum_{i=2}^{images} \sum_{x=1}^M \sum_{y=1}^N U(i,\mu,\gamma) C(i,p,g) P_{i_{xy}}$$

where $$M$$ and $$N$$ are again the dimensions of the image, $$images$$ is the number of images in the sequence, and $$P_{i_{xy}}$$ is the intensity of the pixel at the point $$(x,y)$$ in image $$i$$.

$$C(i,p,q)$$ is taken from Central moments, added so the equation is translation invariant, defined as


 * $$ C(i,p,q) = (x-\overline{x_i})^p (y-\overline{y_i})^q $$

where $$\overline{x_i}$$ is the $$x$$ coordinate of the centre of mass for image $$i$$, and similarly for $$y$$.

$$ U(i,\mu,\gamma)$$ introduces velocity into the equation as


 * $$ U(i,\mu,\gamma) = (\overline{x_i}-\overline{x_{i-1}})^\mu (\overline{y_i}-\overline{y_{i-1}})^\gamma $$

where $$\overline{x_{i-1}}$$ is the $$x$$ coordinate of the centre of mass for the previous image, $$i-1$$, and again similarly for $$y$$.

After the Cartesian velocity moment is calculated, it can be normalised by


 * $$ \overline{vm_{pq\mu\gamma}} = \frac {vm_{pq\mu\gamma}}{A * I} $$

where $$A$$ is the average area of the object, in pixels, and $$I$$ is the number of images. Now the value is not affected by the number of images in the sequence or the size of the object.

As Cartesian moments are non-orthogonal, so are Cartesian velocity moments, so different moments can be closely correlated. These velocity moments do however provide translation and scale invariance (unless the scale changes within the sequence of images).

Zernike moments for single images
A Zernike moment of a single image is calculated by


 * $$ A_{mn} = \frac {m + 1}\pi \sum_x \sum_y [V_{mn}(r,\theta)]^* P_{xy}$$

where $$^*$$ denotes the complex conjugate, $$m$$ is an integer between $$0$$ and $$\infty$$, and $$n$$ is an integer such that $$m - |n|$$ is even and $$|n| < m$$. For calculating Zernike moments, the image, or section of the image which is of interest is mapped to the unit disc, then $$P_{xy}$$ is the intensity of the pixel at the point $$(x,y)$$ on the disc and $$x^2 + y^2 \le 1$$ is a restriction on values of $$x$$ and $$y$$. The coordinates are then mapped to polar coordinates, and $$r$$ and $$\theta$$ are the polar coordinates of the point $$(x,y)$$ on the unit disc map.

$$V_{mn}(r,\theta)$$ is derived from Zernike polynomials and is defined by


 * $$ V_{mn}(r,\theta) = R_{mn}(r)e^{jn\theta} $$


 * $$ R_{mn}(r) = \sum_{s=0}^{\frac {m-|n|}2} (-1)^s F(m,n,s,r) $$


 * $$ F(m,n,s,r) = \frac {(m-s)!} {s! (\frac {m+|n|}2-s)! (\frac {m-|n|}2-s)!} r^{m-2s} $$

Zernike velocity moments for sequences of images
Zernike velocity moments are based on these Zernike moments. A Zernike velocity moment $$A_{mn\mu\gamma}$$ is defined by


 * $$ A_{mn\mu\gamma} = \frac {m + 1}\pi \sum_{i=2}^{images} \sum_{x=1} \sum_{y=1} U(i,\mu,\gamma) [V_{mn}(r,\theta)]^* P_{i_{xy}}$$

where $$images$$ is again the number of images in the sequence, and $$P_{i_{xy}}$$ is the intensity of the pixel at the point $$(x,y)$$ on the unit disc mapped from image $$i$$.

$$ U(i,\mu,\gamma)$$ introduces velocity into the equation in the same way as in the Cartesian velocity moments and $$[V_{mn}(r,\theta)]^*$$ is from the Zernike moments equation above.

Like the Cartesian velocity moments, Zernike velocity moments can be normalised by


 * $$ \overline{A_{mn\mu\gamma}} = \frac {A_{mn\mu\gamma}}{A * I} $$

where $$A$$ is the average area of the object, in pixels, and $$I$$ is the number of images.

As Zernike velocity moments are based on the orthogonal Zernike moments, they produce less correlated and more compact descriptions than Cartesian velocity moments. Zernike velocity moments also provide translation and scale invariance (even when the scale changes within the sequence).