Multi-fractional order estimator

In target tracking, the multi-fractional order estimator (MFOE) is an alternative to the Kalman filter. The MFOE is focused strictly on simple and pragmatic fundamentals along with the integrity of mathematical modeling. Like the KF, the MFOE is based on the least squares method (LSM) invented by Gauss  and the orthogonality principle at the center of Kalman's derivation. Optimized, the MFOE yields better accuracy than the KF and subsequent algorithms such as the extended KF and the interacting multiple model (IMM). The MFOE is an expanded form of the LSM, which effectively includes the KF  and ordinary least squares (OLS) as subsets (special cases). OLS is revolutionized in for application in econometrics. The MFOE also intersects with signal processing, estimation theory, economics, finance, statistics, and the method of moments. The MFOE offers two major advances: (1) minimizing the mean squared error (MSE) with fractions of estimated coefficients (useful in target tracking) and (2) describing the effect of deterministic OLS processing of statistical inputs (of value in econometrics)

Description
Consider equally time spaced noisy measurement samples of a target trajectory described by

$$y_n = \sum_{j=1} ^J c_j n^{j-1} + \eta_n= x_n + \eta_n $$

where n represents both the time samples and the index; the polynomial describing the trajectory is of degree J-1; and $$\eta_n $$ is zero mean, stationary, white noise (not necessarily Gaussian) with variance $$\sigma_n ^2$$.

Estimating x(t) at time $$\tau$$ with the MFOE is described by

$$\hat x (\tau) = \sum_{n=1}^{N} y_n w_n (\tau)$$

where the hat (^) denotes an estimate, N is the number of samples in the data window, $$\tau$$ is the time of the desired estimate, and the data weights are

$$w_n (\tau) = \sum _m U_{mn} T_m (\tau) f_m $$

The $$U_{mn}$$ are orthogonal polynomial coefficient estimators. $$T_{m}(\tau)$$ (a function detailed in ) projects the estimate of the polynomial coefficient $$c_m$$ to the desired estimation time $$\tau$$. The MFOE parameter 0≤$$f_m$$≤1 can apply a fraction of the projected coefficient estimate.

The combined terms $$U_{mn}T_m$$ effectively constitute a novel set of expansion functions with coefficients $$f_m$$. The MFOE can be optimized at time $$\tau$$ as a function of the $$f_m$$s for given measurement noise, target dynamics, and non-recursive sliding data window size, N. However, for all $$f_m=1$$, the MFOE reduces and is equivalent to the KF in the absence of process noise, and to the standard polynomial LSM.

As in the case of coefficients in conventional series expansions, the $$f_m$$s typically decrease monotonically as higher order terms are included to match complex target trajectories. For example, in the $$f_m$$s monotonically decreased in the MFOE from $$f_1=1$$ to $$f_5 \gtrsim 0$$, where $$f_m = 0 $$ for m ≧ 6. The MFOE in consisted of five point, 5th order processing of composite real (but altered for declassification) cruise missile data. A window of only 5 data points provided excellent maneuver following; whereas, 5th order processing included fractions of higher order terms to better approximate the complex maneuvering target trajectory. The MFOE overcomes the long-ago rejection of terms higher than 3rd order because, taken at full value (i.e., $$f_{m}s=1$$), estimator variances increase exponentially with linear order increases. (This is elucidated below in the section "Application of the FOE".)

Fractional order estimator
As described in, the MFOE can be written more efficiently as $$\hat x = <\psi,\omega_m> $$ where the estimator weights $$w_n(\tau)$$ of order m are components of the estimating vector $$\omega_m (\tau)$$. By definition $$\hat x \doteq \hat x(\tau)$$ and $$\omega_m\doteq \omega_m(\tau)$$. The angle brackets and comma $$<,>$$ denote the inner product, and the data vector $$\psi$$ comprises noisy measurement samples $$y_n$$.

Perhaps the most useful MFOE tracking estimator is the simple fractional order estimator (FOE) where $$f_1=f_2=1$$ and $$f_m=0$$ for all m > 3, leaving only $$0\le f_3 \le 1 $$. This is effectively an FOE of fractional order $$2+f_3$$, which linear interpolates between the 2nd and 3rd order estimators described in ) as

$$w_{2+f_{3}}=(1-f_3) \omega_2 +f_3 \omega_3 = \omega_2 + f_3 (\omega_3-\omega_2)=\omega_2+f_3\nu_3$$

where the scalar fraction $$f_3$$ is the linear interpolation factor, the vector $$\nu_3=\omega_3-\omega_2= \upsilon_3T_3$$, and $$\upsilon_3$$ (which comprises the components $$U_{3n}$$) is the vector estimator of the 3rd polynomial coefficient $$c_3\equiv\tfrac{a\Delta^2}{2} $$ (a is acceleration and Δ is the sample period). The vector $$\nu_3$$ is the acceleration estimator from $$\omega_3$$.

The mean-square error (MSE) from the FOE applied to an accelerating target is

$$MSE=\sigma_\eta ^2(|\omega_2|^2+f_3^2|\nu_3|^2)+[c_3T_3(1-f_3)]^2$$, where for any vector $$\theta$$, $$| \theta |^2 \doteq<\theta,\theta> $$.

The first term on the right of the equal sign is the FOE target location estimator variance $$\sigma_\eta ^2(|\omega_2|^2+f_3^2|\nu_3|^2)$$ composed of the 2nd order location estimator variance and part of the variance from the 3rd order acceleration estimator as determined by the interpolation factor squared $$f_3^2$$. The second term is the bias squared $$[c_3T_3(1-f_3)]^2$$ from the 2nd order target location estimator as a function of acceleration in $$c_3$$.

Setting the derivative of the MSE with respect to $$f_3$$ equal to zero and solving yields the optimal $$f_3$$:

$$f_{3,opt} \doteq f_{3,opt}(\tau)= \frac {(c_3 T_3)^2} {(c_3 T_3)^2 + \sigma_\eta ^2 |\nu_3|^2} = \frac {c_3^2} {c_3^2 + \sigma_\eta^2|\upsilon_3|^2}= \frac {\rho_3^2} {\rho_3^2+|\upsilon_3|^2}$$

where $$\rho_3\equiv \frac {c_3}{\sigma_\eta}=\frac {a\Delta^2} {2\sigma_\eta}$$, as defined in.

The optimal FOE is then very simply

$$w_{2+f_{3,opt}}=\omega_2+f_{3,opt}\nu_3=\omega_2+\upsilon_3 T_3f_{3,opt}= \omega_2+\upsilon_3T_3\frac {\rho_3^2} {\rho_3^2+|\upsilon_3|^2}$$

Substituting the optimal FOE into the MSE yields the minimum MSE:

$$MSE_{min}=\sigma_\eta ^2(|\omega_2|^2+f_{3,opt}|\nu_3|^2)$$

Although not obvious, the $$MSE_{min}$$ includes the bias squared. The variance in the FOE MSE is the quadratic interpolation between the 2nd and the 3rd order location estimator variances as a function of $$f_{3,opt}^2$$. Whereas, the $$MSE_{min}$$ is the linear interpolation between the same 2nd and the 3rd order location estimator variances as a function of $$f_{3,opt}$$. The bias squared accounts for the difference.

Application of the FOE
Since a target's future location is generally of more interest than where it is or has been, consider one-step prediction. Normalized with respect to measurement noise variance, the MSE for equally spaced samples reduces for the predicted position to

$$MSE = \frac {1} {N}+\frac {3(N+1)} {N(N-1)}+ f_3^2\frac {5(N+1)(N+2)}{N ((N-1)(N-2)}+\rho_3^2 \left [\frac {(N+1)(N+2)}{6}\right ] ^2 (1-f_3)^2$$

where N is the number of samples in the non-recursive sliding data window. Note that the first term on the right of the equal sign is the variance from estimating the first coefficient (position); the second term is the variance from estimating the 2nd coefficient (velocity); and the 3rd term with $$f_3 = 1$$ is the variance from estimating the 3rd coefficient (which includes acceleration). This pattern continues for higher order terms. Furthermore, the sum of the variances from estimating the first two coefficients is $$\frac {4N+2}{N(N-1)}$$). Adding the variance from estimating the 3rd coefficient yields $$\frac {9N^2+9N+6}{N(N-1)(N-2)}$$.

Estimator variances obviously increase exponentially with unit order increases. In the absence of process noise, the KF yields variances equivalent to these. (A derivation of the variance from a 1st degree polynomial corresponding to $$f_3=\rho_3=0$$ for the generalized case of arbitrary estimation time and sample times is given in reference. In addition, establishing a multi-dimensional tracking gate at the predicted position can easily be aided with the simple approximation of the error function in. )

Kalman filter tuning
Tuning the KF consists of a trade-off between measurement noise and process noise to minimize the estimation error. The KF process noise serves two roles: First, its covariance is sized to account for the maximum expected target acceleration. Second, process noise covariance establishes an effective recursive data window (analogous to the non-recursive sliding data window), described by Brookner as the Kalman filter memory.

Contrary to process noise covariance as a single independent parameter in the KF serving two roles, the FOE has the advantage of two separate independent parameters: one for acceleration and the other for sizing the sliding data window. Therefore, as opposed to being limited to just two tuning parameters (process and measurement noises) as is the KF, the FOE includes three independent tuning parameters: measurement noise variance, the assumed maximum deterministic target acceleration (for simplicity both target acceleration and measurement noise are included in the ratio of the single parameter $$\rho_3$$), and the number of samples in the data window.

Consider tuning a 2nd order predictor applied to the simple and practical tracking example in to minimize the MSE when the target acceleration is $$20 m/s^2$$; the zero mean, stationary, and white measurement noise is described as $$\sigma_\eta = 25m$$; and $$\Delta $$ = 1 second. Thus,

$$\rho_3=\frac {a\Delta^2} {2\sigma_\eta}=20/2/25=0.4$$

Setting $$f_3=0$$ in the normalized prediction MSE yields for the 2nd order predictor applied to an accelerating target,

$$MSE = \frac {4N+2} {N(N-1)}+ \rho_3^2 \left [\frac {(N+1)(N+2)}{6}\right ]^2 $$

where the first term on the right of the equal sign is the normalized 2nd order one-step prediction variance and the second term is the normalized bias squared from acceleration. This MSE is plotted as a function of N in Figure 1 along with both the variance and bias squared. Clearly, only integer order steps are possible in a non-recursive estimator. However, for use in approximating the tuned 2nd order KF, this MSE plot is stepped in tenths of a unit to show more precisely where the minimum occurs. The minimum MSE of 4.09 occurs at N = 2.9. The tuned KF can be approximated by sizing the process noise covariance in the KF such that the effective recursive data window—i.e., the Kalman filter memory —matches N = 2.9 in Figure 1 (i.e., $$\alpha \approx 0.85$$ and $$\beta \approx 0.53$$), where $$\alpha = \frac {4N-2} {N(N+1)}$$and $$\beta = \frac {6} {N(N+1)}$$. This hints at the fallacy of using a 2nd order estimator on accelerating targets as described in. Comparing this with the filtered position in demonstrates that the minimum MSE is a function of the time $$\tau$$ of the desired estimate.

FOE as a multiple-model estimator
The FOE can be viewed as a non-recursive multiple-model (MM) estimator composed of 2nd and 3rd order estimator models with the fraction $$0\le f_3 \le 1 $$ as the interpolation factor. Since the filtered position is generally used for comparisons in the literature, consider now the normalized MSE for the position estimate:

$$MSE = \frac {1} {N}+\frac {3(N-1)} {N(N+1)}+ f_3^2\frac {5(N-1)(N-2)}{N ((N+1)(N+2)}+\rho_3^2 \left [\frac {(N-1)(N-2)}{6}\right ] ^2 (1-f_3)^2$$

Note that this differs from the one-step prediction MSE in that the signs within the parentheses containing N are reversed. The higher order pattern continues here also. Normalized with respect to the measurement noise variance, the minimum position MSE reduces for equally spaced samples to

$$MSE_{min} = \frac {4N-2} {N(N+1)}+ f_{3,opt}\frac {5(N-1)(N-2)}{N ((N+1)(N+2)}$$

where $$|\upsilon_3|^2=\frac {180}{N(N^2-1)(N^2-4)}$$

in $$f_{3,opt}= \frac {\rho_3^2} {\rho_3^2+|\upsilon_3|^2}$$ A plot of the position $$MSE_{min}$$ as a function of N for various values of $$\rho_3$$ is shown in Figure 2, where there are several points of interest: First, the 2nd and 3rd order MSEs track each other very closely and bound all the $$MSE_{min}$$ (interpolated) curves. Second, the curves drop rapidly to a knee. Third, the $$MSE_{min}$$ curves flatten out beyond the knee yielding virtually no increase in accuracy until they begin to approach the 3rd order MSE (variance). This suggests that choosing a window at the knee of the curve is advantageous—to be demonstrated below. Consider again the scenario of, in this case as the target maneuvers. After traveling at a constant velocity, the target accelerates at $$20 m/s^2$$ for 20 seconds and then continues again at a constant velocity. At worst case acceleration, $$\rho_3= 0.4$$. The $$MSE_{min}$$ is plotted in Figure 3 of as a function of N. Also shown are the 2nd order MSE as well as the 2nd and 3rd order MSEs (variances only since the bias is zero in each case) similar to those in Figure 2. There is a fifth curve not previously addressed: the variance portion of the optimal MSE. The variance also levels off for several increments of N like the $$MSE_{min}$$. Both the variance and $$MSE_{min}$$ approach the 3rd order variance as $$N \to \infty$$.

As the acceleration varies from zero to maximum, the MSE is automatically adjusted (no external tinkering or adaptivity) between the variance at $$\rho_3 = 0$$ and maximum $$MSE_{min}$$ at $$\rho_3 = 0.4$$. In other words, the MSE rides up and down the quadratic curve of the variance plus bias squared as a function of changes in acceleration $$\rho_3$$ for any given value of N in the position estimate:

$$MSE = \frac {4N-2} {N(N+1)}+ \rho_3^2 \left [\frac {(N-1)(N-2)}{6}\right ]^2 $$

Choosing N = 4 at the knee of the $$MSE_{min}$$ curve in Figure 3 yields the RMSE (square root of the MSE, which is more often used for comparison in the literature) shown in Figure 4. On the other hand, choosing N = 8 yields the second curve in Figure 4. As shown in Figure 3, the optimal 8–point FOE is essentially a 3rd order non-recursive estimator which yields less than 4% RMSE improvement over the optimal 4-point FOE in the case of no acceleration. However, in the case of maximum acceleration the optimal 8-point MSE is markedly volatile and has large error spikes that can confuse a tracker, one spike exceeding the optimal 4-point MSE for worst case acceleration by more than the optimal 4-point MSE exceeds the optimal 8-point MSE in the absence of acceleration. Obviously, higher values of N produce larger error spikes.

Since trackers encounter greatest difficulties and often lose track during target maneuvers at maximum acceleration, the much smoother $$MSE_{min}$$transition of the optimal 4-point FOE has a major advantage over larger data windows.

IMM compared with the optimal FOE
The 4-point FOE in Figure 4 yields much smoother MSE transitions than the IMM (as well as the KF) in the parallel 1 Hz case of. It produces no error spikes or volatility as do the 8-point FOE and the IMM. In this example only 4 multiplies, 3 adds, and a window shift are required to implement the 4-point FOE, significantly few operations than required by the IMM or KF. Similar comparisons of several additional MMs from the literature with the optimal FOE are made in

Of the KF based MMs, the interacting MM (IMM) is generally considered the state-of-the-art tracking model and usually the method of choice. Since two model IMMs are most often used, consider the following two models: 2nd and 3rd order KFs. The estimated IMM state equation is the sum of the 2nd order KF times the model probability $$\mu_1(k)$$ plus the 3rd order KF times the model probability $$\mu_2(k)$$:

$$\hat X(k|k) = \hat X_1(k|k)\mu_1(k)+\hat X_2 (k|k)\mu_2(k)$$

where $$\hat X_1(k|k)$$ represents the 2nd order KF, $$\hat X_2 (|k|k)$$ represents the 3rd order KF, and k represents the time increment. Since the model probabilities sum to one, i.e., $$\mu_1(k)+\mu_2(k)=1$$; this is actually linear interpolation, where $$\mu_1(k)$$ is analogous to $$(1 - f_3)$$ in the FOE and $$\mu_2(k)$$ is analogous to $$f_3$$. Therefore, this two model IMM is analogous to the optimal FOE in that it also interpolates between 2nd and 3rd order estimators. Two model IMM interpolation is formed during each recursive cycle involving the interactively produced model probabilities.

As in the case of the FOE, this suggests a more descriptive estimate equal to the sum of the 2nd order KF plus the difference between the 3rd and 2nd order KFs times $$\mu_2(k)$$ :

$$\hat X(k|k) = \hat X_1(k|k)+[\hat X_2 (k|k)-\hat X_1(k|k]\mu_2(k)$$

In this formulation the difference between the 3rd and 2nd order KFs effectively augments the 2nd order KF with a fraction of the estimated target acceleration as a function of $$\mu_2(k)$$—as does $$f_3$$ in the FOE.

One major difference between the IMM and optimal FOE is that the IMM is not optimum. The IMM model probabilities and interpolation are based on likelihoods and ad hoc transition probabilities with no mechanism for minimizing the MSE. Of course, not being optimum at any time increment k, the IMM cannot achieve the optimal FOE accuracy shown in Figure 2.

Moveover, the IMM $$\mu_2(k)$$ fails to meet the boundary condition of zero to implement the 2nd order estimator in the absence of acceleration, which the FOE $$f_{3,opt}$$ does. This results from the fact that the likelihoods do not sum to unity even though the model probabilities do. This causes an IMM bias toward a non-existent acceleration and unnecessarily increases the MSE above the 2nd order variance. Another major difference between the IMM and FOE is that the IMM is adaptive whereas the FOE is not.

In order to make a reasonable comparison of the IMM with the FOE, reference constructs a non-recursive IMM analogy (IMMA). It includes $$\mu_2(k)$$ which does go to zero allowing the 2nd order estimator to be implemented. Since the FOE is based on the actual acceleration not a noisy estimate, the acceleration estimate for the IMMA is assumed to be the expected value of the estimate, i.e., the actual acceleration. This is described here as the ideal for the purpose of illustration. These two modifications make the IMMA compatible for comparison with the FOE. The $$\mu_2(k)$$ based on the expected value or actual acceleration (described here as the ideal $$\mu_2$$ where the k is dropped) then varies between zero and one in an S-shaped curve as a function of $$\rho_3$$, as does $$f_{3,opt}$$. This is shown in Figure 5, where a 4-point data window is assumed. Two significant points of interest stand out as shown by the vertical lines. First, the largest deviation of the ideal $$\mu_2$$ from $$f_{3,opt}$$ occurs near $$\rho_3 = 0.7$$. Second, the two curves cross near $$\rho_3 = 1.4$$. A comparison of the one-step predictor IMMA MSE as a function of ideal $$\mu_2$$ with the FOE $$MSE_{min}$$ is given in Figure 6. For the IMMA, the linear interpolation factor $$f_3$$ is replaced in the normalized FOE MSE by the ideal $$\mu_2$$ as the interpolation factor for ideal IMMA MSE plotting.

Included in Figure 6 for reference are a curve of the 3rd order variance, 2nd order variance, and the 2nd order MSE. The large deviation of $$\mu_2$$ from $$f_{3,opt}$$ in Figure 5 has a profound effect on the ideal IMMA MSE as shown in Figure 6. The ideal IMMA MSE exceeds the FOE MSE most near $$\rho_3 = 0.7$$, about where the $$\mu_2$$ differs most from $$f_{3,opt}$$ in Figure 5. In addition, the ideal IMMA MSE exceeds the 3rd order variance most near $$\rho_3 = 0.85$$, even though the specific purpose of interpolation in the IMM is to produce an MSE smaller than the 3rd order variance. Nevertheless, as expected, the two MSE curves do osculate near $$\rho_3 = 1.4$$, where $$\mu_2$$ and $$f_{3,opt}$$ cross in Figure 5.

Furthermore, the MSE is exacerbated in the non-ideal IMMA by adaptivity, as shown in Figure 7 where the IMMA from noisy $$\mu_2$$ is superimposed on the curves in Figure 6 (although there is a slight change in scale to accommodate the larger noisy IMMA MSE). Reference describes this in great detail. Clearly, since Figure 6 includes the ideal $$\mu_2$$ based on the expected value of acceleration, i.e., the actual acceleration; an estimate which includes measurement noise can only degrade the accuracy—as shown in Figure 7.

Indeed, not only is the noisy IMMA MSE larger than the 3rd order variance (by nearly a factor of two at the worst point), once the noisy IMMA MSE exceeds the 3rd order variance, it does not drop below as does the ideal IMMA. In contrast, the optimal FOE MSE (i.e., $$MSE_{min}$$) always remains less than the 3rd order variance.

This analysis compellingly suggests that adaptivity significantly degrades IMM accuracy rather than improving it. Of course, this should not come as a surprise since for $$\rho_3<0.5$$, the acceleration is buried in the noise; i.e., $$(a\Delta^2)/\sigma_\eta <1$$ (a signal-to-noise ratio likeness of less than 0 dB).

These analyses reveal the incredible and disconcerting lack of tracking literature that addresses fundamentals (e.g., optimal IMM interpolation, $$\mu_2$$ boundary conditions, and acceleration-to-noise ratio) and comparisons with standard benchmarks (e.g.; 2nd order, 3rd order, or other optimal estimators).

Deficiencies and oversights in the Kalman filter
Comparisons of the KF with the derivation, analysis, design, and implementation of MFOE have uncovered a number of deficiencies and oversights in the KF that are overcome by the MFOE. They are reported and discussed in.