next up previous
Next: The LMS adaptive filter Up: number9 Previous: General principles

The Kalman filter

The simplest error signal that we can generate is just the difference between what we are getting out of the filter, $\hat{x}_t$, and what we expect to see, $x_t$,
\begin{displaymath}
e_t = x_t - \hat{x}_t
\end{displaymath} (1)

For many problems an overshoot is just as bad as an undershoot, so we can use the mean square error as a cost function. There are lots of other cost functions that we could use, but this one is particularly convienient mathematically. The earliest adaptive filter derived from this error and cost function is the Wiener filter. Unfortunately, it is in FIR form and has coefficients that extend infinitely back in time. Except for certain periodic systems, this filter is not very practical. The filter can be rederived in IIR form, this is known as the Kalman filter. The discrete time, linear, form of this equation looks like,
\begin{displaymath}
\hat{x}_t = b_t \hat{x}_{t-1} + K_t z_t
\end{displaymath} (2)

$b_t$, is a model of how the system goes from one time interval to the next. It is our best understanding of how the ideal system goes from a value at time $t-1$ to a value at time $t$.

$K_t$ is the Kalman gain, it is not controlled by the measurements directly, but instead is determined by how good you think the model $b$ is as compared to the quality of the observations.

$z_t$ is called the innovation, it is an estimate of what you think the error will be at time $t$, given a measurement at time $t$, $y_t$, and a prediction of $x$ at time $t$ based upon the best estimate of $x$ at $t-1$ extrapolated to time $t$ by applying our model funtion, $b$. The simplest example of how to calculate the innovation is,

\begin{displaymath}
z_t = y_t - H b_{t-1} \hat{x}_{t-1}
\end{displaymath} (3)

where $H$ is a function that may be necessary to convert the components of $x$ to the components of $y$ (An example is the meteorological case of estimating the humidity (the $x$) based upon wet-bulb and dry-bulb temperature measurements (the $y$). The $H$ function would have one component that converts humidity to wet-bulb temperature, and one for the conversion to dry-bulb). In an application, the innovation is a known function, like above, and the function $b$ is known. What is not known is the gain, $K$; this must be calculated in parallel with the model estimation. The time varying gain is where the adaptative nature of the Kalman filter expresses itself.

In order to determine the equation that gives us the gain function, we have to spend some time with optimal estimation theory. I will not spend the time on this here, but just show the the result. In the scalar case, the gain function is:


\begin{displaymath}
K_t = \frac{ H \left[ b^2 p_{t-1} + \sigma^2_g \right] }
{ \sigma^2_\nu + H^2 \sigma^2_g + H^2 b^2 p_{t-1} }
\end{displaymath} (4)

Two of the new quantities, $\sigma^2_g$ and $\sigma^2_\nu$, are the noise or error variances for the model, $b$, and the measurements respectively. The first is a statement about how good you believe the model of the system is. The second quantifies how good you think your measurements are. Both of these quantities are presumed to be known. The third quantity, $p_{t-1}$, is the error covariance of the filter, it gives effectively the error bars of the current model output. This can be calculated given the gain,


\begin{displaymath}
p_t = \frac{1}{H} \sigma^2_\nu K_t
\end{displaymath} (5)

To use the filter, each time a new observation ($y_t$) becomes available we calculate (3) and (4), and then use that information in (2) and (5).

The Kalman filter is frequently applied to systems where $x$ and $y$ are multi-channel or vector systems. In this case the equations (2) through (5) are rewritten as matrix equations.


next up previous
Next: The LMS adaptive filter Up: number9 Previous: General principles
Skip Carter 2008-08-20