Wednesday, March 30, 2011

Beginners Kalman Filtering Theory, Part 3

Examples
As noted above, the hard part is adapting your problem to the filter's form. To do that requires practice, so let's work through a few examples.

Let's just measure something, like the voltage across a resistor, with an analog-to-digital converter. We'll ignore the discrete nature of an ADC for now and say that our ADC converts a voltage into an unlimited-precision real number. However, like all real devices, it is imperfect, and while it gives unlimited precision, it only has finite accuracy.

We presume that the voltage is constant.

So, let's reiterate our model, and see how this problem matches up with it.

<x_i>=[A]<x_{i-1}>+<w_i>
<z_i>=[H]<x_i>+<v_i>

The state <x_i> is the actual voltage. It is a one-dimensional vector, or in other words just a scalar. This means that n=1.

The state transition [A] describes how the state changes with time. Since the state is one dimensional, the state transition is scalar also. Since the voltage is constant, the constant is 1.0 exactly.

The process noise is theoretically an uncertainty in the state, but is mostly just used for tuning. For now, we will say that the covariance (scalar) [Q] on it is zero.

The measurement <z_i> is supposed to just be the voltage, so the observation (scalar) [H] is also 1.0 exactly.

We will say that the measurement uncertainty is normal with a 0.1 standard deviation. This makes the measurement noise has a covariance (scalar) [R]=0.01 .

Now we re-iterate the filter equations, just so you don't have to skip around so much to read things.

  1. <x^_i->=[A]<x_{i-1}>
  2. [P_i-]=[A][P_{i-1}][A']+[Q]
  3. [K_i]=[P_i-][H']([H][P_i-][H']+[R])-1
  4. <x_i^>=<x_i^->+[K_i](<z_i>-[H]<x_i^->)
  5. [P_i]=([1]-[K_i][H])[P_i-]

So, let's plug all that into our Kalman filter form, turn the crank, have a cancelfest, and see what we get:

  1. x^_i-=x_{i-1}
  2. P_i-=P_{i-1}
  3. K_i=P_i-/(P_i-+R)
  4. x^_i=x^_i-+K_i(z_i-x^_i-)
  5. P_i=(1-K_i)P_i^-

Well, that cleans up quite nicely. Let's have a closer look at it. First off, our projected state estimate is just our old state estimate, since we don't expect it to change. Our projected estimate covariance likewise doesn't change.

Since the measurement uncertainty and the estimate covariance are always positive, we see that the denominator of the Kalman gain term is always larger than the numerator, and thus that the gain is always between zero and one. It can only reach those extremes if either the measurement uncertainty or estimate uncertainty is zero, and neither is.

If the Kalman gain were exactly one, the fourth equation says that our new estimate would be exactly our measurement. If it were zero, we would ignore the measurement.

We notice that the estimate uncertainty does not depend on the measurement, only the measurement uncertainty. Likewise with the Kalman gain. So, knowing the ''a priori'' estimate uncertainty and measurement uncertainty, we can predict their future history without regard to the measurements themselves. We see from the last equation that the next uncertainty is always smaller than the current one, and therefore the Kalman gain will continually decrease. Sounds like an exponential decay to me.

By combining equations 3 and 5, we get

P=(RP-)/(R+P-)

If we look for convergence, the P on the left side is the same as the P on the right. Taking R as a constant, we get

P^2+RP=RP
P^2=0
P=0

meaning that the only fixed point is at zero. Combining this with the knowledge that P continually decreases, we can predict with certainty that P collapses to zero. This actually makes sense. If you think about how intuitively we know that the uncertainty of a large number of measurements is proportional to 1/sqrt(N), in the limit the uncertainty will collapse to zero as we take an infinite number of measurements.

Once P collapses like this, K will also, as P is in the numerator. When K collapses, this means that the measurement will be ignored, and the estimate will match the previous estimate.

It's pretty easy to code all that up in IDL. Let's say that the true measurement is 0.5V, the ''a priori'' estimate is zero with a variance of 1. Then we can do this:


 pro kalman_constant
   ;True value of state
   val=0.5
   ;Measurement noise
   noise=randomn(0,1000,/double)*0.1
   ;State at all measurement times
   x=noise*0+val
   ;All measurements
   z=x+noise
   R=0.01d; //Measurement noise

   ;Record of estimate over time
   xh=dblarr(n_elements(z))
   P=dblarr(n_elements(z))
   K=dblarr(n_elements(z))
   P[0]=1d
   xh[0]=0d
   for i=1,n_elements(z)-1 do begin
 
     xhm=xh[i-1]
     Pm=P[i-1]
 
     K[i]=Pm/(Pm+R)
     xh[i]=xhm+K[i]*(z[i]-xhm)
     P[i]=(1-K[i])*Pm
   end
   window,0
   device,decompose=0
   loadct,39
   plot,z,psym=1
   oplot,x,color=254
   oplot,xh,color=192
   window,1
   plot,P,/ylog
   window,2
   plot,K,/ylog
 end


By running this, we get this:

The black crosses are 1000 measurements, the red line is the true state value, and the blue line is the Kalman estimate at each measurement, each point only considering the points to the left.

We see that even though the filter started with a very inaccurate state estimate, and the measurements themselves are very inaccurate, the filter converged to near the correct answer very quickly, for appropriate values of "very".

With these particular measurements, we get at our last estimate x=0.50241482V±0.0031638442V. This uncertainty is very near that predicted by the 1/sqrt(N) model. Looking at a plot of the inverse of the Kalman gain (effectively the weighting on the previous estimated state) it is just linear. So, this is just a fancy averager.

Disclaimer
It is a great disservice to Kalman and all those who developed the filter into what it is today, to call it a simple averager. Yes, it reduces to an averager in the simple case, but the filter also works in the complicated case. It has all the statistical properties that you want, and if you tell the truth about all the filter inputs, the filter output really is optimum. I am just trying to understand the filter, and running it on simple cases helps.

1 comment:

  1. I've really enjoyed reading your series on Kalman filters. Your mention of the fact that measurement vectors need not be the same size as the state vector finally made a light bulb go off for the sensor fusion I've been trying to get to grips with.

    One small suggestion / request. I'd love to see a simple example with more than one kind of measurement arriving e.g. rotational accelerations and linear accelerations from an accelerometer and gyroscope. I've seen many varieties of the "estimating a constant" example, but after much searching have yet to see a good example where different measurements are combined.

    Really good stuff. Thanks a lot.

    ReplyDelete