Friday, April 15, 2011

Major and Minor Loops

...and why I don't use them.

Back in the days of Apollo, they used a Kalman filter and control loop similar in principle to what I am doing. However, due to the g l a c i a l  s l o w n e s s of the computers of the era, they had to structure their code quite differently. It took about 1 second to update the Kalman filter and calculate all the control coefficients, and it took a bit less than 50% of the available time to run the control loop. This is with an IMU sensor which automatically mechanically handled orientation and integration of acceleration into velocity.

So what they did was split the code into a major and minor loop. The concepts were different, but we can imagine it like this. In the background, a loop running in non-interrupt priority repeatedly ran the Kalman filter. This is the major loop. It took about 1 second to do an update, and was required to finish and start on the next update every two seconds. This sounds generous, but in the mean time there is a timer interrupt going off something like 20Hz which runs control loop code. This is the minor loop. If the interrupt fires every 50ms, maybe it takes 20ms to run the control code. There's also lots of other interrupts coming in from various sensors. In one notorious case, the encoder on the rendezvous radar mechanism was jiggling quickly between two bits and sending an interrupt each time, at something like 6400Hz. This used up all the remaining time, and then some, which caused the background loop to be late.

The way their code was structured, they fired off a main loop every 2 seconds, expecting the previous one to be done by the time the next one started. With all this extra stuff going on, the background loop never finished, and effectively got left on a stack when another run of the main loop started. Eventually the stack overflowed, the system detected this, reset itself, and cleared all the old main loop iterations off the stack. It's a pretty good example of exception handling, but it generated the infamous 1201 alarm.

The root cause is that the Kalman filter loop had to run in 2 seconds or less, and this is because several constants, such as delta-t, the time between updates, was hard-coded. There was a plan to remove this limitation, so that a new iteration was kicked off when the old one finished, instead of every two seconds. This new design was implemented, but never flew.

Returning to 2011, we are using the Logomatic, which runs at 703 times the frequency and perhaps 1406 times the instruction rate, since the Apollo computers took a minimum of two cycles to execute, and I suspect most insructions in an ARM7TDMI run at around one per cycle, or maybe better.

Because this system is so much faster, we have the luxury of a simpler design. The system has a main loop, executed continuously (it's inside a for(;;) loop). There is a timer interrupt running at 100Hz, but all it does is set a flag. When the main loop comes back around, it checks if the flag is set, and if so, reads the sensors, runs the Kalman filter, calculates the control values, and sets the motor throttles. All this is done inside of one iteration of the loop. It may happen that all this stuff takes longer than 10ms, in which case the interrupt fires during the main calculation part. But, all this does is set that flag, so that the main loop knows to read the sensor the next time it comes around. All the Kalman code knows how to deal with variable-time loops, so it doesn't matter if the loop takes 10,20, or 50ms. Of course the system degrades and becomes more susceptible to noise as it slows, but this is a gradual degradation. There is no hard limit like in Apollo, and no split between major and minor loops, either.

No comments:

Post a Comment