Skip to content

Multivariate Kalman Filter

Charlie Weiss edited this page Dec 30, 2018 · 15 revisions

Overview

System Design

On a high level, multivariate Kalman filters work similarly to one dimensional Kalman filters. Both require an established system model, and then run through these steps:

We begin with a position and velocity. These is propagated forward with the model of our system in a predict step, which incorporates a command velocity (u) published by the robot. This creates a 'prior,' a prediction of where we are according to our model of the system. The prior is then updated with a measurement (z) published by the robot. This measurement is weighted against our prior prediction, to create a posterior of our position and velocity. The posterior is then fed back into the predict step as our new prior.

Conceptually, this is pretty straightforward. Unfortunately, the math gets a little more complicated. Now that we're working with multiple variables, we're straying away from the more direct "multiplying Gaussians" approach to Kalman Filter math and discretizing the system using linear algebra. Everything is matrices, many of which we have to design ourselves according to what we want to put in and get out of the system. We'll go through the steps we took to set up our specific multivariate Kalman filter, but for more general information you should, as always, read this book. I specifically recommend chapters 5-7, assuming you have a good grasp on one dimensional Kalman Filters.

Math

The entire system is governed by these four equations (Note: these are split into the predict and update steps, with colors representing changing variables and black letters representing constants within each step):

Predict

In the first equation, x_bar (orange) and x (red) are vectors of the state variables of the system, F is the state transition function contained in a matrix, and B and u are the control functions. Together, this means that you update your previous state (x) with a model of your movement (F) since the last time step, plus whatever velocity you have added through a command (Bu).

In the second equation, P_bar (blue) and P (green) are covariance matrices of the system, F is the same matrix as before, and Q is the process covariance matrix. P contains the expected variance of the state variables and their covariances (how the variances depend on each other), which tells us how accurate we can expect our model to be. This equation updates it over the past time step using F, and uses Q to account for uncertainty between the model and the world.

Together, these equations serve as the model for our system. From them, we obtain our priors of where we believe we are, and with how much certainty. These will be compared to and updated with measurements in the (you guessed it) update step.

Update

The new variables here are K, y, I, and H. In short, K is the Kalman gain, y is the residual (or difference) between the prediction and measurement, H is the measurement function (which defines the conversion from state variables to measurement variables), and I is an identity matrix.

Overall, the first equation represents weighing the measurement and prior against each other in order to obtain a better prediction of our position. The second equation represents an updated covariance matrix with the new measurement information. The new x vector and P matrix are our posteriors, which represent our final estimate of where we are in this iteration. They will be fed back into the predict step for the next moment in time.

Designing the Math

Designing the Prediction Equations

Here's where we unpack what the heck these equations mean:

As a reminder, we want the results of these equations to be a new x position and velocity from the last movement, carried in a vector (x_bar), and a new covariance matrix that represents how much we trust our model compared to the real world. These will be our priors.

Choosing state variables (x and x dot)

We decided to expand on our previous Kalman filter by adding velocity as a state variable, while still working with one dimensional movement. This means we now have two state variables, position and velocity, contained in the colorful x vector below:

This is the same form x_bar will have after transformation.

Position Propagation

The equation of motion for our one-dimensional system looks like this:

Quite simply, this adds the distance traveled over the last time step to our previous position.

We want to discretize this equation using matrices in order to create our state transition function. This ends up looking like this:

Because we receive the command velocity from the robot, we can discard the old one. Thus, we designed our state transition function (F) to keep only the previous position. The control function (Bu) then adds the difference in position over the last time step to x, and replaces the command velocity with our new one. This means that ultimately, x_bar ultimately carries the updated position and new command velocity.

Covariance and Process Noise Matrices (P and Q) -- Connor please take a look at this

The equation governing covariance for our system looks like:

Let's break this down:

P is our covariance matrix, which contains the variances of x position and velocity. Initially, this is all it contains. Over time, it will include the covariance between these two variances in the diagonals. ???

F is the same as before, and updates the covariance matrix (P) according to our model of the system. In this case, it reduces the covariance matrix to only account for the variance in x.

The process noise matrix, Q, then adds back in the variance in velocity... ??

Overall, Charlie isn't sure why this is.

Designing the Update Equations

As a reminder, here are the equations that determine the update step:

Converting between the prior and measurement space with a measurement function (H)

Before we can do anything, we need to convert between the prior and measurement space. This is because we have two state variables in our prior (position and velocity) but we are only measuring for one (position). In mathematical terms, our prior (x_bar) is a 1x2 matrix, while our measurement (z) is a 1x1 matrix. We need a measurement function to convert x_bar to the same form as z. It looks like this:

Here, you can see how it converts x_bar to a 1x1 matrix:

We will use this heavily in the following equations.

System uncertainty (S)

Our next goal is to calculate the Kalman Gain, but in order to do so we must first calculate our new system uncertainty.

We need to convert the prior covariance matrix to a covariance matrix in the measurement space, and then add our new covariance measurement. We do this using this equation:

where S is system uncertainty, H is the measurement function from before, P_bar is the prior covariance calculated in the predict step, and R is the measured covariance of the system.

Inserting the measurement function (H), our covariance prior (P_bar), and our measurement (R), we get:

You can see it reduces our prior covariance matrix down to a 1x1 matrix of the variance in x. There is no longer any covariance, because there isn't a measured variance in velocity to relate to. It then adds the measured variance, so we get the total system uncertainty with new information.

Kalman Gain (K)

Now we can finally calculate Kalman Gain using the system uncertainty. We will use this to weight the residual later. The equation to calculate Kalman Gain looks like this:

This equation converts our prior covariance (P_bar) into the measurement space with the transpose of our measurement function, and then does the linear algebra equivalent of dividing by our system uncertainty. This gives us a ratio between the measurement and prior covariance. Using our inputs, it looks like this:

Calculate the residual (y)

The last thing we need to calculate is our residual, the difference between our measured and prior positions. Once again, we need to convert the prior into the measurement space so they have the same form, which we will do with our measurement function. We then want to subtract our prior from our measurement to get the residual. Thus, the calculation looks like this:

where y is the residual, z is the measurement, x_bar is our prior, and H is the measurement function. With our inputs, it looks like this:

Now we have the residual between the measured (z) and prior (x) positions!

Putting it all together

Now we can finally circle back to these equations:

x_bar update

We'll begin by updating our x position and velocity:

where x is our poster, x_bar is our prior, K is our Kalman Gain, and y is the residual. In plain words, this is adding the residual scaled by the Kalman gain to the prior (x_bar) to get our new estimate of our location. All together, it looks like this:

Looks messy, right? We're very glad to have numpy's linear algebra functions.

Covariance Update

Next, we have the covariance update:

where I is the identity matrix, which is how we represent '1' as a matrix. K is the Kalman Gain we calculated earlier, H is the measurement function, and P_bar is the prior covariance matrix.

This equation gives gives us a proportion of the prior covariance, meaning the updated covariance is always smaller. This means that we become more confident in our position with each update.

All together, we get:

** Note: add info about changes in KG and how this affects P

One Iteration of the Multivariate Kalman Filter