-
Notifications
You must be signed in to change notification settings - Fork 0
Multivariate Kalman Filter
On a high level, multivariate Kalman filters work similarly to one dimensional Kalman filters. Both require an established system model, and then run through these steps:
We begin with a position and velocity. These is propagated forward with the model of our system in a predict step, which incorporates a command velocity (u) published by the robot. This creates a 'prior,' a prediction of where we are according to our model of the system. The prior is then updated with a measurement (z) published by the robot. This measurement is weighted against our prior prediction, to create a posterior of our position and velocity. The posterior is then fed back into the predict step as our new prior.
Conceptually, this is pretty straightforward. Unfortunately, the math gets a little more complicated. Now that we're working with multiple variables, we're straying away from the more direct "multiplying Gaussians" approach to Kalman Filter math and discretizing the system using linear algebra. Everything is matrices, many of which we have to design ourselves according to what we want to put in and get out of the system. We'll go through the steps we took to set up our specific multivariate Kalman filter, but for more general information you should, as always, read this book. I specifically recommend chapters 5-7, assuming you have a good grasp on one dimensional Kalman Filters.
The entire system is governed by these four equations (Note: these are split into the predict and update steps, with colors representing changing variables and black letters representing constants within each step):
In the first equation, x_bar (orange) and x (red) are vectors of the state variables of the system, F is the state transition function contained in a matrix, and B and u are the control functions. Together, this means that you update your previous state (x) with a model of your movement (F) since the last time step, plus whatever velocity you have added through a command (Bu).
In the second equation, P_bar (blue) and P (green) are covariance matrices of the system, F is the same matrix as before, and Q is the process covariance matrix. P contains the expected variance of the state variables and their covariances (how the variances depend on each other), which tells us how accurate we can expect our model to be. This equation updates it over the past time step using F, and uses Q to account for uncertainty between the model and the world.
Together, these equations serve as the model for our system. From them, we obtain our priors of where we believe we are, and with how much certainty. These will be compared to and updated with measurements in the (you guessed it) update step.
The new variables here are K, y, I, and H. In short, K is the Kalman gain, y is the residual (or difference) between the prediction and measurement, H is the measurement function (which defines the conversion from state variables to measurement variables), and I is an identity matrix.
Overall, the first equation represents weighing the measurement and prior against each other in order to obtain a better prediction of our position. The second equation represents an updated covariance matrix with the new measurement information. The new x vector and P matrix are our posteriors, which represent our final estimate of where we are in this iteration. They will be fed back into the predict step for the next moment in time.
Here's where we unpack what the heck these equations mean:
As a reminder, we want the results of these equations to be a new x position and velocity from the last movement, carried in a vector (x_bar), and a new covariance matrix that represents how much we trust our model compared to the real world. These will be our priors.
We decided to expand on our previous Kalman filter by adding velocity as a state variable, while still working with one dimensional movement. This means we now have two state variables, position and velocity, contained in the colorful x vector below:
This is the same form x_bar will have after transformation.
The equation of motion for our one-dimensional system looks like this:
Quite simply, this adds the distance traveled over the last time step to our previous position.
We want to discretize this equation using matrices in order to create our state transition function. This ends up looking like this:
Because we receive the command velocity from the robot, we can discard the old one. Thus, we designed our state transition function (F) to keep only the previous position. The control function (Bu) then adds the difference in position over the last time step to x, and replaces the command velocity with our new one. This means that ultimately, x_bar ultimately carries the updated position and new command velocity.
The equation governing covariance for our system looks like:
Let's break this down:
P is our covariance matrix, which contains the variances of x position and velocity. Initially, this is all it contains. Over time, it will include the covariance between these two variances in the diagonals. ???
F is the same as before, and updates the covariance matrix (P) according to our model of the system. In this case, it reduces the covariance matrix to only account for the variance in x.
The process noise matrix, Q, then adds back in the variance in velocity... ??
Overall, Charlie isn't sure why this is.
As a reminder, here are the equations that determine the update step:
Before we can do anything, we need to convert between the prior and measurement space. This is because we have two state variables in our prior (position and velocity) but we are only measuring for one (position). In mathematical terms, our prior (x_bar) is a 1x2 matrix, while our measurement (z) is a 1x1 matrix. We need a measurement function to convert x_bar to the same form as z. It looks like this:
Here, you can see how it converts x_bar to a 1x1 matrix:
https://github.com/olinrobotics/learning-resources/blob/kalman/kalman_filter/images/nd_Hx.png
We will use this heavily in the following equations.
Our next goal is to calculate the Kalman Gain, but in order to do so we must first calculate our new system uncertainty.
We need to convert the prior covariance matrix to a covariance matrix in the measurement space, and then add our new measurement of covariance. We do this using this equation:
[image: S = HPH^T + R]
where S is system uncertainty, H is the measurement function from before, P_bar is the prior covariance calculated in the predict step, and R is the measured covariance of the system.
** Note: reorder, calculate system uncertainty first and subdivide into x and P
We'll begin with the first equation:
In plain words, this is adding the residual (the difference between the prior and measurement) scaled by the Kalman gain (how much we trust each of those) to the prior (x_bar) to get our new estimate of our location.
The first step to design this equation is to figure out how to get our residual, the difference between our measurement and our prior from the predict step. We need both the measurement and the prior to have the same form in order to do this. However, the prior, x_bar, is a vector that contains both position and velocity-- so we need to extract only the position from it. We do this with a measurement function, H, making the residual calculation looks like this:
where y is the residual, z is the measurement, x_bar is our prior, and H is the measurement function. So if H converts x_bar into a 1x1 matrix, what should it look like?
As you can see, it is a very simple matrix, but it does its job well! We get a residual between the measurement and prior position at the end.
Now, we will move on to the equation for updating the covariance of the system.
[image: P = (I - KH)P_bar]
We then use the system uncertainty to calculate the Kalman Gain, in order to weight the residual we calculated earlier. The equation to calculate Kalman Gain looks like this:
[image: K = PbarH^TS^-1]
This equation converts our prior covariance (P_bar) into the measurement space with the transpose of our measurement function, and then does the linear algebra equivalent of dividing by our system uncertainty (which is the sum of the prior and measurement). This gives us a ratio between the measurement and prior covariance.
We can now come full circle with our state and covariance updates.
[image: P = (I-KH)Pbar]
For background, I is the identity matrix, with is how we represent 1 as a matrix. K is the Kalman Gain we calculated earlier, H is the measurement function, and P_bar is the prior covariance matrix.
This equation gives gives you a proportion of the prior covariance, meaning the updated covariance is always smaller. This means that we become more confident in our position with each update.