Covariance matrices are a way of describing the relation between a collection of variables. A single covariance value describes the relation between two variables. They are a tool for estimating the possible error in a numerical value and for predicting a numerical value. One of their several applications is in robotics sensor fusion with “regular” and “extended” Kalman filters. In this article I’ll describe how to interpret a covariance matrix and provide a practical example. I’ll leave the formal mathematical and general definition to someone better at that than me.
Let’s begin with the concept of “variance” of a numerical value. That’s the amount by which that value can be expected to vary. For example, if we were to measure the outdoor air temperature with a digital electronic sensor, we may want to know the maximum amount of error to expect in that measurement. That possible amount of error is called the variance and it’s described as a single value. Variance is always positive. For a more in-depth description of variance, please see http://en.wikipedia.org/wiki/Variance.
For the rest of this article I’ll use the terms “value” and “variable” interchangeably. I suppose we could think of a “value” as the current value of a particular “variable”.
Now imagine that there are several properties or conditions or states being measured at the same time and that we’d like to know if there is any relationship between those values. If we could predict in advance how each variable changes, relative to every other variable, that would give us two useful things. First, it would allow us to better identify (and eliminate) outlier values, where one particular value has changed so much that it’s probably not a good measurement. And second, if at one time a measured value was missed, it might be possible to predict what the value should be, based on how all of the other values to which it’s related have changed.
To proceed from a single variance to the idea of covariance and a collection of covariances contained in a matrix, we’ll need an understanding of covariance. Instead of expressing the expected range of possible change in one variable, a covariance expresses the correlation between a change in one variable and a change in another variable. For a much more in-depth explanation, see http://en.wikipedia.org/wiki/Covariance.
To illustrate, we’ll need a more complicated example. Let’s assume we have a mobile robot which can measure both its current position and its orientation. Since this robot can’t levitate or swim we’ll simplify the position and use only the two dimensional X-Y plane. That means the robot’s current position can be adequately described as a position along the X axis and a position along the Y axis. In other words, the robot’s position can be described as (x, y).
The position describes where the robot is located on the surface, which may be a parking lot or the living room floor or the soccer pitch, but it doesn’t describe in which direction it’s pointed. That information about the current state of the robot is called the orientation and will require one dimension. We’ll call this orientation dimension the yaw. The yaw describes in which direction it’s pointing. It’s worth repeating that this is a simplified way of representing the robot’s position and orientation. A full description would require three position values (x, y and z) and also three orientation values (roll, pitch and yaw). The concepts about to be described will still work with a six-dimensional representation of the robot state (position and orientation). Also, yaw is sometimes identified by the lower case Greek letter theta.
Now that we can describe both the position and the orientation of the robot at any point in time and assume that we can update those descriptions at a reasonably useful and meaningful frequency, we can proceed with the description of a covariance matrix. At each point in time, we’ll be measuring a total of three values: the x and y position and the yaw orientation. We could think of this collection of measurements as a vector with three elements.
We’ll start with two sets of measurements, each of which contains three values. Assume the first measurements were taken at 11:02:03 today and we’ll call that time t1. The second set were taken at 11:02:04 and we’ll call that time t2. We’ll also assume that our measurements are taken once per second. The measurement frequency isn’t as important as consistency in the frequency. Covariance itself doesn’t depend upon time, but the timing will become useful further on in this example.
Covariance is a description of how much change to expect in one variable when some other variable changes by a particular amount and in a particular direction. Using the position and orientation example we’ve started, we’d like to know what to expect of the yaw measurement from time t2 when the change in the y measurement between time t1 and t2 was large in the positive direction. Covariance can tell us to expect a similarly large positive change in yaw when y becomes more positive. It could also predict that yaw would become more negative when y became more positive. Lastly, it could state that there doesn’t appear to be any predictable correlation between a change in yaw and a change in y.
Just in case, let’s try a possibly more intuitive example of a correlation. Our initial example measures the position of the robot, with a corresponding x and y value, every second. Since we have regular position updates; since we know the amount of time between the updates (one second); and since we can calculate the distance between the position at time t1 and the position at time t2, we can now calculate the velocity at time t2. We’ll actually get the speed along the x axis and the speed along the y axis which can be combined into a velocity.
Assume the robot is pointed along the x axis in the positive direction and it’s moving. The regular measurements of the position should show a steadily increasing x value and, at least in a perfect world, an unchanging y value. What would you expect the yaw measurement to be – unchanging or changing? Since the robot is not changing its direction the yaw should not be changing. Put in terms of covariance, a change in the x value with no change in the y value is NOT correlated with a change in the yaw value. On the contrary, if we measured a change in yaw with no directional change in the velocity, we would have to suspect that at least one of those measurements, the yaw or the velocity, is incorrect.
From this basic idea of covariance we can better describe the covariance matrix. The matrix is a convenient way of representing all of the covariance values together. From our robotic example, where we have three values at every time t, we want to be able to state the correlation between one of the three values and all three of the values. You may have expected to compare one value to the other two, so please keep reading.
At time t2, we have a value for x, y and yaw. We want to know how the value of x at time t2 is correlated with the change in x from time t1 to time t2. We then also want to know how the value of x at time t2 is related to the values of y and yaw at time t2. If we repeat this comparison, we’ll have a total of 9 covariances, which means we’ll have a 3×3 covariance matrix associated with a three element vector. More generally, an n value vector will have an n×n covariance matrix. Each of the covariance values in the matrix will represent the covariance between two values in the vector.
The first part of the matrix which we’ll examine more closely is the diagonal values, from (1, 1) to (n, n). Those are the covariances of: x to a change in x; y to a change in y; and yaw to a change in yaw. The rest of the elements of the covariance matrix describe the correlation between a change in one value, x for example, and a different value, y for example. To enumerate all of the elements of the covariance matrix for our example, we’ll use the following:
Vector elements at time t:
1st: x value
2nd: y value
3rd: yaw value
Covariance matrix elements:
1,1 1,2 1,3
2,1 2,2 2,3
3,1 3,2 3,3
where the elements correspond to:
1,1 x to x change covariance
1,2 x to y change covariance
1,3 x to yaw change covariance
2,1 y to x change covariance
2,2 y to y change covariance
2,3 y to yaw change covariance
3,1 yaw to x change covariance
3,2 yaw to y change covariance
3,3 yaw to yaw change covariance
Hopefully, at this point, it’s becoming clearer what the elements of a covariance matrix describe. It may also be revealed that there can be certain elements where a correlation is not expected to exist.
It’s important to remember that certain covariance values are meaningful and others don’t provide any directly useful information. A large, positive covariance implies that a large change in the first value, in one direction, will usually correspond with a similarly large change, in the same direction, in the related value. A large negative covariance implies a corresponding large change but in the opposite direction. Smaller covariance values can imply that there either is no correlation between the changes and the values or that the correlation exists but results in a small change.