Interacting Multiple Models(IMM) for Prediction¶
For self-driving vehicle, it's important to reliably predict the movement of traffic agents around ego car, such as vehicles, cyclists and pedestrians.
We have many neural networks to predict obstacle on lane, but for obstacles which are not on lane, we now have poor method to predict them.
Current predictor for obstacles not on lane¶
If an obstacle(vehicle/bicycle/pedestrian) is not on lane, we use a FreeMovePredictor
to predict its trajectory. FreeMovePredictor
assumes that the obstacle always moves with constant acceleration, the state is:
and the transition matrix is:
The disadvantages are:
- We use the newest position and velocity from perception module, but the result is not so accurate.
- It performs not so good especially for vehicles.
To solve these problems and imporve the prediction accuracy off lane, we use - constant velocity kalman filter to predict pedestrian; - interacting multiple model(IMM) of constant velocity(cv), constant acceleration(ca) and constant turn rate(ct) to predict vehicle and bicycle.
Kalman filter¶
In 1960, R.E. Kalman published his famous paper describing a recursive solution to the discent-data linear filtering problem. Since that time, due in large part to advances in digital computing, the Kalman filter has been the subject of extensive research and application, particularly in the area of autonomous or assisted navigation.
The Kalman filter is a set of mathematical equations that provides an efficient computational (recursive) means to estimate the state of a process, in a way that minimizes the mean of the squared error. The filter is very powerful in several aspects: it supports estimations of past, present, and even future states, and it can do so even when the precise nature of the modeled system is unknown.
The process to be estimated¶
The Kalman filter addresses the general problem of trying to estimate the state \(x \in \Re^n\) of a discrete-time controlled process that is governed by the linear stochastic difference equation:
with a measurement \(z \in \Re^m\) that is:
- The \(n * n\) matrix \(A\) is
transition matrix
which relates the state at the previous time step \(k - 1\) to the state at the current step \(k\), in the absence of either a driving function or process noise. Note that in practice A might change with each time step, but here we assume it is constant. - The \(n * l\) matrix \(B\) is
control matrix
which relates the optional control input \(u \in \Re^l\) to the state \(x\). - The \(m * n\) matrix \(H\) is
measurement matrix
which relates the state to the measurement \(z_k\). In practice \(H\) might change with each time step ore measurement, but we assume it is constant.
The random variable \(w_k\) and \(v_k\) represent the process and measurement noise. They are assumed to be independent(of each other), white and with normal probability distributions:
where the \(Q\) is process noise covariance
and R is measurement noise convariance
, they might change with each time step or measurement, but we assume that they are constant.
The computational origin of the filter¶
We define \(\hat{x}_k^- \in \Re^n\) to be our priori state
estimate at step \(k\) given knowledge of the process prior to step \(k\) and \(\hat{x}_k \in \Re^n\) to be our posteriori state
estimate at step \(k\) given measurement \(z_k\). We can then define a priori
and a posteriori
estimate errors as:
The priori
estimate error covariance is then:
and the posteriori
estimate error covariance is:
In deriving the equation for the Kalman filter, we begin with the goal of finding an equation that compute an posteriori
state estimate \(\hat{x}_k\) as a linear combination of the priori
estimate \(\hat{x}_k^-\) and a weighted difference between an actual measurement \(z_k\) and a measurement prediction \(H\hat{x}_k^-\) as shown below:
The difference \((z_k - H\hat{x}_k^-)\) is called the measurement innovation
or residual
. The residual reflects the discrepancy between the predicted measurement \(H\hat{x}_k^-\) and the actual measurement \(z_k\). A residual of zero means that the two are in complete agreement.
The \(n*m\) matrix \(K\) is chosen to be the gain
or blending factor
that minimizes the posteriori
error covariance in (8).
This minimization can be accomplished by
- substituting (9) into the (6) and substituting that into (8);
- performing the indicated expectations;
- taking the derivative of the trace of the result with respcet to \(K\),
- setting the result equal to \(0\) and then solving for \(K\).
One form of the resulting \(K\) that minimizeds (8) is:
Looking at (10) we see that as the measurement error covariance \(R \to 0\), the gain \(K\) weights the residual more heavily. Specifically,
On the other hand, as the priori
estimate error convariance \(P_k^- \to 0\), the gain \(K\) weights the residual less heavily. Specially,
Another way of thinking about the weighting by \(K\) is that as the measurement error covariance \(R \to 0\), the actual measurement \(z_k\) is trusted
more and more, while the predicted measurement \(H\hat{x}_k^-\) is trusted less and less. On the other hand, as the priori
estimate error covariance \(P_k^- \to 0\) the actual measurement \(z_k\) is trusted less and less, while the predicted measurement \(H\hat{x}_k^-\) is trusted more and more.
The discrete kalman filter algorithm¶
The Kalman filter estimate a process by using a form of feedback control: the filter estimates the process state at some time and then obtains feedback in the form of (noisy) measurement. As such, the equations for the Kalman filter falls into two groups:
time update
(predict) equations;measurement update
(correct) equations.
The time update
equations are responsible for projecting forward(in time) the current state and error covariance estimates to obtain the priori
estimates for the next time step.
The measurement update
equations are responsible for the feedback, incorporating a new measurement into the priori
estimate to obtain an improved posteriori
estimate.
The final estimation algorithm resembles that of a predictor-corrector
algorithm for solving numerical problems:
1 2 3 4 5 |
|
The specific equations for the time update
are:
where:
- \(\hat{X}_{k-1}\) is the
posteriori
state from time step \(k-1\); - \(u_{k-1}\) is the control from time step \(k-1\);
- \(\hat{x}_k^-\) is the
priori
state from time step \(k\); - \(P_{k-1}\) is the
posteriori
estimate error covariance from time step \(k-1\); - \(P_k^-\) is the
priori
estimate error covariance from time step \(k\).
The specific equations for the measurement update
are:
where:
- \(K_k\) is the
gain
from time step \(k\); - \(z_k\) is the measurement variable from time step \(k\);
- \(\hat{x}_k\) is the
posteriori
state from time step \(k\); - \(P_k\) is the
posteriori
estimate error covariance from time step \(k\).
Filter prameters and tunning¶
In the actual implementation of the filter, the measurement noise covariance \(R\) is usually measured prior to operation of the filter. Measuring the measurement error covariance \(R\) is generally practical (possible) because we need to be able to measure the process anyway (while operating the filter) so we should generally be able to take some off-line sample measurements in order to determine the variance of the measurement noise.
The determination of the process noise covariance \(Q\) is generally more difficult as we typically do not have the ability to directly observe the process we are estimating. Sometimes a relatively simple (poor) process model can produce acceptable results if one “injects” enough uncertainty into the process via the selection of \(Q\). Certainly in this case one would hope that the process measurements are reliable.
In either case, whether or not we have a rational basis for choosing the parameters, often times superior filter performance (statistically speaking) can be obtained by tuning
the filter parameters \(Q\) and \(R\). The tuning is usually performed off-line, frequently with the help of another (distinct) Kalman filter in a process generally referred to as system identification
.
Dynamic model¶
The motion of a target object(pedestrian or vehicle) can be modeled as:
- Moving with constant speed(CV) in straight;
- Moving with constant acceleration(CA) in straight;
- Moving with constant turn(CT).
CV model¶
For this model, the states under consideration are:
where:
- \(x\) is the position in longitudinal component;
- \(y\) is the position in lateral component;
- \(\dot{x}\) is the velocity in x-direction;
- \(\dot{y}\) is the velocity in y-direction;
For this model, state transition matrix is:
CA model¶
For this model, the states under consideration are:
where:
- \(x\) is the position in longitudinal component;
- \(y\) is the position in lateral component;
- \(\dot{x}\) is the velocity in x-direction;
- \(\dot{y}\) is the velocity in y-direction;
- \(\ddot{x}\) is the acceleration in x-direction;
- \(\ddot{y}\) is the acceleration in y-direction;
For this model, state transition matrix is:
CT model¶
For this model, the states under consideration are:
where:
- \(x\) is the position in longitudinal component;
- \(y\) is the position in lateral component;
- \(\dot{x}\) is the velocity in x-direction;
- \(\dot{y}\) is the velocity in y-direction;
- \(\dot{\theta}\) is the yawrate of obstacle;
For this model, state transition matrix is:
Simulation for kalman filter¶
To check if the algorithm is correct, we build the equation of kalman in python. For details, visit imm
Kalman filter¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
Constant velocity model¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Constant acceleration model¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
Constant turn rate model¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
The simulation result:
Interacting multiple model¶
The IMM estimator was originally proposed by Bloom in An efficient filter for abruptly changing systems. It is one of the most cost-effective class of estimators for a single maneuvering target. The IMM has been receiving special attention in the last few years, due to its capability of being combined with other algorithms to resolve the multiple target tracking problem.
The main idea of imm is the identification and transition between different models: at every tracking moment, by setting weight-coefficient and probability for each filter, and finally weighting calculation, we obtain the current optimal estimation state.
Assume that we have \(r\) models, each model's state equation:
where
- \(j \in [1, r]\), \(X\) is state vector,
- \(A_j\) is transition matrix,
- \(w\) is noise with the variance of \(Q\).
The measurement equation is:
where
- \(Z\) is measurement vector,
- \(H\) is measurement matrix,
- \(v\) is the noise with the variance of \(R\).
The transition matrix between models can be:
and probability vector of each model is:
Step1: Input mix¶
where
- \(X^j_{k-1|k-1}\) is the optimal state estimate,
- \(P^j_{k-1|k-1}\) is the optimal state estimate;
and
where
- \(i,j = 1, \cdots, r\);
- \(\mu_{k-1}^j\) is the probabiltiy of model \(j\) at time \(k-1\);
- \(p_{ij}\) is the probability of a transition from model \(i\) to \(j\).
Step2: Model estimate¶
It's the same as normal kalman filter.
Step3: Probability update¶
With the use of the latest measurement \(z_k\), the likelihood function value of the model \(j\) at time \(k\) is given by:
where:
- \(v_k^j = z_k-z_{k|k-1}^j\) denotes the filter residual;
- \(S_k^j\) denotes the innovation convariance;
- \(n_z\) denotes the dimension of measurement vector.
The model probability \(\mu_{k|k}^j\) at time \(k\) is computed by the following equation:
where:
Step4: Output Integration¶
Finally, the state estimate \(\hat{x}_{k|k}\) and corresponding covariance \(P_{k|k}\) are obtained by the model-conditional estimates and covariances of different models:
Simulation for imm¶
To volidate the performance of the proposed algorithm, a simulation in python is operated.
Imm algorithm¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
|
Input¶
To test the algorithm performance, following curves are used:
- generated constant velocity curve + constant turn rate curve + constant acceleration;
- vehicle pose data from real autonomous car;
- vehicle pose data from argoverse dataset.
\(cv + ct + ca\) curve is generated with:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
Simulation result¶
CV + CT + CA¶
From the figure we can see:
- the algorithm can figure out three models correctly;
- it's more difficult to figure out the \(CA\) model, because the acceleration information is not in the observation vector;
- it's not easy to distinguish \(CV\) and \(CA\) model.
Real vehicle data¶
The result is not so good, because its characteristics do not fit any model.
Argoverse data¶
I choose a turning vehicle's path and add it's velocity information, we can say:
- it takes 2.5s to predict the right model;
- after test, I found that if we set the inital probability a more precise number, it can predict the right model more quickly(within 0.5s).
Generate prediction trajectory¶
With the right model probabilities, we can predict the obstacle's trajectory in longer time. We use the probabilities and three models to generate trajectory:
The red line is the real trajectory of a vehicle, and the green line is predicted trajectory every time.
We can see that:
- At the beginning it can not figure out the right model, so it mixes them up to generate trajectory;
- After figure out the CT model, the trajectory is getting closer to the real trajectory.
Cpp class diagram¶
After testing the correctness of the algorithm, we designed the class diagram of the code:
LRT class¶
We should create a filter(kalman filter or imm) for every obstacle, it's important to construct and deconstruct the filters dynamiclly.
We already have a LRU
(Latest Recently Used) class, which will destroy the oldest node when it's capacity is to reach maximum. But it has the following problems:
- If the capacity is too big and the the number of filters we used is small,
LRU
will waste much memory space; - If the capacity is too small and the the number of filters we used is big,
LRU
will destroy some filters in use, which hurts the prediction module.
So we add the time limit to LRU
structure and names it LRT
. If a node is not used for a given time, it will be destroyed.
KalmanFilter¶
To fit with different filter parameters, we use a template to generate instance.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
ImmKf¶
To fit with different kalman filters, we use a parameter pack to input all filters.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|