** Foundations**

**2.3. Moving Object Tracking**

stationary objects. The only way to avoid this quadratically increasing computational
*re-quirement is to develop suboptimal and approximate techniques. Recently, this problem has*
been subject to intense research. Approaches using approximate inference, using exact
in-ference on tractable approximations of the true model, and using approximate inin-ference
on an approximate model have been proposed. These approaches include:

*•* Thin junction tree filters (Paskin, 2003).

*•* Sparse extended information filters (Thrun et al., 2002; Thrun and Liu, 2003).

*•* *Submap-based approaches: the Atlas framework (Bosse et al., 2003), compressed*
filter (Guivant and Nebot, 2001) and Decoupled Stochastic Mapping (Leonard
and Feder, 1999).

*•* Rao-Blackwellised particle filters (Montemerlo, 2003).

This topic is beyond the scope intended by this dissertation. (Paskin, 2003) includes an excellent comparison of these techniques.

**Perception Modelling and Data Association**

Besides the computational complexity issue, the problems of perception modelling and data association have to be solved in order to accomplish city-sized SLAM. For in-stance, the described feature-based formulas may not be feasible because extracting fea-tures robustly is very difficult in outdoor, urban environments. Data association is difficult in practice because of featureless areas, occlusion, etc. We will address perception mod-elling in Chapter 3 and data association in Chapter 5.

2.3 MOVING OBJECT TRACKING

**Formulation of Moving Object Tracking**

The robot (sensor platform) is assumed to be stationary for the sake of simplicity. The general formula for the moving object tracking problem can be formalized in the proba-bilistic form as:

*p(o*_{k}*, s*_{k}*| Z** _{k}*) (2.30)

*where o**k* *is the true state of the moving object at time k, and s**k* *is the true motion mode of*
*the moving object at time k, and Z**k* is the perception measurement set leading up to time
*k.*

Using Bayes’ rule, Equation 2.30 can be rewritten as:

*p(o**k**, s**k* *| Z**k**) = p(o**k**| s**k**, Z**k**)p(s**k* *| Z**k*) (2.31)
which indicates that the whole moving object tracking problem can be solved by two
*stages: the first stage is the mode learning stage p(s**k* *| Z**k*), and the second stage is the
*state inference stage p(o**k* *| s**k**, Z**k*).

**Mode Learning and State Inference**

*Without a priori information, online mode learning of time-series data is a daunting*
*task. In the control literature, specific data collection procedures are designed for *
*iden-tification of structural parameters of the system. However, online collected data is often*
not enough for online identification of the structural parameters in moving object tracking
applications.

Fortunately, the motion mode of moving objects can be approximately composed of
several motion models such as the constant velocity model, the constant acceleration model
*and the turning model. Therefore the mode learning problem can be simplified to a model*
*selection problem. It is still difficult though because the motion mode of moving objects*
*can be time-varying. In this section, practical multiple model approaches are briefly reviewed*
such as the generalized pseudo-Bayesian (GPB) approaches and the interacting multiple
model (IMM) approach. Because the IMM algorithm is integrated into our whole
algo-rithm, the derivation of the IMM algorithm will be described in detail. The multiple model
approaches described in this section are identical to those used in (Bar-Shalom and Li, 1988,
1995).

*The same problems are solved with switching dynamic models in the machine learning*
literature (Ueda and Ghahramani, 2002; Pavlovic et al., 1999; Ghahramani and Hinton,

*1998). In the cases that the models in the model set are linear, such systems are called *
*jump-linear systems or switching jump-linear dynamic models. However, most of them are batch so that*
they are not suitable for our applications.

**Fixed Structure Multiple Model Approach for Switching Modes.** In the fixed
structure multiple model approach, it is assumed that the mode of the system obeys one of
*a finite number of models in which the system has both continuous nodes as well as discrete*
nodes. Figure 2.14 shows a Dynamic Bayesian Network representing three time steps of an
example multiple model approach for solving the moving object tracking problem.

**Figure 2.14.** A DBN for multiple model based moving object tracking. Clear circles
denote hidden continuous nodes, clear squares denotes hidden discrete nodes and
shaded circles denotes continuous nodes.

*The mode of the moving object is assumed to be one of r possible models which is*
described by:

*s**k* *∈ {M*^{j}*}*^{r}* _{j=1}* (2.32)

*where M is the model set.*

*In practice the system does not always stay in one mode. Because mode jump or mode*
*switch does occur, the mode-history of the system should be estimated. The mode history*
*through time k is denoted as S**k*

*S**k* *= {s*1*, s*2*, . . . , s**k**}* (2.33)
*Given r possible models, the number of possible histories, M*_{k}^{l}*, is r*^{k}*at time k, which*
*increases exponentially with time. Let l be the index of the mode history.*

*l = 1, 2, . . . , r** ^{k}* (2.34)

*The lth mode history, or sequence of modes, through time k is denoted as:*

*M*_{k}^{l}*= {M*^{l}_{1}^{1}*, M*^{l}_{2}^{2}*, . . . , M*^{l}_{k}^{k}*}*

*= {M*_{k−1}^{l}*, M*^{l}_{k}^{k}*}* (2.35)

2.3 MOVING OBJECT TRACKING

*where l**i**is the model index at time i from the history l and*

*1 5 l**i* *5 r* *i = 1, . . . , k* (2.36)

*Using Bayes’ rule, the conditional probability of the lth mode history M*_{k}* ^{l}* can be
ob-tained as:

*µ*^{l}_{k}^{4}*= p(M*_{k}^{l}*| Z**k*)

*= p(M*_{k}^{l}*| Z**k−1**, z**k*)

= *p(z**k**| M*_{k}^{l}*, Z**k−1**)p(M*_{k}^{l}*| Z**k−1*)
*p(z**k**| Z**k−1*)

*= η · p(z**k**| M*_{k}^{l}*, Z**k−1**)p(M*_{k}^{l}*| Z**k−1*)

*= η · p(z**k**| M*_{k}^{l}*, Z**k−1**)p(M*^{l}_{k}^{k}*, M*_{k−1}^{l}*| Z**k−1*)

*= η · p(z**k**| M*_{k}^{l}*, Z**k−1**)p(M*^{l}_{k}^{k}*| M*_{k−1}^{l}*, Z**k−1**)µ*^{l}* _{k−1}* (2.37)
It is assumed that the mode jump process is a Markov process in which the current
node depends only on the previous one.

*p(M*^{l}_{k}^{k}*| M*_{k−1}^{l}*, Z**k−1**) = p(M*^{l}_{k}^{k}*| M*_{k−1}* ^{l}* )

*= p(M*^{l}_{k}^{k}*| M*^{l}_{k}* ^{k−1}*) (2.38)
Equation 2.37 can be rewritten as:

*µ*^{l}_{k}*= ηp(z**k* *| M*_{k}^{l}*, Z**k−1**)p(M*^{l}_{k}^{k}*| M*^{l}_{k}^{k−1}*)µ*^{l}* _{k−1}* (2.39)
in which conditioning on the entire past history is needed even using the assumption that
the mode jump process is a Markov process.

Using the total probability theorem, Equation 2.31 can be obtained by:

*p(o**k* *| Z**k*) =

*r*^{k}

X

*l=1*

*p(o**k**| M*_{k}^{l}*, Z**k**)p(M*_{k}^{l}*| Z**k*)

=

*r*^{k}

X

*l=1*

*p(o**k**| M*_{k}^{l}*, Z**k**)µ*^{l}* _{k}* (2.40)

This method is not practical because an exponentially increasing number of filters are needed to estimate the state. Also even if the modes are Markov, conditioning on the entire past history is needed. In the same way as dealing with the computational complexity of the SLAM problem, the only way to avoid the exponentially increasing number of histories is to use approximate and suboptimal approaches which merge or reduce the number of the mode history hypotheses in order to make computation tractable.

**The Generalized Pseudo-Bayesian Approaches.** The generalized pseudo-Bayesian
(GPB) approaches (Tugnait, 1982) apply a simple suboptimal technique which keeps the
histories of the largest probabilities, discards the rest, and renormalizes the probabilities.

In the generalized pseudo-Bayesian approaches of the first order (GPB1), the state
*estimate at time k is computed under each possible current model. At the end of each cycle,*
*the r hypotheses are merged into a single hypothesis. Equation 2.40 is simplified as:*

*p(o**k**| Z**k*) =
X*r*
*j=1*

*p(o**k* *| M*^{j}*, Z**k**)p(M*^{j}*| Z**k*)

=
X*r*
*j=1*

*p(o**k* *| M*^{j}*, z**k**, Z**k−1**)µ*^{j}_{k}

*≈*
X*r*
*j=1*

*p(o**k* *| M*^{j}*, z**k**, ˆo**k−1**, Σ**o**k−1**)µ*^{j}* _{k}* (2.41)

*where the Z*

*k−1*is approximately summarized by ˆ

*o*

*k−1*and Σ

*o*

*k−1*. The GPB1 approach uses

*r filters to produce 1 state estimate. Figure 2.15 describes the GPB1 algorithm.*

**Figure 2.15.** The GPB1 algorithm of one cycle for 2 switching models.

In the generalized pseudo-Bayesian approaches of second order (GPB2), the state
*es-timate is computed under each possible model at current time k and previous time k − 1.*

*p(o**k* *| Z**k*) =
X*r*
*j=1*

X*r*
*i=1*

*p(o**k**| M*^{j}_{k}*, M*^{i}_{k−1}*, Z**k**)p(M*^{i}_{k−1}*| M*^{j}_{k}*, Z**k**)p(M*^{j}_{k}*| Z**k*) (2.42)
*In the GPB2 approach, there are r estimates and covariances at time k − 1. Each is *
*pre-dicted to time k and updated at time k under r hypotheses. After the update stage, the r*^{2}
*hypotheses are merged into r at the end of each estimation cycle. The GPB2 approach uses*
*r*^{2}*filters to produce r state estimates. Figure 2.16 describes the GPB2 algorithm, which does*
not show the state estimate and covariance combination stage. For output only, the latest
*state estimate and covariance can be combined from r state estimates and covariances.*

**The Interacting Multiple Model Algorithm.** In the interacting multiple model
*(IMM) approach (Blom and Bar-Shalom, 1988), the state estimate at time k is computed*
*under each possible current model using r filters and each filter uses a suitable mixing of*
the previous model-conditioned estimate as the initial condition. It has been shown that the

2.3 MOVING OBJECT TRACKING

**Figure 2.16.** The GPB2 algorithm of one cycle for 2 switching models

IMM approach performs significantly better than the GPB1 algorithm and almost as well
*as the GPB2 algorithm in practice. Instead of using r*^{2}*filters to produce r state estimates*
*in GPB2, the IMM uses only r filters to produce r state estimates. Figure 2.17 describes the*
IMM algorithm, which does not show the state estimate and covariance combination stage.

The derivation of the IMM algorithm is described as the following:

**Figure 2.17.** The IMM algorithm of one cycle for 2 switching models

Similar to Equation 2.40, the Bayesian formula of the IMM-based tracking problem is described as:

*p(o**k**| Z**k*) *T otal P rob.*

=

X*r*
*j=1*

*p(o**k**| M*^{j}_{k}*, Z**k**)p(M*^{j}_{k}*| Z**k*)

*Bayes*

=

X*r*
*j=1*

*p(z**k* *| o**k**, M*^{j}_{k}*, Z**k−1**)p(o**k* *| M*^{j}_{k}*, Z**k−1*)

*p(z**k* *| M*^{j}_{k}*, Z**k−1*) *p(M*^{j}_{k}*| Z**k*)

= *η*

X*r*
*j=1*

*p(z**k**| o**k**, M*^{j}_{k}*, Z**k−1**)p(o**k**| M*^{j}_{k}*, Z**k−1**)p(M*^{j}_{k}*| Z**k*)

*M arkov*

= *η*

X*r*
*j=1*

*p(z**k**| o**k**, M*^{j}* _{k}*)

| {z }

Update

*p(o**k**| M*^{j}_{k}*, Z**k−1*)

| {z }

Prediction

*p(M*^{j}_{k}*| Z**k*)

| {z }

Weighting

(2.43)

*where p(M*^{j}_{k}*| Z**k**) is the model probability and can be treated as the weighting of the estimate*
*from the model M*^{j}_{k}*. p(o*_{k}*| M*^{j}_{k}*, Z*_{k−1}*) is the prediction stage and p(z*_{k}*| o*_{k}*, M*^{j}* _{k}*) is the
update stage. The final estimate is the combination of the estimates from all models.

*The model probability, p(M*^{j}_{k}*| Z**k*), can be calculated recursively as follows:

*µ*^{j}* _{k}* =

^{4}*p(M*

^{j}

_{k}*| Z*

*)*

_{k}*Bayes*

= *ηp(z**k* *| M*^{j}_{k}*, Z**k−1**)p(M*^{j}_{k}*| Z**k−1*)

*T otal P rob.*

= *η p(z**k**| M*^{j}_{k}*, Z**k−1*)

| {z }

Mode Match
X*r*
*i=1*

*p(M*^{j}_{k}*| M*^{i}_{k−1}*, Z**k−1*)

| {z }

Mode Transition

*p(M*^{i}_{k−1}*| Z**k−1*)

| {z }

*µ*^{i}_{k−1}

(2.44)

*The last term on the right hand side is the model probability of the model M*^{i}*at time k − 1.*

*The second term on the right hand side is the mode transition probability. Here it is assumed*
that the mode jump process is a Markov process with known mode transition probabilities.

Therefore,

*P**ij* *4*

*= p(M*^{j}_{k}*| M*^{i}_{k−1}*, Z**k−1*)

*= p(M*^{j}_{k}*| M*^{i}* _{k−1}*) (2.45)

The first term of the right hand side can be treated as mode-matched filtering, which is computed by:

Λ^{j}_{k}*= p(z*^{4}*k* *| M*^{j}_{k}*, Z**k−1*)

*= p(z**k* *| M*^{j}_{k}*, ˆo**k−1**, Σ**o**k−1*) (2.46)
To summarize, the recursive formula for computing the model probability is:

*µ*^{j}_{k}*= ηΛ*^{j}* _{k}*
X

*r*

*i=1*

*P**ij**µ*^{i}* _{k−1}* (2.47)

*where η is the normalization constant.*

*The prediction stage of Equation 2.43 can be done as follows:*

*p(o**k* *| M*^{j}_{k}*, Z**k−1*)

*T otal P rob.*

=

X*r*
*i=1*

*p(o**k**| M*^{j}_{k}*, M*^{i}_{k−1}*, Z**k−1**)p(M*^{i}_{k−1}*| M*^{j}_{k}*, Z**k−1*)

*T otal P rob.*

*≈*

X*r*
*i=1*

Z

*p(o**k* *| M*^{j}_{k}*, M*^{i}_{k−1}*, {o*^{l}_{k−1}*}*^{r}_{l=1}*)do**k−1**µ*^{i|j}

*Interaction*

*≈*

X*r*
*i=1*

Z

*p(o**k* *| M*^{j}_{k}*, M*^{i}_{k−1}*, ˆo*^{i}_{k−1}*)dˆo**k−1**µ** ^{i|j}* (2.48)

*The second line of the above equation shows that Z*

*k−1*

*is summarized by r*model-conditioned estimates and covariances, which is used in the GPB2 algorithm. The third line shows the key idea of the IMM algorithm which uses a mixing estimate ˆ

*o*

*k−1*as the

2.3 MOVING OBJECT TRACKING

*input of the filter instead of {o*^{l}_{k−1}*}*^{r}* _{l=1}*. The last term on the right hand side, the mixing
probability can be obtained by:

*µ*^{i|j}^{4}*= p(M*^{i}_{k−1}*| M*^{j}_{k}*, Z**k−1*)

*= ηp(M*^{j}_{k}*| M*^{i}_{k−1}*, Z**k−1**)p(M*^{i}_{k−1}*| Z**k−1*)

*= ηP**ij**µ*^{i}* _{k−1}* (2.49)

*where η is the normalization constant. Using the assumption that the mixture estimate is a*
Gaussian and then approximating this mixture via moment matching by a single Gaussian,
the mixed initial condition can be computed by:

*o*^{0j}* _{k−1}* =
X

*r*

*i=1*

ˆ

*o*^{i}_{k−1}*µ** ^{i|j}* (2.50)

and the corresponding covariance is:

Σ_{o}^{0j}

*k−1* =

X*r*
*i=1*

*{Σ*^{i}_{o}* _{k−1}*+ (ˆ

*o*

^{0j}

_{k−1}*− ˆo*

^{i}*)(ˆ*

_{k−1}*o*

^{0j}

_{k−1}*− ˆo*

^{i}

_{k−1}*}µ*

*(2.51)*

^{i|j}With the mixed initial conditions, the prediction and update stages can be done with
each model using Kalman filtering. Let the estimate and the corresponding covariance
from each model be denoted by ˆ*o*^{j}* _{k}* and Σ

^{j}

_{o}*respectively. For output purposes, the state estimate and covariance can be combined according to the mixture equations:*

_{k}ˆ
*o**k* =

X*r*
*j=1*

ˆ

*o*^{j}_{k}*µ*^{j}* _{k}* (2.52)

Σ*o**k* =
X*r*
*j=1*

*{Σ*^{j}_{o}* _{k}*+ (ˆ

*o*

^{j}

_{k}*− ˆo*

*k*)(ˆ

*o*

^{j}

_{k}*− ˆo*

*k*)

^{T}*}µ*

^{j}*(2.53)*

_{k}**Calculation Procedures of the IMM algorithm**

From the Bayesian formula of the moving object tracking problem, one cycle of the the calculation procedures consists of the initialization, prediction, data association and update stages.

**Stage 1: Initialization.** Figure 2.18 shows the initialization stage of moving object
tracking and Figure 2.19 shows the corresponding DBN. In this stage, it is assumed that
*there are r possible models in the model set, and the prior model probabilities and the*
mode transition probabilities are given. The mixing probabilities are computed by
Equa-tion 2.49 and the mixed initial condiEqua-tions are computed by EquaEqua-tion 2.50 and EquaEqua-tion
2.51.

**Figure 2.18.** The initialization
stage of moving object
track-ing.

**Figure 2.19.** A DBN
represent-ing the initialization stage of
moving object tracking.

**Stage 2: Prediction.** Figure 2.20 shows the prediction stage of moving object
track-ing and Figure 2.21 shows the correspondtrack-ing DBN. With the mixed initial conditions, each
filer use its corresponding motion model to perform prediction individually in the IMM
algorithm.

**Figure** **2.20.** The prediction
stage of moving object tracking

**Figure 2.21.** A DBN
represent-ing the prediction stage of
moving object tracking

**Stage 3: Data Association.** Figure 2.22 shows the data association stage of moving
object tracking and Figure 2.21 shows the corresponding DBN. In this stage, the sensor
*returns a new measurement z**k*and each filter use its own prediction to perform data
asso-ciation.

**Stage 4: Update.** Figure 2.24 shows the update stage of moving object tracking and
Figure 2.25 shows the corresponding DBN. In this stage, each filter is updated with the
associated measurement and then the mode-matched filtering is done by Equation 2.46.

2.3 MOVING OBJECT TRACKING

**Figure 2.22.** The data
associa-tion stage of moving object
tracking

**Figure 2.23.** A DBN
represent-ing the data association stage
of moving object tracking

The model probabilities are updated by Equation 2.47. For output purposes, the state and covariance can be computed by Equation 2.52 and Equation 2.53.

**Figure 2.24.** The update stage of
moving object tracking

**Figure 2.25.** A DBN
represent-ing the update stage of movrepresent-ing
object tracking

**Motion Modelling**

In the described formulation of moving object tracking, it is assumed that a model set is given or selected in advance, and tracking is performed based on model averaging of this model set. Theoretically and practically, the performance of moving object tracking strongly relates to the selected motion models. Figure 2.26 illustrates the different perfor-mances using different motion models. Given the same data set, the tracking results differ according to the selected motion models. Figure 2.27 illustrates the effects of model set completeness. If a model set does not contain a stationary motion model, move-stop-move object tracking may not be performed well. We will address the motion modelling related issues in Chapter 4.

(a) (b)

**Figure 2.26.** Model Selection. On the left is the result of tracking using a
compli-cated motion model. On the right is the same data using a simple motion model.

(a) (b)

**Figure 2.27.** Move-stop-move object tracking. On the left is the result of tracking
using only moving motion models. On the right is the result of tracking using
moving motion models and a stationary motion model.

**Perception Modelling and Data Association**

Regarding perception modelling, it is assumed that objects can be represented by point-features in the described formulation. In practice this may not be appropriate be-cause of a wide variety of moving objects in urban and suburban areas. In Chapter 3, the hierarchical object based representation for moving object tracking will be described in de-tail. With regard to data association, using not only kinematic information from motion modelling but also geometric information from perception modelling will be addressed in Chapter 5.