** Foundations**

**2.5. SLAM with Detection and Tracking of Moving Objects**

Because of the computational intractability of SLAM with generic objects, SLAM with Detection and Tracking of Moving Objects, or SLAM with DATMO, decomposes the es-timation problem into two separate estimators. Although the derivation of SLAM with DATMO is not as simple as SLAM with generic objects, the computation of SLAM with DATMO is considerably simplified in which it is possible to update both the SLAM fil-ter and the DATMO filfil-ter in real-time. The resulting estimation problems are much lower dimensional than the joint estimation problem by maintaining separate posteriors for the stationary objects and the moving objects. In this section, we address the formulation and calculation procedures of SLAM with DATMO.

**Formulation of SLAM with DATMO**

The derivation described in this section is identical to (Wang et al., 2003b).

**Assumptions.** Before introducing the derivation, the assumptions of SLAM with
DATMO are addressed. The first assumption is that measurements can be decomposed
into measurements of static and moving objects:

*z**k* *= z*^{o}_{k}*+ z*_{k}* ^{m}* and hence

*Z*

*k*

*= Z*

_{k}

^{o}*+ Z*

_{k}*(2.56)*

^{m}2.5 SLAM WITH DETECTION AND TRACKING OF MOVING OBJECTS

*Here the sensor measurement belonging to stationary objects is denoted by the variable z*_{k}^{m}*and the sensor measurement belonging to moving objects is denoted by the variable z*^{o}* _{k}*. In
particular this implies the following conditional independence

*p(z**k* *| O**k**, M, x**k**) = p(z*_{k}^{o}*| O**k**, M, x**k**) p(z*_{k}^{m}*| O**k**, M, x**k*)

*= p(z*_{k}^{o}*| O**k**, x**k**) p(z*_{k}^{m}*| M, x**k*) (2.57)
*where the variable x**k* *denotes the true pose of the robot at time k, and the variable M =*
*{m*^{1}_{k}*, m*^{2}_{k}*, . . . , m*^{l}_{k}*} denotes the true locations of the stationary objects, of which there are l*
*in the world at time k. The variable O**k* *= {o*^{1}_{k}*, o*^{2}_{k}*, . . . , o*^{n}_{k}*} denotes the true states of the*
*moving objects, of which there are n in the world at time k.*

The second assumption is that when estimating the posterior over the map and the
robot pose, the measurements of moving objects carry no information, neither do their
*positions O**k*:

*p(M, x**k**| O**k**, Z**k**, U**k**) = p(M, x**k**| Z*_{k}^{m}*, U**k*) (2.58)
*where the variable U**k**= {u*1*, u*2*, . . . , u**k**} denotes the motion measurements up to time k.*

This is correct if we have no information whatsoever about the speed at which objects move. Here it is an approximation, but one that reduces the complexity of SLAM with moving features enormously.

**Derivation.** We begin by factoring out the most recent measurement:

*p(O**k**, M, x**k**| Z**k**, U**k**) ∝ p(z**k**| O**k**, M, x**k**, Z**k−1**, U**k**) p(O**k**, M, x**k**| Z**k−1**, U**k*) (2.59)
*Observing the standard Markov assumption, we note that p(z**k**| O**k**, M, x**k**, Z**k−1**, U**k*) does
*not depend on Z**k−1**, U**k*, hence we have

*p(O**k**, M, x**k**| Z**k**, U**k**) ∝ p(z**k**| O**k**, M, x**k**) p(O**k**, M, x**k**| Z**k−1**, U**k*) (2.60)
*Furthermore, we can now partition the measurement z**k* *= z*^{o}_{k}*+ z*^{m}* _{k}* into moving and static,
and obtain by exploiting the first assumption and Equation 2.57:

*p(O**k**, M, x**k* *| Z**k**, U**k**) ∝ p(z*_{k}^{o}*| O**k**, x**k**) p(z*^{m}_{k}*| M, x**k**) p(O**k**, M, x**k* *| Z**k−1**, U**k*)(2.61)
*The rightmost term p(O*_{k}*, M, x*_{k}*| Z*_{k−1}*, U** _{k}*) can now be further developed, exploiting the
second assumption

*p(O**k**, M, x**k**| Z**k−1**, U**k**) = p(O**k**| Z**k−1**, U**k**) p(M, x**k* *| O**k**, Z**k−1**, U**k*)

*= p(O**k**| Z**k−1**, U**k**) p(M, x**k* *| Z*_{k−1}^{m}*, U**k*) (2.62)

Hence we get for our desired posterior
*p(O*_{k}*, M, x*_{k}*| Z*_{k}*, U** _{k}*)

*∝ p(z*_{k}^{o}*| O**k**, x**k**) p(z*_{k}^{m}*| M, x**k**) p(O**k* *| Z**k−1**, U**k**) p(M, x**k**| Z*_{k−1}^{m}*, U**k*)

*∝ p(z*_{k}^{o}*| O**k**, x**k**) p(O**k**| Z**k−1**, U**k*)

| {z }

DATMO

*p(z*_{k}^{m}*| M, x**k**) p(M, x**k* *| Z*_{k−1}^{m}*, U**k*)

| {z }

SLAM

(2.63)

*The term p(O*_{k}*| Z*_{k−1}*, U** _{k}*) resolves to the following prediction

*p(O*

*k*

*| Z*

*k−1*

*, U*

*k*) =

Z

*p(O**k* *| Z**k−1**, U**k**, O**k−1**) p(O**k−1**| Z**k−1**, U**k**) dO**k−1*

= Z

*p(O**k* *| O**k−1**) p(O**k−1**| Z**k−1**, U**k−1**) dO**k−1* (2.64)
*Finally, the term p(M, x*_{k}*| Z*_{k−1}^{m}*, U** _{k}*) in Equation 2.63 is obtained by the following step:

*p(M, x**k**| Z*_{k−1}^{m}*, U**k*)

*= p(x**k* *| Z*_{k−1}^{m}*, U**k**, M ) p(M | Z*_{k−1}^{m}*, U**k*)

= Z

*p(x**k**| Z*_{k−1}^{m}*, U**k**, M, x**k−1**) p(x**k−1**| Z*_{k−1}^{m}*, U**k**, M ) p(M | Z*_{k−1}^{m}*, U**k**) dx**k−1*

= Z

*p(x**k**| u**k**, x**k−1**) p(x**k−1**, M | Z*_{k−1}^{m}*, U**k−1**) dx**k−1* (2.65)
which is the familiar SLAM prediction step. Putting everything back into Equation 2.63 we
now obtain the final filter equation:

*p(O**k**, M, x**k* *| Z**k**, U**k*)

*∝ p(z*^{o}_{k}*| O**k**, x**k*)

| {z }

Update Z

*p(O**k* *| O**k−1**) p(O**k−1**| Z**k−1**, U**k−1**) dO**k−1*

| {z }

Prediction
*p(z*^{m}_{k}*| M, x** _{k}*)

| {z }

Update Z

*p(x*_{k}*| u*_{k}*, x*_{k−1}*) p(x*_{k−1}*, M | Z*_{k−1}^{m}*, U*_{k−1}*) dx*_{k−1}

| {z }

Prediction

(2.66)

**Solving the SLAM with DATMO problem.** From Equation 2.66, input to this
SLAM with DATMO filter are two separate posteriors, one of the conventional SLAM form,
*p(x**k−1**, M | Z*_{k−1}^{m}*, U**k−1**), and a separate one for DATMO, p(O**k−1**| Z**k−1**, U**k−1*).

*The remaining question is now how to recover those posteriors at time k. For the*
SLAM part, the recovery is simple:

*p(x**k**, M | Z*_{k}^{m}*, U**k*)

= Z

*p(O**k**, M, x**k**| Z**k**, U**k**) dO**k*

*∝ p(z*^{m}_{k}*| M, x**k*)
Z

*p(x**k**| u**k**, x**k−1**) p(x**k−1**, M | Z*_{k−1}^{m}*, U**k−1**) dx**k−1* (2.67)

2.5 SLAM WITH DETECTION AND TRACKING OF MOVING OBJECTS

For DATMO, we get
*p(O**k* *| Z**k**, U**k*)

= Z Z

*p(O**k**, M, x**k* *| Z**k**, U**k**) dM dx**k*

*∝*
Z ·

*p(z*^{o}_{k}*| O**k**, x**k*)
Z

*p(O**k* *| O**k−1**) p(O**k−1**| Z**k−1**, U**k−1**) dO**k−1*

¸

*p(x**k**| Z*_{k}^{m}*, U**k**) dx**k* (2.68)

*where the posterior over the pose p(x**k* *| Z*_{k}^{m}*, U**k*) is simply the marginal of the joint
calcu-lated in Equation 2.67:

*p(x**k**| Z*_{k}^{m}*, U**k*) =
Z

*p(x**k**, M | Z*_{k}^{m}*, U**k**)dM* (2.69)
For Gaussians, all these integrations are easily carried out in closed form. Equation 2.68
shows that DATMO should take account of the uncertainty in the pose estimate of the
robot because the perception measurements are directly from the robot.

**Calculation Procedures**

Figure 2.29 shows a DBN representing three time steps of an example SLAM with DATMO problem with one moving object and one stationary object. The calculation pro-cedures for solving the SLAM with DATMO problem are the same as the SLAM problem and the moving object tracking problem, which consist of the initialization, prediction, data association and update stages. In this section, we explain the procedures to compute Equation 2.66 with the visualization aid from DBN.

**Stage 1: Initialization.** Figure 2.30 shows the initialization stage. In this example,
two stationary objects and one moving object are initialized. Figure 2.31 shows a DBN
representing this example. It is assumed that the measurements can be classified into the
measurements of stationary objects and moving objects.

**Stage 2: Prediction.** Figure 2.32 shows the prediction stage in which the robot gets a
new motion measurement. Only the robot and the moving object are predicted. The robot
motion prediction is done with the robot motion model and the new motion measurement.

However, there is no motion measurement associated with the moving object, and the motion model of this moving object is unknown. In this dissertation, the IMM algorithm is applied. The moving object motion prediction is done with the mixed initial conditions from the selected motion models. Figure 2.33 shows a DBN representing the prediction stage.

**Figure 2.29.** A DBN of the SLAM with DATMO problem of duration three with one
moving object and one stationary object.

**Figure 2.30.** The initialization
stage of SLAM with DATMO

**Figure 2.31.** A DBN
represent-ing the initialization stage of
SLAM with DATMO

**Stage 3: Data Association.** Figure 2.34 shows the data association stage in which
the robot gets a new perception measurement at the new location. The new perception
measurement is associated with the stationary objects and the moving object. Figure 2.35
shows a DBN representing this stage.

**Stage 4: Update of the SLAM part.** Figure 2.36 shows the update stage of the SLAM
part of the whole problem. Only measurements associated with stationary objects are used
to update the robot pose and the stationary object map. After this update, the map and the
robot pose are more accurate. Figure 2.37 shows a DBN representing this stage.

2.5 SLAM WITH DETECTION AND TRACKING OF MOVING OBJECTS

**Figure** **2.32.** The prediction
stage of SLAM with DATMO

**Figure 2.33.** A DBN
represent-ing the prediction stage of
SLAM with DATMO

**Figure 2.34.** The data
associa-tion stage of SLAM with
DATMO

**Figure 2.35.** A DBN
represent-ing the data association stage
of SLAM with DATMO

**Figure 2.36.** The update stage of
the SLAM part of SLAM with
DATMO

**Figure 2.37.** A DBN
represent-ing the update stage of the
SLAM part of SLAM with
DATMO

**Stage 5: Update of the DATMO part.** Figure 2.38 shows the update stage of the
DATMO part of the whole problem. Because the robot pose estimate is more accurate after
the update of the SLAM part, the measurement associated with the moving object is more

accurate as well. In the update stage of the DATMO part, this more accurate measurement is used to update the moving object pose as well as its motion models.

**Figure 2.38.** The update stage
of the DATMO part of SLAM
with DATMO

**Figure 2.39.** A DBN
represent-ing the update stage of the
DATMO part of SLAM with
DATMO