Foundations
2.2. Simultaneous Localization and Mapping
Figure 2.4. The tail-to-tail relationship.
The Tail-to-Tail Relationship
For local navigation or obstacle avoidance, it is more straightforward to use the lo-cations of moving objects in the robot frame than the lolo-cations with respect to the world coordinate system. In the example of Figure 2.4, given the locations of the robot xij and a moving object xikin the world frame, we want to know the location of this moving object, xjk, and its distribution, Σxjk, in the robot frame, which can be calculated recursively by:
xjk
4= ⊕(ª(xij), xik) = ⊕(xji, xik) (2.8) This relationship is called the tail-to-tail relationship in (Smith et al., 1990). The first-order estimate of the mean of this tail-to-tail operation is:
µxjk≈ ⊕(ª(µxij), µxik) (2.9)
and the first-order covariance estimate can be computed in a similar way:
Σxjk ≈ ∇⊕
· Σxji Σxjixjk
Σxjkxji Σxjk
¸
∇T⊕≈ ∇⊕
· ∇ªΣxij∇Tª Σxijxjk∇Tª
∇ªΣxjkxij Σxjk
¸
∇T⊕ (2.10) Note that this tail-to-tail operation is often used in data association and moving object tracking.
Unscented Transform
As addressed above, these spatial uncertain relationships are non-linear functions and are approximated by their first-order Taylor expansion for estimating the means and the covariances of their outputs. In the cases that the function is not approximately linear in the likely region of its inputs or the Jacobian of the function is unavailable, the unscented transform (Julier, 1999) can be used to improve the estimate accuracy. (Wan and van der Merwe, 2000) shows an example of using the unscented transform technique.
2.2. Simultaneous Localization and Mapping
In this section, we address the formulation, calculation procedures, computational complexity and practical issues of the SLAM problem.
Formulation of SLAM
The general formula for the SLAM problem can be formalized in the probabilistic form as:
p(xk, M | u1, u2, . . . uk, z0, z1, . . . zk) (2.11) where xkis the true pose of the robot at time k, ukis the measurement from motion sensors such as odomtrey and inertial sensors at time k, zk is the measurement from perception sensors such as laser scanner and camera at time k, and M is stochastic stationary object map which contains l landmarks, m1, m2, . . . ml. In addition, we define the following set to refer data leading up to time k:
Zk 4
= {z0, z1, . . . , zk} (2.12)
Uk
= {u4 1, u2, . . . , uk} (2.13)
Therefore, equation (2.11) can be rewritten as:
p(xk, M | Uk, Zk) (2.14)
Using Bayes’ rule and assumptions that the vehicle motion model is Markov and the environment is static, the general recursive Bayesian formula for SLAM can be derived and expressed as: (See (Thrun, 2002; Majumder et al., 2002) for more details.)
p(xk, M | Zk, Uk)
| {z }
Posterior at k
∝ p(zk | xk, M )
| {z }
Update Z
p(xk | xk−1, uk) p(xk−1, M | Zk−1, Uk−1)
| {z }
Posterior at k − 1
dxk−1
| {z }
Prediction
(2.15)
where p(xk−1, M | Zk−1, Uk−1) is the posterior probability at time k −1, p(xk, M | Zk, Uk) is the posterior probability at time k, p(xk | xk−1, uk) is the motion model, and p(zk | xk, M ) is the update stage which can be inferred as the perception model.
Calculation Procedures
Equation 2.15 only explains the computation procedures in each time step but does not address the dependency structure of the SLAM problem. Figure 2.5 shows a Dynamic Bayesian Network of the SLAM problem of duration three, which can be used to visualize the dependencies between the robot and stationary objects in the SLAM problem. In this section, we describe the Kalman filter-based solution of Equation 2.15 with visualization aid from Dynamic Bayesian Networks (Paskin, 2003). The EKF-based framework described
2.2 SIMULTANEOUS LOCALIZATION AND MAPPING
in this section is identical to that used in (Smith and Cheeseman, 1986; Smith et al., 1990;
Leonard and Durrant-Whyte, 1991).
Figure 2.5. A Dynamic Bayesian Network (DBN) of the SLAM problem of duration three. It shows the dependencies among the motion measurements, the robot, the perception measurements and the stationary objects. In this example, there are two stationary objects, m1and m2. Clear circles denote hidden continuous nodes and shaded circles denote observed continuous nodes. The edges from stationary objects to measurements are determined by data association. We will walk through this in the next pages.
Stage 1: Initialization. Figure 2.6 shows the initialization stage, or adding new stationary objects stage. Although the distributions are shown by ellipses in these figures, the Bayesian formula does not assume that the estimations are Gaussian distributions. In this example, two new stationary objects are detected and added to the map. The state xSk of the whole system now is:
xSk =
xk
m1 m2
(2.16)
Let the perception model, p(zk| xk, M ), be described as:
zk = h(xk, M ) + wk (2.17)
where h is the vector-valued perception model and wk ∼ N (0, Rk) is the perception error, an uncorrelated zero-mean Gaussian noise sequence with covariance, Rk. Because the zk
are the locations of the stationary objects M with respect to the robot coordinate system, the perception model h is simply the tail-to-tail relationship of the robot and the map. Let the perception sensor return the mean location, ˆz01, and variance, R10, of the stationary object m1and ˆz02and R20of m2. To add these measurements to the map, these measurements are compounded with the robot state estimate and its distribution because these measurements
Figure 2.6. The initialization stage of SLAM. Solid squares denote stationary objects and black solid circle denotes the robot. Distributions are shown by ellipses.
Figure 2.7. A DBN represent-ing the initialization stage of SLAM. After this stage, the undirected graphical model is produced in which two station-ary objects and the robot state are directly dependent.
are with respect to the robot coordinate system. Therefore, the mean and covariance of the whole system can be computed as in:
µxS
0 =
µx0
⊕(µx0, ˆz01)
⊕(µx0, ˆz02)
(2.18)
ΣxS
0 =
Σx0x0 Σx0m1 Σx0m2
ΣTx
0m1 Σm1m1 Σm1m2
ΣTx0m2 ΣTm1m2 Σm2m2
=
Σx0x0 Σx0x0∇T1⊕ Σx0x0∇T1⊕
∇1⊕Σx0x0 ∇1⊕Σx0x0∇T1⊕+ ∇2⊕R10∇T2⊕ 0
∇1⊕Σx0x0 0 ∇1⊕Σx0x0∇T1⊕+ ∇2⊕R20∇T2⊕
(2.19)
This stage is shown as p(xk−1, M | Zk−1, Uk−1) in equation (2.15). Figure 2.7 shows a DBN representing the initialization stage, or the adding new stationary objects stage, in which the undirected graphical model is produced by moralizing2 the directed graphical model. The observed nodes are eliminated to produce the final graphical model which shows that two stationary objects and the robot state are directly dependent.
Stage 2: Predication. In Figure 2.8, the robot moves and gets a motion measurement u1 from odometry or inertial sensors. Let the robot motion model, p(xk | xk−1, uk), be
2In the Graphical Model literature, moralizing means adding links between unmarried parents who share a common child.
2.2 SIMULTANEOUS LOCALIZATION AND MAPPING
described as:
xk = f (xk−1, uk) + vk (2.20)
where f (.) is the vector of non-linear state transition functions and vk is the motion noise, an uncorrelated zero-mean Gaussian noise sequence with covariance, Qk. Assuming that the relative motion in the robot frame is given by uk, clearly the new location of the robot is the compounding relationship of the robot pose xk−1 and uk. Because only the robot moves, only the elements of the mean and the covariance matrix that corresponding to xkmust be computed. In this example, the mean and the covariance matrix of the whole system can be computed as:
µxS 1 =
⊕(µx0, u1) µm1 µm2
(2.21)
and
ΣxS
1 =
Σx1x1 Σx1m1 Σx1m2
ΣTx1m1 Σm1m1 Σm1m2
ΣTx1m2 ΣTm1m2 Σm2m2
=
∇1⊕Σx0x0∇T1⊕+ ∇2⊕Q1∇T2⊕ ∇1⊕Σx0m1 ∇1⊕Σx0m2 ΣTx
0m1∇T1⊕ Σm1m1 Σm1m2
ΣTx
0m2∇T1⊕ ΣTm1m2 Σm2m2
(2.22)
Figure 2.8. The prediction stage of SLAM.
Figure 2.9. A DBN representing the prediction stage of SLAM.
This is the prediction stage of the SLAM problem which is shown asR
p(xk| xk−1, uk) p(xk−1, M | Zk−1, Uk−1)dxk−1in equation (2.15). Figure 2.9 shows a DBN representing the prediction stage of the SLAM problem. The new nodes, x1and u1, are added to the graph-ical model from the initialization stage. After moralizing the directed graphgraph-ical model, eliminating the odometry node u1and eliminating the node x0, the resulting undirected
graphical model is produced in which two stationary objects and the robot state are still directly dependent.
Stage 3: Data Association. Figure 2.10 shows that the robot gets new measure-ments, z11and z21, at the new location x1and associates z11and z21with the stationary object map. This is the data association stage of the SLAM problem. Gating is one of the data association techniques for determining whether a measurement z originates from some landmark m. More details about data association will be addressed in Chapter 5.
Figure 2.10. The data associa-tion stage of SLAM. Irregular stars denote new measure-ments.
Figure 2.11. A DBN represent-ing the data association stage of SLAM.
Figure 2.11 shows a DBN representing the data association stage. The new perception measurement nodes, z11and z12, are added to the graphical model from the prediction stage.
After data association, two directed edges are added to connect new measurements with the stationary object map.
Stage 4: Update. Figure 2.12 shows the update stage of the SLAM problem. Let the perception sensor return the mean location, ˆz11, and variance, R11, of the stationary object m1 and ˆz12 and R21 of m2. These constraints are used to update the estimate and the corresponding distribution of the whole system with Kalman filtering or other filtering techniques.
An innovation and its corresponding innovation covariance matrix are calculated by:
ν1= z1− ˆz1 (2.23)
Σν1= ∇hΣxS
1∇Th + ΣR1 (2.24)
2.2 SIMULTANEOUS LOCALIZATION AND MAPPING
Figure 2.12. The update stage of SLAM Figure 2.13. A DBN represent-ing the update stage of SLAM
where z1and ˆz1are computed by the compounding operation:
z1=
· ⊕(ª(µx1), µm1)
⊕(ª(µx1), µm2)
¸
(2.25)
ˆz1=
· zˆ11 ˆ z12
¸
(2.26) and ∇his the Jacobian of h taken at µx1. Then the state estimate and its corresponding state estimate covariance are updated according to:
xS1 = xS1 + K1ν1 (2.27)
ΣxS
1 = ΣxS
1 − K1∇hΣxS
1 (2.28)
where the gain matrix is given by:
K1= ΣxS
1∇ThΣ−1ν1 (2.29)
This is the update stage of the SLAM problem which is shown as p(zk| xk, M ) in equa-tion (2.15). Figure 2.13 shows a DBN representing the update stage of the SLAM problem.
After the update stage, the robot and two stationary objects are fully correlated.
Computational Complexity
The Kalman filter solution of the SLAM problem is elegant, but a key bottleneck is its computational complexity. Because it explicitly represents correlations of all pairs among the robot and stationary objects, the size of the covariance matrix of the whole system grows as O(l2), given that the number of stationary objects is l. The time complexity of the standard EKF operation in the update stage is also O(l2). This computational burden restricts applications to those in which the map can have no more than a few hundred
stationary objects. The only way to avoid this quadratically increasing computational re-quirement is to develop suboptimal and approximate techniques. Recently, this problem has been subject to intense research. Approaches using approximate inference, using exact in-ference on tractable approximations of the true model, and using approximate inin-ference on an approximate model have been proposed. These approaches include:
• Thin junction tree filters (Paskin, 2003).
• Sparse extended information filters (Thrun et al., 2002; Thrun and Liu, 2003).
• Submap-based approaches: the Atlas framework (Bosse et al., 2003), compressed filter (Guivant and Nebot, 2001) and Decoupled Stochastic Mapping (Leonard and Feder, 1999).
• Rao-Blackwellised particle filters (Montemerlo, 2003).
This topic is beyond the scope intended by this dissertation. (Paskin, 2003) includes an excellent comparison of these techniques.
Perception Modelling and Data Association
Besides the computational complexity issue, the problems of perception modelling and data association have to be solved in order to accomplish city-sized SLAM. For in-stance, the described feature-based formulas may not be feasible because extracting fea-tures robustly is very difficult in outdoor, urban environments. Data association is difficult in practice because of featureless areas, occlusion, etc. We will address perception mod-elling in Chapter 3 and data association in Chapter 5.