Sampling- and Correlation-based Range Image Matching

Perception Modelling

3.3. Sampling- and Correlation-based Range Image Matching

and not as dense. Sparse data causes problems of data association in the small, or correspon-dence finding, which directly affect the accuracy of direct methods. In the computer vision and indoor SLAM literature, the assumption that corresponding points present the same physical point is valid because data is dense. If a point-point metric is used in the ICP algorithm, one-to-one correspondence will not be guaranteed with sparse data, which will result in decreasing the accuracy of transformation estimation and slower convergence.

Research on the ICP algorithms suggests that minimizing distances between points and tangent planes can converge faster. But because of sparse data and irregular surfaces in outdoor environments, the secondary information derived from raw data such as surface normal can be unreliable and too sensitive. A sampling-based approach for dealing with this issue will be addressed in the next section.

3.3 SAMPLING- AND CORRELATION-BASED RANGE IMAGE MATCHING

frame can be computed by:

T = ⊕(ª(xwA), xwB)

= ⊕(xAw, xwB) = xAB (3.4)

where ⊕ is the compounding operation and ª is the reverse operation defined in Chapter 2.1.

With a reasonable guess of the relative transformation T⁰, the goal of range image reg-istration is to find the optimal estimate ˆT of T to align these two point sets with minimal disparity. The ICP algorithm is a widely used direct method and has become the dominant method for aligning 2D and 3D range images because of its simplicity. The ICP algorithm can be summarized as follows. Using a reasonably good initial guess of the relative trans-formation T⁰, a set of point is chosen from A, and the corresponding closest points are found from B. The better estimate of the relative transformation ˆT is computed by using a least squares method. This procedure is iterated until the change of the estimated re-lated transformation becomes very small. Let n corresponding point pairs be denoted by {(aⁱ, bⁱ)}ⁿ_i=1. A distance metric can be defined as:

E = Xn i=1

k ⊕ (T⁰, bⁱ) − aⁱk² (3.5)

By minimizing E, a closed-from solution can be obtained as (Lu and Milios, 1994):

Tˆθ = arctanΣbxay − Σbyax

Σbxax+ Σbyay

Tˆx = ¯ax− (¯bxcos ˆTθ− ¯bysin ˆTθ)

Tˆy = ¯by− (¯axsin ˆTθ+ ¯aycos ˆTθ) (3.6)

where

¯ax= 1 n

Xn i=1

aⁱ_x , ¯ay= 1 n

Xn i=1

aⁱ_y , ¯bx= 1 n

Xn i=1

bⁱ_x , ¯by= 1 n

Xn i=1

bⁱ_y

Σbxax= Xn i=1

(bⁱ_x− ¯bx)(aⁱ_x− ¯ax) , Σbyay = Xn i=1

(bⁱ_y− ¯by)(aⁱ_y− ¯ay)

Σbxay = Xn i=1

(bⁱ_x− ¯bx)(aⁱ_y− ¯ay) , Σbyax = Xn i=1

(bⁱ_y− ¯by)(aⁱ_x− ¯ax) (3.7) For localization, mapping and tracking, both the pose estimate and its corresponding distribution are important. In (Lu and Milios, 1997), Equation 3.5 is linearized and the analytical solution of the covariance matrix can be derived using the theory of linear re-gression. In (Bengtsson and Baerveldt, 2001), a Hessian matrix based method to compute

the covariance matrix was proposed. Because of the heuristic way the ICP algorithm finds corresponding points, neither method reliably estimates the uncertainty from correspon-dence errors. The method of Fu and Milios tends to underestimate the uncertainty and the method of Bengtsson and Baerveldt tends to overestimate the uncertainty. Additionally, these methods do not take measurement noise into account.

Because the heuristic way for finding corresponding points causes local minimum problems, a good reasonable initial guess of the relative transformation is essential for the successful usage of the ICP algorithm. Nevertheless, the saliency of the range images is also critical. Without a reasonable guess of the relative transformation, the ICP algorithm can still find a global minimum solution as long as the sensed scene has enough salient features or a high saliency score. The following figures illustrate the object saliency effect. Figure 3.9 shows two scans from a static environment and the scan segmentation results.

−20 −15 −10 −5 0 5 10 15 20

−5 0 5 10 15 20

2 3

4 115

14 15

16 17

−20 −15 −10 −5 0 5 10 15 20

−5 0 5 10 15 20

2 3

4 5 6 11

14 15

17 16

Figure 3.9. Left: Scan A. Right: Scan B. The solid box denotes the robot (2mx5m).

Segmentation results are shown with segment numbers.

In this example, we assume that the motion measurement is unavailable and the initial guess of the relative transformation is zero. Figure 3.10 shows this initial guess of the relative transformation.

In order to illustrate the object saliency effect, range images A and B are aligned using the same initial relative transformation guess but using different scan segments: one is matching with only segment 1 of scan A and segment 1 of scan B; the other is matching with the whole scans of A and B. Figure 3.11 shows the registration results. It seems that the ICP algorithm provides satisfactory results in both cases and it is hard to quantify which result is better. However, by comparing the results with the whole scans in Figure 3.12, it is easy to justify that registration using only scan segment 1 of A and B provides a local minimum solution instead of the global one.

3.3 SAMPLING- AND CORRELATION-BASED RANGE IMAGE MATCHING

−8 −6 −4 −2 0 2 4 6 8

0 2 4 6 8 10 12

Figure 3.10. An initial guess of the relative transformation. Measurement points of scan A are denoted by ”·”; measurement points of scan B are denoted by ”×”.

2 4 6

0 2 4 6 8 10

2 4 6

0 2 4 6 8 10

Figure 3.11. Results of segment 1 registration. Left: registration using only segment 1 of scan A and segment 1 of scan B. Right: registration using the whole scans of A and B.

Correspondence Finding Ambiguity

Because of sparse and featureless data issues, precisely estimating the relative transfor-mation and its corresponding distribution is difficult and the ambiguity is hard to avoid in practice. However, as long as the ambiguity is modelled correctly, this ambiguity can be reduced properly when more information or constraints are available. If the distribu-tion does not describe the situadistribu-tion properly, data fusion can not be done correctly even if the incoming measurements contain rich information or constraints to disambiguate the

−8 −6 −4 −2 0 2 4 6 8 0

2 4 6 8 10 12

−8 −6 −4 −2 0 2 4 6 8

0 2 4 6 8 10 12

Figure 3.12. Registration results of Figure 3.11 are shown with the whole scans.

Left: registration using segment 1 of scan A and segment 1 of scan B. Right: regis-tration using the whole scans of A and B.

estimates. Therefore, although more computational power is needed, a sampling-based approach is applied to deal with the issues of correspondence finding ambiguity, or data association in the small.

Instead of using only one initial relative transformation guess, the registration process is run N times with randomly generated initial relative transformations. Figure 3.13 shows the sampling-based registration of scan segment 1 in the previous example. 100 randomly generated initial relative transformation samples are shown in the left figure and the cor-responding registration results are shown in the right figure. Figure 3.13 shows that one axis of translation is more uncertain than the other translation axis and the rotation axis.

Figure 3.14 shows the corresponding sample means and covariances using different num-bers of samples. The covariance estimates from the sampling-based approach describe the distribution correctly.

−4

−2 0

2 4

−4

−2 0 2 4

−60

−40

−20 0 20 40 60

meter meter

degree

−4

−2 0

2 4

−4

−2 0 2 4

−60

−40

−20 0 20 40 60

meter meter

degree

Figure 3.13. Sampling-based uncertainty estimation. Left: the randomly generated initial transformation samples. Right: the transformation estimates after applying the registration algorithm.

3.3 SAMPLING- AND CORRELATION-BASED RANGE IMAGE MATCHING

−2 −1 0 1 2 3

−2

−1.5

−1

−0.5 0 0.5 1 1.5 2

2.5 10 samples

100 samples 1000 samples

Figure 3.14. The corresponding sample means and covariances using different numbers of samples. Covariances are shown by 2σ ellipses (95% confidence). ¤ is the pose estimate using the whole scans, which can be treated as the ground truth.

The means estimates from 10, 100 and 1000 samples are labelled as a pentagram, a circle and a star respectively.

Measurement Noises

Because the sampling-based approach does not handle the measurement noise issues, the grid-based method (Elfes, 1988, 1990) and the correlation-based method (Konolige and Chou, 1999) are applied and integrated for taking measurement noise into account.

First, measurement points and their corresponding distributions are transformed into occupancy grids using the perception model described in Section 3.2. Let gabe an object-grid built using the measurement A and g_a^xy be the occupancy of a grid cell at hx, yi.

The grid-based approach decomposes the problem of estimating the posterior probability p(g | A) into a collection of one-dimensional estimation problems, p(g^xy | A). A common approach is to represent the posterior probability using log-odds ratios:

l_a^xy = log p(g_a^xy| A)

1 − p(g^xya | A) (3.8)

Figure 3.15 and Figure 3.16 show the corresponding occupancy grids of the segment 1 of scan A and scan B.

After the grid maps la and lbare built, correlation of laare lbis used to evaluate how strong the grid-maps are related. The correlation is computed as:

p(A^xy)p(B^xy) (3.9)

Because the posterior probability is represented using log-odds ratios, multiplication of probabilities can be done using additions.

2 4 6 0

1 2 3 4 5 6 7 8 9 10

meter

Figure 3.15. Occupancy grids. Left: the measurement points and their correspond-ing distributions of Segment 1 of A. Right: the correspondcorrespond-ing occupancy grids of Segment 1 of A. The whiter the grid map the more certain the occupancy probabil-ity.

2 4 6

0 1 2 3 4 5 6 7 8 9

meter

Figure 3.16. Occupancy grids. Left: the measurement points and their correspond-ing distributions of Segment 1 of B. Right: the correspondcorrespond-ing occupancy grids of Segment 1 of B.

In the previous section, the sampling-based approach treated the samples equally.

Now the samples are weighted with their normalized correlation responses. Figure 3.17 shows the normalized correlation responses.

The samples with low correlation responses can be filtered out for getting more ac-curate sample mean and covariance. Further, by properly clustering the samples, the dis-tribution can be more precisely described by several Gaussian disdis-tributions instead of one

3.3 SAMPLING- AND CORRELATION-BASED RANGE IMAGE MATCHING

−2 0

−1 0 1 2 3 0 0.005 0.01 0.015

Y: meter X: meter

Weighting

−3 −2 −1 0 1 2 3

−1.5

−1

−0.5 0 0.5 1 1.5 2 2.5 3

Y: meter

X: meter

Figure 3.17. The normalized correlations of the samples. Left: 3D view. Right 2D view. The 2σ ellipse denotes the unweighted sample covariance. The samples, which have correlation higher than the correlation median, are labelled by ◦. The other samples are labelled by ×.

Gaussian distribution. Figure 3.18 shows that the samples with high correlation values are clustered into three clusters and the distribution of the pose estimate now can be repre-sented by three Gaussian distributions.

−1.5 −1 −0.5 0 0.5 1 1.5

−1

−0.5 0 0.5 1 1.5

(a)

−1.5 −1 −0.5 0 0.5 1 1.5

−1

−0.5 0 0.5 1 1.5

(b)

Figure 3.18. Mean and covariance estimation by clustering. Left: Mean and covari-ance estimates using the samples with high correlation values. Right: The mean and the distribution are described by three Gaussian distributions.

Based on this observation, instead of using the Particle Filter with hundreds or thou-sands of particles for dealing with the non-Gaussian distribution issues, we can use the proper number of samples (particles) to correctly describe non-Gaussian distributions with-out losing accuracy with this data-driven approach. This will be left for future work.

Object Saliency Score

(Stoddart et al., 1996) defined a parameter called registration index, which provides a simple means of quantifying the registration error when aligning a particular shape. Simi-larly, we define a parameter called object saliency score using the trace of the autocovariance

estimate from the sampling and correlation based approach. The autocovariance is com-puted as:

p(A | T_[i]⁰, A) (3.10)

where T_[i]⁰ is the ith randomly generated perturbation.

The object saliency score is defined and computed as:

S = 1

trace(ΣA) (3.11)

where S is the object saliency score, and ΣAis the autocovariance matrix of the object A from the sampling and correlation based approach. The larger the object saliency score the more certain the pose estimate from registration process. Table 3.2 shows the object saliency scores of the different objects shown in Figure 3.9.

Table 3.2. Object saliency scores of the different objects shown in Figure 3.9.

Object Object Saliency Score

Segment 1 (Bush Area) 2.1640

Segment 16 (Parked Car) 15.6495

Segment 17 (Building Wall) 9.0009

Whole scan 15.8228

According to the object saliency scores, the pose estimates of the bush area and the wall object are more uncertain than the parked car and the whole sensed area. Regardless of the initial relative transformation guess, this is intuitively correct because the whole scan and the parked car contain salient features but the bush area and the wall do not.

Assuming that the environment is static, it is suggested that the whole scan should be used in registration process because the whole scan is more likely to contain salient features than individual segments and more certain pose estimates can be obtained from the registration process.

在文檔中 MOVING OBJECT TRACKING (頁 72-80)