Proposed Technique for Deriving Camera Poses

Chapter 5 Binocular Omni-vision Systems with an Automatic Adaptation

5.4 Proposed Technique for Deriving Camera Poses

(c)

Fig. 5.6 Experimental result of proposed adaptation method for detecting included angle . (a) and (b) Left/right omni-images, with the detected space lines superimposed on it. (c) Accumulation result with maximum at  = 23°.

5.4 Proposed Technique for Deriving Camera Poses

The world coordinate system X-Y-Z is defined as depicted in Fig. 5.7. The X-axis goes through the two camera centers O1 and O2; the Y-axis is taken to be parallel to the Y-axes of both CCSs; the Z-axis is defined to be perpendicular to the XY-plane;

and the origin is defined to be the origin O1 of CCS 1. It is noted here that, since the two omni-cameras are affixed firmly on the omni-camera stands and adjusted to be of an identical height as described previously, the axes X, Z, X1, Z1, X2, and Z2 are all on the same plane as illustrated in Fig. 5.7. Since the two omni-cameras are allowed to be placed arbitrarily at any location with any orientation, it is necessary to find the baseline D and the orientation angles 1 and 2 (as defined in Fig. 5.7) in advance to calculate the 3D data of space points. The proposed method of deriving the orientation angles and the baseline are described in the following.

Let the camera coordinates of CCS 1 be denoted as (X1, Y1, Z1), and those of CCS 2 as (X2, Y2, Z2), as shown in Fig. 5.7. As mentioned previously, the two CCSs X1-Y1-Z1 and X2-Y2-Z2 are allowed to be oriented arbitrarily (with Y1 and Y2 parallel to each other), and the only knowledge acquired by the proposed system is the angle  between the two optical axes Z1 and Z2, which is derived using the detected space lines, as described previously in Section 5.3.

To derive the angles 1 and 2, the user is asked to stand in the middle region in front of the two omni-cameras so that a feature point Puser on the user’s body may be utilized to draw a mid-perpendicular plane of the line segment O1O2 as shown in Fig.

5.7. Let (X1, Y1, Z1) be the coordinates of Puser in CCS 1, and (u1, v1) be the corresponding pixel’s image coordinates in the left omni-image. From (14) and (16), we have the equality:

Fig. 5.7 A top-view of the coordinate systems. The baseline D, orientation angles

1 and 2, and a point Puser on the user’s body are also drawn.

and 2 is just 2 = 1  . This completes the derivations of the orientation angles 1

and 2 of the two cameras.

To compute the baseline D, we make use of a fact about triangulation in binocular computer vision: the 3D data can be determined up to a scale without knowing the value of the baseline D [54]. Specifically, within the omni-images taken of the user standing in front of the two cameras as mentioned previously, we extract two points on the head and the feet of the user, respectively. Let Phead and Pfoot denote their real 3D data, respectively. On the other hand, as stated previously, we can compute the 3D data up to a scale of the two points, which we denote as P′head and P′foot, respectively, using triangulation calculations [54] with the baseline D being one unit. Then, the relations between the data Phead, Pfoot, P′head, and P′foot can be expressed as

Phead = D·P′head, and Pfoot = D·P′foot, (32) where D is the actual baseline value. Let H′ be the Euclidean distance between P′head

and P′foot; and let H be the real distance between Phead and Pfoot, which is just the known height of the user. Then, the baseline D can finally be computed as D = H/H′.

After finding the baseline D, the system parameters are now all adapted. To sum up, the three steps of the proposed adaptation method are briefly described as follows.

First, the included angle  between the two optical axes are determined using space line features as discussed in Section 5.3. Then, by asking the user to stand at the middle point in front of the two omni-cameras, the orientation angles 1 and 2 of the two cameras are calculated as described in this section. Finally, the baseline D is calculated using the height H of the user as described in this section.

5.5 Experimental Results

Some experimental results are given here to show the adaptation ability under different cameras and environments. Two types of cameras were used, which are perspective cameras and catadioptric omni-cameras, and three different environments were considered, which are a corridor, a hall, and a room, as shown in Figs. 5.8(a) through 5.8(c).

Four different experiments were conducted: Experiment 1 is conducted in the corridor with omni-cameras; Experiment 2 in the hall with omni-cameras; Experiment

3 in the room with omni-cameras; and Experiment 4 also in the room but with perspective cameras. In each experiment, the two cameras were oriented in different angles (i.e., 30°, 15°, 0°, 15°, and 30°). Fifty space line features were first extracted as proposed in Section 5.2. Then, the angle  was automatically calculated using these lines as proposed in Section 5.3. The results are shown in Fig. 5.8(d). The X-axis specifies the ground truth of the angle , and the Y-axis specifies the absolute error of the calculated angle .

In Experiments 1 and 2, since the lines in the corridor and hall are relatively simple and obvious, the adaptation result is accurate with errors of about 2° as shown by the green and purple curves in Fig. 5.8(c). Also, since we use omni-cameras in these experiments, the lines can still be captured even when the two cameras were oriented with a large angle. Thus, the adaptation result remains accurate when the angle  is large. In Experiment 3, since the space lines in the room are more complicated, the adaptation becomes more difficult. However, since the omni-cameras can capture a large field of view of the environment, a plenty number of space lines can be captured. Therefore, the adaptation result is accurate as well, with errors of about 4° as shown by the red curve in Fig. 5.8(c). In contrast, the adaptation errors are about 10° when perspective cameras were used, as shown by the blue curve in Fig. 5.8(c), and they become unacceptable (larger than 20°) when the included angle  is large. These experimental results show the feasibility of the proposed adaptation methods, as well as the power of the omni-cameras in the automatic adaptation process.

Another series of experiments are conducted to test the adaptation ability and the 3D acquisition precision in the room environment. In each experiment in this series, the two cameras were placed at a distance about 180cm to each other, and both were oriented randomly within the range of ±40°. After the cameras were set up, two omni-images of the environment were captured as shown, for example, in Figs. 5.9(a) and 5.9(d), respectively, and used to calculate the included angle  according to Step 2 of Algorithm 5.1. Next, a user was asked to stand in the middle region in front of the two cameras, as shown in Figs. 5.9(b) and 5.9(e), to calculate the orientation angles 1

and 2 and the baseline D according to Step 3 of Algorithm 5.1. After these adaptation tasks were done, a board with 60 landmarks was held by the user, as shown in Figs.

5.9(c) and 5.9(f), to test the precision of the resulting 3D computation.

(a)

(b)

(c)

(d)

Fig. 5.8 Experimental results under different cameras and environments. (a) A corridor. (b) A hall. (c) A room. (d) Adaptation results of angle .

(a) (b)

(e) (f)

Fig. 5.9 Sample omni-images of an experiment. (a)(d) Taking a shot of the environment to calculate . (b)(e) A user standing in the middle region in front of the cameras to calculate baseline D and orientation angles 1 and 2. (c)(f) A board held by the user to test the 3D computation precision.

In these experiments, three different degrees of adaptation were implemented and the corresponding results compared: (1) no adaptation was conducted with the camera orientations and baseline set to be 1 = 2 = 0° and D = 180 cm (D is the ground-truth value); (2) the left omni-camera was set up to face forward with the values 1 = 0°, D

= 180cm, and 2 adapted to be ; and (3) all the parameters 1, 2, and D were

adapted according to the proposed method. Denoting (Xi, Yi, Zi) as the ground-truth location of a landmark point, and (Xi′, Yi′, Zi′) as the calculated location, we define the 3D error E of each landmark point as

 i i ² i i ² i i² i² i² i²

E X X '  Y Y'  Z Z ' X Y Z . (33) The comparison results are shown in Fig. 5.10 in which the vertical axis specifies the average of the 3D errors, and the horizontal axis specifies the system orientation angle which is defined as the maximum of the two orientation angles 1 and 2.

adapt 2

no adaptation adapt 1,2 and D

(a)

adapt 2

no adaptation adapt 1,2 and D

(b)

Fig. 5.10 Experimental results of three different degrees of adaptations. (a) The 3D errors. (b) The standard deviations of the 3D errors. The proposed adaptation methods yield the best results as shown by the purple curves.

As can be seen from Figs. 5.10(a) and 5.10(b), when no parameter is adapted with the results shown by the blue curve, the 3D errors are seen to become larger as the orientation angle becomes larger, showing the necessity of an automatic system adaptation process. When only the orientation 2 of the right omni-camera is adapted with the result shown by the red curve, it is observed that the 3D errors are sometimes lower but vary largely. This results from the fact that the left omni-camera is assumed to face forward in this case. Thus, if the left omni-camera is actually placed to face forward in the experiment, the error measure is lowered; otherwise, the error is large as expected. Finally, when all the parameters 1, 2 and D are adapted with the results shown by the purple curve, the 3D errors are lower than 8% even when the system orientation angle is large. This shows the feasibility, reliability, and validity of the proposed system adaptation method.

Chapter 6 Optimal Design and Placement of

在文檔中精準且有自適能力之環場視覺技術及應用之研究 (頁 53-61)