• 沒有找到結果。

Compositions and flux of soil gas in Liu-Huang-Ku hydrothermal area, northern Taiwan

N/A
N/A
Protected

Academic year: 2021

Share "Compositions and flux of soil gas in Liu-Huang-Ku hydrothermal area, northern Taiwan"

Copied!
11
0
0

加載中.... (立即查看全文)

全文

(1)

* Corresponding author. Tel.: #886-2-27883799, ext. 1718; fax: #886-2-27824814.

E-mail address: hung@iis.sinica.edu.tw (Y.-P. Hung).

Three-dimensional ego-motion estimation from motion

"elds observed with multiple cameras

Yong-Sheng Chen

, Lin-Gwo Liou , Yi-Ping Hung  *, Chiou-Shann Fuh

Institute of Information Science, Academia Sinica, 128, Sec 2, Academia Road, Nankang, Taipei 11529, Taiwan

Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan Received 15 May 2000; accepted 15 May 2000

Abstract

In this paper, we present a robust method to estimate the three-dimensional ego-motion of an observer moving in a static environment. This method combines the optical #ow "elds observed with multiple cameras to avoid the ambiguity of 3-D motion recovery due to small "eld of view and small depth variation in the "eld of view. Two residual functions are proposed to estimate the ego-motion for di!erent situations. In the non-degenerate case, both the direction and the scale of the three-dimensional rotation and translation can be obtained. In the degenerate case, rotation can still be obtained but translation can only be obtained up to a scale factor. Both the number of cameras and the camera placement a!ect the accuracy of the estimated ego-motion. We compare di!erent camera con"gurations through simulation. Some results of real-world experiments are also given to demonstrate the bene"ts of our method.  2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.

Keywords: Ego-motion estimation; Multiple sensors; Optical #ow

1. Introduction

Motion analysis is concerned with the estimation of the relative motion between an observer and objects. The relative motion is derived from the movement of the observer, the objects, or both. Usually, there are two stages in estimating the motion: "rst, "nding the point corre-spondences or computing the optical #ow "eld; second, interpreting the motion from the point correspondences [1}6] or the optical #ow "eld [7}11]. Instead of calculat-ing the point correspondences or the optical #ow "eld as an intermediate result, some other methods estimate the motion directly from the spatial and temporal gradients [12,13]. In this paper, we concentrate on the estimation of the so-c alled ego-motion of an observer moving in a staticenvironment by using the optical #ow "elds.

Ego-motion provides useful information for human computer interaction and vehicle navigation [14}18]. In the literature, Burger and Bhanu [14] computed the 2-D region of focus of expansion (FOE) as the heading direc-tion of a land vehicle from displacement vectors. Irani et al. [19] removed the e!ects of rotation by registering 2-D regions. Then they computed the camera translation from the epipolar "eld. Pei and Liou [18] estimated the vehicle-type motion by using image point and line fea-tures. These works estimated the motion according to the camera center, that is, rotation around the axis through the camera center followed by translation. In the applica-tion of human computer interacapplica-tion and navigaapplica-tion, it is more desirable to compute the motion according to the observer's center [15].

One of the major problems in motion recovery is the ambiguity problem. Multiple kinds of motion induce similar optical #ow "elds and it is di$cult to determine the motion from the observed optical #ow "eld. Horn [20] and Brodsky et al. [21] stated that the motion "elds and their directions are hardly ever ambiguous, but the ambiguity problem arises if the camera's "eld of view is

0031-3203/01/$20.00 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 0 0 ) 0 0 0 9 2 - 3

(2)

Fig. 1. An arbitrary con"guration of K cameras.

small and the variation of the relative depth in the "eld of view is also small [5,22,23]. In the application of vehicle navigation, for example, consider an airplane with a cam-era looking down the land or a car with a camcam-era looking far away. If the "eld of view of the camera is not large enough, the depth map in the view is almost constant. Moreover, it is unsuitable to avoid this problem by using a camera with a large "eld of view because lens distortion and low resolution may seriously decrease the accuracy of the estimated optical #ow.

Another problem in motion recovery is the scaling factor problem concerned with the depth and the transla-tional motion. With only one camera, only the direction of the translation and the relative depth according to the camera center can be estimated [1,8]. The inverse depth and the translation are multiplied together and they can be determined only up to a scale factor.

In this work, we propose a robust method to estimate the three-dimensional ego-motion according to the speci-"ed observer center. Several cameras are mounted on the observer and are calibrated [24,25] according to the speci"ed observer center. We use the optical #ow "elds observed with these cameras to avoid the ambiguity problem. Both the direction and the scale of rotation and translation motion can be obtained by minimizing the proposed residual function of the non-degenerate case. In some special case (degenerate case), for example, when the cameras are not placed well or the observer is under-going pure translation motion, another residual function can be used to determine the direction and the scale of rotation and the direction of translation.

In the following, we present the proposed method of ego-motion estimation for non-degenerate and degener-ate cases in Sections 2.1 and 2.2, respectively. Then we explain why the ambiguity problem can be avoided by using multiple cameras in Section 3. The number of cameras and their placement dramatically a!ect the accuracy of the estimated motion. We compare the performance of di!erent camera con"gurations through simulation in Section 4. The results of real-world experi-ments shown in Section 5 demonstrate the bene"ts of our method. Finally, conclusions are stated in Section 6.

2. Ego-motion estimation

Consider an arbitrary con"guration of K cameras shown in Fig. 1. Without loss of generality, each focal length, fI, of the kth camera is set to 1. We want to estimate the ego-motion according to the global coordi-nate system, CE, attached to the moving observer, where

CE"O, I"[eee], O is the origin, e"[1,0, 0]2,

e"[0, 1, 0]2, and e"[0, 0,1]2. The kth camera coordinate system, CI, in CE can be expressed as CI"bI, RI"[uIuIuI]. The 3;1 vector bI de-notes the position of OI and RI is a 3;3 orthonormal

matrix. These extrinsiccamera parameters bI and RI of each camera are calibrated beforehand [24,25]. At any time instance, we compute the optical #ow "elds from the images captured from all the cameras. Let NI denote the number of image points where the optical #ow vectors are calculated in the kth image. Our goal is to compute the 3-D ego-motion according to the global coordinate system from the K optical #ow "elds.

The 3-D coordinates of a point P in coordinate systems CE and CI are P ([PV, PW, PX]2) and PI ([PVI, PWI, PXI]2), respectively. These two coordinate vec-tors satisfy

P"RIPI#bI. (1)

Relative to the global coordinate system CE, the instan-taneous 3-D ego-motion of the point P in the static environment is

PQ "!;P!t, (2)

where and t denote the 3-D angular velocity and the translational velocity of the undergoing ego-motion [26]. This 3-D relative motion can also be expressed in the kth camera coordinate system as

PQ I"!I;PI!tI, (3) where

I"R2I and tI"R2I[(;bI)#t]. (4) According to the perspective projection camera model, the 3-D point PI is projected on the kth image plane at pI, where

pI,



pVIpWI

1



"

1

(3)

After temporally di!erentiating both sides of Eq. (5) and substituting Eq. (3) into PQ I, we have

vI,pI"!(I;pI)!PPXIQ XIpI!PXI1 tI, (6) where vI is the optical #ow vector at the image point pI.Applying cross product ( ;) by pI to Eq. (6), we obtain pI;[vI#(I;pI)]"!PXI1 (pI;tI). (7) Then we further apply inner product of tI to both sides of Eq. (7) and derive the following fundamental equation which does not contain the unknown depth PXI: pI;[vI#(I;pI)])tI"0. (8) The above equation is essentially the in"nitesimal ver-sion of the epipolar constraint equation [27}29]. Since we want to estimate the 3-D ego-motion and t accord-ing to the speci"ed coordinate system, CE, by usaccord-ing the observations from all the K cameras, the above funda-mental equation is re-expressed in terms of and t by using Eq. (4):

RIpI;[vI#(R2I;pI)]) (;bI#t)"0. (9) Once the camera parameters RI and bI of the kth camera are calibrated and the optical #ow vector vI at the image position pI is estimated, Eq. (9) can be used to determine the 3-D motion parameters, and t, without recovering the depth of the image point.

Suppose there are NI #ow vectors associated with the

kth camera. We use pIG and vIG to represent the ith point

and its optical #ow associated with the kth camera. Eq. (9) can be rewritten as

m2IG(hI#t)"0, (10) where

mIG,RIpIG;[vIG#(R2I;pIG)], (11)

hI,;bI. (12)

2.1. Non-degenerate case

According to Eq. (10), we de"ne a residual function

J which depends on the unknowns  and t J(,t), ) I ,I Gm2IG(hI#t). (13)

Based on the least-squares criterion, the optimal esti-mates of and t can be obtained by minimizing J. By lettingJ/t"0, we have t"M\c, (14) where M, ) I ,I GmIGm2IG and c,! ) I ,I GmIGm2IGhI.(15)

In the following, the matrix M is sometimes written as M() to emphasize that M is a function of .

By substituting Eq. (14) into Eq. (13), we have a new residual function J which only depends on the unknown angular velocity: J(),!c2M\c# ) I ,I Gm2IGhI. (16)

Therefore, the optimal estimate of  (denoted by ( ) based on the least-squares criterion is the one that min-imizes the residual function J(). Once we have (, the estimate of t, denoted t) , can be easily obtained by using Eqs. (14) and (15).

2.2. Degenerate case

In some situations, we cannot obtain  and t by solving Eqs. (14) and (16).

(1) K"1: Only one camera is used in this case. Eq. (15) becomes M" ,I

GmGm2G and c"! ,GI mGm2Gh

"!

Mh. Eq. (16) becomes J()"!h2M2M\ Mh#h2Mh"0. Therefore, we cannot use this resid-ual function, J(), when there is only one camera. (2) ∀k, hI"0: Eq. (16) becomes J()"0 and useless.

Three situations will su!er hI"0 for each k. First, "0, that is, there is no rotational motion (pure translational motion). Second, for each k, bI"0. This means that O and every OI coincide at the same point. Third, bI   for each k.

(3) ∀k, hI"cIt: When hI is parallel to t, Eq. (10) be-comes (cI#1)m2IGt"0. In this case, only the direc-tion of t can be obtained.

We have to de"ne a new residual function of degener-ate case to deal with the above-mentioned situations. Only situation (3) is considered because situation (1) is a special case of situation (2) by letting O"O, thus b"0, and situation (2) is a special case of situation (3) by letting cI"0. When hI"cIt, Eq. (10) can be reduced into the following form:

m2IGt"0 or m2IGtL"0. (17) The second form of Eq. (17) indicates that only the translational direction is recoverable in these degenerate cases.

Similarly, we can de"ne a residual function J as

J(,tL), ) I ,I Gm2IGtL, (18)

(4)

Fig. 2. Two cameras are mounted on the left and right side of the moving vehicle. Two types of motion, pure translation along the X-axis of GCS and pure rotation around the Z-axis of GCS, are under consideration.

where tL is de"ned as the unit vector of the direction of translation, t. Expanding Eq. (18), we have

J(,tL)"t2L



) I ,I GmIGm2IG



tL "t2LMtL, (19) where the Hermitian matrix M is de"ned in Eq. (15). When tLO0, the Rayleigh quotient, M(tL)" t2LMtL/t2LtL"t2LMtL, is always larger than the smallest eigenvalue, , of the Hermitian matrix M. That is, the minimum value of the residual function J(,tL) is the smallest eigenvalue of M() [29,30].

Given an estimate of, the best estimate of tL should be the eigenvector of M() corresponding to the smallest eigenvalue. We de"ned a new residual function J which only depends on the unknown as

J(),the smallest eigenvalue of M(). (20)

Therefore, the optimal estimate of, denoted by ( , is the one which minimizes the error function J(). The optimal estimate of tL (denoted by t)L) is the eigenvector of M(( ) corresponding to the smallest eigenvalue.

3. Motion 5eld ambiguity

In this section, we will explain through simulation why the ambiguity problem can be avoided by combining the optical #ow "elds observed with multiple cameras. Con-sider a moving vehicle with two cameras mounted on the left and right sides and looking outward as Fig. 2 shows. Two types of motion are under consideration: one is the pure translation motion toward the front direction and the other is the pure rotation motion around the vertical axis of the vehicle. First, let us consider only the left camera (camera 1). The optical #ow "elds generated by

the pure translation and the pure rotation are very sim-ilar, as shown in Figs. 3(a) and (b), if the "eld of view of the camera is not large enough (303 in this example) and the depth variation in the "eld of view is very small. From this single #ow "eld, it is di$cult to determine whether the motion is pure translation or pure rotation.

The ambiguity of motion recovery from the optical #ow "eld is illustrated by the following simulation. The vehicle in Fig. 2 is moving straightforward with velocity 10 mm/s. The depth in the "eld of view is constant (2 m in this simulation). The optical #ow "eld of camera 1 is used to recover the ego-motion of the vehicle by minimizing

J() (degenerate case, because pure translation motion

is considered). Gaussian noise with three di!erent per-centages of the length of the optical #ow is applied on the optical #ow "eld. The residual error of function J() is calculated fromX"!0.5 to 0.5 (V"W"0) and is plotted in Fig. 4. There are two local minima, that is, these two candidate motion are ambiguous. The "rst one is located near the true motion, X"0. The residual error atX"0 increase when larger noise level is ap-plied. The recovered direction of translation (the eigen-vector of M) is [1, 0, 0]2 when X"0 and noise level is 0. The second local minimum is located near the mis-taken motion, X"!0.283/s. As the noise level in-creases, the residual error remains low. The reason is that the noise of #ow can be interpreted as the result of the recovered translation vector, [0, 1, 0]2, according to the global coordinate system of the vehicle. To sum up, when the "eld of view is small and the depth in the "eld of view is constant, pure translation motion is ambiguous with rotation motion. If the noise of the optical #ow is not negligible and we search for the global minimum as the recovered motion, pure translation motion might be in-terpreted as rotation motion.

Next, let us consider the left and right cameras to-gether on this moving vehicle. If there is only translation, the optical #ows observed with the two cameras will be the same in scale but opposite in direction. If there is only rotation, the optical #ows will be the same in both the scale and the direction as shown in Fig. 3. Therefore, if we can combine the information contained in the two #ow "elds appropriately, a more precise and unique motion can be obtained.

The motion "elds of the two cameras are used in another simulation and the residual error of J() is plotted in Fig. 5. The only one local minimum (near X"0) means that there is no ambiguity and the accu-rate motion can be obtained.

4. Camera placement

Camera placement dramatically a!ects the robustness and accuracy of the ego-motion estimation with multiple cameras. In Section 2.2, we have described that in some

(5)

Fig. 3. (a) and (b) are the optical #ow "elds of the camera 1 when the vehicle is translating (a) and rotating (b). They look very similar and it is di$cult to distinguish between them. (c) and (d) are the optical #ow "elds of the camera 2 when the vehicle is translating (c) and rotating (d). Considering the optical #ow "elds from both the cameras together, their scales are the same in two kind of motions but the directions are opposite only in pure translation motion.

Fig. 4. The residual function J() by using the optical #ow "eld of camera 1 fromX"!0.5 to 0.53/s. These two local minima mean that these two kinds of motion are ambiguous.

Fig. 5. The residual function J() by using the optical #ow "elds of cameras 1 and 2 fromX"!0.5 to 0.53/s. There is only one local minimum and the solution is unique.

(6)

Fig. 6. The seven cameras used for the simulation of camera placement.

Table 2

The average angles (in deg) between the estimated and the true translation motions of seven con"gurations of camera placement

Con"guration 1 2 3 4 5 6 7 Cameras 1,2 1,3 1,4 1,2,3 1,2,5 1,2,3,6 1,2,3,5,6,7 1% noise 39.91 1.46 18.81 5.41 12.76 0.39 0.17 5% noise 55.71 10.78 46.00 31.95 47.41 5.43 2.71 10% noise 56.15 24.24 52.37 46.55 52.79 13.82 11.83 Table 1

The average angles (in deg) between the estimated and the true translation directions of seven con"gurations of camera placement

Con"guration 1 2 3 4 5 6 7

Cameras 1,2 1,3 1,4 1,2,3 1,2,5 1,2,3,6 1,2,3,5,6,7

1% noise 0.10 0.37 0.39 0.08 0.06 0.07 0.04

5% noise 0.51 6.56 6.47 0.56 0.32 0.46 0.23

10% noise 1.42 36.04 35.63 1.74 0.67 1.00 0.47

special con"gurations of the camera placement, it even turns into degenerate case and only the direction of translation can be estimated. In this section, we will discuss what is the better con"guration of the camera placement to obtain more accurate ego-motion.

In Fig. 6, seven cameras are mounted on the observer and the viewing directions of cameras 1 to 7 are

Z, !X, !Z, Z, !>, X, and >, respectively. The

displacements between the origins of the camera co-ordinate systems and the origin of the observer are [0, 0, 100]2, [!100, 0, 0]2, [0, 0,!100]2, [100, 0, 100]2,

[0,!100, 0]2, [100, 0, 0]2, and [0, 100, 0]2, respectively. The optical #ow "elds of seven kinds of composition of these cameras are used to compute the ego-motion and the accuracy of the result is compared.

In degenerate case, 1000 random trials of 3-D tion motion (the range of each component of the transla-tion motransla-tion is from !15 to 15 mm/s) are generated and the optical #ow "elds for all the cameras are calculated. Gaussian noise of three di!erent noise levels is applied on the optical #ow "eld. Because only the direction of the translation can be estimated in this case, we compare the performance of the seven con"gurations by calculating the angle between the estimated and the true translation direction, as shown in Table 1. From Table 1, we can observe that: (1) in general, more cameras provide more accurate motion estimation; and (2) orthogonal camera placement (con"gurations 1 and 5) are better than col-linear (con"guration 2) and coplanar (con"gurations 4 and 6) camera placement.

In non-degenerate case, 1000 random trials of 3-D rotation and translation motion are generated. The range of each component of the rotation motion is from !0.5 to 0.53/s and the range of each component of the transla-tion motransla-tion is from !15 to 15 mm/s. Again, the perfor-mance of the same seven camera con"gurations is com-pared, as shown in Table 2, by computing the angle between the estimated and the true translation direction. Because three-dimensional translation including its scale can be obtained in this case, the distance between the estimated and the true translation motion is also com-pared in Table 3. Con"gurations 6 and 7 which use more cameras still can obtain more accurate motion. In this case, collinear (con"guration 2) and coplanar (con"gura-tions 4 and 6) camera placement can obtain better

(7)

Table 3

The average distance (in mm) between the estimated and the true translation motions of seven con"gurations of camera placement

Con"guration 1 2 3 4 5 6 7

Cameras 1,2 1,3 1,4 1,2,3 1,2,5 1,2,3,6 1,2,3,5,6,7

1% noise 12.99 5.36 9.74 6.36 8.94 4.44 4.03

5% noise 14.31 12.86 13.72 13.33 13.94 12.79 12.82

10% noise 14.58 14.26 14.30 14.41 14.54 14.27 14.27

Fig. 7. A picture of the IIS head.

Fig. 8. The coordinate systems and camera con"guration of the real experiments.

motion estimation than orthogonal ones (con"gurations 1 and 5).

5. Experimental results

This section shows some results of real experiments. We used a well-calibrated binocular head [25] (referred to as the IIS head) to simulate a moving observer with two cameras mounted on it. The IIS head is built for experiments of active vision, which has four revolute joints and two prismaticjoints, as shown in Fig. 7. The two joints on top of the IIS head are for camera verge or gazing. The next two joints below them are for tilting and panning the stereo cameras. All of the above four joints are revolute and are mounted on an X}> table which is composed of two prismatic joints. The lenses of the

binocular head are motorized to focus on objects at di!erent distances.

To simplify the coordinate transform, we let the global coordinate system and the left camera coordinate system be identical. Then the left camera coordinate system (¸CCS) can be expressed by ¸CCS"bJ,RJ, where bJ"0 and RJ"I. We let the angle between the optical axes of left and right cameras be about 903. Notice that the z-axis of ¸CCS is the same as the optical axis of the left camera, the x-axis points toward the left side of the left camera, and the y-axis points toward the upper side of the left camera. The focal lengths of both cameras are 25 mm, and the "elds of view are 153. The coordinate systems and the camera con"guration are illustrated in Fig. 8.

5.1. Experiment 1

We let the IIS head move forward, such that the left camera of the IIS head looks ahead and the right camera looks to the right. Table 4 lists the true motion para-meters used in this experiment. We estimated the ego-motion for three cases: using the left camera only, using the right camera only, and using both the left and right cameras. The scenes viewed from the left and right cameras are shown in Figs. 9(a) and (b), respectively. The optical #ow "elds observed with the left and right cameras are shown in Figs. 9(c) and (d), respectively. The depth of the scene viewed from left camera is in the range

(8)

Table 4

True motion parameters used in experiment 1

Rotation Translation

 (deg/frame) Direction Mag.

(mm)

V W X tVL tWL tXL t

0.00 0.00 0.00 !0.017 0.045 1.00 20.00

Fig. 9. The images and the optical #ow "elds used in experiment 1: (a) The scene viewed from the left camera. (b) The scene viewed from the right camera. (c) The optical #ow "eld obtained from the left camera. (d) The optical #ow "eld obtained from the right camera. The optical #ow vectors in the "gures are enlarged by a factor of two.

Table 5

Rotational parameters estimated in experiment 1

( (deg/frame) Error (V (W (X ( !RPSC Both 0.017 !0.034 0.012 0.040 Left 0.00 !0.052 !0.052 0.073 Right !0.012 0.22 0.00 0.22

from 1.3 to 1.5 m, while the depth of the scene viewed from the right camera is about 5 m.

Tables 5 and 6 list the estimates of the rotational parameters and translational parameters. The results

show that using both cameras performs better than using only one camera. The performance of using only the left camera is also acceptable because the translation direc-tion is close to the optical axis of the left camera. When

(9)

Table 6

Translational parameters estimated in experiment 1.(tKL,tL RPSC) is de"ned as the angle between tK L and tL RPSC

tL (translational direction) Error tK VL tK WL tK XL (tKL,tL RPSC)

Both 0.040 0.054 1.00 3.303

Left 0.062 0.049 1.00 4.543

Right !0.990 !0.051 0.10 83.393

Table 7

True motion parameters used in experiment 2

Rotation Translation

 (deg/frame) Direction Mag.

(mm)

V W X tVL tWL tXL t

0.017 0.50 !0.023 !0.62 !0.012 !0.78 1.89

Fig. 10. (a) The optical #ow "eld obtained from the left camera. (b) The optical #ow "eld obtained from the right camera. The optical #ow vectors in the "gures are enlarged by a factor of two. Table 8

Rotational parameters estimated in experiment 2

( (deg/frame) Error (V (W (X ( !RPSC Both 0.00 0.52 !0.029 0.025 Left 0.0057 0.57 0.0057 0.086 Right 0.017 !0.0057 !0.011 0.50 Table 9

Translational parameters estimated in experiment 2.(tKL,tL RPSC) is de"ned as the angle between tK L and tL RPSC

tL (translational direction) Error tK VL tK WL tK XL (tKL,tL RPSC)

Both !0.052 !0.034 !1.00 36.00

Left 0.049 0.055 !1.00 36.00

Right 0.083 0.99 !0.095 89.00

only the right camera is used, the ambiguity problem mentioned in Section 3 occurred and the translational motion is mis-classi"ed as rotational motion because the "eld of view is relatively small and the depth variation in the "eld of view is also small.

5.2. Experiment 2

In experiment 2, we let the IIS head pan with a small angle. Table 7 is the true motion parameters used in this experiment. Again, we estimated the ego-motion by using

the left camera only, the right camera only, and both the left and right cameras, respectively. The scenes viewed from the left and right cameras are the same as Figs. 9(a) and (b). The #ow "elds observed with left and right cameras are shown in Figs. 10(a) and (b), respectively.

The experimental results of this experiment are given in Tables 8 and 9. As expected, the experiment using both the left and right cameras obtains the most accurate motion. The ambiguity problem occurred again when using the right camera only because the depth of the scene viewed from the right camera was as far as 5 m. Hence, the optical #ow "eld was very similar to the one caused by small pure translation. Notice that the errors

(10)

of the translational direction are larger than the ones obtained in Experiment 1. The reason is that the magni-tude of translation in this experiment was so small that the estimate of translation was seriously corrupted by the noise of the optical #ow "eld.

6. Conclusions

In this paper, we have proposed a method for 3-D ego-motion estimation using a multiple-camera vision system. This method combines the information con-tained in the multiple optical #ow "elds observed with di!erent cameras to avoid the ambiguity problem. Hence, the accuracy of the estimated motion can be improved. Two residual functions are proposed to deal with di!erent cases: non-degenerate case and degenerate case. In the non-degenerate case, 3-D rotation and trans-lation including their scales can be obtained. In the degenerate case, 3-D rotation and the direction of trans-lation can be obtained. Simutrans-lations and real experiments show that using multiple cameras can provide more robust and accurate estimate of ego-motion.

One potential application of our multiple-camera ap-proach is the`inside-outa (or `outward lookinga) head tracker for virtual reality. The current outward looking head tracker requires structured environments, e.g. regu-lar pattern in the ceiling. Our approach does not require specially designed environment, as long as the environ-ment has enough features for computing optical #ow.

Acknowledgements

The authors would like to thank the helpful discussion with Dr. Chu-Song Chen and An-Ting Tsao. This work was supported in part by the National Science Council of Taiwan, under Grants NSC 86-2745-E-001-007.

References

[1] R.Y. Tsai, T.S. Huang, Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surfaces, IEEE Trans. Pattern Anal. Mach. Intell. 6 (1) (1984) 13}27.

[2] J. Philip, Estimation of three-dimensional motion of rigid objects from noisy observations, IEEE Trans. Pattern Anal. Mach. Intell. 13 (1) (1991) 61}66.

[3] M.E. Spetsakis, Y. Aloimonos, Optimal visual motion estimation: A note, IEEE Trans. Pattern Anal. Mach. Intell. 14 (9) (1992) 959}964.

[4] J. Weng, P. Cohen, N. Rebibo, Motion and structure estimation from stereo image sequences, IEEE Trans. Robotics Automat. 8 (3) (1992) 362}382.

[5] J. Weng, N. Ahuja, T.S. Huang, Optimal motion and structure estimation, IEEE Trans. Pattern Anal. Mach. Intell. 15 (9) (1993) 864}884.

[6] R.J. Holt, A.N. Netravali, Number of solutions for motion and structure from multiple frame correspondence, Int. J. Comput. Vision 23 (1) (1997) 5}15.

[7] D.T. Lawton, Processing translational motion sequences, Comput. Vision Graphics Image Process. 22 (1) (1983) 116}144.

[8] D.J. Heeger, A.D. Jepson, Subspace methods for recover-ing rigid motion I: Algorithm and implementation, Int. J. Comput. Vision 7 (2) (1992) 95}117.

[9] R. Hummel, V. Sundareswaran, Motion parameter estima-tion from global #ow "eld data, IEEE Trans. Pattern Anal. Mach. Intell. 15 (5) (1993) 459}476.

[10] L. Li, J.H. Duncan, 3-D translational motion and structure from binocular image #ows, IEEE Trans. Pattern Anal. Mach. Intell. 15 (7) (1993) 657}667.

[11] S. Soatto, P. Perona, Recursive 3-D visual motion estima-tion using subspace constraints, Int. J. Comput. Vision 22 (3) (1997) 235}259.

[12] S. Negahdaripour, B.K.P. Horn, Direct passive navigation, IEEE Trans. Pattern Anal. Mach. Intell. 9 (1) (1987) 168}176.

[13] B.K.P. Horn, E.J. Weldon Jr., Direct methods for re-covering motion, Int. J. Comput. Vision 2 (1) (1988) 51}76.

[14] W. Burger, B. Bhanu, Estimating 3-D ego-motion from perspective image sequences, IEEE Trans. Pattern Anal. Mach. Intell. 12 (11) (1990) 1040}1058.

[15] Y. Liu, T.S. Huang, Vehicle-type motion estimation from multi-frame images, IEEE Trans. Pattern Anal. Mach. Intell. 15 (8) (1993) 802}808.

[16] T. VieHville, E. Clergue, P.E.D.S. Facao, Computation of ego-motion and structure from visual and inertial sensors using the vertical cue, Proceedings of International Con-ference on Computer Vision, Berlin, Germany, April 1993, pp. 591}598.

[17] A. Giachetti, M. Campani, V. Torre, The use of optical #ow for road navigation, IEEE Trans. Robotics Automat. 14 (1) (1998) 34}48.

[18] S.-C. Pei, L.-G. Liou, Vehicle-type motion estimation by the fusion of image point and line features, Pattern Recog-nition 31 (3) (1998) 333}344.

[19] M. Irani, B. Rousso, S. Peleg, Recovery of ego-motion using region alignment, IEEE Trans. Pattern Anal. Mach. Intell. 19 (3) (1997) 268}272.

[20] B.K.P. Horn, Motion "elds are hardly ever ambiguous, Int. J. Comput. Vision 1 (3) (1987) 259}274.

[21] T. Brodsky, C. FermuKller, Y. Aloimonos, Directions of motion "elds are hardly ever ambiguous, Int. J. Comput. Vision 26 (1) (1998) 5}24.

[22] G. Adiv, Inherent ambiguities in recovering 3-D motion and structure from a noisy #ow "eld, IEEE Trans. Pattern Anal. Mach. Intell. 11 (5) (1989) 477}489.

[23] K. Daniilidis, H.-H. Nagel, The coupling of rotation and translation in motion estimation of planar surfaces, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, New York, June 1993, pp. 188}193.

[24] R.Y. Tsai, A versatile camera calibration technique for high-accuracy 3D machine vision metrology using o!-the-shelf TV cameras and lenses, IEEE J. Robotics Automat. RA-3 (4) (1987) 323}344.

(11)

[25] S.-W. Shih, Y.-P. Hung, W.-S. Lin, Calibration of an active binocular head, IEEE Trans. Systems Man Cybernet. 28 (4) (1998) 426}442.

[26] R.M. Haralick, L.G. Shapiro, Computer and Robot Vision, Vol. 2, Addison-Wesley, Reading, MA, 1993. [27] H.C. Longuet-Higgins, A computer algorithm for

recon-structing a scene from two projections, Nature 293 (1981) 133}135.

[28] O. Faugeras, Three-Dimensional Computer Vision: a geometricviewpoint, The MIT Press, Cambridge, MA, 1993.

[29] K. Kanatani, GeometricComputation for Machine Vision, Clarendon Press, Oxford, 1993.

[30] B. Noble, J.W. Daniel, Applied Linear Algebra, Prentice-Hall, Englewood Cli!s, NJ, 1988.

About the Author*YONG-SHENG CHEN received his B.S. degree in Computer and Information Science from National Chiao Tung University, Taiwan, in 1993, and M.S. degree in Computer Science and Information Engineering from National Taiwan University, Taiwan, in 1995. He is currently a research assistant at the Institute of Information Science, Academia Sinica, Taiwan, and a Ph.D. student at National Taiwan University. His research interests include computer vision, visual tracking, visual surveillance, and human}machine interaction.

About the Author*LIN-GWO LIOU was born in Taiwan. He received his B.S. degree from the National Chiao Tung University in Taiwan in 1989 and Ph.D. degree from the National Taiwan University in 1995, all in Electrical Engineering. His research interests include motion image analysis, methods for 3-D object reconstruction, pattern recognition in image application.

About the Author*YI-PING HUNG received his B.S. in Electrical Engineering from National Taiwan University in 1982. He received an M.S. from the Division of Engineering, an M.S. from the Division of Applied Mathematics, and a Ph.D. from the Division of Engineering, all at Brown University, in 1987, 1988 and 1990, respectively. He then joined the Institute of Information Science, Academia Sinica, Taiwan, and became a research fellow in 1997. He has been teaching in the Department of Computer Science and Information Engineering at National Taiwan University since 1990, where he is now an adjunct professor. In 1997, he received the Outstanding Young Investigator Award of Academia Sinica. Dr. Hung has published more than 70 technical papers in the "elds of computer vision, pattern recognition, image processing, and robotics. In addition to the above topics, his research interests include visual surveillance, virtual reality, human-computer interface, and visual communication.

About the Author*CHIOU-SHANN FUH received the B.S. degree in Computer Science and Information Engineering from National Taiwan University, Taipei, Taiwan, in 1983, the M.S. degree in Computer Science from the Pennsylvania State University, University Park, PA, in 1987, and the Ph.D. degree in Computer Science from Harvard University, Cambridge, MA, in 1992. He was with AT&T Bell Laboratories and engaged in performance monitoring of switching networks from 1992 to 1993. Since 1993, he has been an associate professor in the Computer Science and Information Engineering Department at National Taiwan University, Taipei, Taiwan. His current research interests include digital image processing, computer vision, pattern recognition, and mathematical morphology.

數據

Fig. 1. An arbitrary con"guration of K cameras.
Fig. 2. Two cameras are mounted on the left and right side of the moving vehicle. Two types of motion, pure translation along the X-axis of GCS and pure rotation around the Z-axis of GCS, are under consideration.
Fig. 3. (a) and (b) are the optical #ow "elds of the camera 1 when the vehicle is translating (a) and rotating (b)
Fig. 6. The seven cameras used for the simulation of camera placement.
+3

參考文獻

相關文件

Reading Task 6: Genre Structure and Language Features. • Now let’s look at how language features (e.g. sentence patterns) are connected to the structure

Moreover, the school gracefully fulfills the undertakings as stated in the Service Agreement in relation to the provision of small-group teaching to enhance learning and

This essay wish to design an outline for the course "Taiwan and the Maritime Silkroad" through three planes of discussion: (1) The Amalgamation of History and Geography;

Due to the scope of anattan is very deep, very wide, difficult to understand and difficult to realize, the method of arriving no "ahamkāra, mamamkāra and mānânusaya"

Due to larger declines in prices in men’s and women’s clothing and footwear, in house renovations and in outbound package tours, in order to promote this kind of tours, the indices of

Nicolas Standaert, "Methodology in View of Contact Between Cultures: The China Case in the 17th Century ", Centre for the Study of Religion and Chinese Society Chung

A majority of the secondary schools adopted project learning to develop students’ language development strategies and generic skills but this was only evident in a small number of

Microphone and 600 ohm line conduits shall be mechanically and electrically connected to receptacle boxes and electrically grounded to the audio system ground point.. Lines in