Dynamic visual tracking control of a mobile robot with image noise and occlusion robustness

(1)

Dynamic visual tracking control of a mobile robot with image noise

and occlusion robustness

Chi-Yi Tsai

*

_{, Kai-Tai Song}

Department of Electrical and Control Engineering, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu 300, Taiwan, ROC

a r t i c l e

i n f o

Article history: Received 26 July 2007

Received in revised form 16 June 2008 Accepted 28 August 2008

Keywords:

Visual interaction model Visual tracking control Visual state estimation Nonholonomic mobile robots Temporary partial/full occlusion

a b s t r a c t

This paper presents a robust visual tracking control design for a nonholonomic mobile robot equipped with a tilt camera. This design aims to allow the mobile robot to keep track of a dynamic moving target in the camera’s ﬁeld-of-view; even though the target is temporarily fully occluded. To achieve this, a con-trol system consisting of a visual tracking concon-troller (VTC) and a visual state estimator (VSE) is proposed. A novel visual interaction model is derived to facilitate the design of VTC and VSE. The VSE is responsible for estimating the optimal target state and target image velocity in the image space. The VTC then calcu-lates the corresponding command velocities for the mobile robot to work in the world coordinates. The proposed VSE not only possesses robustness against the image noise, but also overcomes the temporary occlusion problem. Computer simulations and practical experiments of a mobile robot to track a moving target have been carried out to validate the performance and robustness of the proposed system.

1. Introduction

In recent years, computer vision has become a major on-board sensor for autonomous mobile robots. Among the various applica-tions of vision systems, visual tracking plays an important role in autonomous navigation and control. With visual tracking, a robot needs to focus on a target and interact with it accordingly. Thus, the study of visual tracking control, i.e. the vision-based robot mo-tion control to track an interesting target, has become an active area in robotics research [4–21]. Based on motion constraints and the scenario of robotic applications, the research on visual tracking control can be categorized into visual servoing for holo-nomic manipulators and visual tracking for nonholoholo-nomic mobile robots. Visual servoing technique for holonomic manipulators and robot hands has been investigated extensively and many pow-erful tools can be found in the literature[1–3]. However, the re-sults for holonomic manipulators are unsuitable for most mobile robots due to the nonholonomic motion constraints on the mobile platform.

This paper addresses the problem of visual tracking control of a wheeled mobile robot equipped with an on-board monocular camera. Such a visual tracking control task encompasses several interesting issues such as vision-based motion control of mobile robots, estimation of Jacobian matrix, uncertainties caused by im-age noise and occlusion during visual tracking process. Most researchers focus on the design of visual tracking controllers to

track a static object, such as a ground line, landmark, or reference image for the purpose of visual navigation[4–7]or visual regula-tion (e.g. visual homing)[8–12]. Because tracking a dynamic mov-ing target is an important requirement for many intelligent robots, some efforts focus on visual tracking control design of a moving (non-static) target for the purpose of visual formation control [13,14], visual interception[15,16], visual platooning[17]and hu-man–robot interaction [18]. However, only limited works deal with the external uncertainties during visual tracking process such as image noise, varying illumination and temporary occlusion. To overcome the temporary partial occlusion problem, Malis et al. combined a model-free visual servoing method with a template-based visual tracking algorithm to build a ﬂexible and robust visual tracking control system [19]. In their work, the visual tracking algorithm aims to estimate an optimal homography matrix be-tween the reference pattern and the pattern in current image even if the pattern is partially occluded. In[20], Comport et al. combined robust statistical techniques with visual servoing control law in or-der to overcome the position uncertainty of image features caused by varying light source and multiple occlusion problems. However, these reported systems might fail in the full occlusion condition due to the requirement on feature matching. To resolve the tempo-rary full occlusion problem, Han et al. proposed a differential approximation approach to measure the velocity of the target in the image frame (termed as target image velocity)[21]. When the target is fully occluded, the position of the target in the next image is estimated by the measured target image velocity. However, the estimation result is very sensitive to the image noise due to the shortcoming of the differential approximation approach. In the realization of tracking control schemes, it has been noted that 0262-8856/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved.

doi:10.1016/j.imavis.2008.08.011 * Corresponding author.

E-mail addresses:[email protected](C.-Y. Tsai), [email protected](K.-T. Song).

Contents lists available atScienceDirect

Image and Vision Computing

(2)

the image noise usually happens during the visual tracking process of mobile robot due to the position-dependent and light-depen-dent external uncertainties. Therefore, their method cannot efﬁ-ciently overcome the image noise caused by the external uncertainties. Further, in the estimation of Jacobian matrix, the re-ported methods usually assume that the target is stationary during visual tracking operation. However, when the target is non-station-ary, the existing methods fail to provide a suitable solution.

From the above discussion, we note that it is still a challenge to develop a robust control scheme against the image noise and tem-porary (partial/full) occlusion uncertainties during visual tracking tasks. These problems have not yet been addressed, which moti-vated us to design a robust visual tracking control system to track a dynamic moving target and overcome the image noise and tem-porary occlusion. To achieve this, a novel image-based camera–ob-ject visual interaction model is derived in order to help the estimation of Jacobian matrix under non-stationary target condi-tion. Next, a robust visual tracking control system, which consists of a visual tracking controller (VTC) and a visual state estimator (VSE), is developed based on the proposed visual interaction mod-el. The VSE aims to estimate the optimal target state and target im-age velocity from imim-age plane directly, and the VTC then calculates the corresponding control velocities for the mobile robot. The main advantages of the proposed method compared to the existing methods are summarized in the following.

(1) The proposed method allows the robot to track a predictable as well as unpredictable moving target. Compared to the method given in[16], where only straight motion of a unicy-cle in 2D space is reported, the proposed method can handle holonomic target motion in 3D space.

(2) Based on Kalman ﬁltering algorithm [22], the proposed method provides the best linear estimate of target state from the observed image sequence, which contains both random noise and temporary occlusion uncertainties. Therefore, the proposed system is robust to the uncertainties of image noise and temporary occlusion.

Note that the main issue addressed in this paper is the vision-based estimation problem for robotic visual tracking control appli-cations. For a discussion on vision-based control problem (e.g. the analysis of system parametric robustness), please refer to[23]for the technical details.

The rest of this paper is organized as follows. Section2derives the proposed visual interaction model. Section3presents the re-sults of VTC design. In Section4, the design of VSE is presented

to estimate the optimal system state in the image plane for han-dling the uncertainties caused by image noise and temporary occlusion. In Section5, computer simulations are employed to val-idate the system robustness against random image noise. Practical experiments using a robot to track a moving target have been con-ducted to verify the system robustness against temporary occlu-sion. Section6concludes this paper.

2. Camera–object visual interaction model

The basic assumptions of the proposed method are listed as follows:

(1) The proposed VSE has the same basic assumptions as Kal-man ﬁlters in terms of Gaussian distribution uncertainty, smoothness motion, and uniform sampling rate.

(2) The on-board camera is supposed to be a calibrated pinhole camera. Because the proposed VTC possesses some degree of robustness against parametric uncertainty[23], a simple lin-ear camera calibration method[24]can be used to estimate the intrinsic parameters of the camera.

(3) In the derivations below, the width of target is supposed to be a priori known constant to simplify the depth estimation pro-cess. However, any algorithm or sensor which provides the depth information can be utilized to combine with the pro-posed method.

In the following, the visual interaction model between a mobile robot and a dynamic moving target is derived. We ﬁrst introduce a kinematics model of the wheeled mobile robot in relation to a tar-get discussed in this paper. The mathematical derivations of the proposed model are then presented and explained.

2.1. Kinematics model of wheeled mobile robot and target

Fig. 1illustrates the model of wheeled mobile robot and target considered in the nonholonomic visual tracking control problem. The wheeled mobile robot is equipped with a tilt camera to track a dynamic-motion target, which is supposed to be a well-recogniz-able object with appropriate dimensions in the image plane and zero angular motion relative to the robot. The tilt camera is mounted on top of the mobile robot and its optical-axis faces the target of interest, for instance, a human face.Fig. 1(a) shows the model of the wheeled mobile robot and the target in the world coordinate frame Ff(seeFig. 2), in which the motion of the target is supposed to be holonomic such that

Fig. 1. (a) A model of the wheeled mobile robot and the target in the world coordinate frame. (b) Side view of the wheeled mobile robot with a tilt camera mounted on top of it.

(3)

m f cos h m f 2 6 4 3 7 5 and _h m f ¼ w m f _ /¼ wm t ; ð2Þ where Xmf ¼ ½ xmf ymf zmf T

is the position of the mobile robot in the world coordinates, ðhm

f;/Þ represent the orientation angle of the mobile robot and the tilt angle of the onboard camera, wm

t is the tilt velocity of the camera, and ðvm

f;wmfÞ are the linear and angular velocities of the mobile robot. In practice, ðvm

t

mrÞ are the left- and right-wheel velocities, respectively, and D represents the distance between the two drive wheels. In the rest of this paper, the target model (1) and mobile robot model (2) will be utilized to derive the visual interaction model and to design the visual tracking control system.

2.2. Kinematics of camera–object interaction in the camera coordinate frame

Fig. 2 illustrates the relationship between world, camera and image coordinate frames. Let Xf¼ Xtf X

m

f denote the related posi-tion between mobile robot and the target in the world coordinate frame. In order to describe a mobile robot interacting with the tar-get in the image coordinate frame, a visual interaction model has been derived by transferring the kinematics of Xffrom the world coordinate frame into the image coordinate frame. This subsection presents the transformation of the kinematics of Xf from world coordinate frame into camera coordinate frame.

As shown inFig. 2, Xc¼ ½ xc yc zcT denotes the related posi-tion in the camera coordinate frame and can be calculated by the coordinate transformation such that

Xc¼ Rð/; hmfÞXf dY; ð4Þ where Rð/; hm f Þ ¼ Rð/ÞRðh m f Þ ¼ 1 0 0 0 cos / sin / 0 sin / cos / 2 6 4 3 7 5 cos hm f 0 sin h m f 0 1 0 sin hm f 0 cos h m f 2 6 4 3 7 5; dY ¼ 0 dy 0½ T:

dy is the distance between the center of robot tilt platform and the onboard camera. Because dY ¼ 0 dy 0½ T is a constant transla-tional vector, the derivative of(4)becomes

_ Xc¼ oRð/; h m fÞ o/ /_þ oRð/; hmf Þ ohmf _hm f Xfþ Rð/; hmf Þ _Xf; ð5Þ where oRð/; hmfÞ o/ ¼ 0 0 0 0 0 1 0 1 0 2 6 4 3 7 5Rð/; hmfÞ W1Rð/; hmfÞ; oRð/; hmfÞ ohmf ¼ 0 sin / cos / sin / 0 0 cos / 0 0 2 6 4 3 7 5Rð/; hmfÞ W/Rð/; hmfÞ: Substituting(1), (2) and (4)into(5), the kinematics of interaction between the robot and the target in the camera frame can be ob-tained such that

_ Xc¼ ðW1wmt þ W/wmfÞRð/; h m fÞXfþ Rð/; hmfÞ _Xf ¼ AcXcþ Bcu þ Rð/; hmfÞV t f; ð6Þ where Ac¼ W1wmt þ W/wmf ¼ 0 wm f sin / wmf cos / wm f sin / 0 wmt wm f cos / wmt 0 2 6 4 3 7 5 and Bc¼ 0 dy sin / 0 sin / 0 0 cos / 0 dy 2 6 4 3 7 5: u ¼ ½

t

m f wmf wmt

T_{is the control velocity of the mobile robot and} on-board tilt camera. In the following, the kinematics model(6)will be used to derive the interaction model in the image coordinate frame.

2.3. Kinematics of camera–object interaction in the image coordinate frame

In this section, the related position Xcis transformed into the image coordinate frame for deriving the visual interaction model based on(6). We first define the system state in the image plane for the estimator and controller design. Fig. 3 illustrates the definition of observed and desired system state in the image plane. InFig. 3, xiand yiare, respectively, the horizontal and vertical coor-dinates of the centroid of a target in the image plane and dxis the width of the target in the image plane. Similar to human’s visual tracking behavior, the purpose of the visual tracking control design is to control the centroid position and width of target from an initial state into the desired state in the image plane.

In the following, the visual interaction model is derived using (6) and the selected system state. Based on the pinhole camera model, the diffeomorphism (see [26]for detailed description) in the image plane can be deﬁned by standard projection equations [24]such that:

Fig. 2. World (subscript f), camera (subscript c) and image (subscript i) coordinate frames of robotic visual interaction.

(4)

Xi¼ ½ xi yi dxT¼ ½ kxxc kyyc kxkwzcT;

kx¼ fx=zc; ky¼ fy=zc; kw¼ W=zc; ð7Þ where (fx, fy) represent ﬁxed focal length along the image x-axis and y-axis, respectively, and W denotes the actual width of the target. By taking the derivative of(7), the kinematic relationship between image and camera coordinate frames can be found such that

_ Xi¼ PiX_c; Pi¼ kx 0 kxfx1xi 0 ky kyfy1yi 0 0 kxfx1dx 2 6 4 3 7 5: ð8Þ

The visual interaction model(9)indicates that the elements of system matrix Ai and vector Ciare functions of target velocity. Thus, expression(9)can be rewritten as

_

Xi¼ ðAiXiþ CiÞ þ Biu ¼ JiV t

fþ Biu; ð10Þ

where

Expression(10)shows that the visual interaction model consists of two parts: the part of describing target motion _Xt

i ½ _xti _yti _dtx T

¼ JiVtf, and the effect of mobile robot motion _Xmi ½ _xmi _ymi _dmx

T ¼ Biu. Thus,(10)can be rewritten as a dual-Jacobian equation such that _ Xi¼ _Xtiþ _X m i ¼ JiV t fþ Biu; ð11Þ

where the matrix Ji, termed as target image Jacobian, transforms the target velocity Vtf into target image velocity _Xti; matrix Bi, termed as robot image Jacobian, transforms the mobile robot con-trol velocity u into robot image velocity _Xm

i . In other words, the image velocity _Xi is caused by a combination of target image velocity _Xt

i and robot image velocity _Xmi . Therefore, the visual interaction between robot and target in the image coordinate frame can be modeled as a dual-Jacobian visual interaction model (11), which combines the motion effect of mobile robot together with the moving target.

Remark 1. The scalars kx= fx/zcand ky= fy/zcin(7)depend on the depth information between camera and target. The estimation of depth information is a demanding task in visual tracking control design; especially when a single camera is used. Thus, an algorithm or sensor which provides the depth information is usually adopted during the visual tracking process. In order to simplify the depth estimation process, an alternative is to assume that the width of target is known a priori. Therefore, the scalars kxand kycan be calculated using the state variable dxdirectly based on the fact that kx= fx/zc= dx/W and ky= kxfy/fx, where W denotes the width of target for a speciﬁc target.

3. Visual tracking controller (VTC)

In this section, a visual tracking control law based on the pro-posed dual-Jacobian visual interaction model(11) for tracking a target of interest in the image plane is derived exploiting feedback linearization and pole placement approaches.

3.1. Dynamic error state model

In order to control the system state variables from an initial state to the desired state, an error state model is formed to facili-tate the tracking controller design. The error sfacili-tate in the image plane is deﬁned as Xe¼ x½ e ye de T ¼ xi xi yi yi dx dx T ¼ Xi Xi; ð12Þ where Xi¼ xi yi dx T

is the vector of ﬁxed desired states in the image plane (as shown in Fig. 3); X

i ¼ xi yi d x

T

is the vector of the estimated states from the VSE (see Section 4). Using the error state (10), a dynamic error state model in the image plane can be derived by taking the derivative of(12)such that _ Xe¼ _Xti _X m i ¼ JiV t f Biu: ð13Þ

With the new error state Xe, the visual tracking control problem is transformed into a stability problem. If Xeconverges to zero, then the visual tracking control problem is solved.

Ji¼ kxðxfxicos / sin h m f þ cos h m f Þ kxxfxisin / kxð xi fxcos / cos h m f sin h m fÞ kyðy_fi ycos / sin h m f þ sin / sin h m fÞ kyðy_fi ysin / cos /Þ kyð yi fycos / cos h m f þ sin / cos h m fÞ kxdfxxcos / sin h m f kxdfxxsin / kx dx fx cos / cos h m f 2 6 6 4 3 7 7 5:

(5)

3.2. Feedback linearization and pole placement

Based on the dynamic error state model(13), we choose the feedback linearization control law for both the mobile robot and the tilt camera such that

a

1,

a

2,

a

3) > 0 are positive constants, the estimated system state X

iðtÞ will converge exponentially to the desired system state

Xi. This implies that if the controller u and the gain matrix Kgare chosen as given in(14) and (15), respectively, the closed-loop visual tracking system described in (13) will be transformed into an asymptotically stable linear time-invariant (LTI) system and the vi-sual tracking control problem is solved.

Remark 2. Although the proposed image control law(14)results in a smooth convergence in the image plane, it still has to prove that the robot should follow the target. The proof of this problem is presented inAppendix.

3.3. Singularity analysis

The feedback linearization control law(14)poses a singularity problem of matrix Bi. By directly computing the determinant of matrix Bi, the sole singularity of matrix Bican be found such that

fy¼ ðyiþ SdxÞ tan /; ð18Þ

where S = (fydy)/(fxW) is a ﬁxed scalar factor. Since kx= fx/zc= dx/W and ky= kxfy/fx,(18)can be rewritten such that

tan / ¼ fy yiþ kydy

: ð19Þ

Moreover, since fy= kyzcand yi= kyyc,(19)equals

tan / ¼ zc ycþ dy

: ð20Þ

As shown inFig. 4, let /0_{be the angle related to the location of the} target, we have the following geometric relationship:

tanð/ þ /0Þ ¼ zc ycþ dy

: ð21Þ

From(20) and (21), it is clear that the matrix becomes Bisingular when /0_{equals to 0 or}

_p

_{. The physical meaning of these conditions} is that the target is directly above or directly below the robot, and the robot will not be able to approach the target in any way due to insufﬁcient degree-of-freedom. Therefore, the robot will stop tracking temporarily under such circumstances.

4. Visual state estimator

In Section 3.2, the visual tracking control law (14) requires information about target 3D velocity Vtf or target image velocity

_ Xt

i. If V t

f is known, the ﬁrst visual tracking control law(14-1)only needs an estimate of target status Xito calculate the control signal u. However, in practical applications, it is difﬁcult to estimate Vtf on-line in real time when using only one camera. In this situation, the second visual tracking control law (14-2)provides a useful solution which only needs the target image velocity _Xt

iin the image plane. In this section, two VSEs will be proposed in order to esti-mate the necessary information for the VTC. The ﬁrst VSE is devel-oped under the condition that the target velocity Vt

fis known, and the second VSE is designed by releasing this condition, which will facilitate more general applications of the proposed tracking con-trol scheme in image plane.

4.1. VSE design with target velocity information (VSE-WTV)

In the case that the target 3D velocity Vtfis known, the VSE-WTV can be designed based on the system model(9) to estimate the optimal target status Xiin the image plane. To achieve this, a prop-agation model is required in order to facilitate the design of VSE-WTV. This subsection presents the derivation of the required prop-agation model and the proposed VSE-WTV algorithm.

4.1.1. Propagation model for VSE-WTV

The ﬁrst step in the derivation of the propagation model is to discretize the system model (9) into the corresponding discrete form. By the deﬁnition _xðtÞ ¼ limT!0½xðtÞ xðt TÞ=T, where T de-notes the sampling time of the digital system, we can approximate the system model(9)as

Xpi½n ¼ ðI3þ TAiÞXi½n 1 þ TBiun1þ TCi for n ¼ 1; 2; . . . ; ð22Þ where Xp

i½n is the propagated system state at a sample instant n, I3 is a 3 3 identity matrix, X i½n 1 ¼ xi yi d x T denotes the estimated system state at a sample instant n 1, and un1¼

t

m

f wmf wmt

T

t

mrc;wmtcÞ and velocity commands Fig. 4. Physical meaning of the singularity condition(18).

(6)

ð

t

m

t

m

f;dwmf;dwmtÞ can be ob-tained by the velocity inverse transformation

is the estimated velocity error, and

where Wn1= E{du[n 1]duT[n 1]} is the covariance matrix of the estimated velocity error. Applying (22) and (27), the system

state and the corresponding covariance matrix can be

propagated.

4.1.2. Observation and correction for VSE-WTV

In this section, the propagated system state and the propagation covariance matrix will be corrected using the observation from the camera

Zn¼ I3Xi½n þ dZn;

where dZn N(0, Rn) denotes Gaussian observation uncertainty with zero mean and covariance matrix Rnat a sample instant n. The cor-rection procedure is given by

X i½n ¼ X p i½n þ Kn Zn Xpi½n ð28Þ and P n¼ ðI3 KnÞPn; ð29Þ

where Knis the Kalman gain matrix given by

Kn¼ PnðPnþ RnÞ1: ð30Þ

Finally, the corrected system state X

i½n and the corresponding covariance matrix P

n are the optimal estimates at a sample in-stant n.

4.1.3. Summary of the proposed VSE-WTV algorithm

Based on the propagation equations(22) and (27)and the cor-rection equations(28)–(30), the VSE-WTV can be summarized as follows:

(1) Assume that the initial position of target is located in the ﬁeld-of-view of the camera, then initialize the estimated system state X

i½0 and the propagation covariance matrix P0by the ﬁrst observation such that Xi½0 ¼ Z0and P0= R0. The proposed VTC starts to work.

(2) Compute the propagated system state Xp

i½n and the corre-sponding covariance matrix Pn using (22) and (27), respectively.

(3) If the target to be tracked is detected in the observed image, then compute the Kalman gain matrix Knusing(30); other-wise set X i½n ¼ X p i½n and P n¼ Pn, go to step 5. (4) Correct the estimated state vector X

i½n and the correspond-ing covariance matrix P

nusing(28) and (29), respectively. (5) Let X

i½n 1 ¼ Xi½n and Pn1¼ Pn, then go to step 2.

Remark 3. Because the observation uncertainty usually varies with the conditions of target motion (such as orientation and rotation of the target) and working environment (such as light variation and occlusion), the corresponding covariance matrix Rn would be time-varying for different operating conditions. In order to overcome this problem, a real-time self-tuning algo-rithm to choose a suitable observation covariance matrix Rn in varying environmental conditions has been proposed in[27]and is employed in this work. More technical details can be found in [27].

4.2. VSE design without target velocity information (VSE-WoTV) The VSE-WoTV aims to estimate the optimal target status Xiand target image velocity _Xt

i from image space directly without the information of target 3D-velocity Vt

f. In this case, the dual-Jacobian visual interaction model(11)plays an essential role in the estima-tor design. The same procedure presented in Section4.1 will be adopted in the design of VSE-WoTV algorithm.

4.2.1. Propagation model for VSE-WoTV

To derive the required propagation model for the design of VSE-WoTV, we ﬁrst discretize the system model(11)into the discrete form such that

(7)

Xp i½n ¼ X i½n 1 þ T _X t i½n 1 þ TBiun1 for n ¼ 1; 2; . . . ð31Þ Suppose that the target motion is close to a smooth motion in a sampling period, then the target image velocity can be approxi-mated as a constant velocity between two consecutive sample instants _ Xt i½n ¼ _X t i½n 1: ð32Þ

Based on(31) and (32), the propagation equation of VSE-WoTV is given by Xpn¼ I3 TI3 03 I3 X n1þ TBi 03 un1 AestX_n1þ Bestun1; ð33Þ where ðXpnÞ T ¼ ½ ðXp i½nÞ T ð _Xt i½nÞ

T_{is the propagated system state at} a sample instant n, 03 is a 3 3 zero matrix, and ðXn1Þ

T ¼ ½ ðX i½n 1Þ T ð _Xt i½n 1Þ T

denotes the estimated system state at a sample instant n 1.

Next, the covariance matrix of propagation equation(33)at a sample instant n is given by

Pn¼ AestP_n1ATestþ Qn1; ð34Þ where P

n1is the estimated covariance matrix at a sample instant n 1, and Qnis the covariance matrix of the Gaussian propagation uncertainty. Note that(32)is an oversimpliﬁed assumption and will induce propagation error when the target motion is not smooth. However, this kind of error can be compensated by the observation information.

Remark 4. A major difference between VSE-WTV and VSE-WoTV is that the propagation covariance matrix of VSE-WoTV includes the covariance matrix of the Gaussian propagation uncertainty, Qn. The main reason is that if Vtfis known a priori, the prediction of the target state would be more precise with small uncertainty. Thus, the covariance matrix Qn can be approximated by the matrix T2BiWn1BTi in the propagation covariance matrix of VSE-WTV. On the other hand, if Vt

f is unknown, the uncertainty of the target prediction state would become larger. Therefore, the propagation covariance matrix of VSE-WoTV should take the covariance matrix Qninto account.

4.2.2. Observation and correction for VSE-WoTV

Since the observed image only contains information about the target status Xiin each sample instant, the observation model of the VSE-WoTV is given by

Zn¼ I½ 3 03Xnþ dZn HestXnþ dZn; ð35Þ where dZn N(0, Rn) is the observation uncertainty with zero mean and covariance matrix Rn. Based on Eq.(33)to (35), the optimal estimate and the corresponding covariance matrix at a sample in-stant n are given by

Xn¼ X p

nþ KnðZn HestXpnÞ and P

n¼ ðI6 KnHestÞPn; ð36Þ

where Kn¼ PnHTestðHestPnHTestþ RnÞ1is the Kalman gain matrix, and I6is a 6 6 identity matrix.

4.2.3. Summary of the proposed VSE-WToV algorithm

Combining the propagation equations(33) and (34)with the correction equation (36), the processing steps of VSE-WoTV are summarized as follows:

(1) Choose an initial covariance matrix Q0.

(2) Assume that the initial position of the target is located in the ﬁeld-of-view of the camera, then initialize the estimated sys-tem state X

0and propagation covariance matrix P0 by the ﬁrst observation such that X

0¼ ½ ZT0 0 0 0 T

and P0= I6. (3) Compute the ideal propagated state Xpnusing(33)and the

corresponding propagation covariance matrix Pnusing(34). (4) If the target is detected in the observed image, then compute the Kalman gain matrix Knand calculate the optimal esti-mate X

nwith the corresponding covariance matrix P nusing (36); else set X n¼ X p nand P n¼ Pn; go to step 5. Fig. 5. Simulation setup for the performance evaluation of the proposed visual tracking control system.

Table 1

Parameters used in the simulations Symbol Quantity Description

(fx, fy) (294, 312) pixels Camera focal length in retinal coordinates

W 12 cm Width of the target

D 40 cm Distance between two drive wheels

dy 10 cm Distance between the center of robot tilt platform and the onboard camera

T 35 ms Sampling period of the control system ðxi; yi; dxÞ (0, 0, 35) Desired system state in the image plane

(a1,a2,a3) (5/4, 3, 1/2) Positive control gains used in the experiments

Q0 diag(1, 1, 1, 4, 4, 4) Initial covariance matrix

Kn 15 Noise gain

(8)

(5) Let X n1¼ X n, P n1¼ P nand Qn1= Q0; go to step 3.

5. Simulation and experimental results

Several interesting computer simulations and practical experi-ments are presented in this section to validate the tracking perfor-mance and robustness of the proposed visual tracking control system. First, MATLAB was used to verify the tracking performance

of the proposed VTC and the estimation performance of the pro-posed VSE-WoTV. Next, two experiments were performed on an experimental mobile robot to validate the robustness against the occlusion uncertainty. Since the estimation without velocity infor-mation is more difﬁcult to demonstrate compared with that of known velocity information, only the simulation results of the con-troller response of this part is presented. Practical experimental re-sults of both VSE-WTV and VSE-WoTV are illustrated using a video clip and recorded photos.

Fig. 6. The computer simulation results of the proposed VTC combined with VSE-WoTV. (a) Robot trajectory in the world coordinate frame. (b) Control velocities of the center point and the tilt camera of tracking robot. (c) Tracking errors with random noise. (d) Tracking errors estimated by VSE-WoTV. (e) Estimated target image velocity. (f) Estimation errors.

(9)

5.1. Simulation results

In order to evaluate the performance of the proposed visual tracking control system, a simulation environment has been setup using MATLAB.Fig. 5shows the architecture of the simulation set-up. InFig. 5, Xn,which includes the target state Xi[n] and target im-age velocity _Xt

i, denotes the ideal state needed to be estimated by the VSE-WoTV. Xi[n] is obtained from the coordinate transforma-tions(4) and (7), and _Xt

i is calculated by(11)such that

_ Xt

i ¼ _Xi _Xmi ¼

Xi½n Xi½n 1

T Biun1: ð37Þ

The observation signal Znis obtained by the rounding off the value of Xi[n] with random noise (RN) to an integer. In this paper, the ran-dom noise is given by

t fcos h t fÞ;

Fig. 7. Experimental mobile robots used to test the tracking performance of the proposed visual tracking control system. (a) Experimental robots used in experiment 1. The left robot is called tracking robot, and the right one is called target robot. (b) Experimental robot used in experiment 2.

Table 2

Parameters used in the experiments

Symbol Quantity Description

Experiment 1 Experiment 2 (fx, fy) (294, 312) pixels (393.4, 391.8) pixels

Camera focal length in retinal coordinates

W 12 cm 12 cm Width of the target

D 30 cm 40 cm Distance between two drive wheels dy 10 cm 10 cm Distance between the center of

robot tilt platform and the onboard camera

T 100 ms 100 ms Sampling period of the control system ðxi; yi; dxÞ (0, 0, 35) (0, 0, 35) Desired system state in the image plane

(a1,a2,a3) (5/16, 6/8,

4/16)

(5/4, 3, 1/2) Positive control gains used in the experiments

Q0 x diag(5, 5, 5,

20, 20, 20)

Initial covariance matrix

x, Do not care.

(10)

where

t

f¼ 20 cm/s and h t

fðnewÞ ¼ h t

fðoldÞ þ ðT

p

=18Þ rad with ht_fð0Þ ¼ 0. FromFig. 6(a), we see that the motion trajectory of the tracking robot is also a circular path as a result of following the tar-get.Fig. 6(b) shows the control velocities of the robot center point and the tilt camera. Simulation results reveal that the linear and angular velocities of tracking robot converge to constant values when the tracking errors decay to zero. Therefore, the tracking ro-bot keeps tracking the target continuously. Fig. 6(c) shows the tracking errors with random noise(39), andFig. 6(d) is the

corre-sponding tracking errors estimated by the VSE-WoTV. InFig. 6(c) and (d), the dotted lines illustrate the ideal tracking errors while the solid lines show the observation and estimation results of track-ing errors. A comparison ofFig. 6(c) withFig. 6(d) shows that the random noise in each error state is removed sufﬁciently, especially the error states yeand de. Thus, the robustness of the proposed VSE-WoTV against the random noise uncertainty is veriﬁed. Moreover, inFig. 6(d), each error state converges to zero exponentially, which validates the tracking performance of the proposed VTC.Fig. 6(e) Fig. 9. Experimental results of tracking a moving target when it is temporarily partially occluded. (a) Before partial occlusion. (b)–(d) Partial occlusion occurred. (e)–(f) After partial occlusion, the moving target is still under tracking.

Fig. 10. Experimental results of tracking a moving target when it is temporarily fully occluded. (a) Before full occlusion. (b)–(e) Full occlusion occurred. The moving target is estimated only using the prediction information. (f) After fully occlusion, the moving target is still under tracking.

(11)

and (f), respectively, presents the estimation results and the estima-tion errors of target image velocity from the VSE-WoTV. InFig. 6(e), the dotted lines indicate the ideal target image velocity while the solid lines show the estimation results of target image velocity. It is clear that each estimate converges to the corresponding ideal va-lue. This result also can be seen inFig. 6(f), which shows that each estimation error converges to zero efﬁciently. Therefore, these sim-ulation results validate the estimation performance of the proposed VSE-WoTV.

5.2. Experimental results

Two experiments have been carried out using an experimental mobile robot to validate the performance of the proposed control scheme: the ﬁrst experiment is to track a moving object, and the second one is to track a moving person.Fig. 7(a) and (b) shows the mobile robots of experiments 1 and 2, respectively. In Fig. 7(a), the left robot (called tracking robot) is equipped with a USB camera and a tilt camera platform to track another robot, on which a cylindrical object of interest was installed (called target ro-bot).Fig. 7(b) shows another experimental robot constructed to

serve as a test bed for the study of visual tracking of a moving tar-get without its velocity information.Table 2tabulates the param-eters used for the tracking robot in the experiments. Note that the processing time of the proposed visual tracking control system (including target detection, estimation and control computations) is less than 50 ms. This means that the overall tracking system is of acceptable computational load and can track the target in real time. However, the sampling period of the control system, T, was set to 100 ms in the experiments due to other image processing computations such as image compression and storage. In the fol-lowing, the experimental results of visual tracking with occlusion are presented to validate the system performance and robustness. 5.2.1. Experiment 1: Visual tracking of a moving object

This section presents the experimental results of tracking a moving target with the target temporarily partially and fully oc-cluded.Fig. 8shows the block diagram of the implemented visual tracking control system in experiment 1. In this experiment, we set another robot as the moving target with a priori known motion velocity in order to verify the performance of the VSE-WTV pro-posed in Section4.1. A cylindrical object was placed on the robot Fig. 11. Experimental results of the proposed VTC combined with the VSE-WTV. (a) Command velocities of mobile robot and tilt camera. (b) Estimated (solid lines) and observed (dotted lines with spikes) tracking errors.

(12)

to facilitate easy recognition of the target, which is moving along a circular path with velocity

t f¼ 10:5 cm=s and h t fðnewÞ ¼ h t

fðoldÞ þ 0:01 rad with htfð0Þ ¼

p

. The information on the target velocity is then used in VSE-WTV to estimate the state of the target and overcome the occlusion problem even if the target is temporarily fully occluded.

Fig. 9illustrates the partial occlusion experiment recorded by the tilt camera on-board the tracking robot (the robot with a

cam-era).Fig. 9(a) shows the tracked target before partial occlusion. In Fig. 9(b)–(d), the moving target is temporarily partially occluded by a moving object.Fig. 9(e) and (f) shows that the moving target is still tracked after partial occlusion. InFig. 10, the target was fully blocked by a moving person.Fig. 10(a) shows the tracked target be-fore full occlusion. InFig. 10(b)–(e), the moving target is temporar-ily fully occluded by a passing person. Since the target would not be observable in the observed image, the VSE-WTV estimated the moving target only using prediction information. Hence, the mov-ing target is still tracked even though it is unobservable.Fig. 10(f) Fig. 13. Experimental results of tracking a moving person. (a)–(f) Image sequence recorded from a DV camera. (a)–(b) The tracking robot started to track the user. (c)–(e) Full occlusion occurred when another person walked across temporarily. (f) The tracking robot still tracked the user after full occlusion.

Fig. 14. Experimental results of tracking a moving person. (a)–(f) Image sequence recorded from on-board USB camera. (a)–(b) The tracking robot started to track the user. (c)–(e) Full occlusion occurred when another person walked across temporarily. (f) The tracking robot still tracked the user after full occlusion.

(13)

shows that the moving target is tracked successfully after full occlusion.

Fig. 11presents the recorded experimental results of tracking a moving object.Fig. 11(a) shows the control velocities of both mobile robot and tilt camera.Fig. 11(b) shows a comparison be-tween the estimated error states (the solid lines) and the ob-served ones (the dotted lines with spikes). From Fig. 11(b), we see that the random noise caused by the temporary occlusion ef-fect is removed efﬁciently by utilizing the proposed VSE-WTV. Therefore, based on the above occlusion experiments, the robust estimation performance of the proposed visual tracking control system is veriﬁed. A video clip of the experimental results is available online[28].

5.2.2. Experiment 2: Visual tracking of a moving person

In this section, the tracking performance of the proposed VTC combined with the VSE-WoTV is demonstrated by tracking a mov-ing person. In order to detect and track the person in the image plane, the visual tracking control system is combined with a real-time face detection and tracking algorithm presented in our previ-ous work[29].Fig. 12illustrates the complete visual tracking sys-tem which encompasses the face detection/tracking algorithm, the VSE-WoTV described in Section 4.2 and the VTC presented in Section3. Because the velocity of human motion is unknown, the VSE-WoTV works to estimate the image velocity instead of the mo-tion velocity for the VTC used and thus overcome the temporary occlusion problem.

Figs. 13 and 14show the recorded images of the mobile robot

interacting with the moving person in the experiment.

Fig. 13(a)–(f) show the recorded photos of the experimental sce-nario, andFig. 14(a)–(f) are the corresponding pictures recorded by the on-board USB camera. In the beginning, the person sat on a stool, and the robot started to track his face using the proposed visual tracking control system (Figs. 13(a) and 14(a)). Next, the person stood up to walk around in the room, and the mobile robot kept following and tracking the person’s face by the tilt camera (Figs. 13(b) and 14(b)). When the person was walking, another per-son passed between the tracked perper-son and the robot temporarily (Fig. 13(c)–(e)). Thus, inFig. 14(c)–(e), the person’s face was tem-porarily fully blocked by the passing person. Based on the proposed VSE-WoTV algorithm, the propagation information will dominate the estimation results in this situation even if the target is fully unobservable. Therefore, the VSE-WoTV still estimated the posi-tions and velocities of the person’s face in the image plane success-fully even during the temporary full occlusion conditions. Finally,

the person sat down on the stool, and the robot tracked him continuously.

Fig. 15presents the recorded experimental results of tracking a moving person.Fig. 15(a) shows the control velocities of both mo-bile robot and tilt camera.Fig. 15(b) compares the estimated track-ing errors (solid lines) with the observed ones (dotted lines with spikes). FromFig. 15(b), it is clear that the random noise is also re-moved efﬁciently by the proposed VSE-WoTV algorithm. This occlusion experiment validates the robust estimation performance of the proposed visual tracking control system. A video clip of this experiment is available online in[28].

Remark 5. The main differences between the proposed method and the existing video color object tracking (VCOT) methods, such as CamShift algorithm[30], are twofold. First, the existing VCOT methods usually suppose that the target has located in the camera’s ﬁeld of view and do not consider the camera motion effect. On the contrary, the proposed method considers both camera and target motion effects to increase the tracking perfor-mance and system robustness. Second, the existing VCOT methods usually do not deal with the temporary full occlusion problem. In contrast, the proposed method uses the propagation information to deal with the temporary full occlusion problem. Moreover, the propagation covariance matrix can be used to evaluate the reliability of the tracking state under the situation of full occlusion. Fig. 15. Experimental results of the proposed VTC combined with the VSE-WoTV. (a) Command velocities of the mobile robot and the tilt camera. (b) Estimated (solid lines) and observed (dotted lines with spikes) tracking errors.

Fig. 16. Experimental pan-tile platform used to demonstrate the robust property of the proposed visual tracking scheme.

(14)

For example, if one of the diagonal values of the propagation covariance matrix is larger than a preset threshold, then the propagation is not reliable, and thus the visual tracking control system should be stopped and reinitialized.

5.2.3. Experiment of occlusion-robustness property

Since the current VSE design is based on the Kalman ﬁlter algo-rithm, the estimation performance is dependent on the accuracy of covariance matrices Pnand Rn. In order to demonstrate this prop-erty, the proposed visual tracking control system is extended to control a pan-tile camera platform in this experiment. Fig. 16 shows the experimental pan-tile platform equipped with a camera to track the face of a user. The control velocities of pan-tile plat-form can be computed by simplifying the proposed control law (14)such that wpan_f wtilt f " # ¼ B12 B13 B22 B23 1

a

1xe _xti

a

2ye _yti ; ð39Þ

where wpan_f is the pan control velocity, wtilt

f is the tilt control veloc-ity, and Bmndenotes an element of matrix Bicorresponding to the mth row and nth column.

Fig. 17presents the experimental results.Fig. 17(a)–(c) shows the recorded images, in which the green and magenta windows indicate the observation and propagation, respectively.Fig. 17(d) and (e), respectively, illustrates the variance value of state xi in propagation and observation covariance matrices. Since the face tracking algorithm employed in the current system only uses the skin color to detect the human face in a local search window, the algorithm will track another person’s face which moves across the user’s face and camera. In this situation, the variance value of

observation covariance matrix will increase greatly due to the ra-pid change in the observation. Thus, inFig. 17(e), we see that the variance value of the observed state xi(denoted by Rx) increases rapidly due to the sudden change in observation. On the other hand, the variance value of the propagated state xi(denoted by Px) increases smoothly but much smaller than the observed one. Therefore, after the correction step in Kalman ﬁltering algorithm, the propagation state will dominate the estimation result, which tracks the correct user’s face. Finally, inFig. 17(c), the face tracking Fig. 17. The experimental results of occlusion using VSE-WoTV. (a)–(c) Recorded camera view, observation states and propagation states, (d) variance of propagation states, (e) variance of observation states.

Fig. 18. Simulation result of the distance between mobile robot and motion target, jjXfjj.

(15)

algorithm detects the human face close to the estimation result, and the observation is corrected. A video clip of this experiment is available online in[28].

Remark 6. If there is an object with similar feature and motion to target, then the proposed VSE may track to this object when it moves across the target and camera. However, this problem can be resolved by combining an object recognition algorithm with visual tracking algorithm. In this paper, we do not cover the object recognition problem and only focus the topic on visual tracking problem.

6. Conclusion and future work

A novel visual tracking control model of visual interaction be-tween a mobile robot and a dynamic moving object in the image plane has been derived in this paper. Based on the control mod-el, a tracking controller is proposed to resolve visual tracking control of a dynamic moving target with asymptotical conver-gence. Based on the proposed visual interaction model, two vi-sual state estimators have been proposed to estimate the optimal system state from the noisy observation and temporary occlusion during visual tracking operation. The merit of this de-sign is that the image processing procedures are much simpliﬁed for the desired motion control in the world coordinates due to image-based computation. Simulation and experimental results of tracking a moving target validate the performance and robust-ness of the proposed control schemes.

Since the performance of the current VSE design has some restrictions due to the assumptions of the Kalman ﬁlter (e.g. Gauss-ian distribution uncertainty, smoothness motion, and uniform sam-pling rate), the future work will focus on developing other types of VSE, such as neural-networks based VSE, to solve this problem and improve the accuracy of the visual estimation results.

Acknowledgements

The authors thank Fu-Sheng Huang, Chen-Yang Lin, and Chun-Wei Chen for their assistance in the experiments. This work was supported by the Ministry of Economic Affairs under Grant 95-EC-17-A-04-S1-054 and the National Science Council of Taiwan, ROC under Grant NSC 95-2218-E-009-024.

Appendix

In this appendix, we will show that when the system state Xi converges to the desired system state Xi, the mobile robot follows the moving target. Recall the diffeomorphism deﬁned in(7)

Xi¼ PicXc; ðA1Þ

where Pi

c¼ diagðkx;ky;kxkwÞ. Suppose that the system state Xihas converged to the desired system state Xi, we then have the follow-ing results based on(A1)

Xi¼ PicX d c ðA2Þ and Xd c¼ ½ xdc ydc zdc T ¼ Rð/; hmf ÞX d f dY; ðA3Þ where Xd cand X d

f, respectively, are the related position between mo-bile robot and motion target in camera and world coordinate frame when Xi¼ Xi. Because Pic is invertible, the following relation be-tween Xd

fand Xican be obtained by substituting(A3)into(A2)such that Xd f ¼ R T ð/; hmfÞ½ðP i cÞ 1_X iþ dY: ðA4Þ

Let kAk denote the 2-norm value of vector or matrix A. The key idea is that if kXdfk is bounded, it implies that the mobile robot has fol-lowed the motion target. Using(A4), kXd

fk is given by kXdfk ¼ kR T ð/; hmfÞ½ðP i cÞ 1_X iþ dYk 6kRTð/; hmfÞk kðP i cÞ 1_X iþ dYk 6_kðPi cÞ 1 Xik þ kdYk 6_kðPi cÞ 1 k kXik þ kdYk: ðA5Þ Because of kðPi cÞ 1 k ¼ kdiagðk1x ;k 1 y ;k 1 x k 1 wÞk ¼ zckdiagðfx1;fy1; f1 x W 1_{Þk ¼ z} c ffiffiffiffiffiffiffiffiffikmax p

, where zc= fxW/dxand kmax¼ maxðfy1;fx1W 1_Þ, we have the following result:

kXdfk 6 fxW dx ffiffiffiffiffiffiffiffiffi kmax p kXik þ kdYk: ðA6Þ

From(A6), it is clear that kXd

fk is bounded, and hence the proof is completed.

We use the simulation presented in Section5.1as an example to explain the physical meaning of (A6). By using the parameters listed inTable 1, the term on the right-hand side of(A6)can be cal-culated by kXdfk 6 294 12 35 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0:0032 p 35 þ 10 ¼ 209:7337 ðcmÞ; ðA7Þ

which means that when Xiconverges to Xi, the distance between mo-bile robot and motion target is bounded to 209.7337 cm. Fig. 18 shows the simulation result of the distance between mobile robot and motion target. InFig. 18, the solid line presents the 2-norm value of Xf, and the dotted line denotes the bounded distance calculated in (A7). FromFig. 18, we see that the distance between mobile robot and target ﬁnally converges to about 100 cm, which is satisﬁed in the bounded condition(A7). Because the target is always moving and the distance between the robot and target is bounded, this implies that the robot has followed the target as we expected.

References

[1] S. Hutchinson, G.D. Hager, P.I. Corke, A tutorial on visual servo control, IEEE Transactions Robotics and Automation 12 (5) (1996) 651–670.

[2] F. Chaumette, S. Hutchinson, Visual servo control part I: basic approaches, IEEE Robotics and Automation Magazine 13 (4) (2006) 82–90.

[3] F. Chaumette, S. Hutchinson, Visual servo control part II: advanced approaches, IEEE Robotics and Automation Magazine 14 (1) (2007) 109–118.

[4] Y. Ma, J. Košecká, S.S. Sastry, Vision guided navigation for a nonholonomic mobile robot, IEEE Transactions on Robotics and Automation 15 (3) (1999) 521–536.

[5] J.-B. Coulaud, G. Campion, G. Bastin, M.D. Wan, Stability analysis of a vision-based control design for an autonomous mobile robot, IEEE Transactions on Robotics 22 (5) (2006) 1062–1069.

[6] H. Zhang, J.P. Ostrowski, Visual motion planning for mobile robots, IEEE Transactions on Robotics and Automation 18 (2) (2002) 199–208.

[7] T. Nierobisch, W. Fischer, F. Hoffmann, Large view visual servoing of a mobile robot with a pan-tilt camera, in: Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 2006, pp. 3307–3312. [8] J. Chen, W.E. Dixon, D.M. Dawson, M. McIntyre, Homography-based visual

servo tracking control of a wheeled mobile robot, IEEE Transactions on Robotics 22 (2) (2006) 407–416.

[9] Y. Fang, W.E. Dixon, D.M. Dawson, P. Chawda, Homography-based visual servo regulation of mobile robots, IEEE Transactions on System, Man, and Cybernetics-Part B: Cybernetics 35 (5) (2005) 1041–1049.

[10] G.L. Mariottini, G. Oriolo, D. Prattichizzo, Image-based visual servoing for nonholonomic mobile robots using epipolar geometry, IEEE Transactions on Robotics 23 (1) (2007) 87–100.

[11] G.L. Mariottini, D. Prattichizzo, G. Oriolo, Image-based visual servoing for nonholonomic mobile robots with central catadioptric camera, in: Proceedings IEEE International Conference on Robotics and Automation, Orlando, FL, USA, 2006, pp. 538–544.

[12] G. López-Nicolás, C. Sagüés, J.J. Guerrero, D. Kragic, P. Jensfelt, Switching visual control based on epipoles for mobile robots, Journal of Robotics and Autonomous Systems 56 (7) (2008) 592–603.

[13] A.K. Das, R. Fierro, V. Kumar, J.P. Ostrowski, J. Spletzer, C.J. Taylor, A vision-based formation control framework, IEEE Transactions on Robotics and Automation 18 (5) (2002) 813–825.

[14] R. Vidal, O. Shakernia, S. Sastry, Following the ﬂock [formation control], IEEE Robotics and Automation Magazine 11 (4) (2004) 14–20.

(16)

[15] J.A. Borgstadt, N.J. Ferrier, Interception of a projectile using a human vision-based strategy, in: Proceedings IEEE International Conference on Robotics and Automation, San Francisco, USA, 2000, pp. 3189–3196.

[16] L. Freda, G. Oriolo, Vision-based interception of a moving target with a nonholonomic mobile robot, Journal of Robotics and Autonomous Systems 55 (6) (2007) 419–432.

[17] H.Y. Wang, S. Itani, T. Fukao, N. Adachi, Image-based visual adaptive tracking control of nonholonomic mobile robots, in: Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, Maui, Hawaii, USA, 2001, pp. 1– 6.

[18] C.-Y. Tsai, K.-T. Song, Face tracking interaction control of a nonholonomic mobile robot, in: Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 2006, pp. 3319–3324.

[19] E. Malis, S. Benhimane, A uniﬁed approach to visual tracking and servoing, Journal of Robotics and Autonomous Systems 52 (1) (2005) 39–52. [20] A.I. Comport, É. Marchand, F. Chaumette, Statistically robust 2-D visual

servoing, IEEE Transactions on Robotics and Automation 22 (2) (2006) 416– 421.

[21] Y. Han, H. Hahn, Visual tracking of a moving target using active contour based SSD algorithm, Journal of Robotics and Autonomous Systems 53 (3–4) (2005) 265–281.

[22] J.D. Schutter, J.D. Geeter, T. Lefebvre, H. Bruyninckx, Kalman ﬁlters: a tutorial, Journal A 40 (4) (1999) 52–59.

[23] C.-Y. Tsai, K.-T. Song, Visual tracking control of a wheeled mobile robot with a system model and velocity quantization robustness, IEEE Transactions on Control Sysrems Technology, accepted for publication.

[24] W.I. Grosky, L.A. Tamburino, A uniﬁed approach to the linear camera calibration problem, IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (7) (1990) 663–671.

[25] T.-C. Lee, C.-Y. Tsai, K.-T. Song, Fast parking control of mobile robots: a motion planning approach with experimental validation, IEEE Transactions on Control Systems Technology 12 (5) (2004) 661–676.

[26] J.-J.E. Slotine, W. Li, Applied Nonlinear Control, Prentice-Hall, Englewood Cliffs, NJ, 1991.

[27] C.-Y. Tsai, K.-T. Song, X. Dutoit, H. Van Brussel, M. Nuttin, Robust mobile robot visual tracking control system using self-tuning Kalman ﬁlter, in: Proceedings IEEE International Symposium on Computational Intelligence in Robotics and Automation, Jacksonville, Florida, 2007, pp. 161–166.

[28] The video website. Available:<http://isci.cn.nctu.edu.tw/video/RVTCS_IVC/>. [29] K.-T. Song, J.-S. Hu, C.-Y. Tsai, C.-M. Chou, C.-C. Cheng, W.-H. Liu, C.-H. Yang,

Speaker attention system for mobile robots using microphone array and face tracking, in: Proceedings IEEE International Conference on Robotics and Automation, Orlando, FL, 2006, pp. 3624–3629.

[30] G.R. Bradski, S. Clara, Computer vision face tracking for use in a perceptual user interface, Intel Technology Journal 2 (2) (1998) 1–15.