• 沒有找到結果。

Real-Time Tracking Using Trust-Region Methods

N/A
N/A
Protected

Academic year: 2022

Share "Real-Time Tracking Using Trust-Region Methods"

Copied!
7
0
0

加載中.... (立即查看全文)

全文

(1)

Real-Time Tracking Using Trust-Region Methods

Tyng-Luh Liu, Hwann-Tzong Chen

Abstract — Optimization methods based on iterative schemes can be divided into two classes: line-search methods and trust-region methods. While line-search techniques are commonly found in various vision applications, not much attention is paid to trust-region ones. Motivated by the fact that line-search methods can be considered as spe- cial cases of trust-region methods, we propose to establish a trust-region framework for real-time tracking. Our ap- proach is characterized by three key contributions. First, since a trust-region tracking system is more effective, it of- ten yields better performances than the outcomes of other trackers that rely on iterative optimization to perform track- ing, e.g., a line-search based mean-shift tracker. Second, we have formulated a representation model that uses two coupled weighting schemes derived from the covariance el- lipse to integrate an object’s color probability distribution and edge density information. As a result, the system can address rotation and non-uniform scaling in a continuous space, rather than working on some presumably possible discrete values of rotation angle and scale. Third, the frame- work is very flexible in that a variety of distance functions can be adapted easily. Experimental results and compar- ative studies are provided to demonstrate the efficiency of the proposed method.

Keywords— Tracking, vision, iterative optimization, trust- region methods.

I. Introduction

A key component of a successful tracking system is its ability to search efficiently for the target. Focusing on this goal, we propose a new approach for tracking using trust- region methods [6]. Previous uses of trust-region have been in areas other than real-time tracking, e.g., [12], [13]. While the applications are different, the efficiency of trust-region methods as an optimization tool has been demonstrated.

Recently, Chen and Liu [4] have applied trust-region to tracking, and Sminchisescu and Triggs [16] have used them for 3-D body tracking.

A. Our Approach

We view a tracking process as a sequence of iterative op- timization problems: For each image frame the task is to find an optimal solution that best describes the status of a target object. It requires an effective method to solve the underlying optimization problem appropriately. Most of the iterative optimization techniques used in tracking as well as other vision research are line-search in that the iter- ates are restricted to some iteration-dependent directions, e.g., the gradient. We instead use trust-region methods for their efficiency and reliability.

In addition, motivated by [5], we formulate a flexible object representation. It integrates both color and edge in-

T.-L. Liu and H.-T. Chen are with the Institute of Information Science (IIS), Academia Sinica, Taipei, Taiwan. E-mail: {liutyng, pras}@iis.sinica.edu.tw. H.-T. Chen is also a Ph.D. student at the Department of Computer Science and Information Engineering, Na- tional Taiwan University, Taipei, Taiwan.

formation via two coupled weighting schemes derived from a covariance ellipse model. Unlike other previous related works [2], [5], where the values of scale are limited to few pre-determined ones, the representation allows a system to perform optimization over a continuous space to yield better performance.

B. Previous Work

Methods based on Bayesian framework have been play- ing an important role in tracking, e.g., [11], [15], [18].

Among them, the CONDENSATION algorithm [11], intro- duced by Isard and Blake to track contours in clutter via factored sampling, is perhaps the most well-known one. Its main idea is to pinpoint the inappropriateness of the Gaus- sian state density assumption for tracking in clutter while multiple competing observations exist.

It is also possible to track objects of more complicated shapes using a learning approach [1], [7], [8], [9], [17]. Dif- ferent from CONDENSATION, Freedman and Brandstein [7], [8] consider the contour-tracking problem without assum- ing any dynamical model. They establish, via learning, a subset tracker to perform tracking through minimization.

Exemplar-based methods [9], [17] require an off-line learn- ing phase to generate object representations from exam- ples, and then use distance measures to perform template matching.

If the objects to be tracked are non-rigid, it is con- venient to represent them with probability distributions.

A straightforward way to derive a distribution model is through histogram analysis [2], [3], [4], [5]. Birchfield [2] has proposed an algorithm to track a person’s head by mod- eling it as a vertical ellipse with a fixed aspect ratio. In [3], Bradski presents a CAMSHIFT (continuously adaptive mean shift) system for use in a perceptual user interface to track faces. Comaniciu et al. [5] have used the mean shift to track non-rigid objects. They model objects by color distributions, and then measure the similarity between the target and candidate distributions using a Bhattacharyya coefficient. Note that the mean-shift technique is indeed a line search, and later we will discuss the comparisons between the mean-shift and our approach.

II. Trust-Region Methods

Iterative algorithms for optimization can be divided into two classes: line-search and trust-region. For a line-search one, the iterates are determined along some specific di- rections, e.g., steepest descent locates its iterates by con- sidering the gradient directions. A trust-region method, however, derives its iterates by solving the corresponding optimization problem in a bounded region iteratively. So, there are more options to select the iterates. In fact, line- search methods can be considered special cases of trust-

(2)

region methods [6].

The concept of trust-region methods can be better un- derstood by considering a typical unconstrained minimiza- tion problem,

x∈Vminf (x) , (1) whereV is a vector space, and f is some objective function to be minimized.

Essentially, there are three elements to any trust-region method: (i) trust-region radius, to determine the size of a trust region; (ii) trust-region subproblem, to approximate a minimizer in the region; and (iii) trust-region fidelity, to evaluate the accuracy of an approximating solution.

To illustrate, suppose an initial guess x0 and an initial trust-region radius 0 > 0 are given, and let η1 and η2 be some constants satisfying 0 < η1 ≤ η2 < 1. For each iteration k≥ 0, we first define, for the vector space V, an iteration-dependent norm·k and an iteration-dependent inner product· , ·k by

s2k=s, skdef

=s, Mks, for anys ∈ V, where · , · is the inner product, and Mk is an iteration- dependent matrix. (We will discuss how to determine Mk later.) Then at iteration k, with iteratexkand trust-region radius k, the following three steps are performed within the trust regionBk ={x ∈ V | x − xkk≤ k}.

1. Trust-region subproblem: We first construct a model mk to approximate f inBk. In our system, a quadratic model is used for the approximation, i.e.,

mk(xk+s) = mk(xk) +gk,s +1

2s, Hks, (2) where mk(xk) = f (xk), gk = xf (xk), and Hk is the Hessian of f at xk. When Hk = 0, mk is said to be a second-order model. A trust-region subproblem is then to compute an sk, where skk ≤ k, such that the model mk is “sufficiently reduced,” that is,

sk= argmin

sk≤k

ψk(s)def

= gk,s +1

2s, Hks. (3) 2. Trust-region fidelity: After solving a subproblem, the trial point xk +sk will be tested to see if it is a good candidate for the next iterate. This is evaluated explicitly by

rk = f (xk)− f(xk+sk) mk(xk)− mk(xk+sk).

If rk ≥ η1, then the trial point is accepted, i.e., xk+1 = xk+sk. Otherwise,xk+1=xk. Since η1is a small positive number, the above rule favors a trial point only when the value of the objective function f is also reduced. When mk approximates f well and yields a large rk, the trust-region radius will be expanded for the next iteration. On the other hand, if rk is smaller than η1 or rk is negative, it suggests that the objective function f is not well approximated by the model mk within the current trust regionBk. There- fore, the iterate remains unchanged, and the trust-region

radius will be shrunk to derive more appropriate model and subproblem for the next iteration.

3. Trust-region radius: More specifically, the trust-region radius can be updated as follows.

k+1=





max1skk, ∆k} if rk≥ η2,

k if rk∈ [η1, η2), α2skk if rk< η1,

where, following [6, p.782], we have η1= 0.05, η2= 0.9, and α1 = 2.5, α2 = 0.25. The iterative optimization process for (1) will be repeated until the sequence of iterates{xk} converges.

A. Trust-Region Scaled Norm

An objective function f (x) may have variables whose typical values are of different orders of magnitude. For example, in real-time tracking, the values of spatial vari- ables are often much larger than the vales of scale vari- ables. Without re-scaling the variables properly, the con- tributions from variables of small values tend to be dom- inated by those from variables of large values. It can then lead to unexpected optimization results. To deal with such issues, the re-scaling will be done for each it- eration k through a nonsingular matrix Sk to ensure every trust-region subproblem is solved in a reasonably scaled space. In particular, we have used nonsingular diagonal matrices Sks, where the diagonal entries correspond to typical values of the respective variables. It follows that the new variables, say ˜x, in the scaled space are derived by ˜x = Sk−1x . As a result, ˜x will be of comparable scales after the re-scaling. Moreover, as is proved in [6], it is not necessary to reformulate a trust-region subprob- lem using the new variables since re-scaling the variables is equivalent to using an iteration-dependent scaled norm defined by s2k =< s, Mks >=< s, Sk−TSk−1s >, where Mk= Sk−TSk−1 is an iteration-dependent matrix.

B. Trust-Region vs. Line-Search

In our approach, we have used a quadratic model mk for the implementation. If, instead, a linear model is used, then the RHS of (2) is reduced to the first two terms. This implies that a trust-region method with a linear model ap- proximation is almost like gradient descent, but it often achieves better performances owing to its ability to adjust trust regions adaptively throughout the iterations. This is why line-search methods can be considered special cases of trust-region.

Both trust-region and line-search are guaranteed to con- verge to a local minimum. However, not all local minima are of interest for real application. Typical line-search, e.g., steepest descent or even trust-region with a linear model approximation may often converge to a local minimum that is inferior to a nearby one. In Fig. 1, we construct an ob- jective function with three local minima, x1, x2, and x3. Among them, x1 is clearly the global minimum. We test the three schemes, using 1000 different initial positions, x0s, sampled uniformly from [56.96, 66.95]. Though the

(3)

60 65 70 75

−2

−1 0 1 2 3

x 105 f(x)=(x−60) (x−65) (x−70) (x−75) (x−78) (x−80)

x1=61.64 66.95 x

2=72.24 x

3=79.22 56.96

x0

Steepest Descent : TR+Linear : (∆0=2, 4, 6, 8)

TR+Quadratic : (∆0=2, 4, ...,22)

x1 x

2 x

3

TR+Linear : (∆0=10)

742 12 246

1000 0 0

773 23 204

1000 0 0

Fig. 1. Optimizing with Steepest Descent, TR+linear model, and TR+quadratic model. Out of 1000 runs, with initial positionsx0s, sampled uniformly from [56.96, 66.95], we record in each entry the number of times that a method converges to a local minimum.

x0s are indeed close to the global minimum x1, steepest de- scent fails to converge to x1258 times. Trust-region meth- ods are tested with different ∆0, and are more successful in converging to x1. In passing, note that the mean-shift technique in [5] is a more conservative line-search. Instead of taking largest/steepest steps along gradients, it usually progresses by small steps, computed from the information within fixed-size windows. Such an approach tends to con- verge to a nearby local minimum regardless of its signifi- cance. Thus, both a more sophisticated model approxima- tion and a mechanism to iteratively adjust the regions of interest are needed to reduce the chance of converging to a local minimum not of interest.

III. Representations and Objective Functions Motivated by the work of Comaniciu et al. [5], we also use probability distributions to represent targets. But un- like [5], where the analysis relies on kernel properties, we simply treat the color distribution as a weighted color his- togram to account for the possible non-rigidity of objects.

A. Representation Models for Tracking

Tracking objects by distribution is efficient but not nec- essarily sufficient. Suppose the scale of a target object of monotone color is enlarged. Then, it is not guaranteed that the appropriate scale will always be recovered since, in this case, any sub-portion of the object has a similar distribu- tion to that of the object. Other tracking cues are needed to elevate the performance, e.g., [14]. In our system, the representation model consists of two elements: the first is to characterize the RGB color distribution, and the second

is to estimate the edge density near the object boundary.

Since the edge density is contributed mainly from samples near the boundary, and more prone to be affected by the background, we choose color distribution as the primary cue for tracking.

A.1 Color Distribution

For computing the weighed color histogram, the RGB color space is first divided into n bins, and a bin assignment function b is defined uniquely by each pixelxi’s RGB value as b :xi→ {1, . . . , n}. We then formulate a color weighting scheme based on the bivariate normal distribution, defined by φ(x, µ, Σ) = 2π|Σ|11/2e−(x−µ)TΣ−1(x−µ)/2, where x = (x1, x2)T, µ = (µ1, µ2)T is the mean vector, andΣ is the covariance matrix.

Let the correlation coefficient ρ = σ121σ2. Then, when

|ρ| < 1, the bivariate normal distribution can be rewritten as

φ(x; ζ) = 1

2πσ1σ2

1− ρ2exp



−ε(x; ζ) 2



, (4) where, to simplify the notations, we have σ = (σ1, σ2)T, ζ = (µ, σ, ρ) = (µ1, µ2, σ1, σ2, ρ), and

ε(x; ζ) = 1 1− ρ2

(x1− µ1)2

σ21 − 2ρ(x1− µ1)(x2− µ2) σ1σ2 +(x2− µ2)2

σ22

 . From (4), it implies that lines of constant φ correspond to constant exponents, i.e., ε(x; ζ) = constant represents an

(4)

−10

−5 0

5 10

−10 0 10 0 0.5 1

x1 x2

−10

−5 0

5 10

−10 0 10 0 0.5 1

x1 x2

σ1 σ2 p1 p2

θ x1 x2

−15 −10 −5 0 5 10 15

0 0.2 0.4 0.6 0.8

1 bivariate

crater r=2 crater r=4

x1 (a) Color weights (b) Edge weights (c) Covariance ellipse (d) 1-D plot

Fig. 2. (a) Bivariate normal for color weights. (b) Crater function for edge weights. (c) A covariance ellipse can be represented either by (p1, p2, θ) or by (σ1, σ2, ρ), where p1,p2are lengths of the principal semi-diameters, andθ is the angle between the p1semi-diameter and the x1 axis. (d) Peaks of a crater function occur near the loci of the coupled covariance ellipse.

ellipse centering at µ. We focus only on the covariance ellipses, which satisfy ε(x; ζ) = 1 (denoted as ε1(ζ)), to construct the color weighting scheme.

Now, let I0 be the first image frame and ζ0 = (µ0,σ0, ρ0). Then, a target object initially centering at µ0 can be associated with A10) = {x | ε(x; ζ0) ≤ 1}, the area enclosed by ε10). We define the target’s color distribution within A10), denoted as p(u;ζ0), by

p(u;ζ0) = 1 Cp



xi∈A10)

wc(xi;ζ0)δ(b(xi)− u)

wc(xi;ζ0) = exp



−ε(xi;ζ0) 2

 ,

(5)

where δ is the Kronecker delta function, and wc is the de- rived color weighting function. That p(u;ζ0) is a probabil- ity implies Cp =

xi∈A10)wc(xi;ζ0). For convenience, the notation p(u;ζ0) will be abbreviated into p(u) sinceζ0 only describes the target’s initial state. Analogously, dur- ing tracking, the color distribution of some A1(ζ), denoted as q(u;ζ), is

q(u;ζ) = 1 Cq



xi∈A1(ζ)

wc(xi;ζ)δ(b(xi)− u),

where Cq is the total weight such that n

u=1q(u;ζ) = 1.

A.2 Edge Density

For every wcin (5), we can use a crater function to define a coupled edge-point weighting function we. Specifically, we have we(xi;ζ) = γ ε(xi;ζ) exp

γ2ε(xi;ζ)

, where γ is the parameter to adjust the shape of a crater function and the size of a crater’s opening. It can be verified by a straightforward calculation that for γ = 2, the peaks of the level surface of a crater function occur at the loci of the associated covariance ellipse as shown in Fig. 2d.

In practice, we find better tracking performance can be achieved by using a slightly larger γ, say γ = 4, in that significant values of edge weight are within the covariance ellipse.

Finally, we adopt the notation e(ζ) to represent the edge density within and near the boundary of a covariance el- liptic region A1(ζ). The scale-invariant definition of e(ζ)

is as follows.

e(ζ) = 1 σ1σ2



xi∈A1(ζ)

we(xi;ζ)E(xi), (6)

where E(xi) is a binary edge map, derived from a high-pass 5× 5 Laplacian filter in [10].

B. Objective Functions for Tracking

In view of the object representation model just described, a tracking process for an arbitrary target can be character- ized then by evolution dynamics of a covariance ellipse, ε1(ζt), where we simply denote the process as ζ0→ ζ1 ζ2 → · · · . To decide an optimal ζt for each frame It, we still need to formulate an appropriate objective function to complete the framework.

Since there are two rather distinct features included in the representation model, the resulting objective functional must address these two factors justifiably. First, to measure the similarity between two color distributions, we consider the Kullback-Leibler distance, i.e.,

fc(ζ) =n

u=1

p(u) log p(u)

q(u;ζ), (7)

where p(u) is the target (true) color distribution and q(u;ζ) is the one for the covariance ellipse ε1(ζ). Second, to esti- mate whether the boundary edge density of a candidate co- variance ellipse is comparable to the one of target’s, we em- bed the edge density ratio, denoted as h(ζ) = e(ζ)/e(ζ0), into a sigmoid function to derive the following,

fe(ζ) = 1 − 1

1 + exp{−α(h(ζ) − β)}

= 1

1 + exp{α(h(ζ) − β)}, (8) where α and β are parameters for setting up the initial sigmoid function. (We use α = 5 and β = 1 for all the experiments.) Finally, with the definitions in (7) and (8), the underlying optimization problem for each image frame Itcan be formally written as

ζt= argmin

ζ∈Ωt f (ζ) = fc+ λ fe, (9)

(5)

MS + BH

TR + KL

(a) Magnet #080 (b) Magnet #110 (c) Magnet #190 (d) Magnet #228

Fig. 3. TR: Trust-Region, MS: Mean-Shift, BH:Bhattacharyya, and KL: Kullback-Leibler. In each image, the final convergent circle is plotted in white and the intermediate ones in yellow.

where λ is a parameter to weigh the relative importance of the two terms, and Ωtis the space consisting of all possible ζ’s for any combinations of translation, scale, and orienta- tion.

IV. Experimental Results and Discussion We demonstrate the efficiency of our method by (i) mak- ing comparisons with a mean-shift tracker [5], and (ii) car- rying out a variety of experiments of different scenarios.

A. Trust-Region vs. Mean-Shift

In [5], the color distribution is used as the only cue for tracking, and the Bhattacharyya coefficient, defined by n

u=1

p(u)q(u;x), is chosen to be the objective function to be maximized. Since a mean-shift vector is simply to approximate the gradient of an objective function, thus for the sake of comparison, we implement a trust-region tracker with a linear model approximation, and use the exact color representation model described in [5] for all comparisons.

This implies we are dealing with two trackers: trust-region (TR) and mean-shift (MS), and two objective functions:

Kullback-Leibler distance (KL) and Bhattacharyya coeffi- cient (BH). Totally there are four possible combinations:

MS+BH, TR+BH, MS+KL, and TR+KL.

In Fig. 3, we show some of the results obtained by using MS+BH and TR+KL, respectively. The main advantage of experimenting with such a sequence is that the resulting level surfaces are mostly smooth but with sporadic local ex- trema. Thus, it is easier to pinpoint the causes of different outcomes. We also examine the values of objective func- tions explicitly. To do so, we randomly generate 500 initial positions for an arbitrary image frame from the Magnet se- quence, then perform optimizations from each position us- ing MS+BH and TR+BH, respectively. The same process is repeated for MS+KL and TR+KL. We then count the number of occurrences of converging to a better objective function value by trust-region. This quantitative analy- sis is also performed for the other three sequences shown

in Fig. 5. Our results, in Fig. 4, indicate that no matter which objective function is used, BH or KL, the probability that a trust-region tracker is more effective is about 90%, by converging to better values 3675 times out of 4000 tests.

Note that the efficiency can be further improved by using a quadratic model approximation.

B. Tracking by TR+KL+Edge

We turn now our attention to experimenting with the complete algorithm, i.e., using trust-region with a quadratic model to preform tracking via optimizing with (9). In all our experiments, the RGB space is divided into 16× 16 × 16 = 4096 bins. Other parameters used include:

λ = 0.2, initial trust-region radius ∆0 = 4, and typical values of the diagonal of Sk are (10, 10, 1, 1, 0.1). The ex- periments are carried out on a Pentium-4 2.4GHz PC.

The first sequence is to show the tracker’s ability to pur- sue a fast moving object (a kid jumping around), account- ing for the 2-D translation factor only. Note that, in Fig. 5a - 5d, the intermediate iterates/ellipses are plotted in green to illustrate the underlying optimization process. In the second experiment, we demonstrate, in Fig. 5e - 5h, the effectiveness of optimizing over a 5-dimensional continu- ous space to capture various changes in the object’s scale, shape, and orientation. We emphasize that if a system has a status variableζ limited to just some pre-determined dis- crete values of scale and orientation, it generally could not deliver a comparable performance. The third test is per- formed using a pan/tilt/zoom camera where the target per- son in the scene moves back and forth to bring about rapid and substantial changes in the size of the face appeared in Face sequence. While most tracking-by-distribution sys- tems cannot handle such difficulties, our method addresses the issues of scales robustly as shown in Fig. 5i - 5l.

C. Complexity Analysis

Since the algorithm is iterative, and it typically takes just few iterations to converge, it suffices to analyze the

(6)

0 100 200 300 400 500

−1

−0.5 0 0.5 1

Magnet Kid Hand Face

fMS+BH − fT R+BH

0 100 200 300 400 500

−40

−20 0 20 40

Magnet Kid Hand Face

fT R+KL − fMS+KL

(a) TR+Linear+BH vs. MS+BH (b) TR+Linear+KL vs. MS+KL

Fig. 4. Sorted differences in the objective function values derived by TR+Linear and MS.

time complexity for one iteration. For frame t, let m be the number of pixels within A1(ζt) and d be the dimen- sionality ofζ. (In our formulation, d = 5.) We first need to compute the color histogram q(u;ζt), the edge density e(ζt) in (6), the gradient gk, and the Hessian matrix Hk, which it takes O(m), O(m), O(dm), and O(d2m) time, respectively. Next, it takes O(n) time to evaluate fc by summing up p(u) logp(u)q(u), and O(1) time for fe. (Recall that n is the number of color bins.) Finally, for solving a trust-region subproblem, since the number of iterations is assumed to be less than a fixed number, the time complex- ity only depends on dimensionality d. In particular, to find a minimizer of the subproblem, we have to compute ψkand

skk in (3), or find the intersection on the region bound- ary. The first requires O(d2) time, and the latter two take O(d) computation time. When Hk is non-convex, we need extra O(d2) time to find the other possible iterate. There- fore, the time complexity for one iteration of the complete TR tracking algorithm is O(d2m + n).

D. Discussion

Our approach focuses mainly on two important issues:

optimization and representation. Specifically, we have dis- cussed three choices for optimization: line-search, trust- region with a linear-model approximation, and trust-region with a quadratic-model approximation. While the three all have the desired property to converge to a local mini- mum, we investigate the quality of a solution. To empha- size, we note that a line-search method may fail to converge to a better, nearby extremum due to a crude approxima- tion to the local shape of an objective function. We then compare our method with a well-known mean-shift tracker to demonstrate the advantages of being able to find the iterates in a region and to adjust the size of the region adaptively. Nonetheless, it is difficult to evaluate quantita- tively the performances of two different tracking methods because when testing with a video sequence, they often

start at different initial positions for each intermediate im- age frame. Thus, we instead do the quantitative analy- sis for one arbitrary image frame with randomly generated starting positions. This is equivalent to solving an iterative optimization problem using the two methods, respectively, for each initial value. Such modifications make it possible to analyze the results explicitly, and to further verify that a trust-region implementation for tracking is often more reliable and effective than a line-search one.

Other efforts have been made to design a good repre- sentation model. We have formulated a covariance-ellipse representation to integrate color and edge density informa- tion. It enables the system to perform optimization over a continuous space to yield more accurate results. Our future work includes extending the framework for multiple-object tracking, and exploiting other possible applications in com- puter vision using trust-region methods.

Acknowledgments

This work was supported by NSC grants 90-2213-E-001- 016 and 91-2213-E-001-023, and in part by the Institute of Information Science, Academia Sinica of Taiwan. H.-T.

would like to thank the Foundation for the Advancement of Outstanding Scholarship for a student travel grant.

References

[1] S. Avidan, “Support Vector Tracking,” Proc. Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 184–191, Kauai, Hawaii, 2001.

[2] S.T. Birchfield, “Elliptical Head Tracking Using Intensity Gradi- ents and Color Histograms,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 232–237, Santa Barbara, CA, 1998.

[3] G.R. Bradski, “Computer Vision Face Tracking for Use in a Perceptual User Interface,” Intel Technology Journal, 1998.

[4] H.T. Chen and T.L. Liu, “Trust-Region Methods for Real-Time Tracking,” Proc. Eighth IEEE Int’l Conf. Computer Vision, vol. 2, pp. 717–722, Vancouver, Canada, 2001.

[5] D. Comaniciu, V. Ramesh, and P. Meer, “Real-Time Tracking of Non-Rigid Objects using Mean Shift,” Proc. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 142–149, Hilton Head Island, South Carolina, 2000.

(7)

(a) Kid #000 (b) Kid #096 (c) Kid #152 (d) Kid #348

(e) Hand #000 (f) Hand #461 (g) Hand #577 (h) Hand #588

(i) Face #000 (j) Face #136 (k) Face #185 (l) Face #415

Fig. 5. (a)-(d) Kid sequence: track a target with rapid motion (frame rate:> 200fps). (e)-(h) Hand sequence: track a target with substantial changes in size, shape, and orientation (frame rate: 35fps). (i)-(l) Face sequence: the strength of a trust-region tracker is even more appreciable where a wide range of scales of target face are tracked properly (frame rate: 20fps).

[6] A.R. Conn, N.I.M. Gould, and P.L. Toint, Trust-Region Meth- ods, SIAM, 2000.

[7] D. Freedman and M.S. Brandstein, “Contour Tracking in Clut- ter: A Subset Approach,” Int’l J. Computer Vision, vol. 38, no.

2, pp. 173–186, July 2000.

[8] D. Freedman and M.S. Brandstein, “Provably Fast Algorithms for Contour Tracking,” Proc. Conf. Computer Vision and Pat- tern Recognition, vol. 1, pp. 139–144, Hilton Head Island, South Carolina, 2000.

[9] D. Gavrila and V. Philomin, “Real-Time Object Detection for Smart Vehicles,” Proc. Seventh IEEE Int’l Conf. Computer Vi- sion, pp. 87–93, Corfu, Greece, 1999.

[10] Intel Corporation, Intel Image Processing Library Reference Manual, 2000, Document Number 663791-005.

[11] M. Isard and A. Blake, “Contour Tracking by Stochastic Propa- gation of Conditional Density,” Proc. Fourth European Conf.

Computer Vision, vol. 1, pp. 343–356, Cambridge, England, 1996.

[12] M. Jagersand, O. Fuentes, and Nelson R. C., “Experimental Evaluation of Uncalibrated Visual Servoing for Precision Ma- nipulation,” Proc. 1997 Int’l Conf. Robotics and Automation, Albuquerque, NM, 1997.

[13] T.Q. Phong, R. Horaud, A. Yassine, and P.D. Tao, “Object Pose from 2-D to 3-D Point and Line Correspondences,” Int’l J.

Computer Vision, vol. 15, no. 3, pp. 225–243, July 1995.

[14] C. Rasmussen and G.D. Hager, “Probabilistic Data Association Methods for Tracking Complex Visual Objects,” IEEE Trans.

Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp.

560–576, June 2001.

[15] H. Sidenbladh and M.J. Black, “Learning Image Statistics for Bayesian Tracking,” Proc. Eighth IEEE Int’l Conf. Computer Vision, vol. 2, pp. 709–716, Vancouver, Canada, 2001.

[16] C. Sminchisescu and B. Triggs, “Covariance Scaled Sampling for Monocular 3D Body Tracking,” Proc. Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 447–454, Kauai, Hawaii, 2001.

[17] K. Toyama and A. Blake, “Probabilistic Tracking in a Metric Space,” Proc. Eighth IEEE Int’l Conf. Computer Vision, vol. 2, pp. 50–57, Vancouver, Canada, 2001.

[18] Y. Wu and T.S. Huang, “A Co-inference Approach to Robust Visual Tracking,” Proc. Eighth IEEE Int’l Conf. Computer Vi- sion, vol. 2, pp. 26–33, Vancouver, Canada, 2001.

數據

Fig. 1. Optimizing with Steepest Descent, TR+linear model, and TR+quadratic model. Out of 1000 runs, with initial positions x 0 s, sampled uniformly from [56 .96, 66.95], we record in each entry the number of times that a method converges to a local minimu
Fig. 2. (a) Bivariate normal for color weights. (b) Crater function for edge weights. (c) A covariance ellipse can be represented either by ( p 1 , p 2 , θ) or by (σ 1 , σ 2 , ρ), where p 1 , p 2 are lengths of the principal semi-diameters, and θ is the an
Fig. 3. TR: Trust-Region, MS: Mean-Shift, BH:Bhattacharyya, and KL: Kullback-Leibler. In each image, the final convergent circle is plotted in white and the intermediate ones in yellow.
Fig. 4. Sorted differences in the objective function values derived by TR+Linear and MS.

參考文獻

相關文件

Other advantages of our ProjPSO algorithm over current methods are (1) our experience is that the time required to generate the optimal design is gen- erally a lot faster than many

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the

Other researchers say one way to solve the problem of wasted food is to take steps to persuade people to stop buying so much food in the first place.. People buy more food

(It is also acceptable to have either just an image region or just a text region.) The layout and ordering of the slides is specified in a language called SMIL.. SMIL is covered in

Large data: if solving linear systems is needed, use iterative (e.g., CG) instead of direct methods Feature correlation: methods working on some variables at a time (e.g.,

● tracking students' progress in the use of thinking routines and in the development of their writing ability using a variety.. of formative assessment tools

It is based on the goals of senior secondary education and on other official documents related to the curriculum and assessment reform since 2000, including

Because simultaneous localization, mapping and moving object tracking is a more general process based on the integration of SLAM and moving object tracking, it inherits the