• 沒有找到結果。

Advanced Topics in Learning and Vision

N/A
N/A
Protected

Academic year: 2022

Share "Advanced Topics in Learning and Vision"

Copied!
24
0
0

加載中.... (立即查看全文)

全文

(1)

Advanced Topics in Learning and Vision

Ming-Hsuan Yang

mhyang@csie.ntu.edu.tw

(2)

Announcements

• Term project presentation: Dec 28 and Dec 29.

• All critiques due on Jan 9, 2006 (midnight, Taipei local time). No overdue critiques will be accepted.

• Final term project report: Due on Jan 16, 2006 (midnight, Taipei local time).

No overdue term reports will be accepted.

• Supplementary reading:

- M. Black and A. Jepson. Eigentracking: Robust matching and tracking of articulated objects using view-based representation. International Journal of Computer vision, vol.

26, no. 1, pp. 63–84, 1998.

- A. Jepson, D. Fleet, and T. El-Maraghi. Robust Online Appearance Models for Visual Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no.

10, pp. 1296–1311, 2003.

(3)

Overview

• Particle filters

• 3D human tracking

• On-line visual tracking

• Mean shift algorithm and applications

(4)

Problems with Particle Filter

• Sampling in high dimensional space is difficult and inefficient.

• Need a lot of particles.

• Curse of dimensionality: usually does not scale well (e.g., up to 10 dimensional space).

(5)

Multiple Hypothesis Particle Filter

• Multiple hypothesis filter (MHF) is a classical approach to representing multi-modal distributions with Kalman filters.

• In mode-based multiple hypothesis filter [Cham and Rehg CVPR99] , each mode is modeled by a truncated Gaussian (as opposed to a set of discrete samples used in the condensation algorithm).

• Sample weights: p(x) = k maxi=1,...,N{pi exp(−12(x − mi)TS−1i (x − mi)}

where pi is the Gaussian mixture weight and mi is the i-th Gaussian center with covariance Si.

(6)

• Use 2D scaled prismatic models with 19 degree of freedom.

• Observation model:

p(zt|xt) ∝ Y

u

exp(−(I(u) − T (u, xt))2

2 ) (1)

where u represent image pixel coordinates, I(u) are the image pixel values at u, T (u, xt) are the overlapping template pixel values at u with state

vector xt.

(7)

T.-J. Cham and J. Rehg. A multiple hypothesis approach to figure tracking. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 239–245, 1999.

(8)

Annealed Particle Filter

• Incorporate the concept of simulated annealing with particle filter [Deutscher et al. CVPR00]

• Based on Markov chain method of simulated annealing in the optimization context [Kirkpatrick et al. 01].

• Handling multiple modes.

• Multi-layer search, akin to coarse to fine search.

(9)

• 3D human model is based on kinematic chain with 29 degree of freedom.

• Each limb is represented by conic sections with elliptical cross surfaces.

• Use edge feature and background subtraction with calibrated cameras.

e

X(x, z) = 1 N

N

X

i=1

(1 − pei(x, z))2 (2)

where x is the model state vector and z is the image from which the pixel map is derived. pi(x, z) are the values of the edge pixel map at the N

sampling points taken along the model’s silhouette.

(10)

• Likewise,

r

X(x, z) = 1 N

N

X

i=1

(1 − pri(x, z))2 (3)

where pi(x, z) are the values of the foreground pixel map at the N sampling points taken from the interior of the conical sections.

• Sample weight:

w(x, z) = exp−(Pe(x,z)+Pr(x,z))

where x is the model configuration vector and z is the image observation (edge e and region r).

• Can incorporate observations from multiple cameras.

(11)

J. Deutscher, A. Blake, and I. Reid. Articulated body motion capture by annealed particle filtering. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2126–2133, 2000.

(12)

Tracking Loose Limbed People

• A body is represented by a graphical model [Sigal et al. NIPS03]:

- node: body part (torso, upper leg, etc)

- edge: spatial and angular constraints between two adjacent body parts

• Each body part is modeled by 5 fixed parameters (lengths and widths at proximal and distal ends, and offsets) and 6 estimated parameters (3 global positions and 3 angular orientations).

(13)

• Each directed edge between parts i and j is modeled by a conditional distribution ψij(xi, xj).

• Conditional distributions capture physical constraints and can be learned from motion capture data or constructed by hands.

ψij(xi, xj) = λ0N (xj; µij, λij) + (1 − λ0)

Mij

X

m=1

δijmN (xj; Fijm(xi), Gijm(xi))

(4) where λ0 is a fixed outlier probability, µij and λij are the mean and

covariance of the Gaussian outlier process. Fijm(·) and Gijm(·) are means and covariance for the m-th Gaussian mixture component with weight σijm.

(14)

• Image likelihood is based on multi-scale edge and ridge filter responses.

• As the configuration vector has 6 dimensions, conventional use of discrete samples to approximate xi may not be efficient in applying traditional belief propagation algorithm.

• Non-parametric belief propagation [Isard CVPR03] [Sudderth et al. CVPR 03] is a generalized particle filter algorithm that operates on continuous valued random variables.

• Non-parametric belief propagation is achieved by treating the particle set as an approximation and replacing the distribution by a product of incoming message set.

mij(xj) = Z

ψij(xi, xj)φ(xi) Y

k∈Ai,k6=j

mkj(xi)dxi (5)

where Ai is the set of neighbors of node i and φ(xi) is the local likelihood associated with node i.

(15)

• The message mij(xi) can be be approximated by importance sampling from a proposal function f (xi).

• See [Isard CVPR03] [Sudderth et al. CVPR 03] for details on non-parametric belief propagation.

(16)

• Extended for body tracking over time [Sigal CVPR04].

• Able to detect body parts automatically.

(17)

• Tracking body parts over time.

L. Sigal, M. Isard, B. Sigelman and M. black. Attractive people: Assembling loose-limbed models using non-parametric belief propagation. Advances in Neural Information

Processing Systems, , pp. 1539–1546, MIT Press, 2004.

L. Sigal, S. Bhatia, S. Roth, M. black, and M. Isard. Tracking loose-limbed people.

Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp.

421–428, 2004.

(18)

Scaled Covariance Sampling

• Instead of using isotropic Gaussian distribution for sampling.

• Scaled each Gaussian distribution with its eigenvector.

• Reminiscent of Mohalanobis distance in metric.

• 1. Condensation (dashed circle) randomizes each sample by dynamic noise. 2. MHF (solid circle) samples within covariance support (dashed circle) and applies the same noise model. 3. Covariance scaled sampling (pattern ellipse) focuses on good cost minima (flat filled ellipses) by inflating the highly uncertain region (dashed ellipse).

(19)

• Scaled covariance sampling algorithm:

C. Sminchisescu and B. Triggs. Covariance scaled sampling for monocular 3d body

tracking. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 447–454, 2001.

(20)
(21)

3D Human Tracking with Coordinated Mixture of Factor Analyzers

• Instead of drawing samples directly from high dimensional space, learned the nonlinear manifold structure from motion capture data.

• Learn the bi-directional nonlinear projection function using mixture of factor analyzers within a global coordinate [Teh and Roweis NIPS02].

• Mixture of factor analyzers concurrently carries out dimensionality reduction and clustering.

(22)

• Traverse sample trajectory along the nonlinear manifold.

• Able to draw samples using multiple hypothesis filter akin to [Cham and Rehg CVPR99].

• Compared with the results using GPLVM [Urtasun ICCV05], annealed particles [Deutscher et al. CVPR00], simple particle filters.

(23)
(24)

Other Particle Filters

• Joint probabilistic data association (JPDAF) filter.

• Unscented particle filter.

• ...

參考文獻

相關文件

• To achieve small expected risk, that is good generalization performance ⇒ both the empirical risk and the ratio between VC dimension and the number of data points have to be small..

A smaller aperture increases the range in which A smaller aperture increases the range in which the object is approximately in focus. Di

• Learn the mapping between input data and the corresponding points the low dimensional manifold using mixture of factor analyzers. • Learn a dynamical model based on the points on

For a directed graphical model, we need to specify the conditional probability distribution (CPD) at each node.. • If the variables are discrete, it can be represented as a

• Global coordination of local generative models: Global coordination [1], Alignment of local representation

Pantic, “Facial action unit detection using probabilistic actively learned support vector machines on tracked facial point data,” IEEE Conference on Computer

C., “Robust and Efficient Algorithm for Optical Flow Computation,” Proceeding of IEEE International Conference on Computer Vision, pp. “Determining Optical Flow.” Artificial

Jones, "Rapid Object Detection Using a Boosted Cascade of Simple Features," IEEE Computer Society Conference on Computer Vision and Pattern Recognition,