• 沒有找到結果。

# Yung-Yu Chuang Digital Visual Effects Features

N/A
N/A
Protected

Share "Yung-Yu Chuang Digital Visual Effects Features"

Copied!
145
0
0

(1)

### Features

Digital Visual Effects Yung-Yu Chuang

with slides by Trevor Darrell Cordelia Schmid, David Lowe, Darya Frolova, Denis Simakov, Robert Collins and Jiwon Kim

(2)

### Outline

• Features

• Harris corner detector

• SIFT

• Extensions

• Applications

(3)

(4)

### Features

• Also known as interesting points, salient points or keypoints. Points that you can easily point out their correspondences in multiple images using only local information.

?

(5)

### Desired properties for features

• Distinctive: a single feature can be correctly matched with high probability.

• Invariant: invariant to scale, rotation, affine, illumination and noise for robust matching

across a substantial range of affine distortion, viewpoint change and so on. That is, it is

repeatable.

(6)

### Applications

• Object or scene recognition

• Structure from motion

• Stereo

• Motion tracking

• …

(7)

### Components

• Feature detection locates where they are

• Feature description describes what they are

• Feature matching decides whether two are the same one

(8)

(9)

### Moravec corner detector (1980)

• We should easily recognize the point by looking through a small window

• Shifting a window in any direction should give a large change in intensity

(10)

flat

(11)

flat

(12)

flat edge

(13)

flat edge corner

isolated point

(14)

y x

,

2

### Moravec corner detector

Change of intensity for the shift [u,v]:

window function

Four shifts: (u,v) = (1,0), (1,1), (0,1), (-1, 1) Look for local maxima in min{E}

intensity shifted

intensity

(15)

### Problems of Moravec detector

• Noisy response due to a binary window function

• Only a set of shifts at every 45 degree is considered

• Only minimum of E is taken into account

Harris corner detector (1988) solves these problems.

(16)

### Harris corner detector

Noisy response due to a binary window function

 Use a Gaussian function

(17)

### Harris corner detector

Only a set of shifts at every 45 degree is considered

 Consider all small shifts by Taylor’s expansion

(18)

### Harris corner detector

Only a set of shifts at every 45 degree is considered

 Consider all small shifts by Taylor’s expansion

### 

y x

y x

y x

y y

x

x

y x I y x I y x w C

y x I y x w B

y x I y x w A

Bv Cuv

Au v

u E

, ,

2 ,

2

2 2

) , ( ) , ( ) , (

) , ( ) , (

) , ( ) , (

2 )

, (

y x

,

2

y x

y

x

,

2 2 2

(19)

### Harris corner detector

Equivalently, for small shifts [u,v] we have a bilinear approximation:

, where M is a 22 matrix computed from image derivatives:

y

x x y y

y x x

, 2

2

(20)

0 0

T

T T

T

T

2

2 0 0

2

(21)

### Harris corner detector

Only minimum of E is taken into account

A new corner measurement by investigating the shape of the error function

represents a quadratic function; Thus, we can analyze E’s shape by looking at the property of M

T

(22)

### Harris corner detector

High-level idea: what shape of the error function will we prefer for features?

0 2 4 6 8 10 12

0 5

10 0 20 40 60 80 100

0 2 4 6 8 10 12

0 5

10 0 20 40 60 80 100

0 2 4 6 8 10 12

0 5

10 0 20 40 60 80 100

flat edge corner

(23)

• Quadratic form (homogeneous polynomial of degree two) of n variables xi

• Examples

=

(24)

### Symmetric matrices

• Quadratic forms can be represented by a real symmetric matrix A where

(25)

(26)

(27)

T

T T

T T T

T T

T

21 21

1 1q

2 2q

T

T

(28)

### Harris corner detector

Intensity change in shifting window: eigenvalue analysis

1, 2 – eigenvalues of M

direction of the slowest change direction of the

fastest change

(max)-1/2

(min)-1/2 Ellipse E(u,v) = const

(29)

T



 



 



 

 



 

 

1 0

0 1

1 0

0 1

1 0

0 1

1 0

0 A 1

(30)

T



 



 



 

 



 

 

1 0

0 1

1 0

0 4

1 0

0 1

1 0

0 A 4

(31)

T



 

 

 



 

 



 

 

50 . 0 87

. 0

87 . 0 50

. 0 4

0

0 1

50 . 0 87

. 0

87 . 0 50

. 0 75

. 1 30

. 1

30 . 1 25 . A 3

(32)

T



 

 

 



 

 



 

 

50 . 0 87

. 0

87 . 0 50

. 0 10

0

0 1

50 . 0 87

. 0

87 . 0 50

. 0 25

. 3 90

. 3

90 . 3 75

. A 7

(33)

### Harris corner detector

1

2

Corner

1 and 2 are large,

1 ~ 2;

E increases in all directions

1 and 2 are small;

E is almost constant in all directions

edge

1 >> 2 edge

2 >> 1

flat Classification of

image points

using eigenvalues of M:

(34)

### Harris corner detector

Measure of corner response:

(k – empirical constant, k = 0.04-0.06)

2

4 )

( 00 11 2 10 01

11

00 a a a a a

a

2

2 1

2 1

### M

Only for reference, you do not need

them to compute R

(35)

(36)

(37)

(38)

(39)

### Summary of Harris detector

1. Compute x and y derivatives of image

2. Compute products of derivatives at every pixel

3. Compute the sums of the products of derivatives at each pixel

x

x

y

y

x

x

x

2

y2

y

y

xy

x

y

2

2 ' x

x

y2

'

y2

xy

'

xy

(40)

### Summary of Harris detector

4. Define the matrix at each pixel

5. Compute the response of the detector at each pixel

6. Threshold on value of R; compute nonmax suppression.

2 2

xy y x xy

2

(41)

(42)

(43)

(44)

(45)

(46)

### Corner detection demo

http://www.cim.mcgill.ca/~dparks/CornerDetector/mainApplet.htm

(47)

### Harris detector: summary

• Average intensity change in direction [u,v] can be expressed as a bilinear form:

• Describe a point in terms of eigenvalues of M:

measure of corner response

• A good (corner) point should have a large intensity change in all directions, i.e. R should be large

positive

1 2

2

2

1

(48)

### Now we know where features are

• But, how to match them?

• What is the descriptor for a feature? The

simplest solution is the intensities of its spatial neighbors. This might not be robust to

brightness change or small shift/rotation.

### ( )

1 2 3

4 5 6

7 8 9

1 2 3 4 5 6 7 8 9

(49)

### Harris detector: some properties

• Partial invariance to affine intensity change

 Only derivatives are used =>

invariance to intensity shift I  I + b

 Intensity scale: I  a I

R

x (image coordinate) threshold

R

x (image coordinate)

(50)

### Harris Detector: Some Properties

• Rotation invariance

Ellipse rotates but its shape (i.e. eigenvalues) remains the same

Corner response R is invariant to image rotation

(51)

### Harris Detector is rotation invariant

Repeatability rate:

# correspondences

# possible correspondences

(52)

### Harris Detector: Some Properties

• But: not invariant to image scale!

All points will be classified as edges

Corner !

(53)

### Harris detector: some properties

• Quality of Harris detector for different scale changes

Repeatability rate:

# correspondences

# possible correspondences

(54)

### Scale invariant detection

• Consider regions (e.g. circles) of different sizes around a point

• Regions of corresponding sizes will look the same in both images

(55)

### Scale invariant detection

• The problem: how do we choose corresponding circles independently in each image?

• Aperture problem

(56)

(57)

### SIFT

• SIFT is an carefully designed procedure with empirically determined parameters for the invariant and distinctive features.

(58)

### SIFT stages:

• Scale-space extrema detection

• Keypoint localization

• Orientation assignment

• Keypoint descriptor

## ( )

local descriptor

detector

descriptor

A 500x500 image gives about 2000 features

(59)

### 1. Detection of scale-space extrema

• For scale invariance, search for stable features across all possible scales using a continuous

function of scale, scale space.

• SIFT uses DoG filter for scale space because it is efficient and as stable as scale-normalized

Laplacian of Gaussian.

(60)

### DoG filtering

Convolution with a variable-scale Gaussian

Difference-of-Gaussian (DoG) filter

Convolution with the DoG filter

(61)

### Scale space

 doubles for the next octave

K=2(1/s)

Dividing into octave is for efficiency only.

(62)

(63)

### Keypoint localization

X is selected if it is larger or smaller than all 26 neighbors

(64)

### Decide scale sampling frequency

• It is impossible to sample the whole space, tradeoff efficiency with completeness.

• Decide the best sampling frequency by

experimenting on 32 real image subject to synthetic transformations. (rotation, scaling, affine stretch, brightness and contrast change, adding noise…)

(65)

(66)

### Decide scale sampling frequency

s=3 is the best, for larger s, too many unstable features for detector,

repeatability for descriptor,

distinctiveness

(67)

### Pre-smoothing

 =1.6, plus a double expansion

(68)

(69)

### 2. Accurate keypoint localization

• Reject points with low contrast (flat) and poorly localized along an edge (edge)

• Fit a 3D quadratic function for sub-pixel maxima

1

6

5

-1 0 +1

(70)

### 2. Accurate keypoint localization

• Reject points with low contrast (flat) and poorly localized along an edge (edge)

• Fit a 3D quadratic function for sub-pixel maxima

1

6

5

-1 0 +1

2

2 ) 0 ( ) ''

0 ( ' )

0 ( )

( f x

x f

f x

f

3 ˆ 1 x

2

2 6 2 3

2 2 6

6 )

(x x x x x

f 0

6 2 ) (

' x x f

3 61 3

3 1 3 2 1 6 ˆ)

(

2

x

f 3

61

3 1

(71)

### 2. Accurate keypoint localization

• Taylor series of several variables

• Two variables









2 2 2 2 2 2

2 ) 1

0 , 0 ( )

,

( y

y y xy f

y x x f

x x y f

y x f

x f f

y x f









y x

y y

f y

x f

y x

f x

x f y

y x x y

f x

f f y

f x 2 2

2 2

2 1 0

0

x

x x x x

0

x 2

2

2 1

f f

f

f T

T

(72)

### Accurate keypoint localization

• Taylor expansion in a matrix form, x is a vector, f maps x to a scalar

xn

f x

f x

f

1

1

2 2

2 2

1 2

2 2 2

2 2

1 2

2 1

2

2 1

2 2

1 2

n n

n

n n

x f x

x f x

x f

x x

f x

f x

x f

x x

f x

x f x

f

(73)

(74)

(75)

T

(76)

T

n n

1 1

n

i

i i

1

n n

1 1

(77)

T

(78)

T

n nn

n

n n

1

1

1 11

1

n

i

n

j

j i ij

1 1

T

n

i

n

j

j nj i

in n

i

n

j

j j i

i

n

1 1

1 1

1 1

1

T

(79)

T

2 2 2

2 2

2

(80)

### Accurate keypoint localization

• x is a 3-vector

• Change sample point if offset is larger than 0.5

• Throw out low contrast (<0.03)

(81)

### Accurate keypoint localization

• Throw out low contrast | D(xˆ) | 0.03

x x

x x x x

x x

x x x

x x

x x

x x x

x x

x x

x x x

x x x

x x x

2 ˆ 1

ˆ) 2 (

ˆ 1 2 ˆ 1

2 ˆ 1

2 ˆ 1

ˆ 2 ˆ

ˆ 1 ˆ)

(

1 2 2

1 2 2 2 2 2

2

1 2 2 2

1 2 2 2

2 2

T

T T

T T

T T T

T T

T T

D D

D D D

D D

D D D

D D

D D

D D D

D D

D D

D D D

D D D

D

(82)

### Eliminating edge responses

r=10 Let

Keep the points with

Hessian matrix at keypoint location

(83)

(84)

(85)

### Keypoint detector

233x89 832 extrema

729 after con-

trast filtering 536 after cur- vature filtering

(86)

### 3. Orientation assignment

• By assigning a consistent orientation, the

keypoint descriptor can be orientation invariant.

• For a keypoint, L is the Gaussian-smoothed image with the closest scale,

orientation histogram (36 bins) (Lx, Ly)

m θ

(87)

(88)

(89)

(90)

### Orientation assignment

σ=1.5*scale of the keypoint

(91)

(92)

(93)

### Orientation assignment

accurate peak position is determined by fitting

(94)

### Orientation assignment

0 2 

36-bin orientation histogram over 360°,

weighted by m and 1.5*scale falloff Peak is the orientation

Local peak within 80% creates multiple orientations

About 15% has multiple orientations and they contribute a lot to stability

(95)

(96)

### 4. Local image descriptor

• Thresholded image gradients are sampled over 16x16 array of locations in scale space

• Create array of orientation histograms (w.r.t. key orientation)

• 8 orientations x 4x4 histogram array = 128 dimensions

• Normalized, clip values larger than 0.2, renormalize

σ=0.5*width

(97)

(98)

(99)

### Feature matching

• for a feature x, he found the closest feature x1 and the second closest feature x2. If the

distance ratio of d(x, x1) and d(x, x1) is smaller than 0.8, then it is accepted as a match.

(100)

(101)

(102)

(103)

(104)

(105)
(106)

### Estimated rotation

• Computed affine transformation from rotated image to original image:

0.7060 -0.7052 128.4230 0.7057 0.7100 -128.9491 0 0 1.0000

• Actual transformation from rotated image to original image:

0.7071 -0.7071 128.6934 0.7071 0.7071 -128.6934 0 0 1.0000

(107)

(108)

(109)

### PCA-SIFT

• Only change step 4

• Pre-compute an eigen-space for local gradient patches of size 41x41

• 2x39x39=3042 elements

• Only keep 20 components

• A more compact descriptor

(110)

17 location bins 16 orientation bins

Analyze the 17x16=272-d

eigen-space, keep 128 components SIFT is still considered the best.

SIFT

(111)

### Multi-Scale Oriented Patches

• Simpler than SIFT. Designed for image

matching. [Brown, Szeliski, Winder, CVPR’2005]

• Feature detector

– Multi-scale Harris corners

– Orientation from blurred gradient – Geometrically invariant to rotation

• Feature descriptor

– Bias/gain normalized sampling of local patch (8x8) – Photometrically invariant to affine changes in

intensity

(112)

### Multi-Scale Harris corner detector

• Image stitching is mostly concerned with

matching images that have the same scale, so sub-octave pyramid might not be necessary.

(113)

### Multi-Scale Harris corner detector

Corner detection function:

Pick local maxima of 3x3 and larger than 10

(114)

### Keypoint detection function

Experiments show roughly the same performance.

(115)

### Non-maximal suppression

• Restrict the maximal number of interest points, but also want them spatially well distributed

• Only retain maximums in a neighborhood of radius r.

• Sort them by strength, decreasing r from

infinity until the number of keypoints (500) is satisfied.

(116)

(117)

(118)

(119)

### Descriptor Vector

• Rotation Invariant Frame

Scale-space position (x, y, s) + orientation ()

(120)

### MOPS descriptor vector

• 8x8 oriented patch sampled at 5 x scale. See TR for details.

• Sampled from with spacing=5

8 pixels

40 pixels

(121)

### MOPS descriptor vector

• 8x8 oriented patch sampled at 5 x scale. See TR for details.

• Bias/gain normalisation: I’ = (I – )/

• Wavelet transform

8 pixels

40 pixels

(122)

(123)

### Summary

• Multi-scale Harris corner detector

• Sub-pixel refinement

• Blurred intensity patch as descriptor

(124)

### Feature matching

• Exhaustive search

– for each feature in one image, look at all the other features in the other image(s)

• Hashing

– compute a short descriptor from each feature vector, or hash longer descriptors (randomly)

• Nearest neighbor techniques

– k-trees and their variants (Best Bin First)

(125)

### Wavelet-based hashing

• Compute a short (3-vector) descriptor from an 8x8 patch using a Haar “wavelet”

• Quantize each value into 10 (overlapping) bins (103 total entries)

• [Brown, Szeliski, Winder, CVPR’2005]

(126)

### Nearest neighbor techniques

• k-D tree and

• Best Bin First

(BBF)

Indexing Without Invariants in 3D Object Recognition, Beis and Lowe, PAMI’99

(127)

(128)

SIFT Features

(129)

(130)

(131)

### Office of the past

Video of desk Images from PDF

Track &

recognize

T T+1

Internal representation

Scene Graph

Desk Desk

(132)

### …

> 5000 images

change in viewing angle

### Image retrieval

(133)

22 correct matches

(134)

### …

> 5000 images change in viewing angle

+ scale change

(135)

(136)

### Robotics: Sony Aibo

SIFT is used for

 Recognizing

charging station

 Communicating with visual cards

 Teaching object recognition

 soccer

(137)

### Structure from Motion

• The SFM Problem

– Reconstruct scene geometry and camera motion from two or more images

Track

2D Features Estimate

3D Optimize

Fit Surfaces

SFM Pipeline

(138)

### Structure from Motion

Poor mesh Good mesh

(139)

(140)

(141)

(142)

(143)

(144)

(145)

### Reference

• Chris Harris, Mike Stephens, A Combined Corner and Edge Detector , 4th Alvey Vision Conference, 1988, pp147-151.

• David G. Lowe,

Distinctive Image Features from Scale-Invariant Keypoints,

International Journal of Computer Vision, 60(2), 2004, pp91-110.

• Yan Ke, Rahul Sukthankar,

PCA-SIFT: A More Distinctive Representation for Local Image Descri ptors

, CVPR 2004.

• Krystian Mikolajczyk, Cordelia Schmid,

A performance evaluation of local descriptors, Submitted to PAMI, 2004.

SIFT Keypoint Detector, David Lowe.

Matlab SIFT Tutorial, University of Toronto.

Digital Visual Effects Yung-Yu

• Shutter speed is the amount of time that light is allowed to pass through the

A constant offset is added to a data label to produce an effective address (EA) The address is dereferenced to get effective address (EA). The address is dereferenced to get

Change Window Type to Video Sequence Editor. Add

– Change Window Type to Video Scene Editor – Select Add → Images and select all images – Drag the strip to the “1st Frame” in Layer

Edit Video Layer (Right click and drag, left click to set). Drag the strip

An electronic textbook is a comprehensive and self-contained curriculum package with digital print-on demand contents and electronic features (e-features include multimedia

• It is a plus if you have background knowledge on computer vision, image processing and computer graphics.. • It is a plus if you have access to digital cameras

All configurations: Chien-Chang Ho, Pei-Lun Lee, Yung-Yu Chuang, Bing-Yu Chen, Ming Ouhyoung, &#34;Cubical Marching Squares: Adaptive Surface Extraction from Volume Data with

Salmon, Automatic Creation of Object Hierarchies for Ray Tracing IEEE CG&amp;A 1987 Object Hierarchies for Ray Tracing, IEEE CG&amp;A, 1987. • Brian Smits, Efficiency Issues

• Chris Harris, Mike Stephens, A Combined Corner and Edge Detector, 4th Alvey Vision Conference, 1988, pp147-151. • Yan Ke, Rahul Sukthankar, PCA-SIFT: A More Distinctive

• Decide the best sampling frequency by experimenting on 32 real image subject to synthetic transformations. (rotation, scaling, affine stretch, brightness and contrast change,

• In Shutter-speed priority mode, photographers sets the shutter speed and the camera deduces the aperture. • In Program mode, the camera

Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp... Annealed

• It is a plus if you have background knowledge on computer vision, image processing and computer graphics.. • It is a plus if you have access to digital cameras

• It is a plus if you have background knowledge on computer vision, image processing and computer graphics.. • It is a plus if you have access to digital cameras

Digital Visual Effects Yung-Yu

Digital Visual Effects Special effects (physical effects).. Special effects (make-up) Special

Beauty Cream 9001, GaTech DVFX 2003.

Classifier which uses OpenCV (Open Source Computer Vision Library) was a detector that has been training for face detecting.. The result of the experiment is 2 frames per second, and

D.Wilcox, “A hidden Markov model framework for video segmentation using audio and image features,” in Proceedings of the 1998 IEEE Internation Conference on Acoustics, Speech,

Jones, &#34;Rapid Object Detection Using a Boosted Cascade of Simple Features,&#34; IEEE Computer Society Conference on Computer Vision and Pattern Recognition,

Relationship of job stress and type-a behavior to employees, job satisfaction, organization commitment, psychosomatic health problems and turnover motivation. Combined effects of