### Computer Vision

Computer Science & Information Technology
Co p te Sc e ce & o at o Tec o ogy
*Yung-Yu Chuang*

2009/04/10 2009/04/10

*with slides by Li Zhang, Szymon Rusinkiewicz, Srinivasa Narasimhan*

**What is computer vision ?**

• The goal of computer vision is to write

computer programs that can interpret images computer programs that can interpret images and understand the scene. The holy grail is to mimic human vision system

mimic human vision system.

**OUTPUT**

descriptions images

descriptions Computer Graphics

**N****PUT**

images Computer Vision Image Processing

**IN**

**Can computer match human perception? **

• Yes and no (but mostly no!)

t b b tt t “ ” thi g – computers can be better at “easy” things – humans are much better at “hard” things

**Computer vision vs Human Vision**

What we see What a computer sees

What we see What a computer sees

Srinivasa Narasimhan’s slide

**Components of a computer vision system**

Camera

Lighting

Computer

Scene

S I t t ti

Scene Interpretation

Srinivasa Narasimhan’s slide

**Camera**

**Camera trial #1**

scene film

Put a piece of film in front of an object Put a piece of film in front of an object.

**Pinhole camera**

pinhole camera pinhole camera

scene barrier film

Add a barrier to block off most of the rays.

• It reduces blurring

• The pinhole is known as the aperture

• The image is inverted

**Shrinking the aperture**

Why not making the aperture as small as possible?

• Less light gets through

• Diffraction effect

**Adding a lens**

“circle of confusion”

scene lens film

confusion”

A lens focuses light onto the film

• There is a specific distance at which objects are “in focus”

• other points project to a “circle of confusion” in the imageother points project to a circle of confusion in the image

**Lenses**

Thin lens equation:

• Any object point satisfying this equation is in focus

• Thin lens applet:

htt // h t d t /j /L /l ht l

http://www.phy.ntnu.edu.tw/java/Lens/lens_e.html

**Exposure = aperture + shutter speed**

**F**

• Aperture of diameter D restricts the range of rays (aperture may be on either side of the lens) (aperture may be on either side of the lens)

• Shutter speed is the amount of time that light is allowed to pass through the aperturep g p

**Exposure**

• Two main parameters:

Aperture (in f stop) – Aperture (in f stop)

– Shutter speed (in fraction of a second)

**Exposure**

• Two main parameters:

A t (i f t ) – Aperture (in f stop)

– Shutter speed (in fraction of a second)

• Reciprocity

**The same exposure is obtained with **

** ****i ****l** **d **

**an exposure twice as long and an **
**aperture area half as big**

H t f t g i f

– Hence square root of two progression of f stops vs. power of two progression of shutter speed

shutter speed

– Reciprocity can fail for very long exposuresp

From Photography, London et al.

**Effects of shutter speeds**

• Slower shutter speed => more light, but more motion blur

• Faster shutter speed freezes motion From Photography, London et al.

Walking people Running people Car Fast train

1/125 1/250 1/500 1/1000

**Depth of field**

Changing the aperture size affects depth of field.

A smaller aperture increases the range in which A smaller aperture increases the range in which the object is approximately in focus

Di h Diaphragm

Point in focus lens

sensor Object with texture

**Depth of field**

Changing the aperture size affects depth of field.

A smaller aperture increases the range in which A smaller aperture increases the range in which the object is approximately in focus

Di h Diaphragm

Point in focus lens

sensor Object with texture

**Depth of field**

From Photography, London et al.

**Film camera**

aperture

& shutter

scene lens & film

motor

**Digital camera**

aperture

& shutter

scene lens & sensor

array motor

A digital camera replaces film with a sensor array

• A digital camera replaces film with a sensor array

• Each cell in the array is a light-sensitive diode that converts photons to electrons

converts photons to electrons

**CCD v.s. CMOS**

• CCD is less susceptible to noise (special process, higher fill factor))

• CMOS is more flexible, less expensive (standard process), less power consumption

CCD CMOS

CCD CMOS

**SLR (Single-Lens Reflex)**

• Reflex (R in SLR) means that we see through the same lens used to take the image

the same lens used to take the image.

• Not the case for compact cameras

**SLR view finder**

Prism Your eye

Mirror Mirror (flipped for exposure)

Film/sensor

Mirror (when viewing) Light from scene

lens

**Color**

So far, we’ve only talked about monochrome sensors Color imaging has been implemented in a sensors. Color imaging has been implemented in a number of ways:

Fi ld ti l

• Field sequential

• Multi-chip

• Color filter array

• X3 sensor

• X3 sensor

**Field sequential** **Field sequential**

**Field sequential** **Prokudin-Gorskii (early 1900’s) **

L

Lantern projector

http://www.loc.gov/exhibits/empire/

**Prokudin-Gorskii (early 1990’s)** **Multi-chip**

wavelength dependent

**Color filter array**

Bayer pattern Color filter arrays (CFAs)/color filter mosaics Color filter arrays (CFAs)/color filter mosaics

**Bayer’s pattern**

**Demosaicking CFA’s**

bilinear interpolation

i i l i t li i t l ti

original input linear interpolation

**Demosaicking CFA’s**

bili C k F L R h

bilinear Cok Freeman LaRoche

**Digital camera review website**

• A cool video of digital camera illustration

h // d i /

• http://www.dpreview.com/

**Now, we have images**

**• We can think of an image as a function, f: R**^{2}ÆR:

*f( ) i* ** th i t** **it** t iti *( ) *
**– f(x, y) gives the intensity at position (x, y) **

f x y

• What about color images?

**Write a program to interpret images**

Camera

Lighting

Computer

Scene

S I t t ti

Scene Interpretation

Srinivasa Narasimhan’s slide

**Computer vision programs**

**Low-level vision (early vision)**

• Considers local properties of an imageg

“There’s an edge!”

**Mid-level vision **

• Grouping and p g segmentation

“There’s an object and a background!”

and a background!

**High-level vision**

• Recognitiong

“It’s a chair!”

**Low-level vision**

**Detection**

• Edges Li

• Lines

• Corners

**Image filtering**

• Convolution with a mask

0 0 1 0 0 0 1 1 1 0 1 1 1 1 1 0 1 1 1 0 0 0 1 0 0

**Image filtering (motion blur)**

• Convolution with a mask

1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1

**Image filtering (sharpening)**

• Convolution with a mask

0 0 0 0 0

0 -1 -1 -1 0

0 -1 9 -1 0

0 -1 -1 -1 0 **Demo with **

0 0 0 0 0

**Demo with **
**PaintShop Pro**

**Gaussian filters**

• One-dimensional Gaussian

2 2 2

2

1 2

) 1

( ^{σ}

### π σ

*x*

*e*
*x*

*G* = ^{−}

• Two-dimensional Gaussian 2

### π σ

• Two dimensional Gaussian

2 2

1 ^{−}^{x}_{2}^{+}^{y}^{2}

2 2

) 1 ,

( ^{σ}

### π

### σ

^{e}*y*
*x*

*G* =

**Gaussian filters**

**Gaussian filters** **Computing Discrete Convolutions**

### ∑∑ ^{⋅} ^{−} ^{−}

### =

*i* *j*

*j* *y* *i* *x* *In* *j* *i* *f* *y*

*x*

*Out* ( , ) ( , ) ( , )

*• If In is n×n, f is m×m, takes time*
O( ^{2 2})

*i* *j*

*O(m*^{2}*n*^{2})

• OK for small filter kernels, bad for large ones

**Example: smoothing**

**Original** **Smoothed with**
**Gaussian kernel**
**Gaussian kernel**

**Canny edge detector**

• Smooth Fi d d i i

• Find derivative

• Thresholding

• Thinning

**Canny edge detector**

• First, smooth with a Gaussian of some width σ

some width σ

**Original Image** **blurred Image**

**Canny edge detector**

• Next, find “derivative”

Wh i d i i i 2D? G di

• What is derivative in 2D? Gradient:

⎞

⎛∂*f* ∂*f*

⎟⎟⎠

⎜⎜ ⎞

⎝

⎛

∂

∂

∂

= ∂

∇ *y*

*f*
*x*
*y* *f*

*x*

*f*( , ) ,

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 -1 0 1 0 2

1

0 0 -1 0 0 0 0 0 0 0 2

1 0 0 0 0 0

0 0 0 0 0

2 0 0 1 0 0

0 0 0 0 0 2

**Canny edge detector**

**Horizontal gradient** **Vertical gradient**

**Canny edge detector**

**Original Image** **Smoothed Gradient Magnitude**

**Canny edge detector**

• thresholding

**Original Image** **Threshold Gradient Magnitude**

**Canny edge detector**

• Thinning

**Original Image** **Edges**

**Canny edge detector**

• Nonmaximum suppression

Eli i t ll b t l l i i *it d*
*– Eliminate all but local maxima in magnitude*

of gradient

*At each pixel look along direction of gradient:*

*– At each pixel look along direction of gradient:*

if either neighbor is bigger, set to zero

– In practice quantize direction to horizontal In practice, quantize direction to horizontal, vertical, and two diagonals

– Result: “thinned edge image”Result: thinned edge image

Canny demo

**Detecting lines**

• What is the difference between line detection and edge detection?

and edge detection?

– Edges = local

Li l l

– Lines = nonlocal

• Line detection usually performed on the output f d d

of an edge detector

**Hough transform**

• General idea: transform from image

coordinates to parameter space of feature coordinates to parameter space of feature

– Need parameterized model of features

F h i l d t i ll t l th t

– For each pixel, determine all parameter values that might have given rise to that pixel; vote

At end look for peaks in parameter space – At end, look for peaks in parameter space

**Hough transform for lines**

*• Generic line: y = ax+b*

P *d b*

*• Parameters: a and b*

**Hough transform for lines**

*1. Initialize table of buckets, indexed by*
*a and b to zero*

*a and b, to zero*

*2. For each detected edge pixel (x,y):*

*a. Determine all (a,b) such that y = ax+b*
*b. Increment bucket (a,b)*

3. Buckets with many votes indicate probable lines

**Hough transform for lines**

aa

bb

**Issues**

• Slope / intercept parameterization not ideal

N if li f di ti

– Non-uniform sampling of directions – Can’t represent vertical lines

l d

• Angle / distance parameterization

*– Line represented as (r,*θ) where
*x cos *θ *+ y sin *θ*= r*

Hough transform demo *rr*
θθ

Hough transform demo Hough transform demo2

**Detection of corners**

• Also known as featurs, interesting points,

salient points or keypoints Points that you can salient points or keypoints. Points that you can easily point out their correspondences in

multiple images using only local information multiple images using only local information.

?

**Moravec corner detector (1980)**

• We should easily recognize the point by looking through a small window

through a small window

*• Shifting a window in any direction should give a *

*l* *h* i i t it

*large change in intensity*

**Moravec corner detector**

flat

**Moravec corner detector**

flat

**Moravec corner detector**

flat edgeg

**Moravec corner detector**

flat edge corner

i l t d i t g isolated point

**Moravec corner detector**

*Change of intensity for the shift [u,v]:*

### [ ]

### ∑

^{+}

^{+}

^{−}

=

*y*
*x*

*y*
*x*
*I*
*v*
*y*
*u*
*x*
*I*
*y*
*x*
*w*
*v*

*u*
*E*

,

) 2

, ( ) ,

( ) , ( )

, (

*y*
*x,*
window

f ti shifted intensity

i t it

function y

intensity

Four shifts: (u,v) = (1,0), (1,1), (0,1), (-1, 1)( ) ( ) ( ) ( ) ( )
*Look for local maxima in min{E}*

**Problems of Moravec detector**

• Noisy response due to a binary window function O l f hif 45 d i

• Only a set of shifts at every 45 degree is considered

• Only minimum of E is taken into account

ÖHarris corner detector (1988) solves these problems

problems.

**Harris corner detector**

Noisy response due to a binary window function

¾U G i f i

¾Use a Gaussian function

**Harris corner detector**

Only a set of shifts at every 45 degree is considered

¾C id ll ll hif b T l ’ i

¾Consider all small shifts by Taylor’s expansion

**Harris corner detector**

Only a set of shifts at every 45 degree is considered

¾C id ll ll hif b T l ’ i

¾Consider all small shifts by Taylor’s expansion

### [ ]

### ∑

^{+}

^{+}

^{−}

= *w* *x* *y* *I* *x* *u* *y* *v* *I* *x* *y*

*v*
*u*

*E*( , )

### ∑

( , )### [

( , ) ( , )### ]

^{2}

*y*
*x*

*y*
*y*

*y*

,

) , ( ) ,

( ) , ( )

, (

### [ ]

### ∑

^{+}

^{+}

= *w*(*x*,*y*) *I* *u* *I* *v* *O*(*u*^{2},*v*^{2}) ^{2}

+ +

= *Au* *Cuv* *Bv*

*v*
*u*

*E*( , ) ^{2} 2 ^{2}

### [ ]

### ∑

^{+}

^{+}

*y*
*x*

*y*

*x**u* *I* *v* *O* *u* *v*

*I*
*y*
*x*
*w*

,

) , ( )

, (

### ∑

=

*y*
*x*

*x* *x* *y*

*I*
*y*
*x*
*w*
*A*

,

2( , ) )

, ( ) , (

### ∑

### ∑

=

*y*
*x*

*y* *x* *y*

*I*
*y*
*x*
*w*
*B*

,

2( , ) )

, (

### ∑

=

*y*
*x*

*y*

*x* *x* *y* *I* *x* *y*

*I*
*y*
*x*
*w*
*C*

,

) , ( ) , ( ) , (

**Harris corner detector**

*Equivalently, for small shifts [u,v] we have a bilinear approximation:*

⎤

### [ ]

⎡_{⎥}

⎦

⎢ ⎤

⎣

≅ ⎡

*v*
*v* *u*

*u*
*v*
*u*

*E*( , ) **M**

, where M is a 2×2 matrix computed from image derivatives:

⎦

⎣

### ∑

^{(}

^{)}

^{⎢}

^{⎡}

^{I}

^{x}^{2}

^{I}

^{x}

^{I}

^{y}^{⎥}

^{⎤}

**M**

### ∑

_{⎥}

^{⎥}

⎢ ⎦

⎢⎣

=

*y*

*x* *x* *y* *y*

*y*

*I*
*I*
*y* *I*
*x*
*w*

,

) 2

,
(
**M**

**Harris corner detector**

*Only minimum of E is taken into account*

¾A b i i i h

¾A new corner measurement by investigating the shape of the error function

represents a quadratic function; Thus, we
*can analyze E’s shape by looking at the property *

**Mu** **u**

^{T}*can analyze E s shape by looking at the property *
**of M**

**Harris corner detector**

High-level idea: what shape of the error function will we prefer for features?

will we prefer for features?

80 100

80 100

80 100

20 40 60

20 40 60

20 40 60

10 12 10

0 20

10 12 10

0 20

10 12 10

0 20

0 2 4 6 8

0 5

0 2 4 6 8 10

0 5

0 2 4 6 8 10

0 5

fl t d

flat edge corner

**Harris corner detector**

Intensity change in shifting window: eigenvalue analysis

λ_{1}^{, }λ_{2 }– eigenvalues of M

### [ ]

_{⎥}

⎦

⎢ ⎤

⎣

≅ ⎡*u*

*v*
*u*
*v*
*u*

*E*( , ) , **M**

direction of the
Elli * E(* *) * t

⎥⎦

⎢*⎣v*

direction of the slowest change direction of the

fastest change
*Ellipse E(u,v) = const*

g
(λ_{max})^{-1/2}

(λ_{min})^{-1/2}

**Harris corner detector**

λ_{2} edge
Classification of image

points using eigenvalues ^{2}

Corner

λ_{1}and λ_{2}are large,
edge

λ_{2}>> λ_{1}
points using eigenvalues

of M:

1 2

λ_{1 }~ λ_{2};

*E*increases in all
directions
directions

λ_{1}and λ_{2}are small;

*E*is almost constant ^{edge }

λ >>λ flat

λ_{1}

in all directions flat ^{λ}^{1}>> λ_{2}

1

**Harris corner detector**

4 )

( _{00} _{11} ^{2} _{10} _{01}

11

00 *a* *a* *a* *a* *a*

*a* + ± − +

λ =

Only for reference, you do not need

h R

Measure of corner response:

= 2

λ them to compute R

Measure of corner response:

### (

^{trace}

### )

^{2}

det**M** *k* **M**

*R* ^{det}^{M}^{k}

### (

^{trace}

^{M}### )

*R*= −

d **M**

### λ λ

2 1

2 1

trace det

### λ λ

### λ λ

+

=

=
**M**
**M**

*(k – empirical constant, k = 0.04-0.06)*
2
1

( p , )

**Harris corner detector** **Summary of Harris detector**

1. Compute x and y derivatives of image
*I*

*G*

*I** _{x}* =

_{σ}

*∗*

^{x}*I*

*=*

_{y}*G*

_{σ}

*∗*

^{y}*I*

2. Compute products of derivatives at every pixel

3 C t th f th d t f

*x*

*x* *I**x* *I*

*I* 2 = ⋅ *I**y*2 = *I**y*⋅*I**y* *I**xy* = *I**x*⋅*I**y*

3. Compute the sums of the products of derivatives at each pixel

2

2 ' *x*

*x* *G* *I*

*S* = _{σ} ∗ *S**y*^{2} =*G*_{σ}'∗*I**y*^{2} *S** _{xy}* =

*G*

_{σ}

_{'}∗

*I*

_{xy}**Summary of Harris detector**

4. Define the matrix at each pixel

⎤

⎡

⎥⎥

⎦

⎤

⎢⎢

⎣

= ⎡

) , ( )

, (

) , ( )

, ) (

, (

2 2

*y*
*x*
*S*
*y*
*x*
*S*

*y*
*x*
*S*
*y*
*x*
*y* *S*

*x*

*M* ^{x}^{xy}

5 Compute the response of the detector at each

⎥⎦

⎢⎣*S** _{xy}*(

*x*,

*y*)

*S*

*2(*

_{y}*x*,

*y*)

5. Compute the response of the detector at each pixel

### (

trace### )

^{2}

det*M* *k* *M*

*R* = −

6. Threshold on value of R; compute nonmax i

### ( )

suppression.

**Harris corner detector (input)**

**Corner response R** **Threshold on R**

**Local maximum of R** **Harris corner detector**

**Mid-level vision**

**Segmentation and clustering**

• Defining regions

Sh ld th b t? S th b d ? – Should they be compact? Smooth boundary?

• Defining similarity

– Color, texture, motion, …

• Defining similarity of regions

– Minimum distance, mean, maximum

**Clustering based on color**

• Let’s make a few concrete choices:

A bit i – Arbitrary regions

– Similarity based on color only Si il it f i

– Similarity of regions =

distance between mean colors

**k-means Clustering**

**k-means Clustering**

*1. Pick number of clusters k*

*2. Randomly scatter k “cluster centers” in color *
spacep

3. Repeat:

a Assign each data point to its closest cluster a.Assign each data point to its closest cluster

center

b M h l t t t th f b.Move each cluster center to the mean of

the points assigned to it

**k-means Clustering** **k-means Clustering**

**k-means Clustering**

**k-means Clustering**

**k-means Clustering** **k-means Clustering**

**k-means Clustering**

**k-means Clustering**

**k-means Clustering** **k-means Clustering**

**k-means Clustering**

**k-means Clustering**

**k-means Clustering** **k-means Clustering**

**k-means Clustering**

**k-means Clustering**

**Results of Clustering**

**Original Image** **k-means, k=5** **k-means, k=11**

**Results of Clustering**

Sample clusters with

*Sample clusters with kk--means clustering*means clustering
based on color

based on color

**Other Distance Measures**

• Suppose we want to have compact regions

N f 5D

• New feature space: 5D

(2 spatial coordinates, 3 color components)

• Points close in this space are close both in color and in actual proximity

**Interactive segmentation**

video1 video2

**Matting**

**High-level vision**

**Recognition** **Recognition problems**

• What is it?

Obj t d t ti – Object detection

• Who is it?

– Recognizing identity

• What are they doing?

• What are they doing?

– Activities

**• All of these are classification problems**

Ch l f li f ibl did – Choose one class from a list of possible candidates

**Face detection**

• How to tell if a face is present?

**One simple method: skin detection**

skin

• Skin pixels have a distinctive range of colors – Corresponds to region(s) in RGB color space

• for visualization, only R and G components are shown above

Skin classifier

• A pixel X = (R G B) is skin if it is in the skin regionA pixel X (R,G,B) is skin if it is in the skin region

• But how to find this region?

**Skin detection**

**L** th ki i f l

**• Learn the skin region from examples**

– Manually label pixels in one or more “training images” as skin or not skin – Plot the training data in RGB space

• skin pixels shown in orange, non-skin pixels shown in blue

• some skin pixels may be outside the region, non-skin pixels inside. Why?

Skin classifier Skin classifier

• Given X = (R,G,B): how to determine if it is skin or not?

**Skin classification techniques**

Skin classifier

• Given X = (R,G,B): how to determine if it is skin or not?

• Nearest neighbor

find labeled pixel closest to X – find labeled pixel closest to X – choose the label for that pixel

• Data modeling

fit d l ( f l ) t h l

– fit a model (curve, surface, or volume) to each class

• Probabilistic data modeling – fit a probability model to each class

**Probability**

• Basic probability

– X is a random variableX is a random variable

– P(X) is the probability that X achieves a certain value

called a PDF

-probability distribution/density function

– Conditional probability: P(X | Y)

• probability of X given that we already know Yprobability of X given that we already know Y

**Probabilistic skin classification**

• Now we can model uncertainty

– Each pixel has a probability of being skin or not skin

•

Skin classifier

• Given X = (R,G,B): how to determine if it is skin or not?( , , )

• Choose interpretation of highest probability – set X to be a skin pixel if and only if

Wh d t d ?

Where do we get and ?

**Learning conditional PDF’s**

• We can calculate P(R | skin) from a set of training images – It is simply a histogram over the pixels in the training images

• each bin R_{i}_{i}contains the proportion of skin pixels with color Rp p p _{i}_{i}

This doesn’t work as well in higher-dimensional spaces. Why not?

Approach: fit parametric PDF functions Approach: fit parametric PDF functions

• common choice is rotated Gaussian – center

covariance – covariance

» orientation, size defined by eigenvecs, eigenvals

**Learning conditional PDF’s**

• We can calculate P(R | skin) from a set of training images – It is simply a histogram over the pixels in the training images

• each bin R_{i}_{i}contains the proportion of skin pixels with color Rp p p _{i}_{i}

But this isn’t quite what we want

• Why not? How to determine if a pixel is skin?

W t P( ki | R) t P(R | ki )

• We want P(skin | R) not P(R | skin)

• How can we get it?

**Bayes rule**

• In terms of our problem:

what we measure domain knowledge
**(likelihood)**

g
**(prior)**

what we want **normalization**term

what we want

**(posterior)** **normalization**term

The prior: P(skin)p ( )

• Could use domain knowledge

– P(skin) may be larger if we know the image contains a person – for a portrait, P(skin) may be higher for pixels in the centerfor a portrait, P(skin) may be higher for pixels in the center

• Could learn the prior from the training set. How?

– P(skin) may be proportion of skin pixels in training set

**Bayesian estimation**

likelihood posterior (unnormalized)

• Bayesian estimation

– Goal is to choose the label (skin or ~skin) that maximizes the posterior

= minimize probability of misclassification posterior

**• this is called Maximum A Posteriori (MAP) estimation**

**Skin detection results**

Classifier

**Viola/Jones: features**

“Rectangle filters”

Differences between sums of pixels in adjacent rectangles

## {

h_{t}(x) = +1 if f_{t}(x) > θ_{t}

-1 otherwise 60,000×100=6,000,000

Unique Features

Y(x)=∑α_{t}h_{t}(x) Select 200 by Adaboost

## {

Detection = face, if Y(x) > 0 non-face, otherwise

## {

Robust Realtime Face Detection, IJCV 2004, Viola and Jones

**Viola/Jones: handling scale**

Larger Scale Scale

Smallest Scale

50,000 Locations/Scales

**Viola/Jones results: **

Run-time: 15fps (384x288 pixel image on a 700 Mhz Pentium III)

**Application**

Smart cameras: auto focus, red eye removal, auto color correction

**Application**

Lexus LS600 Driver Monitor System

**Face recognition**

• Suppose you want to recognize a
*particular face*

*particular face*

*• How does this face differ from average face*

• Consider variation from average face

• Not all variations equally importantq y p

– Variation in a single pixel relatively unimportant

• If image is high-dimensional vector want to

• If image is high dimensional vector, want to find directions in this space with high variation

**PCA**

• Principal Components Analysis (PCA):

approximating a high dimensional data set approximating a high-dimensional data set with a lower-dimensional subspace

**

**

**

** **

** ****

** **

**

** First principal componentFirst principal component Second principal component

Second principal component

Original axes Original axes

** **

**

**

**

******** **

**

****

** ** **

Data points Data points

**PCA** **PCA on Faces: “Eigenfaces”**

Average

Average First principal componentFirst principal component Average

Average face face

Other Other components components

For all except average For all except average For all except average, For all except average,

“gray” = 0,

“gray” = 0,

“white” > 0,

“white” > 0,

“black” < 0

“black” < 0black < 0black < 0

**Using PCA for Recognition**

• Store each person as coefficients of projection onto first few principal components

onto first few principal components

### ∑

^{max}

^{Eigenface}

image

*i*

• Compute projections of target image, compare

### ∑

==

0

Eigenfacei

image

*i*

*a**i*

p p j g g , p

to database (“nearest neighbor classifier”)

**Choosing the dimension K**

eigenvalues

K *NM*

*i*= K *NM*

*i = *

• How many eigenfaces to use?

• Look at the decay of the eigenvalues

– the eigenvalue tells you the amount of variance g y

“in the direction” of that eigenface – ignore eigenfaces with low variance

**Advanced topics**

**High dynamic range imaging/display**

**Image warping/morphing**

someone very famous

someone not

th t f famous

that famous

video

**Image warping/morphing**

**Tracking**

Feature tracking Feature tracking

**Image stitching**

**MatchMove**

*Move matching using scene planes*
*Move matching using scene planes*

**Matchmove**

*Move matching using scene planes*
*Move matching using scene planes*

**Matchmove**

*Move matching using scene planes*
*Move matching using scene planes*

**Photo tourism**

**Video matching**

*MOCO (Motion control camera)*
*Matrix* *MOCO (Motion control camera)*
*Matrix*

**Video matching**

*Video matching*
*Video matching*

**Matting and compositing**

*Titanic*

**Matting**

**Image manipulation**

*GraphCut Texture*
*GraphCut Texture*

**Image manipulation**

*Poisson blending*
*Poisson blending*

**Image-based modeling**

*photogrammetric modeling and projective texture-mapping*
*photogrammetric modeling and projective texture-mapping*

**Image-based modeling**

*photogrammetric modeling and projective texture-mapping*
*photogrammetric modeling and projective texture-mapping*

**Image-based modeling**

*photogrammetric modeling and projective texture-mapping*
*photogrammetric modeling and projective texture-mapping*

**Image-based modeling**

*Tour into a picture*
*Tour into a picture*

**Image-based modeling**

*Tour into a picture*
*Tour into a picture*

**Structured Light and Ranging Scanning**

http://graphics.stanford.edu/projects/mich/

**3D photography (active)**

*Cyberware whole body scanner*
*Cyberware whole body scanner*

**3D photography (active)**

*Photometric stereo*
*Photometric stereo*

**3D photography (passive)**

*left* *right*

*Stereo*

*depth*
*Stereo*

**Image-based rendering**

*Surface lightfield*
*Surface lightfield*

**View interpolation**

Bullet time video Bullet time video

**View interpolation**

High Quality Video View Interpolation High-Quality Video View Interpolation

**Making face**

*Spacetime face*
*Gollum*

**Video rewrite**

Trainable videorealistic speech animation Trainable videorealistic speech animation

**Inpainting (wire removal)**

*Inpainting*
*Inpainting*

**Texture synthesis/replacement**

*Texture replacement*
*Texture replacement*

**Semi-automatic matting painting**

*Image analogies*
*Image analogies*

**Video editing**

*Flow based video editing*
*Flow-based video editing*

**Face Detection and Recognition** **Motion Estimation**

Application

Andy Serkis, Gollum, Lord of the Rings

**Novel Cameras and Displays**

http://www1.cs.columbia.edu/CAVE/projects/cc.htm

**Capturing Light Field **

Camera Arrays, Graphics Lab, Stanford University

**Digital visual effects**

**Digital Visual Effects**

### ?

**Reality?** **Retouching**

**Iraq War, LA Times, April 2003**

Digital photomontage

**Bush campaign’s TV AD, 2004**

**Texture synthesis and inpainting** **Domestic example**

The Liberty Times 2007 12 17

2007.12.17

**Production pipeline** **Production pipeline**

**Production pipeline**

Preproduction Production Effects

Artwork Reference Stills Plates

Storyboard

Post-production

Editing

VFX g

Music Music

Sound

Film

**Preproduction**

Storyboard Storyboard

**Preproduction**

Artwork

**Preproduction**

Reference & Research Reference & Research

**Production**

Shooting Shooting

**Post-production** **Visual effects production**

Preproduction Production Effects

Artwork Reference Stills Plates

Scanning Storyboard

Post-production

Matchmove and Matchmodel Model Geometry

Effects

Animation Character Animation

Ski i

Lighting and Rendering Skeleton Textures

M i l

Skinning Rendering

Compositing

Assets Shots

Materials

Assets Shots

Film Recording