Making faces
Digital Visual Effects, Spring 2005 Yung-Yu Chuang
2005/6/8
with slides by Richard Szeliski, Steve Seitz and Alex Efros
Announcements
• Project #3 artifacts voting
Outline
• 3D acquisition for faces
• Statistical methods
• Face models from single images
• Image-based faces
• Relighting for faces
3D acquisition for faces
Cyberware scanners
face & head scanner whole body scanner
Making facial expressions from photos
• Similar to Façade, use a generic face model and view-dependent texture mapping
• Procedure
1. Take multiple photographs of a person 2. Establish corresponding feature points 3. Recover 3D points and camera
parameters
4. Deform generic face model to fit points 5. Extract textures from photos
Reconstruct a 3D model
input photographs
generic 3D face model
pose estimation
more features
deformed model
Mesh deformation
• Involves two steps:
– Compute displacement of feature points – Apply scattered data interpolation
generic model displacement deformed model
Mesh deformation Texture extraction
• The color at each point is a weighted combination of the colors in the photos
• Texture can be:
– View-independent – View-dependent
• Considerations for weighting
– Occlusion – Smoothness
– Positional certainty – View similarity
Texture extraction Texture extraction
view-independent view-dependent
Texture extraction Model reconstruction
Use images to adapt a generic face model.
Creating new expressions
• In addition to global blending we can use:
– Regional blending – Painterly interface
Creating new expressions
Applying a global blend
+ =
/2 /2
New expressions are created with 3D morphing:
Applying a region-based blend
x
+
x=
Creating new expressions
Using a painterly interface
+ + +
=
Creating new expressions
Drunken smile Animating between expressions
Morphing over time creates animation:
“neutral” “joy”
Video Spacetime faces
video projectors color cameras black & white cameras
Spacetime faces
timetime
Face surface
time
Face surface
stereo
time
stereo active stereo
time
spacetime stereo
stereo active stereo
time time
time=1
Spacetime Stereo
Face surface
surface motion
time
time=2
Spacetime Stereo
Face surface
surface motion
time
time=3
Spacetime Stereo
Face surface
surface motion
time
time=4
Spacetime Stereo
Face surface
surface motion
time
time=5
Spacetime Stereo
Face surface
surface motion
time
time
Spacetime Stereo
surface motion
Better
• spatial resolution
• temporal stableness
Spacetime stereo matching
Video Fitting
FaceIK Animation
3D face applications: The one 3D face applications: Gladiator
extra 3M
3D face applications: Spiderman 2
Statistical methods
Statistical methods
f(z)+ε
z y
para- meters
observed signal )
| ( max
* P z y
z
=
z) (
) ( )
| max (
y P
z P z y P
z
=
) ( )
| (
min L y z L z
z
+
=
Example:
super-resolution de-noising
de-blocking Inpainting
…
Statistical methods
) ( )
| ( min
* L y z L z
z =
z+
2
)
2( σ
z f data y −
evidence
a-priori knowledge
f(z)+ε
z y
para- meters
observed signal
Statistical methods
There are approximately 10240 possible 10×10 gray-level images. Even human being has not seen them all yet. There must be a strong statistical bias.
Takeo Kanade
Approximately 8X1011 blocks per day per person.
Generic priors
“Smooth images are good images.”
∑
=
x
x V z
L ( ) ρ ( ( ))
) 2
(d = d
Gaussian MRF ρ
T d
T d T d T T
d d
>
≤
⎩⎨
⎧
−
= +
) (
) 2
( 2
ρ
2Huber MRF
Generic priors Example-based priors
“Existing images are good images.”
six 200×200 Images ⇒ 2,000,000 pairs
Example-based priors
L(z)
Example-based priors
low-resolution high-resolution
Model-based priors
“Face images are good images when working on face images …”
Parametric
model Z=WX+μ L(X)
) ( )
| ( min
* L y z L z
z =
z+
⎩ ⎨
⎧
+
=
+ +
=
μ
μ
*
*
) ( )
| ( min
*
WX z
X L WX
y L
X
x⇒
PCA
• Principal Components Analysis (PCA):
approximating a high-dimensional data set with a lower-dimensional subspace
Original axes Original axes
**
**
*
* *
*
**
**
*
* **
*
*
**
**
**
*
*
*
*
**
**
**
****** **
**
*
**
*
Data points Data points
First principal component First principal component Second principal component
Second principal component
PCA
• Given n k-d points
• Calculate the mean
• Calculate the covariance matrix
• SVD (eigen-analysis) on the covariance matrix
SVD
T 1
0 0
0 0
0 0
⎟⎟
⎟
⎠
⎞
⎜⎜
⎜
⎝
⎛
⎟⎟
⎟
⎠
⎞
⎜⎜
⎜
⎝
⎛
⎟⎟
⎟⎟
⎟⎟
⎠
⎞
⎜⎜
⎜⎜
⎜⎜
⎝
⎛
=
⎟⎟
⎟⎟
⎟⎟
⎠
⎞
⎜⎜
⎜⎜
⎜⎜
⎝
⎛
V U
A
wn
w O
T 1
0 0
0 0
0 0
⎟⎟
⎟
⎠
⎞
⎜⎜
⎜
⎝
⎛
⎟⎟
⎟
⎠
⎞
⎜⎜
⎜
⎝
⎛
⎟⎟
⎟⎟
⎟⎟
⎠
⎞
⎜⎜
⎜⎜
⎜⎜
⎝
⎛
=
⎟⎟
⎟⎟
⎟⎟
⎠
⎞
⎜⎜
⎜⎜
⎜⎜
⎝
⎛
U V A
wn
w O
PCA on faces: “eigenfaces”
Average Average
faceface
First principal component First principal component
Other Other components components
For all except average, For all except average,
““graygray””= 0,= 0,
“
“whitewhite””> 0,> 0,
“black“black””< 0< 0
Model-based priors
“Face images are good images when working on face images …”
Parametric
model Z=WX+μ L(X)
) ( )
| ( min
* L y z L z
z =
z+
⎩ ⎨
⎧
+
=
+ +
=
μ
μ
*
*
) ( )
| ( min
*
WX z
X L WX
y L
X
x⇒
Super-resolution
(a) (b) (c) (d) (e) (f)
(a) Input low 24×32 (b) Our results (c) Cubic B-Spline (d) Freeman et al. (e) Baker et al. (f) Original high 96×128
Face models from single images
Morphable model of 3D faces
• Start with a catalogue of 200 aligned 3D Cyberware scans
• Build a model of average shape and texture, and principal variations using PCA
Morphable model
Morphable model of 3D faces
Divide face into 4 regions (eyes, nose, mouth, head)
For each new prototype, find amount of deviation from the reference shape and texture.
Morphable model of 3D faces
• Adding some variations
Reconstruction from single image Modifying a single image
Animating from a single image Video
Reanimating faces
35 static scans at different expressions
Videos
exercise speech
Exchanging faces Exchanging faces
Exchanging faces Morphable model for human body
Image-based faces (lip sync.)
Video rewrite
Stitch Background
Video Video
Model
Analysis
Synthesis Lip VideoSelect stage
stage Together
Results
• Video database
– 8 minutes of Ellen – 2 minutes of JFK
• Only half usable
• Head rotation
training video Read my lips.
I never met Forest Gump.
Relighting faces
Light is additive
lamp #1 lamp #2
Light stage 1.0
Light stage 1.0
64x32 lighting directions
Input images
Reflectance function
occlusion flare
Relighting
Results Changing viewpoints
Results Video
Spiderman 2
real synthetic
Light stage 3
Application: The Matrix Reloaded Application: The Matrix Reloaded
Reference
• F. Pighin, J. Hecker, D. Lischinski, D. H. Salesin, and R. Szeliski.
Synthesizing realistic facial expressions from photographs.
SIGGRAPH 1998, pp75-84.
• Brian Guenter, Cindy Grimm, Henrique Malvar, Daniel Wood, Making Faces, SIGGRAPH 1998.
• Li Zhang, Noah Snavely, Brian Curless, Steven M. Seitz, Spacetime Faces: High Resolution Capture for Modeling and Animation, SIGGRAPH 2004.
• Blanz, V. and Vetter, T., A Morphable Model for the Synthesis of 3D Faces, SIGGRAPH 1999, pp187-194.
• V. Blanz, C. Basso, T. Poggio and T. Vetter, Reanimating Faces in Images and Video, EUROGRAPHICS 2003.
• V. Blanz, K. Scherbaum, T. Vetter, H.P. Seidel, Exchanging Faces in Images, EUROGRAPHICS 2004.
• George Borshukov et al., Universal Capture - Image-based Facial Animation for "The Matrix Reloaded", SIGGRAPH 2003 Sketch.
Reference
• George Borshukov et al., Realistic Human Face Rendering for "The Matrix Reloaded", SIGGRAPH 2003 Sketch.
• Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, Mark Sagar, Acquiring the Reflectance Field of a Human Face, SIGGRAPH 2000.
• Paul Debevec, Chris Tchou, Andreas Wenger, Tim Hawkins, Andy Gardner, Brian Emerson, Ansul Panday, A Lighting Reproduction Approach to Live-Action Compositing, SIGGRAPH 2002.
• Mark Sagar, Reflectance Field Rendering of Human Faces for
"Spider-Man 2", SIGGRAPH 2004 Sketch.
• Christoph Bregler, Malcolm Slaney, Michele Covell, Video Rewrite:
Driving Visual Speeach with Audio, SIGGRAPH 1997.
• Tony Ezzat, Gadi Geiger, Tomaso Poggio, Trainable Videorealistic Speech Animation, SIGGRAPH 2002.
• Brett Allen, Brian Curless, Zoran Popovic, The Space of Human Body Shapes: Reconstruction and Parameterization From Range Scans, SIGGRAPH 2003.