Faces and Image-Based Lighting
Digital Visual Effects, Spring 2007 Yung-Yu Chuang
2007/6/12
with slides by Richard Szeliski, Steve Seitz, Alex Efros, Li-Yi Wei and Paul Debevec
Announcements
• TA evaluation
• Final project:
– Demo on 6/27 (Wednesday) 13:30pm in this room – Reports and videos due on 6/28 (Thursday) 11:59pm
Outline
• Image-based lighting
• 3D acquisition for faces
• Statistical methods (with application to face super-resolution)
• 3D Face models from single images
• Image-based faces
• Relighting for faces
Image-based lighting
Rendering
• Rendering is a function of geometry, reflectance, lighting and viewing.
• To synthesize CGI into real scene, we have to match the above four factors.
• Viewing can be obtained from calibration or structure from motion.
• Geometry can be captured using 3D photography or made by hands.
• How to capture lighting and reflectance?
Reflectance
• The Bidirectional Reflection Distribution Function
– Given an incoming ray and outgoing ray
what proportion of the incoming light is reflected along outgoing ray?
Answer given by the BRDF:
surface normal
Rendering equation
p
= ) ω p,
(
oL
oL
e( p , ω
o)
i i
i i
o
, ω ) ( p, ω ) cos θ ω
ω p,
2
( L
id
∫
s+ ρ
) ω p,
(
oL
o5D light field
ω
iω
o) ω , p
(
iL
iComplex illumination
= ) ω p,
(
oL
oL
e( p , ω
o)
i i
i i
o
, ω ) ( p, ω ) cos θ ω
ω p,
2
f ( L
id
∫
s+
= ) ω p,
(
oB
2f ( p, ω
o, ω
i) L
d( p, ω
i) cos θ
id ω
i∫
sreflectance lighting
Point lights
Classically, rendering is performed assuming point light sources
directional source
Environment maps
Miller and Hoffman, 1984
Capturing reflectance
Acquiring the Light Probe
HDRI Sky Probe
Clipped Sky + Sun Source
Lit by sun only
Lit by sky only
Lit by sun and sky
Illuminating a Small Scene
Real Scene Example
• Goal: place synthetic objects on table
Light Probe / Calibration Grid
real scene real scene
Modeling the Scene
light-based model light-based model
The Light-Based Room Model
Rendering into the Scene
• Background Plate
Rendering into the scene
• Objects and Local Scene matched to Scene
Differential rendering
• Local scene w/o objects, illuminated by model
Differential rendering
= =
- -
Differential Rendering
• Final Result
Environment map from single image?
Eye as light probe! (Nayar et al)
Cornea is an ellipsoid
Results
Application in “The Matrix Reloaded”
3D acquisition for faces
Cyberware scanners
face & head scanner whole body scanner
Making facial expressions from photos
• Similar to Façade, use a generic face model and view-dependent texture mapping
• Procedure
1. Take multiple photographs of a person 2. Establish corresponding feature points
3. Recover 3D points and camera parameters 4. Deform the generic face model to fit points 5. Extract textures from photos
Reconstruct a 3D model
input photographs
generic 3D face model
pose estimation
more features
deformed model
Mesh deformation
generic model displacement deformed model
– Compute displacement of feature points – Apply scattered data interpolation
Texture extraction
• The color at each point is a weighted combination of the colors in the photos
• Texture can be:
– view-independent – view-dependent
• Considerations for weighting
– occlusion – smoothness
– positional certainty – view similarity
Texture extraction
Texture extraction
view-independent view-dependent
Texture extraction
Model reconstruction
Use images to adapt a generic face model.
Creating new expressions
• In addition to global blending we can use:
– Regional blending – Painterly interface
Creating new expressions
Applying a global blend
+ =
/2 /2
New expressions are created with 3D morphing:
Applying a region-based blend
x
+
x=
Creating new expressions
Using a painterly interface
+ + +
=
Creating new expressions
Drunken smile
Animating between expressions
Morphing over time creates animation:
“neutral” “joy”
Video
Spacetime faces
video projectors color cameras
black & white cameras
Spacetime faces
time
time
Face surface
stereo
time
stereo active stereo
time
spacetime stereo
stereo active stereo
time
time
time
Spacetime Stereo
surface motion
Better
• spatial resolution
• temporal stableness
Spacetime stereo matching
Video
Fitting
FaceIK
Animation
3D face applications: The one
3D face applications: Gladiator
extra 3M
Statistical methods
Statistical methods
f(z)+ε
z y
para- meters
observed signal )
| ( max
* P z y
z =
z) (
) ( )
| max (
y P
z P z
y P
=
z) ( )
| (
min L y z L z
z
+
=
Example:
super-resolution de-noising
de-blocking Inpainting
…
Statistical methods
) ( )
| (
min
* L y z L z
z =
z+
2
)
2(
σ
z f
data y −
evidence
a-priori knowledge
f(z)+ε
z y
para- meters
observed
signal
Statistical methods
There are approximately 10240 possible 10×10 gray-level images. Even human being has not seen them all yet. There must be a strong
statistical bias.
Takeo Kanade
Approximately 8X1011 blocks per day per person.
Generic priors
“Smooth images are good images.”
∑
=
x
x V
z
L ( ) ρ ( ( ))
)
2( d = d Gaussian MRF ρ
T d
T d
T d
T T
d d
>
≤
⎩ ⎨
⎧
−
= +
) (
) 2
(
2ρ
2Huber MRF
Generic priors
Example-based priors
“Existing images are good images.”
six 200×200
Images ⇒
2,000,000
pairs
Example-based priors
L(z)
Example-based priors
low-resolution high-resolution
Model-based priors
“Face images are good images when working on face images …”
Parametric
model Z=WX+μ L(X)
) ( )
| (
min
* L y z L z
z =
z+
⎩ ⎨
⎧
+
=
+ +
=
μ
μ
*
*
) (
)
| (
min
*
WX z
X L WX
y L
X
x⇒
PCA
• Principal Components Analysis (PCA):
approximating a high-dimensional data set with a lower-dimensional subspace
Original axes Original axes
**
**
**
**
**
**
** **
**
**
***
*
**
**
**
**
**
****** **
**
**
**
Data points Data points
First principal component First principal component Second principal component
Second principal component
PCA on faces: “eigenfaces”
Average Average
faceface
First principal component First principal component
Other Other components components
For all except average, For all except average,
“gray“gray”” = 0,= 0,
“white“white”” > 0,> 0,
““blackblack”” < 0< 0
Model-based priors
“Face images are good images when working on face images …”
Parametric
model Z=WX+μ L(X)
) ( )
| (
min
* L y z L z
z =
z+
⎩ ⎨
⎧
+
=
+ +
=
μ
μ
*
*
) (
)
| (
min
*
WX z
X L WX
y L
X
x⇒
Super-resolution
(a) (b) (c) (d) (e) (f)
(a) Input low 24×32 (b) Our results (c) Cubic B-Spline
(d) Freeman et al. (e) Baker et al. (f) Original high 96×128
Face models from single images
Morphable model of 3D faces
• Start with a catalogue of 200 aligned 3D Cyberware scans
• Build a model of average shape and texture, and principal variations using PCA
Morphable model
shape examplars texture examplars
Morphable model of 3D faces
• Adding some variations
Reconstruction from single image
Rendering must be similar to
the input if we guess right
Reconstruction from single image
shape and texture priors are learnt from database ρis the set of parameters for shading including camera pose, lighting and so on
prior
Modifying a single image
Animating from a single image
Video
Morphable model for human body
Image-based faces
(lip sync.)
Video rewrite (analysis)
Video rewrite (synthesis)
Results
• Video database
– 2 minutes of JFK
• Only half usable
• Head rotation
training video Read my lips.
I never met Forest Gump.
Morphable speech model
Preprocessing
Prototypes (PCA+k-mean clustering)
We find Ii and Ci for each prototype image.
Morphable model
I αβ
synthesis
analysis
Morphable model
synthesis
analysis
Synthesis
Results
Relighting faces
Light is additive
lamp #1 lamp #2
Light stage 1.0
Light stage 1.0
64x32 lighting directions
Input images
Reflectance function
occlusion flare
Relighting
Results
Changing viewpoints
Results
3D face applications: Spiderman 2
Spiderman 2
real synthetic
Application: The Matrix Reloaded
Application: The Matrix Reloaded
References
• Paul Debevec, Rendering Synthetic Objects into Real Scenes:
Bridging Traditional and Image-based Graphics with Global Illumination and High Dynamic Range Photography,
SIGGRAPH 1998.
• F. Pighin, J. Hecker, D. Lischinski, D. H. Salesin, and R.
Szeliski. Synthesizing realistic facial expressions from photographs. SIGGRAPH 1998, pp75-84.
• Li Zhang, Noah Snavely, Brian Curless, Steven M. Seitz,
Spacetime Faces: High Resolution Capture for Modeling and Animation, SIGGRAPH 2004.
• Blanz, V. and Vetter, T., A Morphable Model for the Synthesis of 3D Faces, SIGGRAPH 1999, pp187-194.
• Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, Mark Sagar, Acquiring the
Reflectance Field of a Human Face, SIGGRAPH 2000.
• Christoph Bregler, Malcolm Slaney, Michele Covell, Video
Rewrite: Driving Visual Speeach with Audio, SIGGRAPH 1997.
• Tony Ezzat, Gadi Geiger, Tomaso Poggio, Trainable Videorealistic Speech Animation, SIGGRAPH 2002.