Faces and Image-Based Lighting
Digital Visual Effectsg Yung-Yu Chuang
with slides by Richard Szeliski, Steve Seitz, Alex Efros, Li-Yi Wei and Paul Debevec
Outline
• Image-based lighting
• 3D acquisition for faces
• Statistical methods (with application to face super-resolution)p )
• 3D Face models from single images
• Image based faces
• Image-based faces
• Relighting for faces
Image-based lighting Image based lighting
Rendering
• Rendering is a function of geometry, reflectance lighting and viewing reflectance, lighting and viewing.
• To synthesize CGI into real scene, we have to t h th b f f t
match the above four factors.
• Viewing can be obtained from calibration or structure from motion.
• Geometry can be captured using 3D y p g photography or made by hands.
• How to capture lighting and reflectance?
• How to capture lighting and reflectance?
Reflectance
• The Bidirectional Reflection Distribution Function
Given an incoming ray and outgoing ray – Given an incoming ray and outgoing ray
what proportion of the incoming light is reflected along outgoing ray?
surface normal surface normal
Answer given by the BRDF:
Rendering equation
) ω , p ( i Li
ωi
) ω , p ( i Li
p ωo
) ω p,
( o
Lo
5D light field )
(
L L ( )
5D light field
) ω p,
( o
Lo Le(p,ωo)
i i i
i
o,ω ) (p,ω )cosθ ω
ω p,
2 ( Li d
2
(p, o, i) i(p, i) i i
s
Complex illumination
) ω p,
( o
Lo Le(p,ωo)
s2 f(p,ωo,ωi)Li(p,ωi)cosθi dωi
) ω p,
( o
B 2 f(p,ωo,ωi)Ld(p,ωi)cosθidωi
s
) ω ( o
Bp 2 ,ω (ωi) (ωi)cosθi ωi
o L d
f d
s p
p q
Point lights
Classically, rendering is performed assuming point light sources
light sources
directional source
Natural illumination
People perceive materials more easily under natural illumination than simplified illumination natural illumination than simplified illumination.
I t R D d T d Ad l
Images courtesy Ron Dror and Ted Adelson
Natural illumination
Rendering with natural illumination is more expensive compared to using simplified expensive compared to using simplified illumination
directional source natural illumination
Environment maps
Miller and Hoffman 1984 Miller and Hoffman, 1984
HDR lighting
Examples of complex environment light Examples of complex environment light
Complex illumination
) ω p,
( o
Lo Le(p,ωo)
s2 f(p,ωo,ωi)Li(p,ωi)cosθi dωi
) ω p,
( o
B 2 f(p,ωo,ωi)Ld(p,ωi)cosθidωi
s
) ω ( o
Bp 2 ,ω (ωi) (ωi)cosθi ωi
o L d
f d
s p
reflectance lighting B th h i l f ti Both are spherical functions
Function approximation
• G(x): the function to approximate
• B1(x), B2(x), … Bn(x): basis functions
• We want
) ( )
( x c B x
G ( )
n i i( )
1
x B c x
G
ii
i
• Storing a finite number of coefficients ci gives an approximation of G(x)
Function approximation
• How to find coefficients ci? Mi i i
– Minimize an error measure
• What error measure?
– L2 error
[ ( ) ( )]222
I i
i i
L G x cB x
E
• Coefficients
G x B x dxB G
c
( ) ( )X
i i
i G B G x B x dx
c ( ) ( )
Function approximation
• Basis Functions are pieces of signal that can be used to produce approximations to a function
produce approximations to a function
c
1 c
2
2
c
c
3Function approximation
• We can then use these coefficients to reconstruct an approximation to the original signal
approximation to the original signal
1
c
c
2
2
c
3
c
Function approximation
• We can then use these coefficients to reconstruct an approximation to the original signal
approximation to the original signal
Nc
iB
i x
i i i 1Orthogonal basis functions
• Orthogonal Basis Functions
Th f ili f f ti ith i l – These are families of functions with special
properties
B
ix B
jx dx 1 0 i i j j
0 i j
– Intuitively, it’s like functions don’t overlap each other’s footprint
A bit lik th F i t f b k
• A bit like the way a Fourier transform breaks a functions into component sine waves
Integral of product
F x G x dx I
x f B x
F ( ) G x g B ( x )
i
i
i
B x
f x
F ( )
j
j
j
B x
g x
G ( )
i j
j j i
i
B x g B x dx
f dx
x G x
F ( ) ( )
i i j
i j
i
g B x B x dx f g dx F G
f ( ) ( ) ˆ ˆ
i j i ) ω (
Bp(ωo)
f (ω )L (ω )cosθ dω B 2 ,ω (ωi) (ωi)cosθi ωio L d
f d
s p
Basis functions
• Transform data to a space in which we can capture the essence of the data better capture the essence of the data better
• Spherical harmonics, similar to Fourier
t f i h i l d i i d i PRT transform in spherical domain, is used in PRT.
Real spherical harmonics
• A system of signed, orthogonal functions over the sphere
the sphere
• Represented in spherical coordinates by the f ti
function
2 K
mP
m 0
0
0 ,
cos sin
2
, cos cos
2
, m
m P
m K
P m K
y
lm l mm l m
l m
l
0 0 ,
cos
, cos sin
2 ,
0
0
m
m P
K
P m K
y
l l
l l
l
where l is the band and m is the index within the band
Real spherical harmonics Reading SH diagrams
This di i direction
– + +
Not this direction
Reading SH diagrams
This di i direction
– + +
Not this direction
The SH functions
0
y
00y
1
y
1 y
111
y
2y
220
y
2 12
y
22
y
The SH functions Spherical harmonics
Spherical harmonics
0
m Y ( )
0
Y lm ( , )
1
1
l
1
y z x
22
xy yz 3 z
2 1 zx x
2 y
2-1
-2 0 1 2
SH projection
• First we define a strict order for SH functions
l m
l
i 1
• Project a spherical function into a vector of
• Project a spherical function into a vector of SH coefficients
ii
f s y s ds
c
S
i i
SH reconstruction
• To reconstruct the approximation to a function N2
~
Ni
i
y s
c s
f
0 i
• We truncate the infinite series of SH functions to give a low frequency approximationg q y pp
Examples of reconstruction
An example
• Take a function comprised of two area light sources
sources
– SH project them into 4 bands = 16 coefficients
3290679 0930 0908 1. ,
238 0 0 425
0642 0001 0317 0837 0940 0 0417 0 0278 0679 0930 0908 0
, . , . , . ,
. , , . , , . ,
. , . , . ,
.
0.425,0,0.238
Low frequency light source
• We reconstruct the signal
U i l th ffi i t t fi d l f
– Using only these coefficients to find a low frequency approximation to the original light source
SH lighting for diffuse objects
• An Efficient Representation for Irradiance Environment Maps Ravi Ramamoorthi and Pat Environment Maps, Ravi Ramamoorthi and Pat Hanrahan, SIGGRAPH 2001
A ti
• Assumptions
– Diffuse surfaces – Distant illumination
– No shadowing, interreflection
) (p,ωo
B 2 f(p,ωo,ωi)Ld(p,ωi)cosθidωi
ss) n ( ) ( Ep
n) B(p,
irradiance is a function of surface normal
Diffuse reflection
B E B E
di i fl
radiosity (image intensity)
reflectance (albedo/texture)
irradiance (incoming light)
= ×
k li h quake light map
Irradiance environment maps
L n
Illumination Environment Map Irradiance Environment Map
p
p
L n d n
E ) (
Spherical harmonic expansion
Expand lighting (L), irradiance (E) in basis functions
l
( , )
l lm lm( , ) L
L Y
0 l ml
l
0
( , )
lm lm( , )
l l
E E Y
0 l ml
= .67 + .36 + …
Analytic irradiance formula
Lambertian surface
acts like low-pass
filter 2 / 3
E lm A L l lm A
l / 4E A L
0 / 40 1 2
l
cosine term
21
2 2
( 1) !
2 ( 2)( 1) 2 !
l
l l l
A l l even
l l
2 ( )( ) 2 !
9 parameter approximation
i Order 0
Exact image Order 0
1 term
m
RMS error = 25 % 0 Ylm( , )
l m
1 2
y z x
-1
-2 0 1 2
2 xy yz 3z21 zx x2y2
9 Parameter Approximation
i Order 1
Exact image Order 1
4 terms
m
RMS Error = 8% 0 Ylm( , )
l m
1 2
y z x
-1
-2 0 1 2
2 xy yz 3z21 zx x2y2
9 Parameter Approximation
i Order 2
Exact image Order 2
9 terms
m
RMS Error = 1% 0 Ylm( , )
l m
For any illumination, average error < 3% [Basri Jacobs 01]
1 2
y z x
error < 3% [Basri Jacobs 01]
-1
-2 0 1 2
2 xy yz 3z21 zx x2y2
Comparison
Incident Irradiance map Irradiance map illumination
300x300
p Texture: 256x256
Hemispherical
p Texture: 256x256 Spherical Harmonic Integration 2Hrs Coefficients 1sec Time 300 300 256 256 Time 9 256 256
Complex geometry
Assume no shadowing: Simply use surface normal Assume no shadowing: Simply use surface normal
y
Natural illumination
For diffuse objects, rendering with natural illumination can be done quickly
illumination can be done quickly
directional source natural illumination
Video
Acquiring the Light Probe
HDRI Sky Probe
Clipped Sky + Sun Source Lit by sun only y y
Lit by sky only y y y Lit by sun and sky y y
Illuminating a Small Scene
Real Scene Example
• Goal: place synthetic objects on tableGoal: place synthetic objects on table
Light Probe / Calibration Grid g
Modeling the Scene
light-based model light-based model
real scene
The Light-Based Room Model
Rendering into the Scene
• Background PlateBackground Plate
Rendering into the scene
• Objects and Local Scene matched to SceneObjects and Local Scene matched to Scene
Differential rendering
• Local scene w/o objects, illuminated by modelLocal scene w/o objects, illuminated by model
Differential rendering
=
- =
Differential rendering
+ +
Differential Rendering
• Final ResultFinal Result
Environment map from single image? Eye as light probe! (Nayar et al)
Results Application in “Superman returns”
Capturing reflectance Application in “The Matrix Reloaded”
3D acquisition for faces 3D acquisition for faces
Cyberware scanners
face & head scanner whole body scannery
Making facial expressions from photos
• Similar to Façade, use a generic face model and view dependent texture mapping
and view-dependent texture mapping
• Procedure
1. Take multiple photographs of a person 2. Establish corresponding feature points 3. Recover 3D points and camera parameters 4. Deform the generic face model to fit points 5. Extract textures from photos
Reconstruct a 3D model
input photographs
generic 3D pose more deformed
generic 3D face model
p
estimation features model
Mesh deformation
– Compute displacement of feature points Apply scattered data interpolation – Apply scattered data interpolation
generic model displacement deformed model
Texture extraction
• The color at each point is a weighted combination of the colors in the photos combination of the colors in the photos
• Texture can be:
– view-independent – view-dependent
• Considerations for weighting
– occlusion – smoothness
– positional certaintyp y – view similarity
Texture extraction Texture extraction
Texture extraction
view-independent view-dependent
Model reconstruction
Use images to adapt a generic face model Use images to adapt a generic face model.
Creating new expressions
• In addition to global blending we can use:
R i l bl di – Regional blending – Painterly interface
Creating new expressions
New expressions are created with 3D morphing:
+ =
+
/2 /2
Applying a global blend
Creating new expressions
+
x
+
x=
Applying a region-based blend
Creating new expressions
+ + +
+ + +
=
Using a painterly interface
Drunken smile Animating between expressions
Morphing over time creates animation:
“neutral” “joy”
Video Spacetime faces
Spacetime faces
black & white cameras color cameras
video projectors
time
time
Face surface Face surface
time
stereo
time
stereo active stereo
time
spacetime stereo
stereo active stereo
Spacetime Stereo
timesurface motion surface motion
time=1
Spacetime Stereo
timesurface motion surface motion
time=2
Spacetime Stereo
timesurface motion surface motion
time=3
Spacetime Stereo
timesurface motion surface motion
time=4
Spacetime Stereo
timesurface motion surface motion
time=5
Spacetime Stereo
timesurface motion surface motion
Better
• spatial resolution
• temporal stableness time
• temporal stableness
Spacetime stereo matching Video
Fitting FaceIK
Animation 3D face applications: The one
3D face applications: Gladiator
extra 3M extra 3M
Statistical methods Statistical methods
Statistical methods
para observed
f(z)+
z y
para- meters
observed signal )
| ( max
* P z y
z max P ( z | y )
Example: super-resolutionz
z) ( )
|
max P ( y z P z
super-resolution de-noising
de-blocking
) max (
y P
z de-blockingInpainting
) ( )
| (
min L y z L z
z
…Statistical methods
para observed
f(z)+
z y
para- meters
observed signal )
( )
| ( min
* L y z L z
z min L ( y | z ) L ( z )
z
z
)
2(z f
data y a-priori
2evidence knowledge
Statistical methods
There are approximately 10240 possible 1010 There are approximately 10 possible 1010 gray-level images. Even human being has not seen them all yet. There must be a strong seen them all yet. There must be a strong statistical bias.
Takeo Kanade Takeo Kanade
Approximately 8X1011 blocks per day per person.
Generic priors
“S th i d i ”
“Smooth images are good images.”
x
x V z
L ( ) ( ( ))
x
) 2
(d d
Gaussian MRF
(d) dGaussian MRF
T d
d
2
T d
T d T d T T
d d
) (
) 2
( 2
Huber MRF
Generic priors Example-based priors
“E i ti i d i ”
“Existing images are good images.”
six 200200 Images Images 2,000,000 pairs
pairs
Example-based priors
L(z)
Example-based priors
high-resolution
low-resolution
Model-based priors
“Face images are good images when Face images are good images when working on face images …”
Parametric
model Z=WX+ L(X)
model
) ( )
| ( min
* L y z L z
z (y | ) ( )
z
X * min L ( y | WX ) L ( X )
*
*
) ( )
| ( min WX z
X L WX
y L
X
x
PCA
• Principal Components Analysis (PCA):
approximating a high dimensional data set approximating a high-dimensional data set with a lower-dimensional subspace
**
**
** **
** **
** ****
** **
**
** First principal componentFirst principal component Second principal component
Second principal component
Original axes Original axes
**
** ** **
**
******** **
**
****
** **
Data points Data points
PCA on faces: “eigenfaces”
Average
Average First principal componentFirst principal component Average
Average face face
Other Other components components
For all except average, For all except average,o a e cept a e age,o a e cept a e age,
“gray” = 0,
“gray” = 0,
“white” > 0,
“white” > 0,
“black” < 0
“black” < 0black < 0black < 0
Model-based priors
“Face images are good images when Face images are good images when working on face images …”
Parametric
model Z=WX+ L(X)
model
) ( )
| ( min
* L y z L z
z (y | ) ( )
z
X * min L ( y | WX ) L ( X )
*
*
) ( )
| ( min WX z
X L WX
y L
X
x
Super-resolution
(a) (b) (c) (d) (e) (f)
(a) Input low 24×32 (b) Our results (c) Cubic B-Spline (a) Input low 24×32 (b) Our results (c) Cubic B Spline (d) Freeman et al. (e) Baker et al. (f) Original high 96×128
Face models from single images
Face models from single images
Morphable model of 3D faces
• Start with a catalogue of 200 aligned 3D Cyberware scans
Cyberware scans
• Build a model of average shape and texture
• Build a model of average shape and texture, and principal variations using PCA
Morphable model
shape examplars texture examplars
Morphable model of 3D faces
• Adding some variations
Reconstruction from single image
Rendering must be similar to the input if we guess right
g g
Reconstruction from single image
prior
shape and texture priors are learnt from database ρ is the set of parameters for shading including camera pose, lighting and so onp , g g
Modifying a single image
Animating from a single image Video
Exchanging faces in images Exchange faces in images
Exchange faces in images Exchange faces in images
Exchange faces in images Morphable model for human body
Image-based faces (lip sync.)
Video rewrite (analysis)
Video rewrite (synthesis) Results
• Video database
2 i t f JFK – 2 minutes of JFK
• Only half usable
• Head rotation
• Head rotation
training video R d li Read my lips.
I never met Forest Gump.
Morphable speech model Preprocessing
Prototypes (PCA+k-mean clustering)
W fi d I d C f h t t i
We find Ii and Ci for each prototype image.
Morphable model
analysis
I α β
analysis synthesis
Morphable model
analysis synthesis
Synthesis
Results Results
Relighting faces Relighting faces
Light is additive
lamp #1 lamp #2
Light stage 1.0 Light stage 1.0
64x32 lighting directions
Input images Reflectance function
occlusion flare
Relighting Results
Changing viewpoints Results
3D face applications: Spiderman 2 Spiderman 2
real synthetic
real synthetic
Spiderman 2
video video
Light stage 3
Light stage 6 Application: The Matrix Reloaded
Application: The Matrix Reloaded References
• Paul Debevec, Rendering Synthetic Objects into Real Scenes:
Bridging Traditional and Image-based Graphics with Global Illumination and High Dynamic Range Photography
Illumination and High Dynamic Range Photography, SIGGRAPH 1998.
• F. Pighin, J. Hecker, D. Lischinski, D. H. Salesin, and R.
Szeliski Synthesizing realistic facial expressions from Szeliski. Synthesizing realistic facial expressions from photographs. SIGGRAPH 1998, pp75-84.
• Li Zhang, Noah Snavely, Brian Curless, Steven M. Seitz, S ti F High R l ti C t f M d li g d Spacetime Faces: High Resolution Capture for Modeling and Animation, SIGGRAPH 2004.
• Blanz, V. and Vetter, T., A Morphable Model for the S th i f 3D F SIGGRAPH 1999 187 194 Synthesis of 3D Faces, SIGGRAPH 1999, pp187-194.
• Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, Mark Sagar, Acquiring the R fl t Fi ld f H F SIGGRAPH 2000 Reflectance Field of a Human Face, SIGGRAPH 2000.
• Christoph Bregler, Malcolm Slaney, Michele Covell, Video Rewrite: Driving Visual Speeach with Audio, SIGGRAPH 1997.
• Tony Ezzat, Gadi Geiger, Tomaso Poggio, Trainable Videorealistic Speech Animation, SIGGRAPH 2002.