# Faces and Image-Based Lighting

## Full text

(1)

### Faces and Image-Based Lighting

Digital Visual Effectsg Yung-Yu Chuang

with slides by Richard Szeliski, Steve Seitz, Alex Efros, Li-Yi Wei and Paul Debevec

### Outline

• Image-based lighting

• 3D acquisition for faces

• Statistical methods (with application to face super-resolution)p )

• 3D Face models from single images

• Image based faces

• Image-based faces

• Relighting for faces

### Rendering

• Rendering is a function of geometry, reflectance lighting and viewing reflectance, lighting and viewing.

• To synthesize CGI into real scene, we have to t h th b f f t

match the above four factors.

• Viewing can be obtained from calibration or structure from motion.

• Geometry can be captured using 3D y p g photography or made by hands.

• How to capture lighting and reflectance?

• How to capture lighting and reflectance?

(2)

### Reflectance

• The Bidirectional Reflection Distribution Function

Given an incoming ray and outgoing ray – Given an incoming ray and outgoing ray

what proportion of the incoming light is reflected along outgoing ray?

surface normal surface normal

### Rendering equation

) ω , p ( i Li

ωi

) ω , p ( i Li

p ωo

) ω p,

( o

Lo

5D light field )

(

L L ( )

5D light field

 ) ω p,

( o

Lo Le(p,ωo)

i i i

i

o,ω ) (p,ω )cosθ ω

ω p,

2 ( Li d

2

### 

(p, o, i) i(p, i) i i

s

 ) ω p,

( o

Lo Le(p,ωo)

### 

s2 f(p,ωo,ωi)Li(p,ωi)cosθi dωi

### 

 ) ω p,

( o

B 2 f(p,ωoi)Ld(p,ωi)cosθidωi

s

### 

 ) ω ( o

Bp 2 ,ωi) (ωi)cosθi ωi

o L d

f d

s p

p q

### Point lights

Classically, rendering is performed assuming point light sources

light sources

directional source

(3)

### Natural illumination

People perceive materials more easily under natural illumination than simplified illumination natural illumination than simplified illumination.

I t R D d T d Ad l

Images courtesy Ron Dror and Ted Adelson

### Natural illumination

Rendering with natural illumination is more expensive compared to using simplified expensive compared to using simplified illumination

directional source natural illumination

### Environment maps

Miller and Hoffman 1984 Miller and Hoffman, 1984

HDR lighting

(4)

 ) ω p,

( o

Lo Le(p,ωo)

### 

s2 f(p,ωo,ωi)Li(p,ωi)cosθi dωi

### 

 ) ω p,

( o

B 2 f(p,ωoi)Ld(p,ωi)cosθidωi

s

### 

 ) ω ( o

Bp 2 ,ωi) (ωi)cosθi ωi

o L d

f d

s p

### 

reflectance lighting B th h i l f ti Both are spherical functions

### Function approximation

• G(x): the function to approximate

• B1(x), B2(x), … Bn(x): basis functions

• We want

n i i

1

i

i

### 

i

• Storing a finite number of coefficients ci gives an approximation of G(x)

(5)

### Function approximation

• How to find coefficients ci? Mi i i

– Minimize an error measure

• What error measure?

– L2 error

[ ( ) ( )]22

2

I i

i i

L G x cB x

E

G x B x dx

B G

c

( ) ( )

X

i i

i G B G x B x dx

c ( ) ( )

### Function approximation

• Basis Functions are pieces of signal that can be used to produce approximations to a function

produce approximations to a function

1

2

2

3

### Function approximation

• We can then use these coefficients to reconstruct an approximation to the original signal

approximation to the original signal

1

2

2

3

### Function approximation

• We can then use these coefficients to reconstruct an approximation to the original signal

approximation to the original signal

N

i

i

i i i 1

(6)

### Orthogonal basis functions

• Orthogonal Basis Functions

Th f ili f f ti ith i l – These are families of functions with special

properties

i

j

###  0ij

– Intuitively, it’s like functions don’t overlap each other’s footprint

A bit lik th F i t f b k

• A bit like the way a Fourier transform breaks a functions into component sine waves

i

i

i

j

j

j

i j

j j i

i

i i j

i j

i

i j i

 ) ω (

Bp(ωo)

### 

f (ω )L (ω )cosθ dω B 2 ,ωi) (ωi)cosθi ωi

o L d

f d

s p

### Basis functions

• Transform data to a space in which we can capture the essence of the data better capture the essence of the data better

• Spherical harmonics, similar to Fourier

t f i h i l d i i d i PRT transform in spherical domain, is used in PRT.

### Real spherical harmonics

• A system of signed, orthogonal functions over the sphere

the sphere

• Represented in spherical coordinates by the f ti

function

m

m

lm l m

m l m

l m

l

0

0

l l

l l

l

### 

where l is the band and m is the index within the band

(7)

0

00

1

1

11

1

2

22

0

2 1

2

2

2

(8)

0

0

1

1

22

2

2

2

-1

-2 0 1 2

### SH projection

• First we define a strict order for SH functions

### i   1 

• Project a spherical function into a vector of

• Project a spherical function into a vector of SH coefficients

i

i

S

i i

(9)

### SH reconstruction

• To reconstruct the approximation to a function N2

N

i

i

### f

0 i

• We truncate the infinite series of SH functions to give a low frequency approximationg q y pp

### An example

• Take a function comprised of two area light sources

sources

– SH project them into 4 bands = 16 coefficients





3290679 0930 0908 1. ,





 





238 0 0 425

0642 0001 0317 0837 0940 0 0417 0 0278 0679 0930 0908 0

, . , . , . ,

. , , . , , . ,

. , . , . ,

.



0.425,0,0.238

### Low frequency light source

• We reconstruct the signal

U i l th ffi i t t fi d l f

– Using only these coefficients to find a low frequency approximation to the original light source

(10)

### SH lighting for diffuse objects

• An Efficient Representation for Irradiance Environment Maps Ravi Ramamoorthi and Pat Environment Maps, Ravi Ramamoorthi and Pat Hanrahan, SIGGRAPH 2001

A ti

• Assumptions

– Diffuse surfaces – Distant illumination

 ) (p,ωo

B 2 f(p,ωoi)Ld(p,ωi)cosθidωi

ss

) n ( ) ( Ep

### 

n) B(p,

irradiance is a function of surface normal

### B   EB   E

di i fl

reflectance (albedo/texture)

### = ×

k li h quake light map

L n

Illumination Environment Map Irradiance Environment Map

p

p

### Spherical harmonic expansion

Expand lighting (L), irradiance (E) in basis functions

l

l lm lm

0 l ml

l

0

lm lm

l l

0 l ml

### = .67+ .36+ …

(11)

Lambertian surface

### 

acts like low-pass

filter 2 / 3

l / 4

0 / 4

0 1 2

### l

cosine term

 

21

2 2

( 1) !

2 ( 2)( 1) 2 !

l

l l l

A l l even

l l

   2 ( )( ) 2 !

### 9 parameter approximation

i Order 0

Exact image Order 0

1 term

m

RMS error = 25 % 0 Ylm( , ) 

l m

1 2

y z x

-1

-2 0 1 2

2 xy yz 3z21 zx x2y2

### 9 Parameter Approximation

i Order 1

Exact image Order 1

4 terms

m

RMS Error = 8% 0 Ylm( , ) 

l m

1 2

y z x

-1

-2 0 1 2

2 xy yz 3z21 zx x2y2

### 9 Parameter Approximation

i Order 2

Exact image Order 2

9 terms

m

RMS Error = 1% 0 Ylm( , ) 

l m

For any illumination, average error < 3% [Basri Jacobs 01]

1 2

y z x

error < 3% [Basri Jacobs 01]

-1

-2 0 1 2

2 xy yz 3z21 zx x2y2

(12)

### Comparison

300x300

p Texture: 256x256

Hemispherical

p Texture: 256x256 Spherical Harmonic Integration 2Hrs Coefficients 1sec Time 300 300 256 256 Time 9 256 256 

### Complex geometry

Assume no shadowing: Simply use surface normal Assume no shadowing: Simply use surface normal

## y

### Natural illumination

For diffuse objects, rendering with natural illumination can be done quickly

illumination can be done quickly

directional source natural illumination

### Video

(13)

Acquiring the Light Probe

(14)

(15)

### Real Scene Example

• Goal: place synthetic objects on tableGoal: place synthetic objects on table

### Modeling the Scene

light-based model light-based model

real scene

(16)

### Rendering into the Scene

• Background PlateBackground Plate

### Rendering into the scene

• Objects and Local Scene matched to SceneObjects and Local Scene matched to Scene

### Differential rendering

• Local scene w/o objects, illuminated by modelLocal scene w/o objects, illuminated by model

(17)

### Differential Rendering

• Final ResultFinal Result

(18)

(19)

### Cyberware scanners

face & head scanner whole body scannery

### Making facial expressions from photos

• Similar to Façade, use a generic face model and view dependent texture mapping

and view-dependent texture mapping

• Procedure

1. Take multiple photographs of a person 2. Establish corresponding feature points 3. Recover 3D points and camera parameters 4. Deform the generic face model to fit points 5. Extract textures from photos

### Reconstruct a 3D model

input photographs

generic 3D pose more deformed

generic 3D face model

p

estimation features model

(20)

### Mesh deformation

– Compute displacement of feature points Apply scattered data interpolation – Apply scattered data interpolation

generic model displacement deformed model

### Texture extraction

• The color at each point is a weighted combination of the colors in the photos combination of the colors in the photos

• Texture can be:

– view-independent – view-dependent

• Considerations for weighting

– occlusion – smoothness

– positional certaintyp y – view similarity

(21)

### Model reconstruction

Use images to adapt a generic face model Use images to adapt a generic face model.

### Creating new expressions

• In addition to global blending we can use:

R i l bl di – Regional blending – Painterly interface

### Creating new expressions

New expressions are created with 3D morphing:

+ =

+

/2 /2

Applying a global blend

(22)

x

x

### =

Applying a region-based blend

### =

Using a painterly interface

### Drunken smileAnimating between expressions

Morphing over time creates animation:

“neutral” “joy”

(23)

### Spacetime faces

black & white cameras color cameras

video projectors

time

(24)

time

Face surface Face surface

time

stereo

time

stereo active stereo

time

spacetime stereo

stereo active stereo

(25)

### Spacetime Stereo

time

surface motion surface motion

time=1

### Spacetime Stereo

time

surface motion surface motion

time=2

### Spacetime Stereo

time

surface motion surface motion

time=3

### Spacetime Stereo

time

surface motion surface motion

time=4

(26)

### Spacetime Stereo

time

surface motion surface motion

time=5

### Spacetime Stereo

time

surface motion surface motion

Better

• spatial resolution

• temporal stableness time

• temporal stableness

(27)

(28)

### z  max P ( z | y )

Example: super-resolution

z

### 

super-resolution de-noising

de-blocking

z de-blocking

Inpainting

z

z

2

2

(29)

### Statistical methods

There are approximately 10240 possible 1010 There are approximately 10 possible 1010 gray-level images. Even human being has not seen them all yet. There must be a strong seen them all yet. There must be a strong statistical bias.

Approximately 8X1011 blocks per day per person.

x

x

) 2

(d  d

(d) d

T d

d

2

T d

T d T d T T

d d



 

) (

) 2

( 2

(30)

high-resolution

low-resolution

z

x

### PCA

• Principal Components Analysis (PCA):

approximating a high dimensional data set approximating a high-dimensional data set with a lower-dimensional subspace

**

**

** **

** **

** ****

** **

**

** First principal componentFirst principal component Second principal component

Second principal component

Original axes Original axes

**

** ** **

**

******** **

**

****

** **

Data points Data points

(31)

### PCA on faces: “eigenfaces”

Average

Average First principal componentFirst principal component Average

Average face face

Other Other components components

For all except average, For all except average,o a e cept a e age,o a e cept a e age,

“gray” = 0,

“gray” = 0,

“white” > 0,

“white” > 0,

“black” < 0

“black” < 0black < 0black < 0

z

x

### Super-resolution

(a) (b) (c) (d) (e) (f)

(a) Input low 24×32 (b) Our results (c) Cubic B-Spline (a) Input low 24×32 (b) Our results (c) Cubic B Spline (d) Freeman et al. (e) Baker et al. (f) Original high 96×128

(32)

### Morphable model of 3D faces

Cyberware scans

• Build a model of average shape and texture

• Build a model of average shape and texture, and principal variations using PCA

### Morphable model

shape examplars texture examplars

### Reconstruction from single image

Rendering must be similar to the input if we guess right

g g

(33)

### Reconstruction from single image

prior

shape and texture priors are learnt from database ρ is the set of parameters for shading including camera pose, lighting and so onp , g g

(34)

(35)

(36)

### Video rewrite (synthesis)Results

• Video database

2 i t f JFK – 2 minutes of JFK

• Only half usable

training video R d li Read my lips.

I never met Forest Gump.

(37)

### Prototypes (PCA+k-mean clustering)

W fi d I d C f h t t i

We find Ii and Ci for each prototype image.

(38)

lamp #1 lamp #2

(39)

### Light stage 1.0Light stage 1.0

64x32 lighting directions

occlusion flare

(40)

(41)

real synthetic

real synthetic

video video

(42)

### Light stage 6Application: The Matrix Reloaded

• Paul Debevec, Rendering Synthetic Objects into Real Scenes:

Bridging Traditional and Image-based Graphics with Global Illumination and High Dynamic Range Photography

Illumination and High Dynamic Range Photography, SIGGRAPH 1998.

• F. Pighin, J. Hecker, D. Lischinski, D. H. Salesin, and R.

Szeliski Synthesizing realistic facial expressions from Szeliski. Synthesizing realistic facial expressions from photographs. SIGGRAPH 1998, pp75-84.

• Li Zhang, Noah Snavely, Brian Curless, Steven M. Seitz, S ti F High R l ti C t f M d li g d Spacetime Faces: High Resolution Capture for Modeling and Animation, SIGGRAPH 2004.

• Blanz, V. and Vetter, T., A Morphable Model for the S th i f 3D F SIGGRAPH 1999 187 194 Synthesis of 3D Faces, SIGGRAPH 1999, pp187-194.

• Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, Mark Sagar, Acquiring the R fl t Fi ld f H F SIGGRAPH 2000 Reflectance Field of a Human Face, SIGGRAPH 2000.

• Christoph Bregler, Malcolm Slaney, Michele Covell, Video Rewrite: Driving Visual Speeach with Audio, SIGGRAPH 1997.

• Tony Ezzat, Gadi Geiger, Tomaso Poggio, Trainable Videorealistic Speech Animation, SIGGRAPH 2002.

Updating...

## References

Related subjects :