3D acquisition for faces

(1)

Making faces

Digital Visual Effects, Spring 2005 Yung-Yu Chuang

2005/6/8

with slides by Richard Szeliski, Steve Seitz and Alex Efros

Announcements

• Project #3 artifacts voting

Outline

• 3D acquisition for faces

• Statistical methods

• Face models from single images

• Image-based faces

• Relighting for faces

3D acquisition for faces

(2)

Cyberware scanners

face & head scanner whole body scanner

Making facial expressions from photos

• Similar to Façade, use a generic face model and view-dependent texture mapping

• Procedure

1. Take multiple photographs of a person 2. Establish corresponding feature points 3. Recover 3D points and camera

parameters

4. Deform generic face model to fit points 5. Extract textures from photos

Reconstruct a 3D model

input photographs

generic 3D face model

pose estimation

more features

deformed model

Mesh deformation

• Involves two steps:

– Compute displacement of feature points – Apply scattered data interpolation

(3)

generic model displacement deformed model

Mesh deformation Texture extraction

• The color at each point is a weighted combination of the colors in the photos

• Texture can be:

– View-independent – View-dependent

• Considerations for weighting

– Occlusion – Smoothness

– Positional certainty – View similarity

Texture extraction Texture extraction

(4)

view-independent view-dependent

Texture extraction Model reconstruction

Use images to adapt a generic face model.

Creating new expressions

• In addition to global blending we can use:

– Regional blending – Painterly interface

Creating new expressions

Applying a global blend

+ =

/2 /2

New expressions are created with 3D morphing:

(5)

Applying a region-based blend

x

+

x

=

Creating new expressions

Using a painterly interface

+ + +

=

Creating new expressions

Drunken smile Animating between expressions

Morphing over time creates animation:

“neutral” “joy”

(6)

Video Spacetime faces

video projectors color cameras black & white cameras

Spacetime faces

^time

(7)

time

Face surface

time

Face surface

stereo

time

stereo active stereo

time

(8)

spacetime stereo

stereo active stereo

time time

time=1

Spacetime Stereo

Face surface

surface motion

time

time=2

Spacetime Stereo

Face surface

surface motion

time

time=3

Spacetime Stereo

Face surface

surface motion

(9)

time

time=4

Spacetime Stereo

Face surface

surface motion

time

time=5

Spacetime Stereo

Face surface

surface motion

time

time

Spacetime Stereo

surface motion

Better

• spatial resolution

• temporal stableness

Spacetime stereo matching

(10)

Video Fitting

FaceIK Animation

(11)

3D face applications: The one 3D face applications: Gladiator

extra 3M

3D face applications: Spiderman 2

Statistical methods

(12)

Statistical methods

f(z)+ε

z y

parameters

observed signal )

| ( max

* P z y

z

=

z

) (

) ( )

| max (

y P

z P z y P

z

=

) ( )

| (

min L y z L z

z

+

=

Example:

super-resolution de-noising

de-blocking Inpainting

…

Statistical methods

) ( )

| ( min

* L y z L z

z =

z

+

2

)

2

( σ

z f data y −

evidence

a-priori knowledge

f(z)+ε

z y

parameters

observed signal

Statistical methods

There are approximately 10²⁴⁰ possible 10×10 gray-level images. Even human being has not seen them all yet. There must be a strong statistical bias.

Takeo Kanade

Approximately 8X10¹¹blocks per day per person.

Generic priors

“Smooth images are good images.”

∑

=

x

x V z

L ( ) ρ ( ( ))

) 2

(d = d

Gaussian MRF ρ

T d

T d T d T T

d d

>

≤

⎩⎨

⎧

−

= +

) (

) 2

( ₂

ρ

2

Huber MRF

(13)

Generic priors Example-based priors

“Existing images are good images.”

six 200×200 Images ⇒ 2,000,000 pairs

Example-based priors

L(z)

Example-based priors

low-resolution high-resolution

(14)

Model-based priors

“Face images are good images when working on face images …”

Parametric

model Z=WX+μ L(X)

) ( )

| ( min

* L y z L z

z =

z

+

⎩ ⎨

⎧

+

=

+ +

=

μ

*

) ( )

| ( min

*

WX z

X L WX

y L

X

x

⇒

PCA

• Principal Components Analysis (PCA):

approximating a high-dimensional data set with a lower-dimensional subspace

Original axes Original axes

**

*

* *

*

**

*

* **

*

**

*

**

****** **

**

*

**

*

Data points Data points

First principal component First principal component Second principal component

Second principal component

PCA

• Given n k-d points

• Calculate the mean

• Calculate the covariance matrix

• SVD (eigen-analysis) on the covariance matrix

SVD

T 1

0 0

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛

=

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛

V U

A

wn

w O

T 1

0 0

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛

=

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛

U V A

wn

w O

(15)

PCA on faces: “eigenfaces”

Average Average

faceface

First principal component First principal component

Other Other components components

For all except average, For all except average,

““graygray””= 0,= 0,

“

“whitewhite””> 0,> 0,

“black“black””< 0< 0

Model-based priors

“Face images are good images when working on face images …”

Parametric

model Z=WX+μ L(X)

) ( )

| ( min

* L y z L z

z =

z

+

⎩ ⎨

⎧

+

=

+ +

=

μ

*

) ( )

| ( min

*

WX z

X L WX

y L

X

x

⇒

Super-resolution

(a) (b) (c) (d) (e) (f)

(a) Input low 24×32 (b) Our results (c) Cubic B-Spline (d) Freeman et al. (e) Baker et al. (f) Original high 96×128

Face models from single images

(16)

Morphable model of 3D faces

• Start with a catalogue of 200 aligned 3D Cyberware scans

• Build a model of average shape and texture, and principal variations using PCA

Morphable model

Morphable model of 3D faces

Divide face into 4 regions (eyes, nose, mouth, head)

For each new prototype, find amount of deviation from the reference shape and texture.

Morphable model of 3D faces

• Adding some variations

(17)

Reconstruction from single image Modifying a single image

Animating from a single image Video

(18)

Reanimating faces

35 static scans at different expressions

Videos

exercise speech

Exchanging faces Exchanging faces

(19)

Exchanging faces Morphable model for human body

Image-based faces (lip sync.)

Video rewrite

Stitch Background

Video Video

Model

Analysis

Synthesis _{Lip Video}^Select stage

stage ^Together

(20)

Results

• Video database

– 8 minutes of Ellen – 2 minutes of JFK

• Only half usable

• Head rotation

training video Read my lips.

I never met Forest Gump.

Relighting faces

Light is additive

lamp #1 lamp #2

Light stage 1.0

(21)

Light stage 1.0

64x32 lighting directions

Input images

Reflectance function

occlusion flare

Relighting

(22)

Results Changing viewpoints

Results Video

(23)

Spiderman 2

real synthetic

Light stage 3

Application: The Matrix Reloaded Application: The Matrix Reloaded

(24)

Reference

• F. Pighin, J. Hecker, D. Lischinski, D. H. Salesin, and R. Szeliski.

Synthesizing realistic facial expressions from photographs.

SIGGRAPH 1998, pp75-84.

• Brian Guenter, Cindy Grimm, Henrique Malvar, Daniel Wood, Making Faces, SIGGRAPH 1998.

• Li Zhang, Noah Snavely, Brian Curless, Steven M. Seitz, Spacetime Faces: High Resolution Capture for Modeling and Animation, SIGGRAPH 2004.

• Blanz, V. and Vetter, T., A Morphable Model for the Synthesis of 3D Faces, SIGGRAPH 1999, pp187-194.

• V. Blanz, C. Basso, T. Poggio and T. Vetter, Reanimating Faces in Images and Video, EUROGRAPHICS 2003.

• V. Blanz, K. Scherbaum, T. Vetter, H.P. Seidel, Exchanging Faces in Images, EUROGRAPHICS 2004.

• George Borshukov et al., Universal Capture - Image-based Facial Animation for "The Matrix Reloaded", SIGGRAPH 2003 Sketch.

Reference

• George Borshukov et al., Realistic Human Face Rendering for "The Matrix Reloaded", SIGGRAPH 2003 Sketch.

• Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, Mark Sagar, Acquiring the Reflectance Field of a Human Face, SIGGRAPH 2000.

• Paul Debevec, Chris Tchou, Andreas Wenger, Tim Hawkins, Andy Gardner, Brian Emerson, Ansul Panday, A Lighting Reproduction Approach to Live-Action Compositing, SIGGRAPH 2002.

• Mark Sagar, Reflectance Field Rendering of Human Faces for

"Spider-Man 2", SIGGRAPH 2004 Sketch.

• Christoph Bregler, Malcolm Slaney, Michele Covell, Video Rewrite:

Driving Visual Speeach with Audio, SIGGRAPH 1997.

• Tony Ezzat, Gadi Geiger, Tomaso Poggio, Trainable Videorealistic Speech Animation, SIGGRAPH 2002.

• Brett Allen, Brian Curless, Zoran Popovic, The Space of Human Body Shapes: Reconstruction and Parameterization From Range Scans, SIGGRAPH 2003.