Cyberware scanners

(1)

Making faces

Digital Visual Effects, Spring 2005 Yung-Yu Chuang

2005/6/8

with slides by Richard Szeliski, Steve Seitz and Alex Efros

(2)

(3)

Outline

• 3D acquisition for faces

• Statistical methods

• Face models from single images

• Image-based faces

• Relighting for faces

(4)

(5)

Cyberware scanners

face & head scanner whole body scanner

(6)

and view-dependent texture mapping

• Procedure

1. Take multiple photographs of a person 2. Establish corresponding feature points 3. Recover 3D points and camera

parameters

4. Deform generic face model to fit points 5. Extract textures from photos

(7)

Reconstruct a 3D model

input photographs

generic 3D face model

pose estimation

more features

deformed model

(8)

– Compute displacement of feature points – Apply scattered data interpolation

(9)

generic model displacement deformed model

Mesh deformation

(10)

combination of the colors in the photos

• Texture can be:

– View-independent – View-dependent

• Considerations for weighting

– Occlusion – Smoothness

– Positional certainty – View similarity

(11)

Texture extraction

(12)

(13)

view-independent view-dependent

Texture extraction

(14)

Use images to adapt a generic face model.

(15)

Creating new expressions

• In addition to global blending we can use:

– Regional blending – Painterly interface

(16)

Applying a global blend

+ =

/2 /2

New expressions are created with 3D morphing:

(17)

Applying a region-based blend

x

+

x

=

Creating new expressions

(18)

Using a painterly interface

+ + +

=

(19)

Drunken smile

(20)

Morphing over time creates animation:

“neutral” “joy”

(21)

Video

(22)

(23)

video projectors color cameras

black & white cameras

Spacetime faces

(24)

(25)

time

Face surface

(26)

Face surface

(27)

stereo

time

(28)

stereo active stereo

(29)

spacetime stereo

stereo active stereo

time

(30)

time=1

Face surface

surface motion

(31)

time

time=2

Spacetime Stereo

Face surface

surface motion

(32)

time=3 Face surfa

surface motion

(33)

time

time=4

Spacetime Stereo

Face surface

surface motion

(34)

time=5 surface motion

(35)

time

time

Spacetime Stereo

surface motion

Better

• spatial resolution

• temporal stableness

(36)

(37)

Video

(38)

(39)

FaceIK

(40)

(41)

3D face applications: The one

(42)

extra 3M

(43)

3D face applications: Spiderman 2

(44)

(45)

Statistical methods

f(z)+ε

z y

parameters

observed signal )

| ( max

* P z y

z =

z

) (

) ( )

| max (

y P

z P z

y P

=

z

) ( )

| (

min L y z L z

z

+

=

Example:

super-resolution de-noising

de-blocking Inpainting

…

(46)

) ( )

| (

min

* L y z L z

z =

z

+

2

)

2

(

σ

z f

data y −

evidence

a-priori knowledge

f(z)+ε

z y

parameters

observed

signal

(47)

Statistical methods

There are approximately 10²⁴⁰ possible 10×10 gray-level images. Even human being has not seen them all yet. There must be a strong

statistical bias.

Takeo Kanade

Approximately 8X10¹¹blocks per day per person.

(48)

“Smooth images are good images.”

∑

=

x

x V

z

L ( ) ρ ( ( ))

)

2

( d = d Gaussian MRF ρ

T d

T T

d d

>

≤

⎩ ⎨

⎧

−

= +

) (

) 2

(

₂

ρ

2

Huber MRF

(49)

Generic priors

(50)

“Existing images are good images.”

six 200×200

Images ⇒

2,000,000

pairs

(51)

Example-based priors

L(z)

(52)

low-resolution high-resolution

(53)

Model-based priors

“Face images are good images when working on face images …”

Parametric

model Z=WX+μ L(X)

) ( )

| (

min

* L y z L z

z =

z

+

⎩ ⎨

⎧

+

=

+ +

=

μ

*

) (

)

| (

min

*

WX z

X L WX

y L

X

x

⇒

(54)

approximating a high-dimensional data set with a lower-dimensional subspace

Original axes Original axes

**

** **

**

***

*

**

****** **

**

Data points Data points

First principal component First principal component Second principal component

Second principal component

(55)

PCA

• Given n k-d points

• Calculate the mean

• Calculate the covariance matrix

• SVD (eigen-analysis) on the covariance matrix

(56)

T 1

0 0

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛

=

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛

V U

A

wn

w

O

T 1

0 0

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛

=

⎟⎟

⎠

⎞

⎜⎜

⎝

⎛

U V A

wn

w

O

(57)

PCA on faces: “eigenfaces”

Average Average

faceface

First principal component First principal component

Other Other components components

For all except average, For all except average,

“gray“gray”” = 0,= 0,

“white“white”” > 0,> 0,

““blackblack”” < 0< 0

(58)

“Face images are good images when working on face images …”

Parametric

model Z=WX+μ L(X)

) ( )

| (

min

* L y z L z

z =

z

+

⎩ ⎨

⎧

+

=

+ +

=

μ

*

) (

)

| (

min

*

WX z

X L WX

y L

X

x

⇒

(59)

Super-resolution

(a) (b) (c) (d) (e) (f)

(a) Input low 24×32 (b) Our results (c) Cubic B-Spline

(d) Freeman et al. (e) Baker et al. (f) Original high 96×128

(60)

(61)

Morphable model of 3D faces

• Start with a catalogue of 200 aligned 3D Cyberware scans

• Build a model of average shape and texture, and principal variations using PCA

(62)

(63)

Morphable model of 3D faces

Divide face into 4 regions (eyes, nose, mouth, head)

For each new prototype, find

amount of deviation from the reference shape and texture.

(64)

(65)

Reconstruction from single image

(66)

(67)

Animating from a single image

(68)

(69)

Reanimating faces

35 static scans at different expressions

(70)

exercise speech

(71)

Exchanging faces

(72)

(73)

Exchanging faces

(74)

(75)

Image-based faces

(lip sync.)

(76)

Stitch Background

Video Video

Model

Analysis

Synthesis _{Lip Video}^Select stage

stage ^Together

(77)

Results

• Video database

– 8 minutes of Ellen – 2 minutes of JFK

• Only half usable

• Head rotation

training video Read my lips.

I never met Forest Gump.

(78)

(79)

Light is additive

lamp #1 lamp #2

(80)

(81)

Light stage 1.0

64x32 lighting directions

(82)

(83)

Reflectance function

occlusion flare

(84)

(85)

Results

(86)

(87)

Results

(88)

(89)

Spiderman 2

real synthetic

(90)

(91)

Application: The Matrix Reloaded

(92)

(93)

Reference

• F. Pighin, J. Hecker, D. Lischinski, D. H. Salesin, and R. Szeliski.

Synthesizing realistic facial expressions from photographs.

SIGGRAPH 1998, pp75-84.

• Brian Guenter, Cindy Grimm, Henrique Malvar, Daniel Wood, Making Faces, SIGGRAPH 1998.

• Li Zhang, Noah Snavely, Brian Curless, Steven M. Seitz, Spacetime Faces: High Resolution Capture for Modeling and Animation,

SIGGRAPH 2004.

• Blanz, V. and Vetter, T., A Morphable Model for the Synthesis of 3D Faces, SIGGRAPH 1999, pp187-194.

• V. Blanz, C. Basso, T. Poggio and T. Vetter, Reanimating Faces in Images and Video, EUROGRAPHICS 2003.

• V. Blanz, K. Scherbaum, T. Vetter, H.P. Seidel, Exchanging Faces in Images, EUROGRAPHICS 2004.

• George Borshukov et al., Universal Capture - Image-based Facial Animation for "The Matrix Reloaded", SIGGRAPH 2003 Sketch.

(94)

• Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, Mark Sagar, Acquiring the Reflectance Field of a Human Face, SIGGRAPH 2000.

• Paul Debevec, Chris Tchou, Andreas Wenger, Tim Hawkins, Andy Gardner, Brian Emerson, Ansul Panday, A Lighting Reproduction Approach to Live-Action Compositing, SIGGRAPH 2002.

• Mark Sagar, Reflectance Field Rendering of Human Faces for

"Spider-Man 2", SIGGRAPH 2004 Sketch.

• Christoph Bregler, Malcolm Slaney, Michele Covell, Video Rewrite:

Driving Visual Speeach with Audio, SIGGRAPH 1997.

• Tony Ezzat, Gadi Geiger, Tomaso Poggio, Trainable Videorealistic Speech Animation, SIGGRAPH 2002.

• Brett Allen, Brian Curless, Zoran Popovic, The Space of Human Body Shapes: Reconstruction and Parameterization From Range Scans, SIGGRAPH 2003.