3D photography

(1)

3D photography

Digital Visual Effectsg Yung-Yu Chuang

with slides by Szymon Rusinkiewicz, Richard Szeliski, Steve Seitz and Brian Curless

(2)

3D photography

• Acquisition of geometry and material

(3)

Range acquisition

(4)

Range acquisition taxonomy

mechanical

mechanical (CMM, jointed arm)(CMM, jointed arm)

contact contact

ultrasonic trackers ultrasonic trackers

ti t k ti t k inertial

inertial (gyroscope, accelerometer)(gyroscope, accelerometer)

industrial CT industrial CT magnetic trackers magnetic trackers range

range acquisition

acquisition transmissivetransmissive

industrial CT industrial CT ultrasound ultrasound MRI

MRI non

non--opticaloptical radarradar MRI

MRI

reflective reflective

non

non opticaloptical optical

optical

sonar sonar p

p

(5)

Range acquisition taxonomy

shape from X:

stereo stereo

passive passive

stereo stereo motion motion shading shading texture texture

ti l ti l

passive

passive _texture_texture

focus focus defocus defocus

optical optical methods

methods active variants of passive methodsactive variants of passive methods

Stereo w. projected texture Stereo w. projected texture

active active

p j p j

Active depth from defocus Active depth from defocus Photometric stereo

Photometric stereo

time of flight time of flight t i l ti t i l ti triangulation triangulation

(6)

Outline

• Passive approaches

St

– Stereo

– Multiview approach

h

• Active approaches

– Triangulation

– Shadow scanning

• Active variants of passive approachesp pp

– Photometric stereo

– Example-based photometric stereop p

(7)

Passive approaches

(8)

Stereo

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923

(9)

Stereo

• One distinguishable point being observed

Th i b f d t th i t ti f – The preimage can be found at the intersection of the rays from the focal points to the image points

(10)

Stereo

• Many points being observed

N d th d t t bli h d

– Need some method to establish correspondences

(11)

Components of stereo vision systems

• Camera calibration

I ifi i i lifi h h f

• Image rectification: simplifies the search for correspondences

• Correspondence: which item in the left image corresponds to which item in the right image

• Reconstruction: recovers 3-D information from the 2-D correspondencesp

(12)

Epipolar geometry

• Epipolar constraint: corresponding points must lie on conjugate epipolar lines

lie on conjugate epipolar lines

– Search for correspondences becomes a 1-D problem

0 '

^

F 0

'

^

Fx 

x

(13)

Image rectification

Warp images such

• Warp images such that conjugate

epipolar lines become epipolar lines become collinear and parallel to u axis

to u axis

(14)

Disparity

• With rectified images, disparity is just

(horizontal) displacement of corresponding (horizontal) displacement of corresponding features in the two images

Di it 0 f di t t i t – Disparity = 0 for distant points – Larger disparity for closer points

D th f i t ti l t 1/di it – Depth of point proportional to 1/disparity

(15)

Reconstruction

• Geometric

C t t th li t di l t R d R' – Construct the line segment perpendicular to R and R'

that intersects both rays and take its mid-point

(16)

Basic stereo algorithm

For each epipolar line

For each pixel in the left image

• compare with every pixel on same epipolar line in right image

• pick pixel with minimum match cost

I t t h i d

Improvement: match windows

(17)

Basic stereo algorithm

• For each pixel

F h di it – For each disparity

• For each pixel in window – Compute differenceCompute difference

– Find disparity with minimum SSD

(18)

Reverse order of loops

• For each disparity

F h i l – For each pixel

• For each pixel in window – Compute differenceCompute difference

• Find disparity with minimum SSD at each pixel

(19)

Incremental computation

• Given SSD of a window, at some disparity

Image 1 Image 1

Image 2 Image 2 Image 2 Image 2

(20)

Incremental computation

• Want: SSD at next location

Image 1 Image 1

Image 2 Image 2 Image 2 Image 2

(21)

Incremental computation

• Subtract contributions from leftmost column, add contributions from rightmost column

add contributions from rightmost column

+ + --

Image 1 Image 1

+ + + + + + + + + + --

-- --

+ + --

I 2

-- --

+ + +

Image 2 +

Image 2 ^--

-- --

+ + + + + +

(22)

Selecting window size

• Small window: more detail, but more noise

L i d b l d il

• Large window: more robustness, less detail

• Example:

(23)

Selecting window size

3 pixel window

3 pixel window 20 pixel window20 pixel window

(24)

Non-square windows

• Compromise: have a large window, but higher weight near the center

weight near the center

• Example: Gaussian

• Example: Shifted windows

(25)

Ordering constraint

• Order of matching features usually the same in both images

in both images

• But not always: occlusion

(26)

Dynamic programming

• Treat feature correspondence as graph problem

Right image features Right image features

1

1 22 33 44 1

1 Cost of edges =Cost of edges = Left image

Left image features features

2 2 3 3

g g similarity of similarity of regions between regions between image features image features features

features

3 3 4 4

image features image features

(27)

Dynamic programming

• Find min-cost path through graph

Right image features Right image features

1

1 22 33 44 1

1

1 11

Left image Left image

features features

2 2 3 3

1

1 11

2 2 2

2 3 features 3

features

3 3 4 4

3 3 4 4 3

3 4 4

(28)

Energy minimization

• Another approach to improve quality of correspondences

correspondences

• Assumption: disparities vary (mostly) smoothly

• Minimize energy function:

E_data_data+E_smoothness_smoothness

• E_data: how well does disparity match data E : how well does disparity match

• E_smoothness: how well does disparity match that of neighbors – regularization

(29)

Energy minimization

• If data and energy terms are nice (continuous, smooth etc ) can try to minimize via gradient smooth, etc.) can try to minimize via gradient descent, etc.

I ti di iti l i i th

• In practice, disparities only piecewise smooth

• Design smoothness function that doesn’t penalize large jumps too much

– Example: V()=min(||, K)

(30)

Stereo as energy minimization

• Matching Cost Formulated as Energy

“d t ” t li i b d t h – “data” term penalizing bad matches

) , (

) , ( )

, ,

( x y d x y x d y

D  I  J 

– “neighborhood term” encouraging spatial smoothness

) , (

) , ( )

, ,

( y y y

similar) something

(or

d2 and d1

labels with

pixels adjacent

of

2 cost

1, ) (

d d

V





similar) something

2 (or

1 d

d 





 ^D ⁽ ^d ⁾ ^ ^V ⁽ ^d ^d ⁾

E ^  ^ 

) 2 , 2 ( ), 1 , 1 (

2 , 2 1

, 1 )

, (

,

) ( , )

, , (

y x y

x neighbors

y x y

x y

x

y

x

V d d

d

y

x

D

E

(31)

Energy minimization

• Hard to find global minima of non-smooth functions

functions

– Many local minima P bl NP h d – Provably NP-hard

• Practical algorithms look for approximate

l d l

minima (e.g., simulated annealing)

(32)

Stereo results

– Data from University of Tsukuba

ground truth scene

htt // t iddl b d / t /

http://cat.middlebury.edu/stereo/

(33)

Results with window correlation

normalized correlation (best window size)

ground truth (best window size)

(34)

Results with graph cuts

ground truth graph cuts

(Potts model E, (Potts model E,

expansion move algorithm)

(35)

Stereo evaluation

(36)

Stereo—best algorithms

(37)

Volumetric multiview approaches

• Goal: find a model consistent with images

“M d l i ” ( i i )

• “Model-centric” (vs. image-centric)

• Typically use discretized volume (voxel grid)

• For each voxel, compute occupied / free (for some algorithms, also color, etc.)

( g , , )

(38)

Photo consistency

• Result: not necessarily the correct scene

M d h i

• Many scenes produce the same images

True scene True scene

Reconstructed Reconstructed

scene scene

All scenes All scenes

Photo

Photo--consistent scenesconsistent scenes

(39)

Silhouette carving

• Find silhouettes in all images

E i

• Exact version:

– Back-project all silhouettes, find intersection

Binary Images Binary Images Binary Images Binary Images

(40)

Silhouette carving

• Find silhouettes in all images

E i

• Exact version:

– Back-project all silhouettes, find intersection

(41)

Silhouette carving

• Limit of silhouette carving is visual hull or line hull

line hull

• Complement of lines that don’t intersect object

• In general not the same as object

– Can’t recover “pits” in object

• Not the same as convex hull

(42)

Silhouette carving

• Discrete version:

L ll l i l

– Loop over all voxels in some volume

– If projection into images lies inside all silhouettes, mark as occupied

mark as occupied – Else mark as free

(43)

Silhouette carving

(44)

Voxel coloring

• Seitz and Dyer, 1997

I ddi i f / i d l

• In addition to free / occupied, store color at each voxel

• Explicitly accounts for occlusion

(45)

Voxel coloring

• Basic idea: sweep through a voxel grid

P j t h l i t h i i hi h – Project each voxel into each image in which

it is visible

If colors in images agree mark voxel with color – If colors in images agree, mark voxel with color – Else, mark voxel as empty

A t f l b d i

• Agreement of colors based on comparing standard deviation of colors to threshold

(46)

Voxel coloring and occlusion

(47)

Voxel coloring and occlusion

• Problem: which voxels are visible?

S l i i i

• Solution: constrain camera views

– When a voxel is considered, necessary occlusion i f ti t b il bl

information must be available

– Sweep occluders before occludees

C ll h

– Constrain camera positions to allow this sweep

(48)

Voxel coloring sweep order

Scene Scene Traversal Traversal

(49)

Voxel coloring camera positions

Inward-lookinga d oo g Outward-looking Cameras above scene

Out a d oo g

Cameras inside scene

Seitz Seitz

(50)

Image acquisition

Selected Dinosaur Images Selected Dinosaur Images

•Calibrated Turntable

•360° rotation (21 images)

S l t d Fl I S l t d Fl I

Selected Flower Images Selected Flower Images

(51)

Voxel coloring results

Dinosaur Reconstruction Dinosaur Reconstruction

72 K l l d

Flower Reconstruction Flower Reconstruction

70 K l l d

72 K voxels colored 72 K voxels colored 7.6 M voxels tested 7.6 M voxels tested 7 min. to compute 7 min. to compute on a 250MHz SGI on a 250MHz SGI

70 K voxels colored 70 K voxels colored 7.6 M voxels tested 7.6 M voxels tested 7 min. to compute 7 min. to compute on a 250MHz SGI on a 250MHz SGI on a 250MHz SGI

on a 250MHz SGI on a 250MHz SGIon a 250MHz SGI

(52)

Space carving

Image 1

Image 1 Image NImage N

Image 1

Image 1 Image NImage N

…...

Initialize to a volume V containing the true scene Initialize to a volume V containing the true scene Choose a voxel on the current surface

Choose a voxel on the current surface

R t til

Carve if not photo

Carve if not photo--consistentconsistent Project to visible input images Project to visible input images Repeat until convergence

Repeat until convergence

(53)

Multi-pass plane sweep

• Faster alternative:

S l i h f 6 i i l di ti

– Sweep plane in each of 6 principal directions – Consider cameras on only one side of plane

R t til

– Repeat until convergence

(54)

Multi-pass plane sweep

True Scene Reconstruction

(55)

Multi-pass plane sweep

(56)

Multi-pass plane sweep

(57)

Multi-pass plane sweep

(58)

Multi-pass plane sweep

(59)

Space carving results: African violet

Input image (1 of 45)

Input image (1 of 45) ReconstructionReconstruction

Reconstruction Reconstruction Reconstruction

Reconstruction

(60)

Space carving results: hand

Input image Input image (1 of 100) (1 of 100)

Reconstruction Reconstruction Reconstruction Reconstruction

(61)

Active approaches

(62)

Time of flight

• Basic idea: send out pulse of light (usually laser) time how long it takes to return

laser), time how long it takes to return

t c

r  

2 1 2

(63)

Laser scanning (triangulation)

Object

Direction of travel Laser sheet

CCD image plane

Laser CCD

Cylindrical lens

Digital Michelangelo Project

http://graphics stanford edu/projects/mich/

• Optical triangulation

– Project a single stripe of laser light

http://graphics.stanford.edu/projects/mich/

– Scan it across the surface of the object

– This is a very precise version of structured light scanning

Other patterns are possible

• Other patterns are possible

(64)

Cyberware

face and hand full body

(65)

XYZRGB

(66)

XYZRGB

(67)

Shadow scanning

Desk

L Stick or

Lamp pencil

Camera

Desk

http://www.vision.caltech.edu/bouguetj/ICCV98/

(68)

Basic idea

• Calibration issues:

– where’s the camera wrt. ground plane?

– where’s the shadow plane?

• depends on light source position, shadow edge

(69)

Two Plane Version

• Advantagesg

– don’t need to pre-calibrate the light source

– shadow plane determined from two shadow edgesshadow plane determined from two shadow edges

(70)

Estimating shadow lines

(71)

Shadow scanning in action

(72)

Results

accuracy: 0.1mm over 10cm ~ 0.1% error

(73)

Textured objects

(74)

Scanning with the sun

accuracy: 1mm over 50cm ~ 0.5% error

(75)

Scanning with the sun

accuracy: 1cm over 2m accuracy: 1cm over 2m

~ 0.5% error

(76)

Active variants of

passive approaches

(77)

The BRDF

• The Bidirectional Reflection Distribution Function

– Given an incoming ray and outgoing rayg y g g y

what proportion of the incoming light is reflected along outgoing ray?

surface normal surface normal





(l,v) (l n)

I   

(78)

Diffuse reflection (Lambertian)

L V N

P

kd

v l, ) 



( albedo

Assuming that light strength is 1.g g g .

(79)

Photometric stereo

N L₁

L₂ V L₃

• Can write this as a matrix equation:

(80)

Solving the equations

(81)

More than three lights

• Get better results by using more lights

L l i

• Least squares solution:

• Solve for N, k, _d_d as before

(82)

Trick for handling shadows

• Weight each equation by the pixel brightness:

• Gives weighted least-squares matrix g q equation:

• Solve for N, k, _d_d as before

(83)

Photometric Stereo Setup

(84)

Procedure

• Calibrate camera

C lib li h di i /i i i

• Calibrate light directions/intensities

• Photographing objects (HDR recommended)

• Estimate normals

• Estimate depth

(85)

Estimating light directions

• Trick: place a chrome sphere in the scene

– the location of the highlight tells you where the light source is

source is

• Use a ruler

(86)

Photographing objects

(87)

Normalize light intensities

(88)

Estimate normals

(89)

Depth from normals

(90)

Results

(91)

Limitations

• Big problems

d ’t k f hi thi i t l t

– doesn’t work for shiny things, semi-translucent things

shadows inter reflections – shadows, inter-reflections

• Smaller problems

l b

– calibration requirements

• measure light source directions, intensities

• camera response function

(92)

Example-based photometric stereo

• Estimate 3D shape by varying illumination, fixed camera

fixed camera

• Operating conditions

– any opaque material

– distant camera, lighting – reference object available

– no shadows, interreflections, transparency

(93)

same surface normal

“Orientation consistency”

(94)

(95)

(96)

(97)

(98)

(99)

(100)

Virtual views

(101)

Velvet

(102)

Virtual Views

(103)

Brushed Fur

(104)

Virtual Views

(105)

(106)

Active stereo with structured light

Li Zhang’s one-shot stereo

camera 1 camera 1

Li Zhang s one shot stereo

projector projector

• Project “structured” light patterns onto the object

camera 2

– simplifies the correspondence problem

(107)

Spacetime Stereo

http://grail.cs.washington.edu/projects/stfaces/

(108)

3D Model Acquisition Pipeline

3D Scanner 3D Scanner

(109)

3D Model Acquisition Pipeline

View Planning View Planning View Planning View Planning

(110)

3D Model Acquisition Pipeline

Alignment Alignment View Planning

View Planning AlignmentAlignment

View Planning View Planning

(111)

3D Model Acquisition Pipeline

Alignment Alignment View

View PlanningPlanning AlignmentAlignment View

View PlanningPlanning

Merging Merging

(112)

Volumetric reconstruction

(113)

Signed distance function

(114)

Results

(115)

The Digital Michelangelo Project

• Goal: scan 10 sculptures by Michelangelo

Hi h l i (“ illi ”)

• High-resolution (“quarter-millimeter”) geometry

• Stanford University, led by Marc Levoy

(116)

Systems, projects and applications

(117)

Scanning the David

height of gantry: 7.5 meters i h f 800 kil

weight of gantry: 800 kilograms

(118)

Range processing pipeline

• steps

1. manual initial alignment 2. ICP to one existing scan

3. automatic ICP of all overlapping pairs 4. global relaxation to spread out error 5 merging using volumetric method 5. merging using volumetric method

(119)

Statistics about the scan

• 480 individually aimed scans

• 2 billion polygons

• 7,000 color images

• 32 gigabytesg g y

• 30 nights of scanning

• 22 peoplep p

(120)

Comparison

photograph 1 0 mm computer model photograph 1.0 mm computer model

(121)

Results

(122)

The Great Buddha Project

• Great Buddha of Kamakura

O i i l d f d l d 1243

• Original made of wood, completed 1243

• Covered in bronze and gold leaf, 1267

• Approx. 15 m tall

• Goal: preservation of

• Goal: preservation of cultural heritage

• Institute of Industrial Science

• Institute of Industrial Science, University of Tokyo, led by

K t hi Ik hi K t hi Ik hi Katsushi Ikeuchi Katsushi Ikeuchi

(123)

Scanner

• Cyrax range scanner by Cyra Technologies by Cyra Technologies

• Laser pulse time-of-flight

• Accuracy: 4 mm

• Range: 100 mg

(124)

Processing

• 20 range images (a few million points)

Si l ll ll ICP

• Simultaneous all-to-all ICP

• Variant of volumetric merging (parallelized)

(125)

Results

(126)

Applications in VFX

• 3D scanning

H b id f IMAX

• Hybrid camera for IMAX

• View interpolation

(127)

3D scanning

XYZRGB Inc XYZRGB Inc.

(128)

IMAX 3D

• 6K resolution, 42 linear bits per pixel F CG i i ll k 6 h f f

• For CG, it typically takes 6 hours for a frame

• 45-minute IMAX 3D CG film requires a 100-CPU rendering farm full-time for about a year just for rendering

• For live-action, camera is bulky (like a refrigerator)g )

(129)

Hybrid stereo camera

(130)

Live-action sequence

(131)

Hybrid input

left

right

(132)

Hybrid input

left left

right

(133)

Combine multiple hires to lores

(134)

Results

(135)

View interpolation

Bullet time video Bullet time video

(136)

View interpolation

High quality video view interpolation High-quality video view interpolation

3D photography

3D photography

3D photography

Range acquisition

Range acquisition taxonomy

Range acquisition taxonomy

Outline

Passive approaches

Passive approaches

Stereo

Stereo

Stereo

Components of stereo vision systems

Epipolar geometry

0

'

F 0

'

Fx 

x

Image rectification

Disparity

Reconstruction

Basic stereo algorithm

Basic stereo algorithm

Reverse order of loops

Incremental computation

Incremental computation

Incremental computation

Selecting window size

Selecting window size

Non-square windows

Ordering constraint

Dynamic programming

Dynamic programming

Energy minimization

Energy minimization

Stereo as energy minimization

) , (

) , ( )

, ,

( x y d x y x d y

D  I  J 

) , (

) , ( )

, ,

( y y y



 D ( d )  V ( d d )

E    

) ( , )

, , (

V d d

d

y

x

D

E

Energy minimization

Stereo results

htt // t iddl b d / t /

http://cat.middlebury.edu/stereo/

Results with window correlation

Results with graph cuts

Stereo evaluation

Stereo—best algorithms

Volumetric multiview approaches

Photo consistency

Silhouette carving

Silhouette carving

Silhouette carving

Silhouette carving

Silhouette carving

Voxel coloring

Voxel coloring

Voxel coloring and occlusion

Voxel coloring and occlusion

Voxel coloring sweep order

Voxel coloring camera positions

Image acquisition

 ^D ⁽ ^d ⁾ ^ ^V ⁽ ^d ^d ⁾

E ^  ^ 