3D photography
Digital Visual Effectsg Yung-Yu Chuang
with slides by Szymon Rusinkiewicz, Richard Szeliski, Steve Seitz and Brian Curless
3D photography
• Acquisition of geometry and material
Range acquisition Range acquisition taxonomy
mechanical
mechanical (CMM, jointed arm)(CMM, jointed arm)
contact contact
ultrasonic trackers ultrasonic trackers
ti t k ti t k inertial
inertial (gyroscope, accelerometer)(gyroscope, accelerometer)
industrial CT industrial CT magnetic trackers magnetic trackers
range range acquisition
acquisition transmissivetransmissive
industrial CT industrial CT ultrasound ultrasound MRI MRI non
non--opticaloptical radarradar MRI
MRI
reflective reflective
non
non opticaloptical optical optical
sonar sonar p
p
Range acquisition taxonomy
shape from X:
shape from X:
stereo stereo
passive passive
stereo stereo motion motion shading shading texture texture
ti l ti l
passive
passive texturetexture
focus focus defocus defocus
optical optical methods
methods active variants of passive methodsactive variants of passive methods Stereo w. projected texture Stereo w. projected texture
active active
p j p j
Active depth from defocus Active depth from defocus Photometric stereo Photometric stereo
time of flight time of flight t i l ti t i l ti triangulation triangulation
Outline
• Passive approaches
St – Stereo
– Multiview approach
h
• Active approaches
– Triangulation – Shadow scanning
• Active variants of passive approachesp pp
– Photometric stereo
– Example-based photometric stereop p
Passive approaches Passive approaches
Stereo
Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923
Stereo
• One distinguishable point being observed
Th i b f d t th i t ti f – The preimage can be found at the intersection of the rays from the focal points to the image points
Stereo
• Many points being observed
N d th d t t bli h d
– Need some method to establish correspondences
Components of stereo vision systems
• Camera calibration
I ifi i i lifi h h f
• Image rectification: simplifies the search for correspondences
• Correspondence: which item in the left image corresponds to which item in the right image
• Reconstruction: recovers 3-D information from the 2-D correspondencesp
Epipolar geometry
• Epipolar constraint: corresponding points must lie on conjugate epipolar lines
lie on conjugate epipolar lines
– Search for correspondences becomes a 1-D problem
0
'
F 0
'
Fx
x
Image rectification
Warp images such
• Warp images such that conjugate
epipolar lines become epipolar lines become collinear and parallel to u axis
to u axis
Disparity
• With rectified images, disparity is just (horizontal) displacement of corresponding (horizontal) displacement of corresponding features in the two images
Di it 0 f di t t i t – Disparity = 0 for distant points – Larger disparity for closer points
D th f i t ti l t 1/di it – Depth of point proportional to 1/disparity
Reconstruction
• Geometric
C t t th li t di l t R d R' – Construct the line segment perpendicular to R and R'
that intersects both rays and take its mid-point
Basic stereo algorithm
For each epipolar line
For each pixel in the left image
• compare with every pixel on same epipolar line in right image
• compare with every pixel on same epipolar line in right image
• pick pixel with minimum match cost
I t t h i d
Improvement: match windows
Basic stereo algorithm
• For each pixel
F h di it – For each disparity
• For each pixel in window – Compute differenceCompute difference
– Find disparity with minimum SSD
Reverse order of loops
• For each disparity
F h i l – For each pixel
• For each pixel in window – Compute differenceCompute difference
• Find disparity with minimum SSD at each pixel
Incremental computation
• Given SSD of a window, at some disparity
Image 1 Image 1
Image 2 Image 2 Image 2 Image 2
Incremental computation
• Want: SSD at next location
Image 1 Image 1
Image 2 Image 2 Image 2 Image 2
Incremental computation
• Subtract contributions from leftmost column, add contributions from rightmost column add contributions from rightmost column
+ + --
Image 1 Image 1
+ + + + + + + + + + --
-- --
+ + --
I 2
I 2
-- --
+ + + +
Image 2 Image 2 --
-- --
+ + + + + +
Selecting window size
• Small window: more detail, but more noise
L i d b l d il
• Large window: more robustness, less detail
• Example:
Selecting window size
3 pixel window
3 pixel window 20 pixel window20 pixel window
Non-square windows
• Compromise: have a large window, but higher weight near the center
weight near the center
• Example: Gaussian
• Example: Shifted windows
Ordering constraint
• Order of matching features usually the same in both images
in both images
• But not always: occlusion
Dynamic programming
• Treat feature correspondence as graph problem
Right image features Right image features
1
1 22 33 44 1
1 Cost of edges =Cost of edges = Left image
Left image features features
2 2 3 3
g g similarity of similarity of regions between regions between image features image features features
features 3 3 4 4
image features image features
Dynamic programming
• Find min-cost path through graph
Right image features Right image features
1
1 22 33 44 1
1
1
1 11
Left image Left image features features
2 2 3 3
1
1 11
2 2 2
2 3 3 features
features 3 3 4 4
3 3 4 4 3
3 4 4
Energy minimization
• Another approach to improve quality of correspondences
correspondences
• Assumption: disparities vary (mostly) smoothly
• Minimize energy function:
Edatadata+Esmoothnesssmoothness
• Edata: how well does disparity match data E : how well does disparity match
• Esmoothness: how well does disparity match that of neighbors – regularization
Energy minimization
• If data and energy terms are nice (continuous, smooth etc ) can try to minimize via gradient smooth, etc.) can try to minimize via gradient descent, etc.
I ti di iti l i i th
• In practice, disparities only piecewise smooth
• Design smoothness function that doesn’t penalize large jumps too much
– Example: V()=min(||, K)
Stereo as energy minimization
• Matching Cost Formulated as Energy
“d t ” t li i b d t h – “data” term penalizing bad matches
) , (
) , ( ) , ,
(x y d x y x d y
D I J
– “neighborhood term” encouraging spatial smoothness
) , (
) , ( ) , ,
( y y y
similar) something
(or
d2 and d1 labels with pixels adjacent of
2 cost
1, ) (
d d d d V
similar) something
2 (or
1 d
d
D( d ) V(d d )E
) 2 , 2 ( ), 1 , 1 (
2 , 2 1 , 1 )
, (
, ) ( , )
, , (
y x y x neighbors
y x y x y
x
y
x V d d
d y x D E
Energy minimization
• Hard to find global minima of non-smooth functions
functions
– Many local minima P bl NP h d – Provably NP-hard
• Practical algorithms look for approximate
l d l
minima (e.g., simulated annealing)
Stereo results
– Data from University of Tsukuba
ground truth scene
htt // t iddl b d / t /
http://cat.middlebury.edu/stereo/
Results with window correlation
normalized correlation (best window size)
ground truth (best window size)
Results with graph cuts
ground truth graph cuts
(Potts model E, (Potts model E, expansion move algorithm)
Stereo evaluation Stereo—best algorithms
Volumetric multiview approaches
• Goal: find a model consistent with images
“M d l i ” ( i i )
• “Model-centric” (vs. image-centric)
• Typically use discretized volume (voxel grid)
• For each voxel, compute occupied / free (for some algorithms, also color, etc.)
( g , , )
Photo consistency
• Result: not necessarily the correct scene
M d h i
• Many scenes produce the same images
True scene True scene
Reconstructed Reconstructed
scene scene
All scenes All scenes
Photo
Photo--consistent scenesconsistent scenes
Silhouette carving
• Find silhouettes in all images
E i
• Exact version:
– Back-project all silhouettes, find intersection
Binary Images Binary Images Binary Images Binary Images
Silhouette carving
• Find silhouettes in all images
E i
• Exact version:
– Back-project all silhouettes, find intersection
Silhouette carving
• Limit of silhouette carving is visual hull or line hull
line hull
• Complement of lines that don’t intersect object
• In general not the same as object
– Can’t recover “pits” in object
• Not the same as convex hull
Silhouette carving
• Discrete version:
L ll l i l
– Loop over all voxels in some volume
– If projection into images lies inside all silhouettes, mark as occupied
mark as occupied – Else mark as free
Silhouette carving Voxel coloring
• Seitz and Dyer, 1997
I ddi i f / i d l
• In addition to free / occupied, store color at each voxel
• Explicitly accounts for occlusion
Voxel coloring
• Basic idea: sweep through a voxel grid
P j t h l i t h i i hi h – Project each voxel into each image in which
it is visible
If colors in images agree mark voxel with color – If colors in images agree, mark voxel with color – Else, mark voxel as empty
A t f l b d i
• Agreement of colors based on comparing standard deviation of colors to threshold
Voxel coloring and occlusion
Voxel coloring and occlusion
• Problem: which voxels are visible?
S l i i i
• Solution: constrain camera views
– When a voxel is considered, necessary occlusion i f ti t b il bl
information must be available – Sweep occluders before occludees
C ll h
– Constrain camera positions to allow this sweep
Voxel coloring sweep order
Scene Scene Traversal Traversal
Voxel coloring camera positions
Inward-lookinga d oo g Outward-looking Cameras above scene
Out a d oo g
Cameras inside scene
Seitz Seitz
Image acquisition
Selected Dinosaur Images Selected Dinosaur Images
•Calibrated Turntable
•360° rotation (21 images) S l t d Fl I
S l t d Fl I Selected Flower Images Selected Flower Images
Voxel coloring results
Dinosaur Reconstruction Dinosaur Reconstruction
72 K l l d
72 K l l d
Flower Reconstruction Flower Reconstruction
70 K l l d
70 K l l d
72 K voxels colored 72 K voxels colored 7.6 M voxels tested 7.6 M voxels tested 7 min. to compute 7 min. to compute on a 250MHz SGI on a 250MHz SGI
70 K voxels colored 70 K voxels colored 7.6 M voxels tested 7.6 M voxels tested 7 min. to compute 7 min. to compute on a 250MHz SGI on a 250MHz SGI on a 250MHz SGI
on a 250MHz SGI on a 250MHz SGIon a 250MHz SGI
Space carving
Image 1
Image 1 Image NImage N
Image 1
Image 1 Image NImage N
…...
…...
Initialize to a volume V containing the true scene Initialize to a volume V containing the true scene Choose a voxel on the current surface
Choose a voxel on the current surface
R t til
R t til
Carve if not photo
Carve if not photo--consistentconsistent Project to visible input images Project to visible input images Repeat until convergence Repeat until convergence
Multi-pass plane sweep
• Faster alternative:
S l i h f 6 i i l di ti – Sweep plane in each of 6 principal directions – Consider cameras on only one side of plane
R t til
– Repeat until convergence
Multi-pass plane sweep
True Scene Reconstruction
Multi-pass plane sweep Multi-pass plane sweep
Multi-pass plane sweep Multi-pass plane sweep
Space carving results: African violet
Input image (1 of 45)
Input image (1 of 45) ReconstructionReconstruction
Reconstruction Reconstruction Reconstruction
Reconstruction
Space carving results: hand
Input image Input image (1 of 100) (1 of 100)
Reconstruction Reconstruction Reconstruction Reconstruction
Active approaches Active approaches
Time of flight
• Basic idea: send out pulse of light (usually laser) time how long it takes to return laser), time how long it takes to return
t c
r
2 1 2
Laser scanning (triangulation)
Object
Direction of travel Laser sheet
CCD image plane
Laser CCD
Cylindrical lens
Digital Michelangelo Project
http://graphics stanford edu/projects/mich/
• Optical triangulation
– Project a single stripe of laser light
http://graphics.stanford.edu/projects/mich/
– Scan it across the surface of the object
– This is a very precise version of structured light scanning
Other patterns are possible
• Other patterns are possible
Cyberware
face and hand full body
XYZRGB XYZRGB
Shadow scanning
Desk
L Stick or
Lamp pencil
Camera
Desk
http://www.vision.caltech.edu/bouguetj/ICCV98/
Basic idea
• Calibration issues:
– where’s the camera wrt. ground plane?
– where’s the shadow plane?
• depends on light source position, shadow edge
Two Plane Version
• Advantagesg
– don’t need to pre-calibrate the light source
– shadow plane determined from two shadow edgesshadow plane determined from two shadow edges
Estimating shadow lines
Shadow scanning in action Results
accuracy: 0.1mm over 10cm ~ 0.1% error
Textured objects Scanning with the sun
accuracy: 1mm over 50cm ~ 0.5% error
Scanning with the sun
accuracy: 1cm over 2m accuracy: 1cm over 2m
~ 0.5% error
Active variants of
passive approaches
The BRDF
• The Bidirectional Reflection Distribution Function
– Given an incoming ray and outgoing rayg y g g y what proportion of the incoming light is reflected along outgoing ray?
surface normal surface normal
(l,v) (l n)
I
Diffuse reflection (Lambertian)
L V N
P
kd
v l, )
( albedo
Assuming that light strength is 1.g g g .
Photometric stereo
N L1
L2
V L3
• Can write this as a matrix equation:
• Can write this as a matrix equation:
Solving the equations
More than three lights
• Get better results by using more lights
L l i
• Least squares solution:
• Solve for N, k, dd as before
Trick for handling shadows
• Weight each equation by the pixel brightness:
• Gives weighted least-squares matrix g q equation:
• Solve for N, k, dd as before
Photometric Stereo Setup Procedure
• Calibrate camera
C lib li h di i /i i i
• Calibrate light directions/intensities
• Photographing objects (HDR recommended)
• Estimate normals
• Estimate depth
• Estimate depth
Estimating light directions
• Trick: place a chrome sphere in the scene
– the location of the highlight tells you where the light source is
source is
• Use a ruler
Photographing objects
Normalize light intensities Estimate normals
Depth from normals Results
Limitations
• Big problems
d ’t k f hi thi i t l t
– doesn’t work for shiny things, semi-translucent things
shadows inter reflections – shadows, inter-reflections
• Smaller problems
l b
– calibration requirements
• measure light source directions, intensities
• camera response function
• camera response function
Example-based photometric stereo
• Estimate 3D shape by varying illumination, fixed camera
fixed camera
• Operating conditions
– any opaque material – distant camera, lighting – reference object available
– no shadows, interreflections, transparency
same surface normal
“Orientation consistency”
Virtual views
Velvet Virtual Views
Brushed Fur Virtual Views
Active stereo with structured light
Li Zhang’s one-shot stereo
camera 1 camera 1
Li Zhang s one shot stereo
projector projector
• Project “structured” light patterns onto the object
camera 2
– simplifies the correspondence problem
Spacetime Stereo
http://grail.cs.washington.edu/projects/stfaces/
3D Model Acquisition Pipeline
3D Scanner 3D Scanner
3D Model Acquisition Pipeline
3D Scanner 3D Scanner
View Planning View Planning View Planning View Planning
3D Model Acquisition Pipeline
3D Scanner 3D Scanner
Alignment Alignment View Planning
View Planning AlignmentAlignment
View Planning View Planning
3D Model Acquisition Pipeline
3D Scanner 3D Scanner
Alignment Alignment View
ViewPlanningPlanning AlignmentAlignment View
View PlanningPlanning
Merging Merging
Volumetric reconstruction
Signed distance function Results
The Digital Michelangelo Project
• Goal: scan 10 sculptures by Michelangelo
Hi h l i (“ illi ”)
• High-resolution (“quarter-millimeter”) geometry
• Stanford University, led by Marc Levoy
Systems, projects and applications
Systems, projects and applications
Scanning the David
height of gantry: 7.5 meters i h f 800 kil weight of gantry: 800 kilograms
Range processing pipeline
• steps
1. manual initial alignment 2. ICP to one existing scan
3. automatic ICP of all overlapping pairs 4. global relaxation to spread out error 5 merging using volumetric method 5. merging using volumetric method
Statistics about the scan
• 480 individually aimed scans
• 2 billion polygons
• 7,000 color images
• 32 gigabytesg g y
• 30 nights of scanning
• 22 peoplep p
Comparison
photograph 1 0 mm computer model photograph 1.0 mm computer model
Results The Great Buddha Project
• Great Buddha of Kamakura
O i i l d f d l d 1243
• Original made of wood, completed 1243
• Covered in bronze and gold leaf, 1267
• Approx. 15 m tall
• Goal: preservation of
• Goal: preservation of cultural heritage
• Institute of Industrial Science
• Institute of Industrial Science, University of Tokyo, led by K t hi Ik hi
K t hi Ik hi Katsushi Ikeuchi Katsushi Ikeuchi
Scanner
• Cyrax range scanner by Cyra Technologies by Cyra Technologies
• Laser pulse time-of-flight
• Accuracy: 4 mm
• Range: 100 mg
Processing
• 20 range images (a few million points)
Si l ll ll ICP
• Simultaneous all-to-all ICP
• Variant of volumetric merging (parallelized)
Results Applications in VFX
• 3D scanning
H b id f IMAX
• Hybrid camera for IMAX
• View interpolation
3D scanning
XYZRGB Inc XYZRGB Inc.
IMAX 3D
• 6K resolution, 42 linear bits per pixel F CG i i ll k 6 h f f
• For CG, it typically takes 6 hours for a frame
• 45-minute IMAX 3D CG film requires a 100-CPU rendering farm full-time for about a year just for rendering
• For live-action, camera is bulky (like a refrigerator)g )
Hybrid stereo camera Live-action sequence
Hybrid input
left
right
Hybrid input
left left
right
Combine multiple hires to lores Results
View interpolation
Bullet time video Bullet time video
View interpolation
High quality video view interpolation High-quality video view interpolation