Structure from motion
Digital Visual Effectsg Yung-Yu Chuang
with slides by Richard Szeliski, Steve Seitz, Zhengyou Zhang and Marc Pollefyes
Outline
• Epipolar geometry and fundamental matrix
S f i
• Structure from motion
• Factorization method
• Bundle adjustment
• Applications
• Applications
Epipolar geometry &
fundamental matrix
The epipolar geometry
epipolar geometry demo
C C’ x x’ and X are coplanar C,C ,x,x and X are coplanar
The epipolar geometry
What if only C C’ x are known?
What if only C,C ,x are known?
The epipolar geometry
All points on project on l and l’
All points on project on l and l
The epipolar geometry
Family of planes and lines l and l’ intersect at e Family of planes and lines l and l intersect at e and e’
The epipolar geometry
epipolar pole
= intersection of baseline with image plane
epipolar geometry demo
= intersection of baseline with image plane
= projection of projection center in other image
epipolar plane = plane containing baseline epipolar plane plane containing baseline
epipolar line = intersection of epipolar plane with image
The fundamental matrix F
C C’
p R p’
C
T=C’-C
T Rp' p
Two reference frames are related via the extrinsic parameters
T Rp p
The fundamental matrix F T
Rp' p
0 0 z y
T T
T T T
Multiply both sides by pT
T
0 0
x y
x z
T T
T T T
T p p T Rp' T
p
T
p p
T
p
p
T Rp'
p
T0
0 '
Ep
p
essential matrix T Rp
p
0
0 Ep p
The fundamental matrix F 0 '
Ep p p p
Let M and M’ be the intrinsic matrices, then
x M
p
1p ' M '
1x ' 0
) ' ' ( )
( M
1x )
E ( M
1x ) 0 ( M x E M x
0 ' '
1
M EM x
x
0 '
Fx
x
fundamental matrixThe fundamental matrix F
• The fundamental matrix is the algebraic representation of epipolar geometry representation of epipolar geometry
Th f d t l t i ti fi th diti
• The fundamental matrix satisfies the condition that for any pair of corresponding points x↔x’
i th t i
in the two images
0 Fx'
x
TFx 0 x
Tl 0
x x l 0
The fundamental matrix F
F is the unique 3x3 rank 2 matrix that satisfies xTFx’=0 for all x↔x’
for all x↔x
1. Transpose: if F is fundamental matrix for (P,P’), then FT is fundamental matrix for (P’,P)
2 Epipolar lines: l F ’ & l’ FT 2. Epipolar lines: l=Fx’ & l’=FTx
3. Epipoles: on all epipolar lines, thus eTFx’=0, x’
eTF=0, similarly Fe’=0
e F 0, similarly Fe 0
4. F has 7 d.o.f. , i.e. 3x3-1(homogeneous)-1(rank2)
5. F is a correlation, projective mapping from a point x to a line l=Fx’ (not a proper correlation, i.e. not invertible)
The fundamental matrix F
• It can be used for
– Simplifies matching
– Allows to detect wrong matchesAllows to detect wrong matches
Estimation of F — 8-point algorithm
• The fundamental matrix F is defined by
0 '
Fx x
for any pair of matches x and x’ in two images.
f11 f12 f13
• Let x=(u,v,1)T and x’=(u’,v’,1)T,
21 22 23
13 12 11
f f f
f f f
f f f F
f31 f32 f33
each match gives a linear equation
0 '
' '
' '
' f11uv f12uf13vu f21vv f22vf23u f31v f32 f33 uu
8-point algorithm
11
f f
1
´
´
´
´
´
´ 13
12 1
1 1 1 1 1 1 1 1 1 1
1
f f f v
u v v v u v u v u u u
1 0
´
´
´
´
´
´
22 21 2
2 2 2 2 2 2 2 2 2 2 2
1 1 1 1 1 1 1 1 1 1 1 1
f v f
u v v v u v u v u u u
1
´
´
´
´
´
´
31 23
f
v f u v v v u v u v u u
un n n n n n n n n n n n
33 32
f f
33
f
• In reality, instead of solving , we seek f to minimize subj. . Find the vector Af 0
Af j f 1
corresponding to the least singular value.
8-point algorithm
• To enforce that F is of rank 2, F is replaced by F’ that minimizes subject to F F' det F' 0 F that minimizes subject to . F F' det F' 0
• It is achieved by SVD. Let , where y F UΣV, let
1
0 0
0 0
Σ
0 0
0 0 Σ'
1
, let
3 2
0 0
0 0
Σ
0 0 0
0 0
Σ' 2
then is the solution. F' UΣ'V
8-point algorithm
% Build the constraint matrix
A = [x2(1,:)‘.*x1(1,:)' x2(1,:)'.*x1(2,:)' x2(1,:)' ...[ ( , ) ( , ) ( , ) ( , ) ( , ) x2(2,:)'.*x1(1,:)' x2(2,:)'.*x1(2,:)' x2(2,:)' ...
x1(1,:)' x1(2,:)' ones(npts,1) ];
[U,D,V] = svd(A);
% Extract fundamental matrix from the column of V
% corresponding to the smallest singular value.p g g F = reshape(V(:,9),3,3)';
% E f k2 t i t
% Enforce rank2 constraint [U,D,V] = svd(F);
F = U*diag([D(1 1) D(2 2) 0])*V';
F = U diag([D(1,1) D(2,2) 0]) V ;
8-point algorithm
• Pros: it is linear, easy to implement and fast
C ibl i
• Cons: susceptible to noise
Problem with 8-point algorithm
11
f f
1
´
´
´
´
´
´ 13
12 1
1 1 1 1 1 1 1 1 1 1
1
f f f v
u v v v u v u v u u u
1 0
´
´
´
´
´
´
22 21 2
2 2 2 2 2 2 2 2 2 2 2
1 1 1 1 1 1 1 1 1 1 1 1
f v f
u v v v u v u v u u u
10000 10000 100 10000 10000 100 100 ~100 1
1
´
´
´
´
´
´
31 23
f
v f u v v v u v u v u u
un n n n n n n n n n n n
~10000 ~10000 ~100 ~10000 ~10000 ~100 ~100 ~100 1
!
Orders of magnitude difference
between column of data matrix 33
32
f f
! between column of data matrix
least-squares yields poor results
33
f
Normalized 8-point algorithm
1. Transform input by , 2 C ll 8 i b i
i
i Tx
xˆ xˆ'i Tx'i ˆ'
ˆ ˆ
2. Call 8-point on to obtain 3.
i i x x ˆˆ , T
F T F 'Τ ˆ
F
0
Fx x' Fx 0 x
ˆ 0 ˆ
T '
FT
1x x'
Fˆ
Normalized 8-point algorithm
normalized least squares yields good results
T f i [ 1 1] [ 1 1]
(700,500)
(0,500) (-1,1) (1,1)
2
Transform image to ~[-1,1]x[-1,1]
( , )
( , ) ( , ) ( , )
500 1
2 1 700 0
2
(0,0)
1
(0,0) (700,0) (-1,-1) (1,-1)
Normalized 8-point algorithm
[x1, T1] = normalise2dpts(x1);
[x2, T2] = normalise2dpts(x2);
A = [x2(1,:)‘.*x1(1,:)' x2(1,:)'.*x1(2,:)' x2(1,:)' ...
x2(2,:)'.*x1(1,:)' x2(2,:)'.*x1(2,:)' x2(2,:)' ...
[ , ] p ( );
x1(1,:)' x1(2,:)' ones(npts,1) ];
[U D V] svd(A);
[U,D,V] = svd(A);
F = reshape(V(:,9),3,3)';
F reshape(V(:,9),3,3) ; [U,D,V] = svd(F);
F = U*diag([D(1,1) D(2,2) 0])*V';
% Denormalise F = T2'*F*T1;
Normalization
function [newpts, T] = normalise2dpts(pts) c = mean(pts(1:2,:)')'; % Centroid
newp(1,:) = pts(1,:)-c(1); % Shift origin to centroid.p( , ) p ( , ) ( ); g newp(2,:) = pts(2,:)-c(2);
meandist = mean(sqrt(newp(1,:).^2 + newp(2,:).^2));
scale = sqrt(2)/meandist;
T = [scale 0 -scale*c(1) 0 scale -scale*c(2) 0 0 1 ];
t T* t newpts = T*pts;
RANSAC
repeat
select minimal sample (8 matches) compute solution(s) for F
determine inliers
until (#inliers,#samples)>95% or too many times ( , p ) y compute F based on all inliers
Results (ground truth)
Results (8-point algorithm) Results (normalized 8-point algorithm)
Structure from motion Structure from motion
Structure from motion
Unknown Unknown Unknown Unknown camera camera viewpoints viewpoints
structure for motion: automatic recovery of camera motion and scene structure from two or more images. It is a self calibration technique and called automatic camera tracking calibration technique and called automatic camera tracking or matchmoving.
Applications
• For computer vision, multiple-view shape reconstruction novel view synthesis and reconstruction, novel view synthesis and autonomous vehicle navigation.
F fil d ti l i ti f CGI
• For film production, seamless insertion of CGI into live-action backgrounds
Matchmove
example #1 example #2 example #3 example #4
Structure from motion
2D feature
tracking 3D estimation optimization (bundle adjust)
geometry fitting
SFM pipeline SFM pipeline
Structure from motion
• Step 1: Track Features
Detect good features Shi & Tomasi SIFT – Detect good features, Shi & Tomasi, SIFT – Find correspondences between frames
• Lucas & Kanade-style motion estimationy
• window-based correlation
• SIFT matching
KLT tracking
http://www ces clemson edu/~stb/klt/
http://www.ces.clemson.edu/ stb/klt/
Structure from Motion
• Step 2: Estimate Motion and Structure
Si lifi d j ti d l [T i 92]
– Simplified projection model, e.g., [Tomasi 92]
– 2 or 3 views at a time [Hartley 00]
Structure from Motion
• Step 3: Refine estimates
“B dl dj t t” i h t t – “Bundle adjustment” in photogrammetry – Other iterative methods
Structure from Motion
• Step 4: Recover surfaces (image-based triangulation silhouettes stereo ) triangulation, silhouettes, stereo…)
Good mesh Good mesh
Factorization methods Factorization methods
Problem statement
Notations
• n 3D points are seen in m views
( 1) 2D i i
• q=(u,v,1): 2D image point
• p=(x,y,z,1): 3D scene point
• : projection matrix
• : projection function
• : projection function
• qij is the projection of the i-th point on image j
j ti d th f
• ij projective depth of qij
) ( p
q
ij (
jp
i)
(x y z)(x/z y/z)q
(x,y,z)(x/z,y/z)ij z
Structure from motion
• Estimate and to minimize
jp
i) );
( ( log )
, , , , , (
1 1 1
1 j i ij
m
j n
i ij n
m p p w P Π p q
Π
Π
j
otherwise
j in view visible
is if 0
1 i
ij
w p
otherwise
j 0
• Assume isotropic Gaussian noise, it is reduced top ,
2 1
1, , , , , ) ( )
( j i ij
m n
wij Π p q p
p Π
Π
1 1 1
1, , , , , ) ( )
( j i ij
j i
ij n
m p p w Π p q
Π
Π
• Start from a simpler projection model
• Start from a simpler projection model
Orthographic projection
• Special case of perspective projection
Di t f th COP t th PP i i fi it – Distance from the COP to the PP is infinite
Image World
– Also called “parallel projection”: (x, y, z) → (x, y)
SFM under orthographic projection
2D image Orthographic projection
incorporating 3D rotation 3D scene
image offset point incorporating 3D rotation 3D scene
point
offset
t Π t Πp q
211 23 31 21 2 23 31 21
• Trick
– Choose scene origin to be centroid of 3D pointsg p – Choose image origins to be centroid of 2D points – Allows us to drop the camera translation:Allows us to drop the camera translation:
Πp
q p
q
factorization (Tomasi & Kanade)
projection of n features in one image:
n 3 3
n 2
2
n
1 2 n2
1 q q p p p
q
1 1 12
11
q q qn Π
projection of n features in m images
32
1 2 2
22 21
1 12
11
n n
n
p p
Π p q
q
q
n 3 3
n 2m 2m
2 1
qm qm qmn Πm
3 n 2m
2m
W
measurementM
motionS
shapeKey Observation: rank(W) <= 3
Factorization
3 3 2
2
W M S
known solve for
• Factorization Technique
n 3 3 m 2 n
2m
Factorization Technique
– W is at most rank 3 (assuming no noise)
– We can use singular value decomposition to factor W:
33 2
W
2M ' S '
We can use singular value decomposition to factor W:
n 33 m n 2
2m
– S’ differs from S by a linear transformation A:
) )(
( '
' S MA AS
M
W
1– Solve for A by enforcing metric constraints on M
Metric constraints
• Orthographic Camera
R f th l
T 01 10
– Rows of are orthonormal:
• Enforcing “Metric” Constraints
0 1
– Compute A such that rows of M have these properties
M A
M ' A M M '
Trick (not in original Tomasi/Kanade paper, but in followup work)
• Constraints are linear in AAT:
T T T
T A A G where G AA
' ' ' '
1 0
0 1
• Solve for G first by writing equations for every iin M
• Then G = AATby SVD (since U = V)
0 1
Then G AA by SVD (since U V)
Factorization with noisy data
M S E
W
n 2m 33 n 2m n2m
• SVD gives this solutiong
– Provides optimal rank 3 approximation W’ of W
'
W E
W
n 2m n 2m n 2mW
W
E
• Approach
– Estimate W’, then use noise-free factorization of W’
as before
– Result minimizes the SSD between positions of image features and projection of the reconstruction
Results Extensions to factorization methods
• Projective projection Wi h i i d
• With missing data
• Projective projection with missing data
Bundle adjustment Bundle adjustment
Levenberg-Marquardt method
• LM can be thought of as a combination of steepest descent and the Newton method steepest descent and the Newton method.
When the current solution is far from the correct one the algorithm behaves like a correct one, the algorithm behaves like a
steepest descent method: slow, but guaranteed to converge When the current solution is close to converge. When the current solution is close to the correct solution, it becomes a Newton’s method
method.
Nonlinear least square
find try to ,
ts measuremen of
set a
Given x
Here minimal is
distance squared
that the so
vector parameter
best
the p
T
).
ˆ ( with ˆ,
Here, minimal.
is distance
squared
p x
x
x f
Levenberg-Marquardt method
Levenberg-Marquardt method
• μ=0 → Newton’s method
d h d
• μ→∞ → steepest descent method
• Strategy for choosing μ
– Start with some small μStart with some small μ
– If error is not reduced, keep trying larger μ until it does
– If error is reduced, accept it and reduce μ for the next iteration
Bundle adjustment
• Bundle adjustment (BA) is a technique for simultaneously refining the 3D structure and simultaneously refining the 3D structure and camera parameters
It i bl f bt i i ti l
• It is capable of obtaining an optimal
reconstruction under certain assumptions on
i d l F G i
image error models. For zero-mean Gaussian image errors, BA is the maximum likelihood
ti t estimator.
Bundle adjustment
• n 3D points are seen in m views
i h j i f h i h i i j
• xij is the projection of the i-th point on image j
• ajj is the parameters for the j-th camera
• bi is the parameters for the i-th point
• BA attempts to minimize the projection error
• BA attempts to minimize the projection error
Euclidean distance
predicted projection Euclidean distance
Bundle adjustment
Bundle adjustment
3 views and 4 points
Typical Jacobian
Block structure of normal equation Bundle adjustment
Bundle adjustment
Multiplied by
Issues in SFM
• Track lifetime
N li l di i
• Nonlinear lens distortion
• Degeneracy and critical surfaces
• Prior knowledge and scene constraints
• Multiple motions
• Multiple motions
Track lifetime
every 50th frame of a 800-frame sequencey q
Track lifetime
lifetime of 3192 tracks from the previous sequencep q
Track lifetime
track length histogramg g
Nonlinear lens distortion
Nonlinear lens distortion
effect of lens distortion
Prior knowledge and scene constraints
add a constraint that several lines are parallelp
Prior knowledge and scene constraints
add a constraint that it is a turntable sequenceq
Applications of matchmove
Applications of matchmove
Jurassic park 2d3 boujou
Enemy at the Gate, Double Negative
2d3 boujou
Enemy at the Gate, Double Negative
Photo Tourism
VideoTrace
http://www.acvt.com.au/research/videotrace/
Video stabilization
Project #3 MatchMove
• It is more about using tools in this project
Y h i h lib i
• You can choose either calibration or structure from motion to achieve the goal
• Calibration
• Voodoo/Icarus
• Examples from previous classes #1 #2
• Examples from previous classes, #1, #2
References
• Richard Hartley, In Defense of the 8-point Algorithm, ICCV, 1995.
• Carlo Tomasi and Takeo Kanade,,Shape and Motion from Image p g Streams: A Factorization Method, Proceedings of Natl. Acad. Sci., 1993.
Manolis Lourakis and Antonis Argyros The Design and
• Manolis Lourakis and Antonis Argyros, The Design and
Implementation of a Generic Sparse Bundle Adjustment Software Package Based on the Levenberg-Marquardt Algorithm, FORTH- ICS/TR 320 2004
ICS/TR-320 2004.
• N. Snavely, S. Seitz, R. Szeliski,Photo Tourism: Exploring Photo Collections in 3D, SIGGRAPH 2006.,
• A. Hengel et. al.,VideoTrace: Rapid Interactive Scene Modelling from Video, SIGGRAPH 2007.