CHAPTER 4 A PROGRESSIVE OCTREE CONSTRUCTION METHOD WITH A
4.5 Analytical Analysis on the Performance of the Proposed Method
In this section we analyze the performance of the proposed progressive method theoretically. Before introducing the lemmas, we define the definitions and notations first:
Definition:
Assume N silhouette images, I0…IN-1, are used to construct the octree model for the object.
Let
(i) Oi be the i-th octant projected onto the images and applied test intersection against the silhouette image in the octree construction method;
(ii) ci be the center (or centroid) of the 8 projected octant vertices of Oi; (iii) ri be the radius of the bounding circle of the projected octant image for Oi; (iv) di be the distance value at the circle center ci in the distance map DistMap defined in Section 2;
Also let
priority be the i-th octant in the priority queue which has the below property:
(ii) Vm be the constructed octree model after examining m octants in the octree structure against the silhouette images I0 to IN-1 and
be the constructed octree model V
ueue priority_q
Vm ,Vmqueue,Vmstack
m from the octree construction methods using the priority queue, queue and stack data structure to store the octants to be processed, respectively;
(iii) Sm be the set of m octants examined in the octree structure against to the silhouette images and be the set Sm with octants retrieved from the stack, queue and priority queue data structure respectively;
Lemma 4.1: If , are the volumes generated by the octree construction methods implem ue and the priority queue, respectively, when m octants have been processed, then XOR( ≥ XOR( . Proof:
Let and be the volumes generated by performing octant projection and intersection test on the set of octants and
respectively.
(1) If and are identical then XOR( ) = XOR(
holds.
(2) If and are not identical then there exists at least one octant different between
Assume Oi ∈ and Oi ∉ , Oj ∈ and Oj ∉ , ri and rj are
stack
Sm , Squeuem , Spriority_qm ueue
queue
Vm Vmpriority_queue
ented with the que
queue
Vm ) Vmpriority_queue)
queue
Vm Vmpriority_queue
queue
Sm Spriority_qm ueue ,
queue
Vm Vmpriority_queue Vmqueue Vmpriority_queue)
queue
Vm Vmpriority_queue
queue
Sm and Spriority_qm ueue.
queue
Sm Spriority_qm ueue Spriority_qm ueue Squeuem
the radii of the bounding circles of the octant projection images for Oi and Oj. The rest of the octants in and re the same. Then the subdivision of Oi and
Oj will affect the diffe and XOR( . In other
words, let be the octree models generated by examining the octants in S -{ Oj } and S -{ Oi }, respectively. Since
-{ Oj }= -{ Oi }, we have XOR( )=XOR( ). Now we only need to analyze the influence of the subdivision of octant O j to . Fig. 4.15 is the depiction of the relation among the projection images Oi a and the silhouette images.
queue
Sm Spriority_qm ueuea
rence between XOR(Vmqueue) Vmpriority_queue)
ueue
Bounding circle for octant Oi and Oj
Object interior Object interior
Bounding circle for octant Oi and Oj
ri=rj rj<ri
(a) (b)
Fig. 4.16. The relation of radii of the bounding circles for octant Oi and Oj and their
( )
Oi error _ projmax_ and max_ proj_error
( )
Oj .Since Octants in are the breadth-first traversal of the octree, if Oj is not the same octant as the O , then the level of Oi should be smaller than or equal to the level
Since Oi is not in Spriority_qm ueue , it implies that max_proj_error
( )
Oi <The relations among and ri could
be
(i) max_proj_error
( )
Oi < max_proj_error( )
Oj < ri(=rj)The subdivisions of Oi and Oj will produce black or grey sub-octants which will not affect XOR( ) and XOR( ).
The subdivisions of Oj will produce white octants and cut off half of the error. Thus the XOR( ) will be less then the XOR( ).
Both the subdivision of Oi and Oj will produce white octants and cut off half of the error induced by the octant Oi and Oj. Thus XOR( ) and XOR( ) reduce equally.
(b) If level of Oi is less then the level of Oj, then ri ≥ 2*rj
The subdivision of Oi and Oj will produce black or grey sub-octants which will not affect XOR( ) and XOR( ).
The subdivisions of Oj will produce white octants and cut off half of the error. Thus the XOR( ) will be less then the XOR( ).
The subdivisions of Oj will produce white octants and cut off half of the error. Thus the XOR( ) will be less then the XOR( ).
Lemma 4.1: If are the volumes generated by the octree construction methods implem ue and the priority queue, respectively,
when m ≥ XOR( .
Proof:
Lemma 4.2 can be proved in a similar way as in the proof of Lemma 4.1.
Lemma 4-3: Let , and
be the queues after processing m grey octants based on the projecti given in the conventional and the two new construction m
previously. Also let , and
be the queue lengths of ,
and , respectively.
Then for any given projection error bound p
≤
≤ .
stack
Vm , Vmpriority_queue
ented with the que
octants have been processed, , then XOR(Vmstack) Vmpriority_queue)
ueue(m)
priority_q , (m)priority_queue1P priority_queue2P(m) on error definitions
ethods introduced h(m)
ueue_lengt
priority_q priority_queue_length1P(m) (m)
h ueue_lengt
priority_q 2P priority_queue(m)
(m) ueue
priority_q 1P priority_queue2P(m)
(m) h ueue_lengt
priority_q 2P (m)priority_queue_length1P h(m)
priority_q 1P priority_queue_length(m)
(a) While m=1, the first octant, the root octant, is projected onto the images and examined by the intersection test. If the root octant is partially intersected with the object silhouette images, then it is subdivided into 8 smaller sub-octants.
Let n be the number of sub-octants of the root octant which are grey and d be the number of sub-octants which are grey and their is smaller than P. Then
It follows that
(1)
(b) Suppose ≤ sustains
priority_q 1P priority_queue_length(k) 1) (k h ueue_lengt
priority_q 1P +
1)
priority_q priority_queue1P(k)
(i) Assuming Oi =Oj:
Let n be the number of sub-octants of Oi which are grey and d be the number of sub-octants which are grey and their is smaller than P then the length of
, after subdividing O
( )
Opriority_q priority_queue1P(m) i are:
1) h(k ueue_lengt
priority_q + =priority_queue_length(k)-1 + n and 1)
(k h ueue_lengt
priority_q 1P + =priority_queue_length1P(k)-1+ n-d.
Since priority_queue_length1P(k)≤priority_queue_length(k), it follows that 1) is the maximum value in the queue. This implies that is empty.
priority_q 1P ≤ for any projection error
bound p.
h(m) ueue_lengt priority_q
(2) We can also prove that ≤
by induction. It is similar to the proof for
≤ .
priority_q 1P priority_queue_length(m)
4.6 Conclusion
In this chapter the octree construction method using the best first traversal scheme with a priority queue is presented. It is an extension to the first method and the second method which construct the octree model of an object progressively. The octants to be processed are arranged in the order of maximum projection error. The octant having maximum projection error is always the one first to be processed.
The reconstruction quality in terms of XOR error of the octree projection image with the silhouette image has been shown better than the octree construction method using the breadth first traversal of depth first traversal. Besides, we also show that the memory requirement for the priority queue becomes less if we can avoid inserting octants with tolerable projection errors into the priority queue by incorporating the first method and the second octant color classification criteria into the proposed progressive method.
CHAPTER 5
A NOVEL 3D PLANAR OBJECT RECONSTRUCTION FROM MULTIPLE UNCALIBRATED IMAGES USING THE
PLANE-INDUCED HOMOGRAPHIES
5.1 Introduction
In the physical world (especially the man-made world) planar surfaces such as walls, windows, table, roof, road, and terrace can be found in the indoor as well as the outdoor scenes. Our task is to reconstruct the 3D planar surfaces in a scene from multiple uncalibrated images taken by a camera placed at different viewpoints. In general, the methods for 3D projective or uncalibrated reconstruction [33][36][37][40][41][50] are point-based. They estimate the fundamental matrix from a sufficient number of corresponding point pairs first, and then derive the epipole and the canonical geometric representation for projective views using the fundamental matrix. Then, for each pair of corresponding points, they use a triangulation technique or bundle adjustment technique to compute the 3D point coordinates in the projective space. Finally, for the determination of the uncalibrated planar scene structure [34][39][46][47][49][54][56][60], the 3D points found are fitted by planes. However, it is desirable to derive the 3D planar scene structure in terms of plane features in the images directly, for these features are more reliable than the point or line features [49].
The estimation of the 3D projective planar structure based on the projected plane feature information exclusively has not yet received much attention, although it is known that the corresponding projected plane regions in a pair of stereo images induce a homography. It is also known that homographies are useful to many other practical applications including:
(a) Fundamental matrix estimation or canonical projective geometry representation [48][49].
(a) 2D image mosaicing or view synthesis [43].
(a) Plane + parallax analysis [34][36][56].
(a) Planar motion estimation and ego-motion [45][54][60].
Recently, two methods have been proposed for the 3D projective reconstruction of planes and cameras. The first method assumes all planes are visible in all images and the second method assumes a reference plane is visible in all images [51][52]. In practice, it is not realistic to have all planes or even one plane visible in all images unless a very large ground plane is available. When there is no reference plane visible in all images, the reconstruction problem cannot be formulated within a common projective space and the reconstruction results will be inevitably obtained in different projective spaces.
We shall recover the 3D scene planar structure from the uncalibrated images using the plane-induced homographies without assuming that all planes or one plane must be seen in all images. To obtain the homographies, we must locate the projected regions of planar surfaces in the images. There are methods for detecting regions corresponding to planar surfaces in the image [55][44][59]. After the image regions of planar surfaces have been extracted, we use the Gabor filtering technique [57] to identify at least four point correspondences for every plane in the stereo images in order to obtain the initial value of the homography. Then we iteratively refine the homography based on a nonlinear minimization method given in [53]. Next, we use two homographies to compute the epipole and to find the compatible projection equations in terms of the estimated homography and an assigned plane coefficient vector of a reference plane, together with the estimated epipole. With the projection equations thus derived we then prove that the 3D equation of any other plane visible in the stereo images can be computed with respect to the reference plane equation as long as its homography is determined. Finally, we merge or integrate all reconstructed plane equations found in individual projective spaces within a common space through the coordinate (or space) transformations. Again, each required coordinate transformation matrix is expressed by the homography and plane coefficient vector information of two planes visible in the involved image pairs. Fig. 5.1 shows the flow diagram of our method.
Homography estimation for each image pair
Epipole computation from homographies
Plane equation computation for planes visible in each image pair
Integration of all plane equations under a common projective space Image sequence
Final 3D planar structure
Fig. 5.1. The flow diagram of our reconstruction method.
The remaining sections of the chapter are organized as follows. Section 2 is the preliminaries and mathematical notations for the projective reconstruction. Section 3 shows how the 3D equations of all planar surfaces visible in the stereo images can be determined from their homographies. Section 4 presents the integration of the reconstruction results obtained in different projective spaces through the coordinate transformations. Section 5 shows the estimation of the plane-induced homographies and the related epipole. Section 6 reports the experimental results on both the synthetic and real images. Section 7 is the concluding remarks.
5.2 Preliminaries and Mathematical Notations for Projective Reconstruction
Consider any two consecutive images (Ii, Ij) in an image sequence for reconstructing the visible planar surfaces. LetRi, tri
be the extrinsic parameters and be the 3x3 upper triangular intrinsic camera matrix of the i-th camera. Then the coordinates of a 3D point and its 2D projection point
in image Ii are related by a pinhole camera model [40][37][43]:
M i
T E E E
E =[x y z ]
pr
T i i
i =[u v ]
ur
[ ]
To represent the point in the projective space or the homogeneous coordinate system, we use the vectors with a tilde to denote the homogeneous coordinates of the 3D points and its image projection point such that u~i =[urTi 1]Tand~pE =[prTE 1]T. The symbol indicates an equality up to a non-zero scale in the homogeneous coordinate system.
Assume the world coordinate system is chosen to be the i-th camera coordinate system; namely, and
≅
Consider a plane , which does not pass through the optical center of the i-th camera (otherwise, its image will be degenerated into a line). Let its plane equation be
ΠA two projection equations (5-1) and (5-3), we can obtain a homography as follows [58][48][53]:
For an uncalibrated camera the intrinsic and extrinsic camera parameters in Eqs.
(5-1) and (5-3) cannot be estimated. We need to replace these two equations by some new parameters that can be estimated. This is done as follows:
LetAijbe rewritten as =
{
−1 − ~ i−1}
A similar formulation of Eq. (5-4) has been derived in [48]. In this way, the original Euclidean point ~pE becomes point T denoted by {~ }, which describes the projective geometry associated with images i pij and j.
projective space {~ }. Since the projective structure can only be determined up to a pij 4x4 non-singular projective matrix [43], the new plane coefficient vector
T
discussion on the values of
]T parameters including homography Aij , epipole e~j and plane coefficients
T
5.3 Reconstruction of All Visible Pl
In the new pr
involved in Eq. (5-4) are now all known.
Next, we shall describe how to obtain the projective reconstruction for the visible in the image pair (Ii, Ij) in the newly defined projective space
anes from a Given Image Pair
ojective space the projection equations become
[ ]
⎥Similarly, for any other plane ПB visible in (Ii, Ij) the induced homography between the plane regions in image pair (Ii, Ij) is expressed by
Next, we shall prove the fact that the relation between plane coefficient vectors of planes ПB and ПA is determined once their homographies and are found.
From above we have
Aij Bij
where ηrij =λAλe[brE −arE]Mi−1. We can apply the least-squares method to estimate the
be determ the two homographies and based on the fact that
Substituting into the above equation, we have
~ ]
In other words, the relationship between two plane coefficient vectors is given by:
⎟⎟
can be determined with respect to a~ once the planar homographyij is known.
Bij
5.4 Integration of Planes Reconstructed from Different Image Pairs
Next, we consider the integration of reconstructed planes obtained from different image pairs (Ii, Ij) and (Ij, Ik), which contain the projections of two commonly visible planes. We shall use the plane-based coordinate transformation method for integrating the reconstruction results defined in different spaces.
Let the 4x4 coordinate transformation matrix , mapping the points in the projective space { } to the points in the projective space { }, be defined by
Then, the plane coefficient vectors of a common plane, which are respectively defined in the two different projectiv
c~ij, c~jk
Thus, it requires the information of 5 common planes in the two different projective spaces in order to solve for the transformation matrix . It is usually not very practical to find five common planes in the image pairs.
On the other hand, the two respective 3x4 projection matrices associated with image Ij defined in the two projective spaces { } and { } are related directly by the matrix [38]. This relationship provides 11 linear equations in the 15 matrix elements in . Then, it is reduced to a need of two plane information to provide 6 additional linear equations to solve for the 15 unknowns. In the following we shall give a system of 24 linear equations using the information of two planes for solving for the 15 unknowns; the result will be more reliable.
From Eq. (4), we have
⎥⎦
Similarly, for the same point on planeΠB, but represented as p~ in the different jk projective space {~ }, we can relate it to the same 4x1 vector pjk
[
lju~Tj 0]
TbyAfter some algebraic manipulation, this can be reduced to
⎥⎥
Since this equality holds for all image points on plane ПB, it further implies
⎥⎦
This leads to a system of 12 linear equations in 16 unknowns: 15 from the matrix plus one from . Therefore, we need another system of equations provided by a second visible plane, say, ПG:
Combining the above two systems of equations, we have a total of 24 linear equations in 17 unknowns. Here we give a least squared solution by placing the two systems of equations in the following form
⎥⎥
where the ratio of λB λG
can find the m
has been estimated during the plane reconstruction phase (see Eq. (5-7)). We atrix using the pseudo-inverse matrix of the 4x6 matrix on the left-hand side of the above equation.
5.5 Computation of Homographies
We need to estimate from the image data associated with the planar surface ПA. We shall use the region-based matching, instead of point-based matching, to find the homography. First of all, we use the Gabor filtering technique [57] to identify at least four point correspondences in order to obtain the initial solution of the homography. We then use the Levenberg-Marquardt iterative nonlinear minimization algorithm [53] to minimize the sum of the squared intensity differences of the transformed and original image points due to the plane ПA in the image pair
Hijk
Here the transformed location [ , 1]T is obtained from the image point using an estimated and
)'
j u v is the intensityobtained by a bilinear interpolation from the original image Ij. The intensity values of the image points in the common region of the two images are normalized to remove the possible illumination difference. The above minimization method converges in a few iterations.
After finding two homographies, recall that we can compute the epipole~ from ej the skew symmetry property of B
[ ]
ej ×AT ~ . Also, in turn, we can compute the fundamental matrix F using the epipole~ as foej llows:
[ ]
ej ×A5.6 Experimental Results
Experiment 1:
In the first experiment we use a synthetic tower whose feature points and schematic diagram are given in Table 5.1 and Fig. 5.2. We take a sequence of six pictures to cover all aspects of the tower using a virtual camera looking down from the upper positions. The image resolution is 640x480 in pixel. Three consecutive images of the sequence, I1, I2, and I3, are shown in Fig. 5.3. We apply the reconstruction process to this data set. We employ a linear least-squares method based on eight corresponding image point pairs available in the synthetic data to get the true homography for each of the five planes, ΠGr, ΠA, ΠB, ΠE, ΠF visible in the pair (I1, I2).
In addition, to handle the possible problems caused by data translation and scaling change, we also use the normalization transform proposed by Hartley [42] to compute the homographies. We choose ΠGr as the reference plane. During the reconstruction process, we find the plane coefficient vectors with respect to the reference plane ΠGr
vector designated as To check the correctness of the final 3D projective reconstruction result, we convert the 3D camera centered projective space back to the 3D object centered Euclidean space using the 3D Euclidean data of the tower available in Table 5.1 to measure the reconstruction errors in the metric space. The computation times for estimating the plane coefficient vectors and the coordinate transformation matrix for space integration are within a second.
]T
1 1 1 1
[ .
1 9 2
8 7
5 6 11 12
9 4
13 17
15 16 14
ΠGr
ΠA
ΠB
ΠF
ΠE
3
10
18
19 20
23
22 21
24
z y
x
Fig. 5.2. The schematic diagram of the tower. The dimensions of the tower are 40 inches in depth (the x-direction), 40 inches in width (the z direction) and 180 inches in height (the y direction).
(a) (b) (c)
Fig. 5.3. Three distinct images I1, I2 and I3 taken at a distance of about 500 inches.
The visible planes in the three images are ΠGr, ΠA, ΠB, ΠE, ΠF in I1 and I2, and ΠGr, ΠB, ΠF, ΠC, ΠG in I3.
Table 5.1. 3D object centered coordinates of the tower feature points.
Point 1 2 3 4 5 6 7 8 9 10 11 12
X -20 -20 -20 -20 -20 -20 -20 -20 -20 -20 -20 -20
Y 0 0 120 120 50 50 70 70 80 80 110 110
Z -20 20 20 -20 -10 10 10 -10 -15 15 15 -15
Point 13 14 15 16 17 18 19 20 21 22 23 24
X -13.333 -13.333 -6.667 -6.667 0 0 -50 -70 -50 0 50 70
Y 140 140 160 160 180 0 0 0 0 0 0 0
Z -13.333 13.333 -6.667 6.667 0 -70 -50 0 50 70 50 0
In what follows we assume the noise is uniformly distributed over the interval [-R, R], where R indicates the noise strength or level. We generate 500 copies of noisy image data using the given noise model with R = 0.5, 1.0, 1.5, 2.0 pixels,
In what follows we assume the noise is uniformly distributed over the interval [-R, R], where R indicates the noise strength or level. We generate 500 copies of noisy image data using the given noise model with R = 0.5, 1.0, 1.5, 2.0 pixels,