• 沒有找到結果。

Chapter 2 Model Reconstruction Using Octree Algorithm

2.4 Model Triangulation

After the 3D octree construction, the ON cubes of the octree diagram O represent an approximation 3D model of the object and the larger depth R is, the more accurate the octree diagram becomes. However, the depth R is limited to about 7 due to the constrain of computational resources, for example the memory and computational time increase proportionally to the triple order of the depth R. With limitation on the depth R, triangulation of the octree not only increases the accuracy of the model but also makes the texture mapping much easier. Figure 2.4.1 shows the reconstruction result of a sphere using a 3D octree of max depth 3. Though max depth 3 is an extreme case which would not be chosen in the reality reconstruction system, it is a good example showing the difference between octree model and Triangulation model

(a) (b) (c)

Figure 2.4.1 Reconstructing of sphere using 3D Octree of max depth 3 : (a) original model, (b) octree model, (c) triangulation model of (b)

A typical triangulation algorithm called “Marching Cubes” , which is also adopted in this thesis, is often applied to the cubic-type 3D model reconstruction, such as the octree. The

“Marching Cubes” algorithm is based on the fact that the real object surface would intersect with the edges of the ON cubes. By finding the intersection points cube by cube, triangles can be constructed by the geometry relationships of these intersection points. Figure 2.4.2 illustrated 15 patterns of transforming cube into triangles in the Marching Cubes algorithm.

Figure 2.4.2 15 patterns of transforming cube into triangles in Marching Cubes. The dotted

corners represent corners lay inside the surface[23]

It is difficult to represent the 256 geometry relationships for a cube in mathematical equation since each vertex of the cube may lie inside or outside the surface. Hence, a general way to implement the Marching Cubes algorithm is to encode the geometry relationships of cube vertices into a unique index number ranging from 0 to 255. The indexing scheme is shown in Figure 2.4.3. The table below the cube shows the correspondence of the eight bits to the vertices v , i i=1...8, of the cube. Each bit of index number is assigned with 1 if the corresponding vertex lay inside the surface or 0 otherwise. As a result, a unique index number is generated and then used to find out the corresponding triangle structure, which is defined using the edge number ej, j=1...12, in the look-up table.

15

Figure 2.4.3 indexing scheme of Marching Cubes.[23]

From Section 2.2.4, every ON cube should have some vertices laying inside the object and others laying outside, making some edges of the ON cube intersect with the object surface.

Thus these edges should have an inside vertex in one end and an outside vertex in the other end, which means an intersection point exists on each of these edges. The exact intersection points on these edges can be determined by geometry algorithms such as binary search [37].

After the intersection points are obtained, the marching cubes algorithm is then adopted to construct the triangle mesh model. Figure 2.4.4 gives an example of triangulation result of marching cubes algorithm.

(a) (b)

Figure 2.4.4 Triangulation example[37]: (a) octree model, (b) triangulation model of (a)

Chapter 3

Object Extraction Using 3D Feature Points

Segmentation of foreground and background of an image is crucial many computer vision applications. Two existing and developing algorithms are introduced in Section 3.1. Based on structure of the proposed reconstruction system, a new algorithm is proposed to remove the background and extract the targeting object in the foreground, utilizing the tracking results of camera tracking system mentioned in Section 2.1, including camera poses and the 3D positions of feature points. Two steps of the Algorithms, the 3D foreground / background segmentation and the object mask generation, are explained in the Section 3.2 and 3.3.

3.1 Existing Image Segmentation Algorithms

3.1.1 Graphical Partitioning Active Contours

“Active Contour”[19, 36] is the most used algorithm for object extraction. An active contour, or a snake, can be represented using (3.1.1).

( )

s

(

x

( ) ( )

s ,y s

)

, s

[ ]

0,1

v = ∈ (3.1.1) where s represents the indexing value of each points belong to the active contour, as v(s)

represents the position of the point indexed by s. The deformation of an active contour is controlled by an energy function defined by the image information like color and edge, as shown in (3.1.2)[32].

( )

( ) ( ) ( ) ( ) ( )

(

E v s E v s E v s

)

ds

Esnake =

internal + image + constraint (3.1.2)

where Einternal represents the energy of smoothness defined by elasticity and stiffness parameter, Eimage represents the energy defined by image information including color, texture and edge, and Econstraint represents the energy defined by information other than the image like object shape pattern, respectively.

17

iterative process of minimization is required. Many minimization process with an implementation of active contour have been proposed. The most studied one among these algorithms is the “Level set method”[29]. Instead of deforming the contour, the level set method assigns a height value c(x,y) for each point (x,y) in image and uses the height value to determine the location of contour according to (3.1.3). The height value c(x,y) is iteratively updated by calculating the effect of Esnake to point (x, until saturation, which y) indicates no sign changes for any c(x,y). As the iteration process is completed, the contour of the targeting object is obtained.

⎪⎩

The GPAC[31] algorithm is proposed recently for the sake of foreground/background segmentation of nature pictures, as shown in Figure 3.1.1(a) and 3.1.1(b). However, the segmentation result of GPAC algorithm shown in Figure 3.1.2(b) indicates a unexpected result in segmenting the test image as Figure 3.1.2(a). Viewing from Figure 3.1.1, the GPAC algorithm is well-performed under the condition that the foreground and background are different in color tones. When the color of foreground object is similar to the background, the segmentation is failed.

(a) (b)

Figure 3.1.1 GPAC segmentation results : (a) test image 1, (b) extracted foreground of (a)

(a) (b)

Figure 3.1.2 GPAC segmentation results : (a) test image 2, (b) extracted foreground of (a)

3.1.2 Image Segmentation

Different from GPAC, image segmentation algorithms tend to divide the image into several segments according the segmentation condition, usually determined by the color and edge information. The objective of image segmentation algorithm is to divide the image into several segments such that each segment contains pixels with similar features and is mostly distinct to other segments. The basic image segmentation algorithm is the watershed algorithm, which simply calculates the boundary of every pixel with local minimum color.

The watershed algorithm always over-cut the image and is applied as a pre-process of other segmentation algorithms, as shown in Figure 3.1.3(b). Improvements are applied to watershed algorithm including merging similar segment areas and using more information like texture and edge for calculation. EDISON[8, 16] is a segmentation algorithm implements the improved algorithm, and the segmentation result is shown in Figure 3.1.3(c).

From the segmentation result of EDISON in Figure 3.1.3(c), it is obvious that the over-cut problem still exists to be solved for the algorithms based on watershed. Hence, the graph-based image segmentation algorithm[12, 38] is introduced. Instead of image processing theory, the algorithm segments image based on graph theory. The image to be segmented as a graph G=

( )

V,E , where viV represents pixels in image and eijE represents connecting edge of two neighboring pixels vi and vj. A weight value wij =w

( )

vi,vj is calculated for every edge eij according to the dissimilarity of pixel vi and vj. With the weight value w , the graph-based segmentation algorithm is able to segments the image into

19

Φ ' E '

Ekl = for kl, while satisfying the objective of image segmentation algorithm mentioned above. A graph-based segmentation proposed in [12] has been tested and the result is illustrated in Figure 3.1.3(d).

Though the image segmentation algorithm is performs well in divide image into image blocks, it can not determine if the image block belongs to foreground or background. It is impossible to piece image blocks together to form a complete object image since the algorithm provides nothing about the geometry of the object. Therefore, the image segmentation algorithm is not suitable for the reconstruction system, either.

(a) (b)

(c) (d)

Figure 3.1.3 Image segmentation results : (a) test image, (b) result of watershed algorithm, (c) result of EDISON[16], (d) result of graph-based segmentation algorithm proposed in [12]

3.2 3D Foreground / Background Separation

3.2.1 Problem Analysis on Separation

To obtain an image sequence containing an object suitable for reconstruction, the image sequence must be taken by a camera orbiting around the target object, as shown in Figure 3.2.1. To satisfy the above condition, the shooting environments should be arranged to keep the target object from the surrounding background in a certain distance. To reconstruction the 3D information from the image sequence, the camera tracking system is adopted to reconstruct the camera 3D pose and 3D feature point positions. With reconstructed 3D feature point positions, the separation between the target object and the background will be also restored by the camera tracking system. As a result, the reconstructed feature points are distributed in two groups separated by an empty space gap. The group close to the camera is identified as the foreground, while the other one far away from the camera is identified as the background. The average distance between points of the foreground is much smaller than that of the background, as illustrated in Figure 3.2.2. The objective of separation is to cluster the reconstructed feature points into two groups of foreground and background.

Figure 3.2.1 Camera motion when capturing image sequence. The camera is always aiming at the object. Illustration generated using Maya PLE 8.5[2]

21

Figure 3.2.2 Top view of the distribution of reconstructed 3D feature point of a frame.

Circle and arrow at right bottom represent the position and viewing direction of the camera, respectively. The smaller ellipse indicates the foreground points while the larger ellipse

indicates the background points.

Though there is a clear separation between the two point groups, the clustering algorithm like K-means is not applicable for two reasons. First, most clustering algorithm attempts to separate 3D data using planes. However, the boundary shape of foreground and background group is more likely a sphere or an irregular shape, not a plane, as shown in Figure 3.2.3.

Second, it is unnecessary to find a separation surface to extremely separate the 3D feature points into two groups of foreground and background. Instead, the background removal can be performed by an equivalent process, the foreground extraction, by picking out the foreground points from the feature points.

Figure 3.2.3 Top view of the distribution of reconstructed 3D feature point of image sequence. The orange arrow indicates the orbit of camera.

3.2.2 Proposed Foreground Extraction Algorithm

With reconstructed camera position and feature point location information, an algorithm performing foreground points extraction is proposed for this particular condition. Utilizing the distribution characteristic illustrated in Figure 3.2.2, the proposed algorithm first determines the initial two-point set of the foreground group by finding two closest points either or both of them are visible to the camera. The determination was performed by checking the included angle of two vectors, the camera viewing direction and the vector from camera position to the feature point position, by (3.2.1) .

( )

23

position, the camera viewing direction, and the direction from camera to feature point, respectively. The point Pf is determined visible to camera if θv is smaller than θc , usually defined as the viewing angle of the camera, or can be defined as a small angle such as 10 degrees to narrow down the range of foreground object.

A classification process is then performed to iteratively integrate suitable unclassified feature points into the foreground group. An unclassified feature point is selected and integrated if its distance to any foreground group point is smaller than the threshold distance Thd, defined as (3.2.2).

(

1 2

) (

1 2

)

d d mean d ,d

d W , d d, W,

Th = ⋅ (3.2.2)

where W =min

(

ImageWidth,ImageHeight

)

, d1 and d2 represent the distances from the camera to the two points of the initial point set, d is the expected minimal number of feature points extracted from 2D projection of the object surface.

From (3.2.2), Thd is proportional to the image resolution since the pixel distance between two feature points becomes larger for higher image resolution. Besides, Thd is inverse proportional to the distance from camera to the foreground object, due to the fact that the object size in image becomes smaller and the distance between feature points becomes closer when the object is farther. Thd is also inverse proportional to the expected minimal number of feature points for the reason that the more feature points on an object surface, the closer the feature points.

The detail of proposed foreground extraction algorithm is presented in Table 3.2.1.

Table 3.2.1 Foreground Extraction Algorithm Input

Minimal dimension of image width and height W, expected minimal number of feature point d, 3D camera position Pc, camera viewing direction Dc, camera viewing angle θc, and a set of 3D feature points position P.

Output

foreground points set F Algorithm

1. Initialization : Determine the initial two-points set of foreground group

2. Foreground Extraction : Iteratively integrate unclassified feature points into the foreground group

WHILE no more points are added to F FOR EACH point Pi in F

dmin := A_VERY_LARGE_NUMBER FOR EACH point Pi in P

25

3.3 Object Mask Generation

3.3.1 Problem Analysis on Mask Generation

To extract an object in an image, the most used approach is applying an object mask to the image to remove the unnecessary part. Feature points belonging to the object are extracted by the foreground extraction algorithm proposed in Section 3.1, as shown in Figure 3.3.1(b).

The next step is to generate a proper object mask from these feature points. For instance, generate the object mask illustrated in Figure 3.3.1(c) from the feature points illustrated in Figure 3.3.1(b).

(a) (b)

(c) (d)

Figure 3.3.1 Object extraction operation : (a) original image, (b) extracted foreground points of (a), (c) ideal generated object mask of (a), (d) ideal extracted object of (a)

A characteristic of the feature points distributed on the edge of object in image can be observed from Figure 3.3.2(a). This characteristic of distribution makes it extremely difficult to generate an object mask from the extracted foreground points in two aspects. First, the

distances among feature points are irregular which make the determination of the object contour ambiguous, as shown in Figure 3.3.2(c). Second, the density of feature points varies with respect to the object texture which leads to an ambiguity of distinguishing holes from surface, as shown in Figure 3.3.2(d).

(a) (b)

(c)

(d)

Figure 3.3.2 Ambiguity of determination : (a) the point set, (b) ideal contour of (a), (c) contour ambiguity of (a), (d) hole ambiguity of (a)

Several algorithms have been proposed to deal with the problem of finding best shape fitting a set of points. The most simple and robust algorithm is the convex hull algorithm [9].

The convex hull algorithm generates minimal convex contour containing all points. However,

27

Figure 3.3.3 Convex hull of Figure 3.3.2 (a)

To improve the performance when dealing with concave situation, the concave hull algorithm is proposed based on the convex hull algorithm [25]. Instead of looking for global concave contour containing all points, the convex hull algorithm finds only local concave contour in certain range. Theoretically, some convex parts can be preserved as all points are still contained by the contour. Hence, a new problem of determining the proper contour of the points is introduced since there are various possibilities of choosing the contour path.

Generally speaking, the convex hull is smoother with larger range, as shown in Figure 3.3.4.

Another issue besides the ambiguity is that, the generated contour is affected by the range of local concave contour and the distribution of the points. Concave hull algorithm performs well only when points are normally distributed. From Figure 3.3.3(a), the density of extracted points from Section 3.2 varies a lot and the result in Figure 3.3.5 illustrated a false contour by the affection of point distribution.

(a) (b)

Figure 3.3.4 Ambiguity of concave hull[25]

Figure 3.3.5 Concave hull of Figure 3.3.2 (a)

Besides the above two algorithms, alpha shape algorithm [7, 13] is proposed with abilities of not only preserving concave parts but also process holes. Different from using lines to determine the contour, the concept of alpha shape algorithm is to use circles of some radius r to determine the contour shape. If no point lies inside some circle, than the circle area is eliminated. After the elimination process, the remaining area represents the shape of the point set, as shown in Figure 3.3.6. Though alpha shape algorithm seems to be capable of dealing with the concerning problem, same problem as concave hull exists. The radius of circle r plays an important role in determining the shape: a small r results in the misjudgment of holes, while large r leads to the neglect of concaves. Observing from Figure 3.3.2 (a), the distribution of points makes is almost impossible to determine a proper r for the algorithm.

The result of best r obtained by a serious try and error is shown in Figure 3.3.7 (a). Although the result in 3.3.7 (a) is very close to the ideal contour shown in Figure 3.3.7 (b), the result is sensitive to r which can only obtained by try and error currently.

Figure 3.3.6 Alpha shape illustration[7]

29

Figure 3.3.7 Alpha shape of Figure 3.3.2 (a)

3.3.2 Object Mask Generation Utilizing the Convex Hull Algorithm

Since the object mask generation is applied throughout the image sequence, the most critical requirement is the stability of the algorithm. In other words, the algorithm must generate object masks with minimum error when applied on all images, not only one image.

Recalling the octree reconstruction process mentioned in Chapter 2, the object model is reconstructed by eliminating the non-object part. When reconstructing model from object mask, the over-definition part in one mask may be eliminated in other masks, while the over-eliminated part is removed from the model and can not be recovered again. Therefore, the major requirement for the mask generation algorithm is to generate masks rather over-defined than over-eliminated.

Reviewing the three algorithms described in Section 3.3.1 according to the requirement.

The alpha shape algorithm is inapplicable for the reason that, there is no method in defining proper circle radius r without human’s aid. The try and error method is not suitable when processing images one by one. If applying a fixed predefined circle radius r instead of optimizing it for each image, the generated object mask might be over-eliminated if holes are generated due to some misjudgments. On the other hand, the concave hull algorithm is also not adaptable for the fact that the algorithm might misjudge the concave and result in incomplete object masks. These two algorithms both have the possibility of over-eliminating the object mask. Thus, the two algorithms are both not suitable for the situation concerned.

Therefore, the best and only choice for the object mask generation algorithm is the convex hull algorithm.

The object mask generation step of the proposed background removal algorithm

implements the gift wrapping algorithm [9], one of the convex hull algorithms. The main concept of gift wrapping algorithm is to wrap all points with a foldable line. Starting from the leftmost point p0, a line segment l01 is linked to the point p1 which makes all other points lie right to the line segment l01. Then start from p1 and link a line segment l12 to next point p2

which makes all other points lie right to l12, and so forth. The algorithm terminates when line segment is liked back to starting point p0, thus all points are wrapped inside a polygon composed of line segments, l01, l12, …, lm0 for some m.

The concept and executing process of gift wrapping algorithm are simple except the step for determining if all points lie right to a line segment. Instead of searching all possible line segments, the determination process can be simplified just by utilizing the angle relationship of line segment sequence. By setting a vertical pseudo line passing the leftmost point for initial judgment, the line segment with minimum clockwise included angle to the previous line segment will make all points lie to its right. Hence, the gift wrapping algorithm is transformed into linking line segments in sequence, as shown in Figure 3.3.8(a). In fact, the judgment can be further simplified into an equivalent situation of finding the line segment having minimum included angle with the elongation of previous line segment, as shown in

The concept and executing process of gift wrapping algorithm are simple except the step for determining if all points lie right to a line segment. Instead of searching all possible line segments, the determination process can be simplified just by utilizing the angle relationship of line segment sequence. By setting a vertical pseudo line passing the leftmost point for initial judgment, the line segment with minimum clockwise included angle to the previous line segment will make all points lie to its right. Hence, the gift wrapping algorithm is transformed into linking line segments in sequence, as shown in Figure 3.3.8(a). In fact, the judgment can be further simplified into an equivalent situation of finding the line segment having minimum included angle with the elongation of previous line segment, as shown in

相關文件