Surface Layout - Object-based Segmentation

Chapter 3. 3D Image Construction from 2D Image

3.2. Object-based Segmentation

3.2.3. Surface Layout

Fig. 3.8. Surface layout [27]. On these images and elsewhere, main class labels are indicated by colors (green=support, red=vertical, blue=sky) and subclass labels are indicated by markings (left/up/right arrows for planar left/center/right, ‘O’ for porous, ‘X’ for solid).

Surface layout proposed in [27] can label the image into geometry classes, which coarsely describe the 3D scene orientation of each image region as shown in Fig. 3.8.

Every region in the image is categorized into one of three main classes: “support”,

“vertical”, and “sky”. Support surface are parallel to the ground and could potentially support a solid object. Vertical surfaces are solid surfaces that are too steep to support an object. The sky is the image region corresponding to the open air and clouds. Vertical class is further categorized into one of five subclasses: “left”, “center”, “right”,

“porous”, and “solid”. Planar surfaces facing to the “left”, “center” or “right” of the viewer, and non-planar surface that are either “porous” or “solid”.

We believe that surface layout representation is useful information for us to detect object in the image. Fig. 3.9 shows the stages of the surface layout. At first, image is partitioned to many superpixels, and we compute cues for each superpixels. In order to have better result, multiple segmentation is used, so same-label likelihood is computed to be cost information for merge segment. After multiple segmentation, homogeneity likelihood is computed for each segment, and it is used to determine that segment is homogeneity or not. Label likelihood is also computed for each segment and superpixel to determine that segment belongs to which category. Finally, Bayes theorem applies label likelihood and homogeneity likelihood to compute the label confidence for each superpixel. We will briefly describe the stages in following section.

Fig. 3.9. Flow of the surface layout.

3.2.3.1 Superpixels

The use of superpixels improves the computational efficiency of our algorithm, and allows complex statistics to be computed for enhancing our knowledge of the image structure. Different from original algorithm in [34], we adopt our initial segmentation as superpixels.

3.2.3.2 Cues computation

To determine which orientation is most likely, we need to use all of the available cues: location, color, texture, perspective. In Table 3.1, we list the set of statistics used for classification.

Table 3.1. Statistics computed to represent superpixels [27]

Surface Cues

Location

L1. Location: normalized x and y, mean

L2. Location: normalized x and y, 10th and 90th pctl

L3. Location: normalized y wrt estimated horizon, 10th, 90th pctl

L4. Location: whether segment is above, below, or straddles estimated horizon L5. Shape: number of superpixels in segment

L6. Shape: normalized area in image Color

C1. RGB values: mean

C2. HSV values: C1 in HSV space C3. Hue: histogram (5 bins) C4. Saturation: histogram (3 bins) Texture

T1. LM filters: mean absolute response (15 filters) T2. LM filters: histogram of maximum responses (15 bins)

Perspective

P1. Long Lines: (number of line pixels)/sqrt(area) P2. Long Lines: percent of nearly parallel pairs of lines P3. Line Intersections: histogram over 8 orientations, entropy P4. Line Intersections: percent right of image center

P5. Line Intersections: percent above image center

P6. Line Intersections: percent far from image center at 8 orientations P7. Line Intersections: percent very far from image center at 8 orientations P8. Vanishing Points: (num line pixels with vertical VP membership)/sqrt(area) P9. Vanishing Points: (num line pixels with horizontal VP membership)/sqrt(area) P10. Vanishing Points: percent of total line pixels with vertical VP membership P11. Vanishing Points: x-pos of horizontal VP - segment center (0 if none) P12. Vanishing Points: y-pos of highest/lowest vertical VP wrt segment center P13. Vanishing Points: segment bounds wrt horizontal VP

P14. Gradient: x, y center of mass of gradient magnitude wrt segment center

3.2.3.3 Same-label Likelihoods

Same-label likelihoods learned from training images. The same-label classifier

outputs an estimate of for the adjacent superpixels

i

and

j

and image data I. Here and are the superpixel label. The same-label classifier is based on cue set L1, L6, C1-C4, and T1-T2 in Table 3.1. In Table 3.2 we list the set of statistics used for computing same-label likelihoods.

Table 3.2. Statistics computed over pairs of superpixels Boundary cues

Location

the absolute differences of the pixel location values x and y Color

C1. the absolute differences of the mean RGB C2. the absolute differences of the mean HSV

C3. the symmetrized Kullback-Leibler divergence of the hue C4. the symmetrized Kullback-Leibler divergence of the saturation Texture

T1. the absolute differences of the mean LM filter response

T2. he symmetrized Kullback-Leibler divergence of texture histogram

Shape

S1. the ratio of the area

S2. the fraction of the boundary length divided by the perimeter of the smaller superpixel S3.the straightness of the boundary

3.2.3.4 Multiple Segmentations

The increased spatial support of superpixels provides much better classification performance than for pixels. Large regions are required to effectively use the more complex cues. We need to compute multiple segmentations and then use the increased spatial support provided by each segment to better evaluate its quality. This method is based on pairwise same-label likelihoods. A diverse sampling of segmentations is produced by varying the number of segments

n

and using a random initialization.

3.2.3.5 Label Likelihood Computation

The label classifier is used to distinguish among the main classes and the subclasses, and it is based on all of the listed cues. The label classifier output the estimate of , for the segment .

3.2.3.6 Homogeneity Likelihood Computation

The homogeneity classifier is used to determine whether a segment has a single or is mixed, and it is based on all of the listed cues. The homogeneity classifier output the estimate of for the segment .

Fig. 3.10. The result of the confidence images for each of the surface labels.

3.2.3.7 Label Confidences Computation

In final stage, we compute label confidences for each superpixel, and use following formula:

| ∑ , (3.10) Fig. 3.10 shows the result of the confidence images for each of the surface labels.

在文檔中利用物件導向切割的二維至三維影像轉換 (頁 42-48)