Chapter 3. 3D Image Construction from 2D Image
3.2. Object-based Segmentation
3.2.3. Surface Layout
Fig. 3.8. Surface layout [27]. On these images and elsewhere, main class labels are indicated by colors (green=support, red=vertical, blue=sky) and subclass labels are indicated by markings (left/up/right arrows for planar left/center/right, ‘O’ for porous, ‘X’ for solid).
27
Surface layout proposed in [27] can label the image into geometry classes, which coarsely describe the 3D scene orientation of each image region as shown in Fig. 3.8.
Every region in the image is categorized into one of three main classes: “support”,
“vertical”, and “sky”. Support surface are parallel to the ground and could potentially support a solid object. Vertical surfaces are solid surfaces that are too steep to support an object. The sky is the image region corresponding to the open air and clouds. Vertical class is further categorized into one of five subclasses: “left”, “center”, “right”,
“porous”, and “solid”. Planar surfaces facing to the “left”, “center” or “right” of the viewer, and non-planar surface that are either “porous” or “solid”.
We believe that surface layout representation is useful information for us to detect object in the image. Fig. 3.9 shows the stages of the surface layout. At first, image is partitioned to many superpixels, and we compute cues for each superpixels. In order to have better result, multiple segmentation is used, so same-label likelihood is computed to be cost information for merge segment. After multiple segmentation, homogeneity likelihood is computed for each segment, and it is used to determine that segment is homogeneity or not. Label likelihood is also computed for each segment and superpixel to determine that segment belongs to which category. Finally, Bayes theorem applies label likelihood and homogeneity likelihood to compute the label confidence for each superpixel. We will briefly describe the stages in following section.
28
Fig. 3.9. Flow of the surface layout.
3.2.3.1 Superpixels
The use of superpixels improves the computational efficiency of our algorithm, and allows complex statistics to be computed for enhancing our knowledge of the image structure. Different from original algorithm in [34], we adopt our initial segmentation as superpixels.
3.2.3.2 Cues computation
To determine which orientation is most likely, we need to use all of the available cues: location, color, texture, perspective. In Table 3.1, we list the set of statistics used for classification.
29
Table 3.1. Statistics computed to represent superpixels [27]
Surface Cues
Location
L1. Location: normalized x and y, mean
L2. Location: normalized x and y, 10th and 90th pctl
L3. Location: normalized y wrt estimated horizon, 10th, 90th pctl
L4. Location: whether segment is above, below, or straddles estimated horizon L5. Shape: number of superpixels in segment
L6. Shape: normalized area in image Color
C1. RGB values: mean
C2. HSV values: C1 in HSV space C3. Hue: histogram (5 bins) C4. Saturation: histogram (3 bins) Texture
T1. LM filters: mean absolute response (15 filters) T2. LM filters: histogram of maximum responses (15 bins)
Perspective
P1. Long Lines: (number of line pixels)/sqrt(area) P2. Long Lines: percent of nearly parallel pairs of lines P3. Line Intersections: histogram over 8 orientations, entropy P4. Line Intersections: percent right of image center
P5. Line Intersections: percent above image center
P6. Line Intersections: percent far from image center at 8 orientations P7. Line Intersections: percent very far from image center at 8 orientations P8. Vanishing Points: (num line pixels with vertical VP membership)/sqrt(area) P9. Vanishing Points: (num line pixels with horizontal VP membership)/sqrt(area) P10. Vanishing Points: percent of total line pixels with vertical VP membership P11. Vanishing Points: x-pos of horizontal VP - segment center (0 if none) P12. Vanishing Points: y-pos of highest/lowest vertical VP wrt segment center P13. Vanishing Points: segment bounds wrt horizontal VP
P14. Gradient: x, y center of mass of gradient magnitude wrt segment center
3.2.3.3 Same-label Likelihoods
Same-label likelihoods learned from training images. The same-label classifier
30
outputs an estimate of for the adjacent superpixels
i
andj
and image data I. Here and are the superpixel label. The same-label classifier is based on cue set L1, L6, C1-C4, and T1-T2 in Table 3.1. In Table 3.2 we list the set of statistics used for computing same-label likelihoods.Table 3.2. Statistics computed over pairs of superpixels Boundary cues
Location
the absolute differences of the pixel location values x and y Color
C1. the absolute differences of the mean RGB C2. the absolute differences of the mean HSV
C3. the symmetrized Kullback-Leibler divergence of the hue C4. the symmetrized Kullback-Leibler divergence of the saturation Texture
T1. the absolute differences of the mean LM filter response
T2. he symmetrized Kullback-Leibler divergence of texture histogram
Shape
S1. the ratio of the area
S2. the fraction of the boundary length divided by the perimeter of the smaller superpixel S3.the straightness of the boundary
3.2.3.4 Multiple Segmentations
The increased spatial support of superpixels provides much better classification performance than for pixels. Large regions are required to effectively use the more complex cues. We need to compute multiple segmentations and then use the increased spatial support provided by each segment to better evaluate its quality. This method is based on pairwise same-label likelihoods. A diverse sampling of segmentations is produced by varying the number of segments
n
sand using a random initialization.
31
3.2.3.5 Label Likelihood Computation
The label classifier is used to distinguish among the main classes and the subclasses, and it is based on all of the listed cues. The label classifier output the estimate of , for the segment .
3.2.3.6 Homogeneity Likelihood Computation
The homogeneity classifier is used to determine whether a segment has a single or is mixed, and it is based on all of the listed cues. The homogeneity classifier output the estimate of for the segment .
Fig. 3.10. The result of the confidence images for each of the surface labels.
3.2.3.7 Label Confidences Computation
In final stage, we compute label confidences for each superpixel, and use following formula:
| ∑ , (3.10) Fig. 3.10 shows the result of the confidence images for each of the surface labels.