The proposal is mainly built on two relatively simple ideas: 1) to make more effective use of the object's known approximate location and shape to narrow the search area, and 2) to make more effective use of the detected edges interior to the search area.
To facilitate algorithm development and the following discussion, we assume use of change detection to roughly delineate the moving objects, but other techniques can also be employed. Change detection detects image areas that exhibit significant changes from one video frame to another. Normally, the result (termed “change detection mask” or CDM in short) will consist of pixels from both the moving objects and the background. Concerning edge detection, we employ the Canny edge detector.
Our edge linking method consists of two stages: mask sketch and mask refinement. In the former we define the outer perimeter of the area that contains an object of interest, and in the latter we refine the estimated object boundary. They are discussed in separate subsections below. The overall procedure is illustrated in Fig. 2-1.
CDM
Fig. 2-1: The first proposed method of video segmentation.
1. Mask Sketch
We illustrate the procedure using the arbitrary CDM example shown in Fig. 2-2(a), where pixels in the CDM are marked in gray.
To start, we round out the CDM to obtain a solid region. In addition to defining the maximum support of the object, this step also serves two functions. First, by this we make the edges interior to the CDM, but not part of it, also available for subsequent edge-based
processing. And secondly, we stop one- and two-pixel wide “cracks” in the CDM. More than one way exists to obtain the same result. One of them is described in [21]. To conserve space, we omit the details here. For the arbitrary CDM example of Fig. 2-2(a), we obtain Fig. 2-2(b) as the result.
(a) (b) (c)
(d) (e) (f) Fig. 2-2: Arbitrary example illustrating the proposed algorithm. (a) CDM. (b) Result of CDM round-out. (c) Result of CDM round-out with edge pixels therein marked in black. (d) Result of segmental row scans. (e) Result of segmental column scans. (f) Result of boundary tightening with edge pixels in the mask marked in black and other pixels in gray.
Next, we tighten the boundary of the rounded CDM by working with the edge pixels therein. Note that each row of pixels in the rounded CDM may consist of more than one connected segment, and likewise each column. We do “segmental orthogonal scans” as follows. First, for each connected horizontal segment that contains two or more edge pixels, we connect the furthest two of them. Then we regard the boundary pixels in the result as virtual edge pixels and, for each connected segment in each column of the rounded CDM, we connect the two furthest edge pixels. This completes the boundary tightening step. For the above example, let the edge pixels in the rounded CDM be as shown in black in Fig.
2-2(c). Then the resulting pixel maps after segmental horizontal and vertical scans are as illustrated in Figs. 2-2(d) and (e), respectively. To further appreciate the effects of these scans, Fig. 2-2(f) shows the result again, with the edge pixels marked in black while the others in gray. Comparing it with Fig. 2-2(c), we see that the mask is indeed tightened to match the edge contours better. While the technique of orthogonal scans may look much like that in maximal scan, the segmental nature leads to a very different result.
2. Mask Refinement
Now that we have bounded the outer perimeter of the object, we proceed to refine the estimated object boundary. For this we employ Dijkstra's shortest-path algorithm to find and to link up the outermost edges in the boundary-tightened mask.
Figure 2-3 illustrates how our algorithm works using a section of the resulting mask shown in Fig. 2-2(f) from mask sketch. Consider edge linking between points A and B shown in Fig. 2-3(a), for example. The algorithm considers all edge pixels on the boundary
of the mask as belonging to the object boundary. First, all non-edge boundary pixels in the mask are identified. In Fig. 2-3(a), the non-edge boundary pixels between points A and B are marked by cross hatching. Next, we stop all edge gaps around the mask boundary that are only one pixel wide. This is done by examining each non-edge boundary pixel. If two of its orthogonal four-connected neighbors are edge pixels, it is declared to be an edge pixel.
Fig. 2-3(b) shows the result for the example, where all pixels on the mask boundary (between A and B) that are now considered belonging to the object boundary are marked black. The others remain cross-hatched. For clarity, in this figure we omit the black marking of the edge pixels that are not on the mask boundary. Note that the above gap-stopping method is in effect a shortest-path algorithm over one-pixel gaps, but with lower complexity than normal Dijkstra algorithm. Regarding the example, we are now left with two edge discontinuities between A and B, defined by the pixel pairs (a,b) and (c,B), respectively.
(a) (b) (c) (d)
Fig. 2-3: Illustration of the mask refinement method. (a) Zoomed-in section of the mask sketch result for illustration use. (b) After stopping of one-pixel gaps. (c) Respective search areas of shortest-path algorithm for edge discontinuities (a,b) and (c,B). (d) Final result of edge linking between A and B.
The algorithm continues by considering separately each remaining edge discontinuity along the mask boundary. For each discontinuity, we search in the mask for the shortest path that bridges it, where each edge pixel in the mask is given a smaller equivalent length and each non-edge pixel a larger equivalent length. To control the computational complexity, we may limit the search area to a band around the mask's boundary. Let Dw be the bandwidth in number of pixels. For example, Fig. 2-3(c) illustrates the two search areas for the edge discontinuities (a,b) and (c,B), respectively, with Dw = 5. After executing Dijkstra's shortest-path algorithm over the two search areas separately, we obtain the final result shown in Fig. 2-3(d) for edge linking between A and B.
3. On Algorithm Complexity
The complexity of the above algorithm depends on the detailed organization of the operations involved. Nevertheless, we can see that a “mask sketch” as described above involves several passes over the CDM and its interior, each pass involving some simple logical operations on each pixel. Therefore, the complexity of mask sketch is on the order of
the size of the extracted object. The complexity of mask refinement depends on the total length of the edge discontinuities. A typical implementation of the Dijkstra algorithm has
O(n
2) complexity, where n is the number of pixels in the search area. Thus the complexity of mask refinement is at most O(L2D
w2) where L is the perimeter of the extracted object.4. Some Experimental Results
Experiments show that the proposed method is efficient and performs well. Figure 2-4 shows some results from using two different values of Dw in mask refinement. For more discussion on the efficiency and the performance of the method, see [12].
(a) (b)
Fig. 2-4: Result of mask refinement at different search bandwidths for a frame in the Mother-and-Daughter sequence. (a) Dw = 2. (b) Dw = 5.