Future Work

Chapter 7 Conclusion and Future Work

7.2 Future Work

We could further speed up the program by implementing the code fully by C++.

(We are now half C++ half MATLAB.) Additionally, since we apply the same calculation to all superpixels, the process can be implemented in parallel computing.

We have not resolved the shrinking bias problem caused by the nature of graph cut segmentation. One way to solve this problem is to adaptively tuning the combination of different feature information when assigning the energy terms. For example, when the image’s foreground and background color are similar, greater weight is given to the texture (or other feature) component to provide greater region coherence and avoid boundary short cutting.

Appendix A

In this appendix, we introduce the algorithm of Simple Linear Iterative Clustering Superpixel (SLIC) in details. In addition, we introduce the SLICO algorithm, which is the zero-parameter version of SLIC. In our final implementation, we choose SLICO over SLIC. We also give a brief comparison between SLIC and SLICO.

A.1 Simple Linear Iterative Clustering Superpixel (SLIC)

As introduced in 4.2, SLIC is fast, memory efficient, and also exhibits state-of-the-art boundary adherence. It is an adaptation of k-means for superpixels generation, with two important distinctions:

1. The number of distance calculations in the optimization is reduced by limiting the search space to a region proportional to superpixel size. This reduces the complexity to be linear in the number of pixels N and independent of the number of superpixels K.

2. A weighted distance measure combines color and spatial proximity while simultaneously providing control over the size and compactness of the superpixels.

SLIC is simple to use and understand. By default, the only parameter of the algorithm is K, the desired number of approximately equally sized superpixels. For a color image in the CIELAB color space, the clustering procedure begins with an initialization step where K initial cluster centers Ck 



l a b x yk kk k k



^T with ^k^

 

^1,^K ^are sampled on a regular grid spaced S pixels apart. To produce roughly equally sized superpixels, the grid interval is S N K/ , so the approximate size of each superpixel is therefore N K/ pixels for an image with N pixels. The centers are moved to seed locations corresponding to the lowest gradient position in a

3 3 

neighborhood. This is

superpixel with a noisy pixel.

Next, in the assignment step, each pixel i is associated with the nearest cluster center whose search region overlaps its location, as depicted in Fig. 2.4. This is the key to speeding up our algorithm because limiting the size of the search region significantly reduces the number of distance calculations, and results in a significant speed advantage over conventional k-means clustering where each pixel must be compared with all cluster centers. Since the expected spatial extent of a superpixel is a region of approximate size

S S 

, the search for similar pixels is done in a region

2 S  2 S

around the superpixel center.

(a) (b)

Fig. A.1 Reducing the superpixel search regions. (a) standard k-means searches the entire image. (b) SLIC searches a limit region.

There is a problem that how to define the distance measure D. While the maximum possible distance between two colors in the CIELAB space is limited, the spatial distance in the xy plane depends on the image size. It is not possible to simply use the Euclidean distance in this 5D space without normalizing the spatial distances. In order

to cluster pixels in this 5D space, therefore a new distance measure that considers superpixel size is introduced. Using it enforce color similarity as well as pixel proximity in this 5D space such that the expected cluster sizes and their spatial extent are approximately equal. The measure is defined by combining the color proximity and spatial proximity normalized by their respective maximum distances within a cluster, N_c and Ns, as follows:

where d_c and d_s are distances in color and spatial space, respectively. The parameter m is a constant to represent the respective maximum color distance Nc, and the maximum spatial distance expected within a given cluster should correspond to the sampling interval, therefore N_s S.

In SLICO, the distance measure D is defined as

2 more important and the resulting superpixels are more compact (i.e., they have a lower area to perimeter ratio). When m is small, the resulting superpixels adhere more tightly to image boundaries, but have less regular size and shape. When using the CIELAB color space, m can be in the range [1,40].

adjusts the cluster centers to be the mean



^{l a b x y}



^T vector of all the pixels belonging to the cluster. The L2 norm is used to compute a residual error E between the new cluster center locations and previous cluster center locations. The assignment and update steps can be repeated iteratively until the error converges, but in most of time that 10 iterations suffices for most images, and report all results in this paper using this criteria.

Finally, a post-processing step enforces connectivity by reassigning disjoint pixels to nearby largest superpixels. We show the complete algorithm in Table A.1

In the figure below, the first column of images shows SLIC output with a constant compactness factor for all superpixels, while the second column of images shows the output of SLICO, which chooses the compactness factor adaptively for each superpixel.

If the image is smooth in certain regions but highly textured in others, SLIC produces smooth regular-sized superpixels in the smooth regions and highly irregular superpixels in the textured regions. Thus, it becomes tricky choosing the right parameter for each image. On the other hand, in SLICO, the user no longer has to set the compactness parameter or try different values of it. SLICO adaptively chooses the compactness parameter for each superpixel differently. This generates regular shaped superpixels in both textured and non-textured regions alike. To note that the improvement comes with hardly any compromise on the computational efficiency - SLICO continues to be as fast as SLIC.

SLIC SLICO Fig. A.2 Segmentation results of SLIC and SLICO.

Table A.1 Algorithm of SLICO superpixel segmentation.

Algorithm SLICO Superpixel Segmentation

Input: Image with N pixels, number of superpixels K, compactness parameter m Output: Segmented map

1: Initialize K cluster centers

C

  l a b x y

k kk k k



^T by sampling pixels at regular grid step S,

S  N K /

2: Move cluster centers to the lowest gradient position in a

3 3 

neighborhood.

3: Set label of pixel i,

^{l i}   ^{ } ¹

for each pixel.

4: Set distance between nearest cluster center and pixel i,

^{d i}   ^{ }

for each pixel.

5: repeat

for each cluster center C

_k do

for each pixel i in a 2 S  2 S

region around Ck do

8: Compute the distance D between C_k and i with (A.4) 9:

if ^{D d i} ^  

^then

10: Set distance between nearest cluster center and pixel i,

^{d i}   ^ ^D

11: Set label of pixel i,

^{l i}   ^ ^k

12:

end if

13:

end for

14:

end for

15: Compute new cluster centers using the mean value of pixels in each cluster.

16: Compute residual error E, the L2 distance between previous centers and recomputed centers.

17: until

E  threshold

18: Enforce connectivity by reassigning disjoint pixels.

REFERENCE

A. Interactive Image Segmentation

[1] D. L. Pham, C. Xu, and J. L. Prince, “Current methods in medical image segmentation 1”, Annual Review of Biomedical Engineering, vol. 2, no. 1, pp.

315-337, 2000.

[2] C. Rother, V. Kolmogorov, and A. Blake. “Grabcut: Interactive foreground extraction using iterated graph cuts,” ACM Transactions on Graphics, vol. 23, no.

3, pp. 309–314, 2004.

[3] Y. Y. Boykov, and M. P. Jolly, “Interactive graph cuts for optimal boundary &

region segmentation of objects in N-D images,” In Proceedings of the IEEE

International Conference on Computer Vision, vol.1, pp.105-112, 2001.

[4] Y. Boykov and V. Kolmogorov, “An experimental comparison of mincut/max-flow algorithms for energy minimization in vision.” IEEE

Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1124–

1137, 2004.

[5] Y. Li, J. Sun, C. K. Tang, and H. Y. Shum. “Lazy snapping,” ACM Transactions

on Graphics, vol. 23, no. 3, pp. 303-308, 2004.

[6] F. R. Chung, “Spectral Graph Theory,” vol. 92, American Mathematical Soc., 1997.

[7] N. Biggs, “Algebraic Graph Theory”, Cambridge University Press, 1974.

[8] D. Greig, B. Porteous, and A. Seheult. “Exact maximum a posteriori estimation for binary images,” Journal of the Royal Statistical Society, Series B, pp. 271–279, 1989.

[9] A. Blake, C. Rother, M. Brown, P. Perez, and P. Torr, “Interactive Image

Segmentation Using an Adaptic GMMRF Model,” In Proceedings of the

European Conference on Computer Vision, pp. 428-441, 2004.

[10] M. Gleicher, “Image snapping,“ In Proceedings of the 22nd annual conference on

Computer graphics and interactive techniques, pp. 183-190, 1995.

[11] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active contour models,”

International journal of computer vision, vol. 1, no. 4, pp. 321-331, 1987.

[12] H. Lombaert, Y. Sun, L. Grady, and C. Xu, “A Multilevel Banded Graph Cuts Method for Fast Image Segmentation,” In Proceedings of the IEEE International

Conference on Computer Vision, vol. 1, pp. 259-265, 2005.

[13] E. N. Mortensen and W. A. Barrett, ”Intelligent scissors for image composition,”

In Proceedings of the 22nd annual conference on Computer graphics and

interactive techniques, pp. 191–198, 1995.

[14] J.Wang, M. Agrawala, and M. F. Cohen, “Soft scissors: an interactive tool for realtime high quality matting,” ACM Transactions on Graphics, vol. 26, no. 3, pp.

9:1-9:6, 2007.

[15] V. Kolmogorov and Y. Boykov, “What metrics can be approximated y geo-cuts, or global optimization of length/area and flux,” In Proceedings of the IEEE

International Conference on Computer Vision, vol. 1, pp. 564–571, 2005.

[16] S. Vicente, V. Kolmogorov, and C. Rother, “Graph cut based image segmentation with connectivity priors,” In Proceedings of the IEEE Conference on Computer

Vision and Pattern Recognition, pp. 1-8, 2008.

[17] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE

Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp.

888-905, 2000.

[18] V. Kolmogorov, Y. Boykov, and C. Rother, “Applications of parametric maxflow

in computer vision,” In Proceedings of the IEEE International Conference on

Computer Vision, pp. 1-8, 2007.

[19] T. Schoenemann, F. Kahl, and D. Cremers, “Curvature regularity for region-based image segmentation and inpainting: A linear programming relaxation,” In

Proceedings of the IEEE International Conference on Computer Vision, pp, 17-23,

2009.

[20] A. K. Sinop and L. Grady, “A seeded image segmentation framework unifying graph cuts and random walker which yields a new algorithm,” In Proceedings of

the IEEE International Conference on Computer Vision, pp. 1-8, 2007.

[21] L. Grady, “Random walks for image segmentation,” IEEE Transactions on

Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1768–1783, 2006.

[22] X. Bai and G. Sapiro, “A geodesic framework for fast interactive image and video segmentation and matting,” In Proceedings of the IEEE International Conference

on Computer Vision, pp. 1–8, 2007.

[23] A. Protiere and G. Sapiro, “Interactive image segmentation via adaptive weighted distances,” IEEE Transactions on Image Processing, vol. 16, no. 4, pp. 1046–

1057, 2007.

[24] L. J. Reese and W. A. Barrett, “Image editing with intelligent paint,” In

Proceedings of Eurographics, vol. 21, no. 3, pp. 714–724, 2002.

[25] C. Couprie, L. Grady, L. Najman, and H. Talbot, “Power watersheds: A new image segmentation framework extending graph cuts, random walker and optimal spanning forest,” In Proceedings of the IEEE International Conference on

Computer Vision, pp. 731-738, 2009.

[26] D. Singaraju, L. Grady, and R. Vidal, “P-Brush: Continuous valued MRFs with normed pairwise distributions for image segmentation,” In Proceedings of the

IEEE Conference on Computer Vision and Pattern Recognition, pp. 1303–1310,

2009.

[27] D. Mumford, and J. Shah, “Optimal approximations by piecewise smooth functions and associated variational problems.” Communications on pure and

applied mathematics, vol. 42, no. 5, pp. 577-685, 1989.

[28] S. Geman, and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images.” IEEE Transactions on Pattern Analysis and

Machine Intelligence, vol. 6, pp. 721-741, 1984.

[29] E. N. Mortensen, and W. A. Barrett, “Interactive segmentation with intelligent scissors.” Graphical models and image processing, vol. 60, no. 5, pp. 349-384, 1998.

[30] L. D. Cohen, and R. Kimmel, “Global minimum for active contour models: A minimal path approach.” International journal of computer vision, vol. 24, no. 1, pp. 57-78, 1997.

[31] L. Grady, “Computing exact discrete minimal surfaces: Extending and solving the shortest path problem in 3D with application to segmentation.” In Proceedings of

the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp.

69-78, 2006.

[32] D. Freedman and T. Zhang, “Interactive graph cut based segmentation with shape priors.” In Proceedings of the IEEE Conference on Computer Vision and Pattern

Recognition, vol. 1, pp. 755-762, 2005.

[33] N. Vu and B. S. Manjunath, “Shape prior segmentation of multiple objects with graph cuts.” In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition, pp. 1-8, 2008.

[34] H. Wang and H. Zhang, “Adaptive shape prior in graph cut segmentation.”

In Proceedings of the IEEE Conference on Image Processing, pp. 3029-3032, 2010.

[35] A. M. Ali, A. A. Farag and A. S. El-Baz, “Graph cuts framework for kidney segmentation with prior shape constraints.” In Medical Image Computing and

Computer-Assisted Intervention, pp. 384-392, 2007.

[36] O. Veksler, “Star shape prior for graph-cut image segmentation.” In Proceedings

of the IEEE Conference on Computer Vision, pp. 454-467, 2008.

[37] L. Vincent and P. Soille, “Watersheds in digital spaces: an efficient algorithm based on immersion simulations,” IEEE Transactions on Pattern Analysis and

Machine Intelligence, vol. 13, no. 6, pp. 583-598, 1991.

[38] M. Y. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa, "Entropy rate superpixel segmentation," In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition, pp. 2097-2104, 2011.

[39] O. Veksler and Y. Boykov, “Superpixels and supervoxels in an energy optimization framework,” In Proceedings of the European Conference on

Computer Vision, pp. 211-224, 2010.

[40] J. Santner, T. Pock, and H. Bischof, “Interactive multi-label segmentation,”

Springer Berlin Heidelberg, pp. 397-410, 2011.

[41] O. Duchenne, J. Y. Audibert, R. Keriven, J. Ponce, and F. Ségonne, “Segmentation by transduction.” In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition, pp. 1-8, 2008.

[42] B. L. Price, B. Morse, and S. Cohen, “Geodesic graph cut for interactive image segmentation.” In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition, pp. 3161-3168, 2010.

B. Superpixel

[43] P. Neubert and P. Protzel, “Superpixel benchmark and comparison,” In

Proceedings of Forum Bildverarbeitung, 2012.

[44] L. Vincent and P. Soille, “Watersheds in digital spaces: An efficient algorithm based on immersion simulations.” IEEE Transactions on Pattern Analysis and

Machine Intelligence, vol. 13, no. 6, pp. 583–598, 1991.

[45] D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space analysis.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.

24, no. 5, pp. 603–619, 2002.

[46] A. Vedaldi and S. Soatto, “Quick shift and kernel methods for mode seeking,” In

Proceedings of the European Conference on Computer Vision, pp. 705-718, 2008.

[47] A. Levinshtein, A. Stere, K. Kutulakos, D. Fleet, S. Dickinson, and K. Siddiqi.

“Turbopixels: Fast superpixels using geometric flows,” IEEE Transactions on

Pattern Analysis and Machine Intelligence, vol. 31, no.12, pp. 2290-2297, 2009.

[48] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, “SLIC superpixels compared to state-of-the-art superpixel methods,” IEEE Transactions

on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2274-2282,

decomposition,” In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition, pp. 1124-1131, 2005.

[51] P. Felzenszwalb and D. Huttenlocher, “Efficient graph-based image segmentation.”

International Journal of Computer Vision, vol. 59, no. 2, pp. 167–181, 2004.

[52] Y. M. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa, “Entropy rate superpixel segmentation,” In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition, pp. 2097-2104, 2011.

C. Theorems and Methmetics

[53] V. Vineet, and P. J. Narayanan, “CUDA cuts: Fast graph cuts on the GPU,”

In Proceedings of the IEEE Conference on Computer Vision and Pattern

Recognition, pp.1-8, 2008.

[54] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE Transactions on Pattern Analysis and

Machine Intelligence, vol. 33, no. 5, pp. 898-916, 2011.

[55] R. B. Miller, “Response time in man-computer conversational transactions,”

In Proceedings of the fall joint computer conference, ACM, part I, pp. 267-277, 1968.

[56] D. J. Field, “Relations between the statistics of natural images and the response properties of cortical cells,” Journal of the Optical Society of America A, vol. 4, no. 12, pp. 2379-2394, 1987.

在文檔中運用超像素的即時互動式影像切割 (頁 85-98)

Chapter 7 Conclusion and Future Work

7.2 Future Work

Appendix A

A.1 Simple Linear Iterative Clustering Superpixel (SLIC)





 

3 3 

S S 

2 S  2 S





Algorithm SLICO Superpixel Segmentation

Input: Image with N pixels, number of superpixels K, compactness parameter m Output: Segmented map

C

  l a b x y



S  N K /

3 3 

l i     1

d i    

for each cluster center C

for each pixel i in a 2 S  2 S

if D d i   

d i    D

l i    k

end if

end for

end for

E  threshold

REFERENCE

A. Interactive Image Segmentation

International Conference on Computer Vision, vol.1, pp.105-112, 2001.

Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1124–

on Graphics, vol. 23, no. 3, pp. 303-308, 2004.

European Conference on Computer Vision, pp. 428-441, 2004.

Computer graphics and interactive techniques, pp. 183-190, 1995.

International journal of computer vision, vol. 1, no. 4, pp. 321-331, 1987.

Conference on Computer Vision, vol. 1, pp. 259-265, 2005.

interactive techniques, pp. 191–198, 1995.

International Conference on Computer Vision, vol. 1, pp. 564–571, 2005.

Vision and Pattern Recognition, pp. 1-8, 2008.

Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp.

Computer Vision, pp. 1-8, 2007.

Proceedings of the IEEE International Conference on Computer Vision, pp, 17-23,

the IEEE International Conference on Computer Vision, pp. 1-8, 2007.

Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1768–1783, 2006.

on Computer Vision, pp. 1–8, 2007.

Proceedings of Eurographics, vol. 21, no. 3, pp. 714–724, 2002.

Computer Vision, pp. 731-738, 2009.

IEEE Conference on Computer Vision and Pattern Recognition, pp. 1303–1310,

applied mathematics, vol. 42, no. 5, pp. 577-685, 1989.

Machine Intelligence, vol. 6, pp. 721-741, 1984.

the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp.

Recognition, vol. 1, pp. 755-762, 2005.

Pattern Recognition, pp. 1-8, 2008.

Computer-Assisted Intervention, pp. 384-392, 2007.

of the IEEE Conference on Computer Vision, pp. 454-467, 2008.

Machine Intelligence, vol. 13, no. 6, pp. 583-598, 1991.

Pattern Recognition, pp. 2097-2104, 2011.

Computer Vision, pp. 211-224, 2010.

Pattern Recognition, pp. 1-8, 2008.

Pattern Recognition, pp. 3161-3168, 2010.

B. Superpixel

Proceedings of Forum Bildverarbeitung, 2012.

Machine Intelligence, vol. 13, no. 6, pp. 583–598, 1991.

Proceedings of the European Conference on Computer Vision, pp. 705-718, 2008.

Pattern Analysis and Machine Intelligence, vol. 31, no.12, pp. 2290-2297, 2009.

on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2274-2282,

Pattern Recognition, pp. 1124-1131, 2005.

International Journal of Computer Vision, vol. 59, no. 2, pp. 167–181, 2004.

Pattern Recognition, pp. 2097-2104, 2011.

C. Theorems and Methmetics

Recognition, pp.1-8, 2008.

Machine Intelligence, vol. 33, no. 5, pp. 898-916, 2011.

^{l i}   ^{ } ¹

^{d i}   ^{ }

if ^{D d i} ^  

^{d i}   ^ ^D

^{l i}   ^ ^k