Chapter 5. Conclusion and Future Works
5.2. Future work
There are two issues remained in our 2D to 3D conversion system. First, there still are many depth cues we can use. For example, considering the temporal domain information, we can combine some video segmentation method that can help the result of object segmentation more accurate. The other issue is computational speed of our algorithm which still remains slow. Therefore, we will be working on optimizing the speed of object segmentation algorithm in the future and porting the algorithm on the parallel processor.
58
Reference
[1] T. Iinuma, H. Murata, S. Yamashita, and K. Oyamada, “Natural Stereo Depth Creation Methodology for a Real-time 2D-to-3D Image Conversion,” SID Symposium Digest of Technical Papers, pp. 1212-1215, 2000.
[2] C. C. Cheng, T. L. Chung, Y. M. Ysai, and L. G. Chen, “Hybrid Depth Cueing for 2D-To-3D Conversion System,” in Proc. of Stereoscopic Displays and
Application XX, 2009
[3] S. Battiato, A. Capra, S. Curti, and M. L. Cascia, "3D Stereoscopic Image Pairs by Depth-Map Generation," in Proc. of International Symposium on 3D Data
Processing, Visualization and Transmission (3DPVT), pp. 124-131, 2004.
[4] D. Hoiem, A. Stein, A. A. Efros, and M. Hebert, “Recovering occlusion
boundaries from a single image,” in Proc. of IEEE International Conference on
Computer Vision (ICCV), 2007.
[5] R. I. Hartley, and A. Zisserman, “Multiple Views Geometry in Computer Vision,”
Cambridge University Press: Cambridge, UK, 2000.
[6] M. Pollefeys, L. V. Gool, and M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, R.
Koch, “Visual modeling with a hand-held camera,” International Journal of Computer Vision , vol. 59, no.3, pp. 207-232, 2004.
[7] C. Tomasi and T. Kanade, “Detection and tracking of point features,” Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-CS-91-132, Apr. 1991.
[8] T. Jebara, A. Azarbayejani, A. Pentland, 3D structure from 2D motion, IEEE Signal Process. Mag. 16 (3) (1999) 66–84.
59
[9] M.Z. Brown, D. Burschka, and G. Hager, “Advances in Computational Stereo,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no.8,
pp. 993-1008,2003.
[10] D. Scharstein and R. Szeliski, "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms," International Journal of
Computer Vision (IJCV), vol. 47, pp. 7-42, 2002.
[11] S. Knorr, T. Sikora, “An image-based rendering (IBR) approach for realistic stereo view synthesis of TV broadcast based on structure from motion,” in Proc.
of IEEE International Conference on Image Processing (ICIP), San Antonio,
USA, 2007.
[12] L.MacMillan, “An Image based approach to three-dimensional computer graphics,” Ph.D. Dissertation, 1997, University of North Carolina.
[13] I. Ideses, L. P. Yaroslavsky, and B. Fishbain, “Real-time 2D to 3D video conversion,” Journal of Real-Time Image Processing, vol.2, no. 1, pp. 3–9, 2007.
[14] M. Kunter, S. Knoor, A. Krutz, T. SiKora, “Unsupervised object segmentation for 2D to 3D conversion,” in Proc. of SPIE, vol. 7237, 2009
[15] A. Krutz, M. Kunter, M. Mandal, M. Frater, and T. Sikora, “Motion-based Object Segmentation using Sprites and Anisotropic Diffusion”, 8th International
Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS),
2007.
[16] S. A. Valencia, R. M. Rodríguez-Dagnino, “Synthesizing Stereo 3D Views from Focus Cues in Monoscopic 2D images,” in Proc. SPIE, vol. 5006, pp. 377-388, 2003.
[17] J.M. Geusebroek and A.W.M. Smeulders, “A six-stimulus theory for stochastic
60
texture,” International Journal of Computer Vision (IJCV), vol. 62, pp. 7–16, 2005.
[18] V. Nedovic, A. W. M. Smeulders, A. Redert, J. M. Geusebroek, “Depth
estimation via stage classification,” in Proc. of 3DTV conference, pp 77-80, 2008 [19] D. A. Forsyth, D.A. “Shape from texture and integrability,” in Proc. of
International Conference on Computer Vision (ICCV), vol. 2, pp. 447-452, 2001,
[20] A. M. Loh, R. Hartley “Shape from Non-Homogeneous, Non-Stationary, Anisotropic, Perspective texture”, in Proc. of the British Machine Vision
Conference, 2005
[21] Y. J. Jung, A. Baik, J. Kim, and D. Park, “A novel 2D-to-3D conversion
technique based on relative height depth cue,” in Proc. of SPIE, vol. 7237, 2009 [22] A. Saxena, S. H. Chung, and A. Y. Ng, “Learning depth from single monocular
images,” In NIPS, vol. 18, 2005
[23] A. Saxena, S. H. Chung, and A. Y. Ng, “3-D depth reconstruction from a single still image,” in Proc. of International Journal of Computer Vision (IJCV), vol.76 no. 1, 2007
[24] T. Okino, H. Murata, K. Taima, T. Iinuma, and K. Oketani, "New television with 2D to 3D image conversion technologies," in Proc. of SPIE , Stereoscopic
Displays and Virtual Reality Systems III, Vol. 2653, pp. 96-103, 1996.
[25] H. Murata, Y. Mori, S. Yamashita, A. Maenaka, S. Okada, K. Pyamada, and S.
Kishimoto, "A real- Time Image Conversion Technique Using Computed Image Depth," SID Symposium Digest of Technical Papers, vol. 29, pp. 919-922, 1998 [26] Martin, C. Fowlkes and J. Malik, “Learning to detect natural image boundaries
using local brightness, color and texture cues,” IEEE Transactions on Pattern
Analysis and Machine Intelligence,” vol. 26, no. 5, pp. 530–549, 2004
61
[27] D. Hoiem, A. Efros, and M. Hebert, “Recovering surface layout from an image,”
International Journal of Computer Vision (IJCV), vol. 75, no. 1, pp. 151–172,
2007.
[28] Y. R. Horng, Y. C. Tseng, T. S. Chang, “Stereoscopic Images Generation with DirectionalGaussian Filter,” in Proceedings of IEEE International Symposium
on Circuits and Systems, pp. 2650-2653, 2010.
[29] A. Korbes, R. Lotufo, G. B. Vitor, and J. V. Ferreira, “A proposal for a parallel watershed transform algorithm for real-time segmentation,” in proc. of Workshop
de Vis o Computacional WVC’, 2009.
[30] A. R. Smith, “Color gamut transform pairs,” Computer Graphics, Vol. 12, pp.
12-19, 1978.
[31] J. R. Smith and S. F. Chang, "VisualSEEk: A fully automated content-based
image query system", in proc. of ACM Multimedia Conference, pp. 87 - 98, 1996.
[32] T. Leung and J. Malik, “Representing and recognizing the visual appearance of materials using threedimensional textons,” International Journal of Computer
Vision (IJCV), vol. 43, no. 1, pp. 29–44, 2001.
[33] D.H. Ballard, "Generalizing the Hough Transform to Detect Arbitrary Shapes", Pattern Recognition, vol.13, no.2, pp.111-122, 1981
[34] P. Felzenszwalb and D. Huttenlocher. “Efficient graph-based image
segmentation, “International Journal of Computer Vision (IJCV), vol.59, no.2, 2004.
[35] J. L. Schneiter, N. R. Corby , US Patent No. 4,963,017 , “Variable depth range camera”, General Electric Company, schenectedy, N.Y, 1990
[36] Y. Su, M. T. Sun, and V. Hsu, “Global motion estimation from coarsely sampled motion vector field and the applications,” in Proc. International Symposium on
62
Circuits and System (ISCAS),, vol. 2, pp. 628–631, 2003
[37] S. Makrogiannis, G. Economou, and S. Fotopoulos, “A region dissimilarity relation that combines feature-space and spatial information for color image segmentation,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 35, no. 1, pp.
44–53, 2005.
[38] D. B. K. Trieu, and T. Maruyama, T, “Real-time image segmentation based on a parallel and pipelined watershed algorithm.,“ Journal of Real-Time Image
Processing, vol. 2, no. 4, pp. 319–329, 2007
63
Appendix
In this section, we briefly describe formula and parameter of object boundary tracing method and constraint segmentation. In A.1, we introduce the detail formula and parameter for object boundary tracing method. In A.2, we introduce the detail formula and parameter for constraint segmentation.
A.1
In the initial boundary selection process, we detect “sky-vrt”, “gnd-vrt”, and “vrt, vrt” class of initial object boundaries. In the following, we list the formula for those detectors.
For the “gnd-vrt” class of the boundary that belongs to initial object boundary if the following conditions are satisfied:
z 0.4
1 2
2 2.0
0.4 0.4
z 0.4 20
0.4 0.4
0.4 0.4
64
The denotes the same-label likelihood and the
denotes the ground label confidence. denotes the vertical label confidence. denotes the sky label confidence. denotes the length of boundary in x axis. denotes the length of boundary in y axis. denotes total pixels of boundary.
For the “sky-vrt” class of the boundary that belongs to initial object boundary if the following condition is satisfied:
z 1 0.5
0.3 0.3
For the “vrt-vrt” class of the boundary that belongs to initial object boundary if the following conditions are satisfied:
z 1 0.7
0.4 z For the condition that two fragments of junction are ground label, if other
fragments of junction are satisfied the following formula are the “vrt-vrt” class of initial object boundary.
1
0.8
the denotes the subclass of segment i label confidence for segment j.
A.2
In the constraint segmentation process, if following conditions are satisfied, we will merge segments.
65
z Condition 1: 1 2
Event 1:
v v s cos hπ s cos hπ s sin hπ s sin hπ 0.6,
where h, s, v denote value of color in the HSV color space.
Event 2:
Main class Main class
z Condition 2: 1 2 6
Event 1:
v v s cos hπ s cos hπ s sin hπ s sin hπ 1.2,
where h, s, v denote value of color in the HSV color space.
Event 2:
Max denotes the rightest position of the segment in the image. Min denotes the
66
leftest position of the segment in the image.
Event 4:
|Mean i Mean j | 0.1
Mean denotes the mean value of the position at the x axis in the image.
z Condition 4: 2 5
Event 2:
Main class Main class subclass subclass
Event 5:
Seg Seg Max Seg , Seg Min Seg , Seg 0.1
Seg denotes area of bounding box of segment i
Table A.1. Events of constraint segmentation Event 1: the color of the segment is similar to the other.
Event 2: the label confidence of the segment is similar to the other.
Event 3: the shape of the segment is similar to the other.
Event 4: the y axis position of the segment is similar to the other.
Event 5: the segment is inside of the other segment.
Event 6: the segment is small enough.
67
作 者 簡 歷
姓名: 陳奕均 籍貫: 台北縣
學歷:
台北市立松山高級中學 (民國 90 年 09 月 ~ 民國 93 年 06 月) 國立交通大學電子工程學系 學士 (民國 93 年 09 月 ~ 民國 97 年 06 月) 國立交通大學電子所系統組 碩士 (民國 97 年 09 月 ~ 民國 99 年 09 月)