Conclusions and Future Work - 使用RGB-D影像評估景深估測演算法

6.1 Conclusions

In this thesis, we propose a method to evaluate the stereo matching algorithms by using our dataset consisting of stereo pair images. These images designed to include many factors that may affect the performance of stereo matching algorithms. Our evaluation focuses on the foreground, because we assume that the depth map is used for human-computer interaction applications. With this set of evaluation dataset and procedure, we like to know the behavior of a specific stereo matching algorithm. Is it robust to certain disturbance factors?

We summarize the characteristics of the three disparity estimation algorithms test in this thesis.

WTA (stereo matching using non-local aggregation method): When the background is complex, the accuracy of WTA increases. No matter the background has repeated patterns or irregular complex patterns, WTA has better results than the simple background. Reducing the textureless background region can improve its performance. When there are several objects in the scene, WTA has very bad estimation results. A person with arms up horizontally or a person in the T-shirt with unicolor plaid pattern makes the performance worse. When the PSNR=40, WTA produces similar results as the cases without noises in the images.

Rectification error has little impact on WTA.

DERS (Depth Estimation Reference Software): When the background is complex, the estimated disparity in the foreground has more errors. Cutting off textrueless region helps the estimation accuracy. We found that without left-right cross check, DERS does not handle the occlusion region well. DERS has very poor results in the repeated-pattern regions, but it works well in the complex background. The increase of object number in a scene decreases

the performance of DERS. A person with arms up horizontally or a person in the T-shirt with unicolor plaid pattern increases the errors. PSNR=40 is good enough for DERS to do the depth estimation. Rectification error has huge impact on DERS that the errors increase a lot.

GC-NS (stereo matching with nonparametric smoothness priors in feature space): The estimated disparities in the foreground has little change when the background becomes complex. Cutting off the textureless region can be useful to improve the performance. The performance increases a lot in the repeated-pattern region and irregular complex region. When the number of objects increase, its performance gets worse. GC-NS cannot do well using the images where a person with arms up horizontally or a person in the T-shirt with unicolor plaid pattern. The Gaussian noise has little impact on GC-NS when the PSNR = 40. Rectification error has some influences on the performance but not much.

6.2 Future Work

For the proposed dataset, it takes time to generate a dataset consisting of stereo pair images and its ground truth, and we have tried our best to cover all of the factors in the dataset.

However, there’s still some factor can be added into the dataset, such as illumination, motion blur and shape complexity. Moreover, we can do better to quantize the factors like background complexity. It will be great if we can use sequence instead of single image. For ground truth disparity map, we can find a better active sensor that the black holes can reduce to make the ground truth more reliable.

For the evaluation part, we use BPR and MSE to see the performance. Since our purpose is to evaluate the algorithm for novel applications, we should choose one of the applications to help us complete the evaluation.

Bibliography

[1] D. Scharstein, and R. Szeliski, http://vision.middlebury.edu/stereo/eval/

[2] S. Morales, T. Vaudrey, and R. Klette, “Robustness evaluation of stereo algorithms on long stereo

sequences,” Intelligent Vehicles Symposium, IEEE, pp.347-352, 3-5 June 2009.

[3] J. Chai, X. Tong, S. Cha, and H. Shum : Plenoptic Sampling. In: Proceeding of ACM SIGGRAPH, pp.

307–318, 2000.

[4] H. Shim, and S. Lee, “Performance evaluation of time-of-flight and structured light depth sensors in

radiometric/geometric variations,” SPIE Optical Engineering, vol. 51, 2012.

[5] LaserFocusWorld: GETURE RECOGNITION: Lasers bring gesture recognition to the home

[Online]Available:

http://www.laserfocusworld.com/articles/2011/01/lasers-bring-gesture-recognition-to-the-home.html

[6] Depth Biomechanics: Background

[Online] Available: http://www.depthbiomechanics.co.uk/?cat=18

[7] D. Scharstein, R. Szeliski, and R. Zabih, “A taxonomy and evaluation of dense two-frame stereo

correspondence algorithms,” IEEE Workshop on Stereo and Multi-Baseline Vision, pp.131-140, 2001.

[8] M. Humenberger, and C. Zinner”A fast stereo matching algorithm suitable for embedded real-time

systems,” Journal on Computer Vision and Image Understanding, pp.1180-1202, 2010.

[9] A. Fusiello, V. Roberto, and E. Trucco, “Efﬁcient stereo with multiple windowing,” IEEE Conference on

Computer Vision and Pattern Recognition, pp. 858–863, June 1997.

[10] M. F. Tappen, and W. T. Freeman, “Comparison of graph cuts with belief propagation for stereo, using

identical MRF parameters,” Ninth IEEE International Conference on Computer Vision. Proceedings, vol.2, pp.900-906, Oct. 2003.

[11] C. Kim, K. M. Lee, B. T. Choi, and S. U. Lee, “A dense stereo matching using two-pass dynamic

programming with generalized ground control points,” in Proc. IEEE Conference on Computer Vision and

88 Pattern Recognition, vol.2, pp.1075-1082, June 2005.

[12] A. Rehman, and Zhou Wang, “Reduced-reference SSIM estimation,” 17th IEEE International Conference

on Image Processing , pp.289-292, 26-29 Sept. 2010.

[13] H. Hirschmuller, and D. Scharstein, “Evaluation of Cost Functions for Stereo Matching,” IEEE Conference.

on Computer Vision and Pattern Recognition, pp.1,8, 17-22 June 2007.

[14] R. Hartley, and A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press,

2th edition, 2003.

[15] A. Fusiello, E. Trucco, and A. Verri, “A compact algorithm for rectification of stereo pairs,” Machine

Vision and Applications.,vol. 12, pp. 16–22, 2000.

[16] Na-Eun Yang,Yong-Gon Kim, and Rae-Hong Park, “Depth hole filling using the depth distribution of

neighboring regions of depth holes in the Kinect sensor,” IEEE International Conference on Signal

Processing, Communication and Computing (ICSPCC), pp.658-661, 12-15 Aug. 2012.

[17] Zhengyou Zhang, “Flexible camera calibration by viewing a plane from unknown orientations,” The

Proceedings of the Seventh IEEE International Conference on Computer Vision, vol.1, pp.666,673, 1999.

[18] Camera Calibration Toolbox for Matlab. Available together with the software at

http://www.vision.caltech.edu/bouguetj/calib_doc/

[19] J. Smisek, M. Jancosek, and T. Pajdla, “3D with Kinect,” IEEE International Conference on Computer

Vision Workshops (ICCV Workshops), pp.1154-1160, 6-13 Nov. 2011.

[20] GIZMOWATCH: What you need to know about the Kinect for XBOX 360

[Online] Available:

http://www.gizmowatch.com/entry/what-you-need-to-know-about-the-kinect-for-xbox-360/

[21] Kinect for Windows: DOWNLOAD KINECT FOR WINDOWS SDK

[Online] Available: http://www.microsoft.com/en-us/kinectforwindows/develop/overview.aspx

[22] G. Danciu, S. M. Banu, and A. Caliman, “Shadow removal in depth images morphology-based for Kinect

cameras,” International Conference on System Theory, Control and Computing (ICSTCC), pp.1-6, 12-14

89 Oct. 2012.

[23] Contour Tracing: Defining Connectivity

[Online] Available:

http://www.imageprocessingplace.com/downloads_V3/root_downloads/tutorials/contour_tracing_Abeer_G

eorge_Ghuneim/connectivity.html

[24] Qingxiong Yang, “A non-local cost aggregation method for stereo matching,” IEEE Conference

on Computer Vision and Pattern Recognition, pp.1402-1409, 16-21 June 2012.

[25] B. M. Smith, Li Zhang, and Hailin Jin, “Stereo matching with nonparametric smoothness priors in feature

space,” IEEE Conference on Computer Vision and Pattern Recognition, pp.485-492, 20-25 June 2009.

[26] ISO/IEC JTC1/SC29/WG11, N11631, “Report on Experimental Framework for 3D Video Coding,”

October 2010.

在文檔中使用RGB-D影像評估景深估測演算法 (頁 98-102)