P ERFORMANCE R ESULT - HARDWARE IMPLEMENTATION

6. HARDWARE IMPLEMENTATION

6.7. P ERFORMANCE R ESULT

6.7. Performance Result

TABLE 6-2, TABLE 6-3 shows the overall comparison of different implementation result. In TABLE 6-2 most of the implementations are using the programmable GPU. The programmable GPU favors high bandwidth and computation resource. The image size, disparity range, and FPS of all designs are quite different. It is difficult to compare difference implementations. Therefore, the million disparity evaluation (MDE) method has been used. TABLE 6-3 shows the error rate of different implementation result. The test sequences are from the middlebury vision website.

6.7%

EffectAggr [46] Intel C2D 2.14 GHz 320x240 463x370 CBiased[36] Geforce 7900 512x512

256x256

SepLaplacian[37] Geforce 7900 256x256 512x512

RealTimeGPU[38] Radeon 9800,

P4 3GHz 320x240 16 16 19.6

ReliableGPU[34] Radeon 9800 - - 16.6 -

GradientGuided[24] Radeon 9800XT 512x384 40 14.7 117

Ground

SepLap

Reliab

Fig. 6‐13 th

d Truth

placian

leGPU

he implemeentation res

52 Proposed Me

RealDP

RealTimeB

GradientGui

sult with dif

ethod

ided

fferent met

hod

TrellisDP

CBaised

ealTimeGPU

Conclusion

The main contribution of this thesis is to propose a hardware friendly algorithm and an architecture design for real-time local stereo matching. Our design gives a quality depth result for real-time application. The proposed algorithm reduces about 95.14% computation complexity comparing to the original ADSW, and the average quality drop with 1 disparity tolerance is about 0.515%. The implemented design can achieve 43 frames per second and 64 disparities with CIF image size under 100MHz clock rate. The chip consumes totally 562,642 K gate counts and 21.3K Bytes internal memory. Besides, we also consider the bandwidth issue in the system level. The final bandwidth requirement is only 45MB/s, which is about ninth of the total bandwidth, and can be easily integrated with other IP for different kinds of applications.

Future Work

Although our algorithm gives a quality result, the disparity map at the occluded area may be incorrect due to the lack of disparity refinement. Besides, the depth result may be unreliable if the object is tiny or lack of color information. On the other hand, the chip area is large and dominated by the large internal storage and multiple RAM banks. Therefore, the unreliable disparity map area and expensive cost of internal storage size may limit its application.

There are two issues remained in our work. First, the practicability for different applications needs to be investigated, such as the scene reconstruction and 3D-TV, which may require smooth depth on edge and occluded area. The second issue is the expensive cost of internal memory size. To reduce the internal memory size, there are three feasible plans, for example, decreasing the bits of census, truncating the

intermediate result of cost aggregation, and using memory with single port instead of dual port. However, the reduction of the memory area is still limited under the data reuse strategy of the proposed architecture. For a low memory cost implementation, further research for stereo algorithm or architecture is required.

Reference

[1] P. Kauff, N. Brandenburg, M. Karl, and O. Schreer, “Fast hybrid block- and pixel recursive

disparity analysis for real-time applications in immersive teleconference scenarios,” in

Proceedings of 9 th International Conference in Central Europe on Computer Graphics

Visualization and Computer Vision, pp. 198-205, 2001.

[2] T. Kanade, A. Yoshida, K. Oda, H. Kano, and M. Tanaka, “A stereo machine for video-rate

dense depth mapping and its new applications,” in Proceedings of the IEEE International

Conference on Computer Vision and Pattern Recognition, 1996.

[3] M.Z. Brown, D. Burschka, and G. Hager, “Advances in Computational Stereo,” IEEE

Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no.8, pp. 993-1008,

August 2003.

[4] D. Scharstein and R. Szeliski, "A Taxonomy and Evaluation of Dense Two-Frame Stereo

Correspondence Algorithms," International Journal of Computer Vision, vol. 47, pp. 7-42,

2002.

[5] H. Hirschmuller and D. Scharstein, "Evaluation of Cost Functions for Stereo Matching," in

Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 17-22

June 2007

[6] R. Zabih and J. Woodfill, “Non-parametric Local Transforms for Computing Visual

Correspondence,” in Proceedings of third European Conference on Computer Vision, vol. 2,

pp. 151–158, 1994.

[7] G. Egnal, "Mutual information as a stereo correspondence measure," Computer and

Information Science, University of Pennsylvania, Philadelphia, USA, Tech. Rep.

MS-CIS-00-20, 2000.

[8] M. Hariyama, H. Sasaki, and M.Kameyama, “Architecture of a stereo matching VLSI

processor based on hierarchically parallel memory access,” The 2004 47th Midwest Symposium

on Circuits and Systems, vol 2, pp. II245- II247, 2004.

[9] M. Okutomi and T. Kanade, "A locally adaptive window for signal matching," International

Journal of Computer Vision, vol. 7, pp. 143-162, 1992.

[10] M. Hariyama, T. Takeuchi, and M. Kameyama, "Reliable stereo matching for highly-safe

intelligent vehicles and its VLSI implementation," in Proceedings of the IEEE Intelligent

Vehicles Symposium. IV, pp. 128-133, 2000.

[11] P. B. Chou and C. M. Brown, "The theory and practice of Bayesian image labeling,"

International Journal of Computer Vision, vol. 4, pp. 185-210, 1990.

[12] H. Tao, H. S. Sawhney, and R. Kumar, "A global matching framework for stereo

computation," Proc. Int’l Conf. Computer Vision, vol. 1, pp. 532-539, 2001.

[13] A. F. Bobick and S. S. Intille, "Large Occlusion Stereo," International Journal of Computer

Vision, vol. 33, pp. 181-200, 1999.

[14] S. B. Kang, R. Szeliski, and J. Chai, "Handling Occlusions in Dense Multi-View Stereo," in

Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 2001.

[15] K.J. Yoon and I.S. Kweon, “Adaptive Support-weight Approach for Correspondence search,”

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006.

[16] M. Gerrits, and P. Bekaert, "Local Stereo Matching with Segmentation-based Outlier

Rejection," in Proceedings of the 3rd Canadian Conference on Computer and Robot Vision, pp.

66-66, 07-09 June 2006.

[17] F. Tombari, S. Mattoccia, and L. Di Stefano, "Segmentation-Based Adaptive Support for

Accurate Stereo Correspondence," Lecture Notes in Computer Science, vol. 4872, p. 427,

2007.

[18] F. Tombari, S. Mattoccia, L. Di Stefano, and E. Addimanda, “Classification and evaluation of

cost aggregation methods for stereo correspondence," in Proceedings of IEEE International

Conference on Computer Vision and Pattern Recognition, June 24-26, 2008

[19] ISO/IEC JTC1/SC29/WG11 N6501, "Requirements on Multi-view Video Coding," Redmond,

USA, July 2004.

[20] O. Veksler, "Fast variable window for stereo correspondence using integral images," in

Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern

Recognition, vol.1, pp. I-556-I-561

[21] S. Kang, R. Szeliski, and J. Chai, “Handling occlusions in dense multi-view stereo,” in

Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 103–110, 2001.

[22] H. Hirschmuller, P. R. Innocent, and J. Garibaldi, "Real-Time Correlation-Based Stereo

Vision with Reduced Border Errors," International Journal of Computer Vision, vol. 47, pp.

229-246, 2002.

[23] S. Chan, Y. Wong, and J. Danie, "Dense stereo correspondence based on recursive adaptive

size multi-windowing," Image and Vision Computing New Zealand, pp. 26-28, 2003.

[24] M. Gong and R. Yang, "Image-gradient-guided real-time stereo on graphics hardware," in

Proceedings of Fifth International Conference on 3-D Digital Imaging and Modeling, pp.

548-555, 2005.

[25] C. Demoulin and M. Van Droogenbroeck. “A method based on multiple adaptive windows to

improve the determination of disparity maps,” in Proceedings of IEEE Workshop on Circuit,

Systems and Signal Processing, pp. 615–618, 2005.

[26] Y. Boykov, O. Veksler, and R. Zabih, "A variable window approach to early vision," IEEE

Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 1283-1294, 1998.

[27] J. C. Kim, K. M. Lee, B. T. Choi, and S. U. Lee, "A Dense Stereo Matching Using Two-Pass

Dynamic Programming with Generalized Ground Control Points," in Proceedings of IEEE

Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, 2005.

[28] M. Okutomi, Y. Katayama, and S. Oka, "A Simple Stereo Algorithm to Recover Precise

Object Boundaries and Smooth Surfaces," International Journal of Computer Vision, vol. 47,

pp. 261-273, 2002.

[29] Y. Ohta and T. Kanade, "Stereo by intra- and inter-scanline search using dynamic

programming," IEEE transactions on pattern analysis and machine intelligence, vol. 7, pp.

139-154, 1985.

[30] S. Roy and I. J. Cox, "A Maximum-Flow Formulation of the N-Camera Stereo

Correspondence Problem," in Proceedings of the Sixth International Conference on Computer

Vision, 1998.

[31] Y. Boykov, O. Veksler, and R. Zabih, "Fast Approximate Energy Minimization via Graph

Cuts," IEEE transactions on pattern analysis and machine intelligence, pp. 1222-1239, 2001.

[32] Y. Boykov and V. Kolmogorov, "An Experimental Comparison of Min-Cut/Max-Flow

Algorithms for Energy Minimization in Vision," IEEE transactions on pattern analysis and

machine intelligence, pp. 1124-1137, 2004.

[33] H. Hirschmuller, "Improvements in real-time correlation-based stereo vision," IEEE Workshop

on Stereo and Multi-Baseline Vision, pp. 141-148, 2001.

[34] G. Minglun and Y. Yee-Hong, "Near real-time reliable stereo matching using programmable

graphics hardware," in Proceedings of IEEE Computer Society Conference on Computer

Vision and Pattern Recognition, vol.1, pp. 924-931, 2005.

[35] S. Forstmann, Y. Kanou, O. Jun, S. Thuering, and A. Schmitt, "Real-Time Stereo by using

Dynamic Programming," in Proceedings of Computer Vision and Pattern Recognition

Workshop on Real-Time 3D Sensor and Their Use, , 2004, pp. 29-29, 2004.

[36] L. Jiangbo, G. Lafruit, and F. Catthoor, "Fast Variable Center-Biased Windowing for

High-Speed Stereo on Programmable Graphics Hardware," in Proceedings of IEEE

International Conference on Image Processing, pp. VI - 568-VI – 571, 2007

[37] L. Jiangbo, S. Rogmans, G. Lafruit, and F. Catthoor, "Real-Time Stereo Correspondence using

a Truncated Separable Laplacian Kernel Approximation on Graphics Hardware," in

Proceedings of IEEE International Conference on Multimedia and Expo, pp. 1946-1949,

2007.

[38] L. Wang, M. Liao, M. Gong, R. Yang, and D. Nister, "High-quality real-time stereo using

adaptive cost aggregation and dynamic programming," in Proceedings of the Third

International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06), pp. 798-805, 2006.

[39] K. Konolige, “Small Vision Systems: Hardware and Implementation,” in Proceedings of Eighth

Int'l Symp. Robotics Research, Oct. 1997.

[40] N. Chang, T. M. Lin, T. H. Tsai, Y. C. Tseng, and T. S. Chang, "Real-Time DSP

Implementation on Local Stereo Matching," in Proceedings of IEEE International Conference

on Multimedia and Expo, pp. 2090-2093, 2007.

[41] M. Hariyama, T. Takeuchi, and M. Kameyama, "VLSI processor for reliable stereo matching

based on adaptive window-size selection," in Proceedings of IEEE International Conference

on Robotics and Automation, vol. 2, 2001.

[42] Q. Yang, L. Wang, R. Yang, S. Wang, M. Liao, and D. Nister, "Real-time global stereo

matching using hierarchical belief propagation," in Proceedings of The British Machine Vision

Conference, 2006.

[43] S Park, C Chen, and H Jeong. “VLSI Architecture for MRF Based Stereo Matching,” Lecture

Notes in Computer Science, vol.4599, no., pp.55-64 2007

[44] M. Gong, R. Yang, and L. Wang, "A Performance Study on Different Cost Aggregation

Approaches Used in Real-Time Stereo Matching," International Journal of Computer Vision,

vol. 75, pp. 283-296, 2007.

[45] S. Park, H. Jeong, K. Pohang, and S. Korea, "Real-time Stereo Vision FPGA Chip with Low

Error Rate," Proceedings of the 2007 International Conference on Multimedia and Ubiquitous

Engineering, pp. 751-756, 2007.

[46] F. Tombari, S. Mattoccia, L. Di Stefano, and E. Addimanda. “Near real-time stereo based on

effective cost aggregation,” in Proceedings of the IEEE International Conference on Computer

Vision and Pattern Recognition, 2008.

作者簡歷

姓名: 蔡宗憲籍貫: 台北市

學歷:

台北市立建國高級中學 (民國 88 年 09 月 ~ 民國 91 年 06 月) 國立交通大學電子工程學系學士 (民國 91 年 09 月 ~ 民國 95 年 06 月) 國立交通大學電子所系統組碩士 (民國 95 年 09 月 ~ 民國 97 年 09 月)

著作:

國內會議

[1] T. H. Tsai, Y. C. Chang, and T. S. Chang, “Hierarchical Decision Table for Bad Pixel Detection

in Stereo Vision” in Proceedings of VLSI Design/CAD Symposium, Spring 2007.

[2] T. H. Tsai, Y. C. Chang, Y. C. Tseng, and T. S. Chang, “Census diffusion with segmentation

constraint for disparity estimation in stereo vision,” in Proceedings of Computer Vision,

Graphics, and Image Processing (CVGIP), Aug. 2007.

國際會議

[3] N. Chang, T.M. Lin, T.S. Tsai, Y.C. Tseng, and T.S. Chang, "Real-Time DSP Implementation

on Local Stereo Matching," in Proceedings of IEEE International Conference on Multimedia

and Expo, pp.2090-2093, 2-5 July 2007

[4] T.S. Tsai, N.Y.-C. Chang, and T.S. Chang, "Data reuse analysis of local stereo matching," in

Proceedings of IEEE International Symposium on Circuits and Systems, pp.812-815, 18-21 May 2008

在文檔中即時的區域性立體視覺比對演算法分析與設計 (頁 64-75)