6. HARDWARE IMPLEMENTATION
6.7. P ERFORMANCE R ESULT
6.7. Performance Result
TABLE 6-2, TABLE 6-3 shows the overall comparison of different implementation result. In TABLE 6-2 most of the implementations are using the programmable GPU. The programmable GPU favors high bandwidth and computation resource. The image size, disparity range, and FPS of all designs are quite different. It is difficult to compare difference implementations. Therefore, the million disparity evaluation (MDE) method has been used. TABLE 6-3 shows the error rate of different implementation result. The test sequences are from the middlebury vision website.
6.7%
51
EffectAggr [46] Intel C2D 2.14 GHz 320x240 463x370 CBiased[36] Geforce 7900 512x512
256x256
SepLaplacian[37] Geforce 7900 256x256 512x512
RealTimeGPU[38] Radeon 9800,
P4 3GHz 320x240 16 16 19.6
ReliableGPU[34] Radeon 9800 - - 16.6 -
GradientGuided[24] Radeon 9800XT 512x384 40 14.7 117
Ground
HB
SepLap
Reliab
Fig. 6‐13 th
d Truth
BP
placian
leGPU
he implemeentation res
52 Proposed Me
RealDP
RealTimeB
GradientGui
sult with dif
ethod
BP
ided
fferent met
Re
hod
TrellisDP
CBaised
ealTimeGPU
53
Conclusion
The main contribution of this thesis is to propose a hardware friendly algorithm and an architecture design for real-time local stereo matching. Our design gives a quality depth result for real-time application. The proposed algorithm reduces about 95.14% computation complexity comparing to the original ADSW, and the average quality drop with 1 disparity tolerance is about 0.515%. The implemented design can achieve 43 frames per second and 64 disparities with CIF image size under 100MHz clock rate. The chip consumes totally 562,642 K gate counts and 21.3K Bytes internal memory. Besides, we also consider the bandwidth issue in the system level. The final bandwidth requirement is only 45MB/s, which is about ninth of the total bandwidth, and can be easily integrated with other IP for different kinds of applications.
Future Work
Although our algorithm gives a quality result, the disparity map at the occluded area may be incorrect due to the lack of disparity refinement. Besides, the depth result may be unreliable if the object is tiny or lack of color information. On the other hand, the chip area is large and dominated by the large internal storage and multiple RAM banks. Therefore, the unreliable disparity map area and expensive cost of internal storage size may limit its application.
There are two issues remained in our work. First, the practicability for different applications needs to be investigated, such as the scene reconstruction and 3D-TV, which may require smooth depth on edge and occluded area. The second issue is the expensive cost of internal memory size. To reduce the internal memory size, there are three feasible plans, for example, decreasing the bits of census, truncating the
54
intermediate result of cost aggregation, and using memory with single port instead of dual port. However, the reduction of the memory area is still limited under the data reuse strategy of the proposed architecture. For a low memory cost implementation, further research for stereo algorithm or architecture is required.
55
Reference
[1] P. Kauff, N. Brandenburg, M. Karl, and O. Schreer, “Fast hybrid block- and pixel recursive
disparity analysis for real-time applications in immersive teleconference scenarios,” in
Proceedings of 9 th International Conference in Central Europe on Computer Graphics
Visualization and Computer Vision, pp. 198-205, 2001.
[2] T. Kanade, A. Yoshida, K. Oda, H. Kano, and M. Tanaka, “A stereo machine for video-rate
dense depth mapping and its new applications,” in Proceedings of the IEEE International
Conference on Computer Vision and Pattern Recognition, 1996.
[3] M.Z. Brown, D. Burschka, and G. Hager, “Advances in Computational Stereo,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no.8, pp. 993-1008,
August 2003.
[4] D. Scharstein and R. Szeliski, "A Taxonomy and Evaluation of Dense Two-Frame Stereo
Correspondence Algorithms," International Journal of Computer Vision, vol. 47, pp. 7-42,
2002.
[5] H. Hirschmuller and D. Scharstein, "Evaluation of Cost Functions for Stereo Matching," in
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 17-22
June 2007
[6] R. Zabih and J. Woodfill, “Non-parametric Local Transforms for Computing Visual
Correspondence,” in Proceedings of third European Conference on Computer Vision, vol. 2,
pp. 151–158, 1994.
[7] G. Egnal, "Mutual information as a stereo correspondence measure," Computer and
Information Science, University of Pennsylvania, Philadelphia, USA, Tech. Rep.
MS-CIS-00-20, 2000.
56
[8] M. Hariyama, H. Sasaki, and M.Kameyama, “Architecture of a stereo matching VLSI
processor based on hierarchically parallel memory access,” The 2004 47th Midwest Symposium
on Circuits and Systems, vol 2, pp. II245- II247, 2004.
[9] M. Okutomi and T. Kanade, "A locally adaptive window for signal matching," International
Journal of Computer Vision, vol. 7, pp. 143-162, 1992.
[10] M. Hariyama, T. Takeuchi, and M. Kameyama, "Reliable stereo matching for highly-safe
intelligent vehicles and its VLSI implementation," in Proceedings of the IEEE Intelligent
Vehicles Symposium. IV, pp. 128-133, 2000.
[11] P. B. Chou and C. M. Brown, "The theory and practice of Bayesian image labeling,"
International Journal of Computer Vision, vol. 4, pp. 185-210, 1990.
[12] H. Tao, H. S. Sawhney, and R. Kumar, "A global matching framework for stereo
computation," Proc. Int’l Conf. Computer Vision, vol. 1, pp. 532-539, 2001.
[13] A. F. Bobick and S. S. Intille, "Large Occlusion Stereo," International Journal of Computer
Vision, vol. 33, pp. 181-200, 1999.
[14] S. B. Kang, R. Szeliski, and J. Chai, "Handling Occlusions in Dense Multi-View Stereo," in
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 2001.
[15] K.J. Yoon and I.S. Kweon, “Adaptive Support-weight Approach for Correspondence search,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006.
[16] M. Gerrits, and P. Bekaert, "Local Stereo Matching with Segmentation-based Outlier
Rejection," in Proceedings of the 3rd Canadian Conference on Computer and Robot Vision, pp.
66-66, 07-09 June 2006.
[17] F. Tombari, S. Mattoccia, and L. Di Stefano, "Segmentation-Based Adaptive Support for
Accurate Stereo Correspondence," Lecture Notes in Computer Science, vol. 4872, p. 427,
2007.
57
[18] F. Tombari, S. Mattoccia, L. Di Stefano, and E. Addimanda, “Classification and evaluation of
cost aggregation methods for stereo correspondence," in Proceedings of IEEE International
Conference on Computer Vision and Pattern Recognition, June 24-26, 2008
[19] ISO/IEC JTC1/SC29/WG11 N6501, "Requirements on Multi-view Video Coding," Redmond,
USA, July 2004.
[20] O. Veksler, "Fast variable window for stereo correspondence using integral images," in
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, vol.1, pp. I-556-I-561
[21] S. Kang, R. Szeliski, and J. Chai, “Handling occlusions in dense multi-view stereo,” in
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 103–110, 2001.
[22] H. Hirschmuller, P. R. Innocent, and J. Garibaldi, "Real-Time Correlation-Based Stereo
Vision with Reduced Border Errors," International Journal of Computer Vision, vol. 47, pp.
229-246, 2002.
[23] S. Chan, Y. Wong, and J. Danie, "Dense stereo correspondence based on recursive adaptive
size multi-windowing," Image and Vision Computing New Zealand, pp. 26-28, 2003.
[24] M. Gong and R. Yang, "Image-gradient-guided real-time stereo on graphics hardware," in
Proceedings of Fifth International Conference on 3-D Digital Imaging and Modeling, pp.
548-555, 2005.
[25] C. Demoulin and M. Van Droogenbroeck. “A method based on multiple adaptive windows to
improve the determination of disparity maps,” in Proceedings of IEEE Workshop on Circuit,
Systems and Signal Processing, pp. 615–618, 2005.
[26] Y. Boykov, O. Veksler, and R. Zabih, "A variable window approach to early vision," IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 1283-1294, 1998.
58
[27] J. C. Kim, K. M. Lee, B. T. Choi, and S. U. Lee, "A Dense Stereo Matching Using Two-Pass
Dynamic Programming with Generalized Ground Control Points," in Proceedings of IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, 2005.
[28] M. Okutomi, Y. Katayama, and S. Oka, "A Simple Stereo Algorithm to Recover Precise
Object Boundaries and Smooth Surfaces," International Journal of Computer Vision, vol. 47,
pp. 261-273, 2002.
[29] Y. Ohta and T. Kanade, "Stereo by intra- and inter-scanline search using dynamic
programming," IEEE transactions on pattern analysis and machine intelligence, vol. 7, pp.
139-154, 1985.
[30] S. Roy and I. J. Cox, "A Maximum-Flow Formulation of the N-Camera Stereo
Correspondence Problem," in Proceedings of the Sixth International Conference on Computer
Vision, 1998.
[31] Y. Boykov, O. Veksler, and R. Zabih, "Fast Approximate Energy Minimization via Graph
Cuts," IEEE transactions on pattern analysis and machine intelligence, pp. 1222-1239, 2001.
[32] Y. Boykov and V. Kolmogorov, "An Experimental Comparison of Min-Cut/Max-Flow
Algorithms for Energy Minimization in Vision," IEEE transactions on pattern analysis and
machine intelligence, pp. 1124-1137, 2004.
[33] H. Hirschmuller, "Improvements in real-time correlation-based stereo vision," IEEE Workshop
on Stereo and Multi-Baseline Vision, pp. 141-148, 2001.
[34] G. Minglun and Y. Yee-Hong, "Near real-time reliable stereo matching using programmable
graphics hardware," in Proceedings of IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, vol.1, pp. 924-931, 2005.
[35] S. Forstmann, Y. Kanou, O. Jun, S. Thuering, and A. Schmitt, "Real-Time Stereo by using
Dynamic Programming," in Proceedings of Computer Vision and Pattern Recognition
Workshop on Real-Time 3D Sensor and Their Use, , 2004, pp. 29-29, 2004.
59
[36] L. Jiangbo, G. Lafruit, and F. Catthoor, "Fast Variable Center-Biased Windowing for
High-Speed Stereo on Programmable Graphics Hardware," in Proceedings of IEEE
International Conference on Image Processing, pp. VI - 568-VI – 571, 2007
[37] L. Jiangbo, S. Rogmans, G. Lafruit, and F. Catthoor, "Real-Time Stereo Correspondence using
a Truncated Separable Laplacian Kernel Approximation on Graphics Hardware," in
Proceedings of IEEE International Conference on Multimedia and Expo, pp. 1946-1949,
2007.
[38] L. Wang, M. Liao, M. Gong, R. Yang, and D. Nister, "High-quality real-time stereo using
adaptive cost aggregation and dynamic programming," in Proceedings of the Third
International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06), pp. 798-805, 2006.
[39] K. Konolige, “Small Vision Systems: Hardware and Implementation,” in Proceedings of Eighth
Int'l Symp. Robotics Research, Oct. 1997.
[40] N. Chang, T. M. Lin, T. H. Tsai, Y. C. Tseng, and T. S. Chang, "Real-Time DSP
Implementation on Local Stereo Matching," in Proceedings of IEEE International Conference
on Multimedia and Expo, pp. 2090-2093, 2007.
[41] M. Hariyama, T. Takeuchi, and M. Kameyama, "VLSI processor for reliable stereo matching
based on adaptive window-size selection," in Proceedings of IEEE International Conference
on Robotics and Automation, vol. 2, 2001.
[42] Q. Yang, L. Wang, R. Yang, S. Wang, M. Liao, and D. Nister, "Real-time global stereo
matching using hierarchical belief propagation," in Proceedings of The British Machine Vision
Conference, 2006.
[43] S Park, C Chen, and H Jeong. “VLSI Architecture for MRF Based Stereo Matching,” Lecture
Notes in Computer Science, vol.4599, no., pp.55-64 2007
60
[44] M. Gong, R. Yang, and L. Wang, "A Performance Study on Different Cost Aggregation
Approaches Used in Real-Time Stereo Matching," International Journal of Computer Vision,
vol. 75, pp. 283-296, 2007.
[45] S. Park, H. Jeong, K. Pohang, and S. Korea, "Real-time Stereo Vision FPGA Chip with Low
Error Rate," Proceedings of the 2007 International Conference on Multimedia and Ubiquitous
Engineering, pp. 751-756, 2007.
[46] F. Tombari, S. Mattoccia, L. Di Stefano, and E. Addimanda. “Near real-time stereo based on
effective cost aggregation,” in Proceedings of the IEEE International Conference on Computer
Vision and Pattern Recognition, 2008.
61
作 者 簡 歷
姓名: 蔡宗憲 籍貫: 台北市
學歷:
台北市立建國高級中學 (民國 88 年 09 月 ~ 民國 91 年 06 月) 國立交通大學電子工程學系 學士 (民國 91 年 09 月 ~ 民國 95 年 06 月) 國立交通大學電子所系統組 碩士 (民國 95 年 09 月 ~ 民國 97 年 09 月)
著作:
國內會議
[1] T. H. Tsai, Y. C. Chang, and T. S. Chang, “Hierarchical Decision Table for Bad Pixel Detection
in Stereo Vision” in Proceedings of VLSI Design/CAD Symposium, Spring 2007.
[2] T. H. Tsai, Y. C. Chang, Y. C. Tseng, and T. S. Chang, “Census diffusion with segmentation
constraint for disparity estimation in stereo vision,” in Proceedings of Computer Vision,
Graphics, and Image Processing (CVGIP), Aug. 2007.
國際會議
[3] N. Chang, T.M. Lin, T.S. Tsai, Y.C. Tseng, and T.S. Chang, "Real-Time DSP Implementation
on Local Stereo Matching," in Proceedings of IEEE International Conference on Multimedia
and Expo, pp.2090-2093, 2-5 July 2007
[4] T.S. Tsai, N.Y.-C. Chang, and T.S. Chang, "Data reuse analysis of local stereo matching," in
Proceedings of IEEE International Symposium on Circuits and Systems, pp.812-815, 18-21 May 2008