• 沒有找到結果。

Chapter 5 Chip Implementation

5.1 Chip Specification

Table 5.1 H.264/AVC main profile decoder specification for motion compensation

Table 5.1 lists the specification of our bandwidth-efficient motion compensation architecture for H.264 HDTV decoder. After synthesis on Cadence RTL complier using UMC 0.13 um COMS technology, total gate count is 557730 (including embedded SRAM) and the gate count of each component is listed for video decoder in table 5.2. The Die size of H.264

decoder is 3100 mm x 3100mm. Table 5.3 lists on/off chip memory used on each module in our design. The chip photo of H.264 decoder is illustrated as Figure 5.1. The average power consumption of system is 50mW approximately. Furthermore, about synthesis results of our proposed motion compensation and memory controller, the power consumption of motion compensation is 9.53mW and the power consumption of memory controller is 3.9mW at 100MHz, the gate count is 83515 and 8584 for motion compensation and memory controller respectively.

Table 5.2 Synthesis results of H.264/AVC’s main profile decoder including SRAM

Table 5.3 On/Off-Chip memory size for different module in H.264 main profile decoder

CAVLC

Figure 5.1 CHIP photo for H.264/AVC main profile decoder

Chapter 6

Conclusion

In this thesis, we present a bandwidth-efficient motion compensation memory controller organization for H.264 HDTV decoder and support 1080HD 30fps@L4 high-quality format.

The proposed motion compensation engine realizes all advanced features including MV generators with direct modes, combined luma/chroma interpolator, and weighted prediction of H.264/AVC main profile. Concerning the design of interpolator, 4-parallel separate 1-D architecture gives the most space on high throughput video decoder compared with other architectures proposed. An Extend-2D column major approach is presented, and the proposed data reuse technique for fraction motion compensation introduces content buffer, content-swap operation and register-file shifting attached on our interpolator design. This design improves 50%-60% bandwidth with B-slices under external data BUS. Additionally, a combined luma/chroma interpolator is proposed in order to save area, which achieves approximately 44% of cost reduction. Altogether, memory usage and bandwidth are optimized by our proposed design.

Besides, the decoder system bottleneck resulted from the performance limitation of the off-chip SDRAM subsystem leads system designers to put more efforts on SDRAM efficiency.

In conventional SDRAM controller designs, though different requirements for SDRAM service of the heterogeneous system components are often considered, high bandwidth utilization can be achieved for special applications such as high definition TV. For this reason, the proposed memory controller can reduce bandwidth over external BUS using memory

scheduling and improve data access hit rate using data arrangement. For reducing bus utilization, the memory controller architecture is proposed and related approaches are employed as well. This design target of interpolator and frame memory access controller is to reduce external memory access and improve throughput of the entire video decoder. The SDRAM memory access controller appended to video decoder is presented to overcome the tremendous transfer of pixel data to/from external frame memories. To achieve efficient memory access scheduling, we discuss not only memory scheduling but also data arrangement within SDRAM. The proposed data arrangement in our scheduling scheme can minimize the miss ratio (at the same bank) that contributes the maximum latency among all scheduling cases. We create system level hardware-like C++ model and use data utilization to analyze the system performance. Compared to unscheduled situation, the experimental result shows that the access latency can be reduced by 50 % ~ 90 % and bandwidth utilization can be improved up to 90%. In the meanwhile, the throughput of the overall video decoder improves about 50 % ~ 60 % after combining extended RSO method and memory scheduling.

Besides, the gate count of motion compensation and memory controller is 83515 and 8584 respectively in synthesis results. The average power consumption of motion compensation and memory controller is 9.45mW and 3.9mW approximately at 100MHz.

Bibliography

[1] “Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification,” Joint Video Team (JVT), Int. Telecommun. Union-Telecommun. (ITU-T) and Int. Standards Org./Int. Electrotech. Comm. (ISO/IEC), ITU-T Recommendation H.264 and ISO/IEC 14496-10 AVC, May 2003.

[2] “Information technology-generic coding of moving pictures and associated audio information: Video,” ITU-T H.262, ISO/IEC 13818-2, 1994.

[3] Joint Video Term H.264/AVC Reference Software, Version JM 9.2.

http://iphome.hhi.de/suehring/tml/download/ .

[4] T. Wiegand, G. J. Sullivan, G. Bjntegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Trans. Circuits Syst. Video Technol., vol 13, no 7, pp. 560- 576, July, 2003.

[5] Peng Yin; Tourapis, A.M.; Boyce, J.” Localized weighted prediction for video coding”, IEEE Trans. ISCAS Vol. 5, 23-26 May 2005 pp:4365 - 4368

[6] Boyce, J.M.; “Weighted prediction in the H.264/MPEG AVC video coding standard”, IEEE Trans. ISCAS, Volume 3, 23-26 May 2004 pp. - 789-92

[7] Tourapis, A.M.; Feng Wu; Shipeng Li, “ Direct mode coding for bipredictive slices in the H.264 standard”, IEEE Trans. Circuits and Systems for Video Technology,Volume 15, Issue 1, Jan. 2005 pp.119 - 126

[8] Tourapis, A.M.; Feng Wu; Shipeng Li, “ Direct mode coding for bipredictive slices in the H.264 standard”, IEEE Trans. Circuits and Systems for Video Technology, vol 15, Jan.

2005, pp.119 - 126

[9] Tsu-Ming Liu, Ting-An Lin, Sheng-Zen Wang, Wen-Ping Lee, Kang-Cheng Hou, Jiun-Yan Yang and Chen-Yi Lee, “A 125-μW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications”, ISSCC Dig. of Tech. Papers, Feb. 2006.pp.

402-403

[10] S. Z. Wang, T. A. Lin T. M. Liu, C. Y. Lee “A New Motion Compensation Design for H.264/AVC Decoder” in Proc. of Int. symposium on Circuits and Systems (ISCAS '05), 2005, pp. 4558-61

[11] P. C. Tseng, Y. C. Chang, Y. W. Huang, H. C. Fang, C. T. Huang, and L. G. Chen,

“Advances in hardware architectures for image and video coding - a survey,” in Proc.

IEEE, vol. 93, no. 1, pp. 184-197, Jan. 2005.

[12] T. W. Chen, Y. W. Huang, T. C. Chen, Y. H. Chen, C. Y. Tsai, and L. G. Chen,

“Architecture design of H.264/AVC decoder with hybrid task pipelining for high definition videos,” in Proc. IEEE Int.Symp. Circuits and Systems, 2005, pp. 2931-2934.

[13] Y, Hu, A. Simpson, K. McAdoo, and J. Cush, “A high definition H.264/AVC hardware video decoder core for multimedia SoC's,” in Proc. IEEE Int. Symp. Consumer Electron., Sept., 2004, pp. 385-389.

[14] T. A. Lin, S. Z. Wang, T. M. Liu, and C. Y. Lee, “An H.264/AVC decoder with 4x4 level pipeline,” in Proc. IEEE Int. Symp.Circuits and Systems, 2005, pp. 1806-1809.

[15] T. A. Lin, T. M. Liu, and C. Y. Lee, “A low-power H.264/AVC decoder,”in Proc. IEEE Int. Symp. VLSI-TSA, Apr. 2005, pp. 278-281.

[16] Azevedo, A.; Zatt, B.; Agostini, L.; Bampi, S.; ”Motion compensation sample processing for HDTV H.264/AVC decoder”, Digital Object Identifier 10.1109/NORCHP, 2005, Page(s):110 – 113

[17] Haung-Chun Tseng; Cheng-Ru Chang; Youn-Long Lin; ” A hardware accelerator for H.264/AVC motion compensation”, Digital Object Identifier 10.1109/SIPS, 2005.

Page(s):214 – 219

[18] Chuan-Yung Tsai; Tung-Chien Chen; To-Wei Chen; Liang-Gee Chen; “Bandwidth optimized motion compensation hardware design for H.264/AVC HDTV decoder”

Digital Object Identifier 10.1109/MWSCAS, 2005. Page(s):1199 – 1202

[19] H. Y. Kang, K. A. Jeong, J. Y. Bae, Y. S. Lee, and S. H. Lee, “MPEG4 AVC/H.264 decoder with scalable bus architecture and dual memory controller”, in Proc. IEEE Int.Symp. Circuits and Systems, vol. 2, 2004, pp. II - 145-148.

[20] V. Lappalainen, A. Hallapuro, and T. D. Hamalainen, “Complexity of optimized H.26L video decoder implementation,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no.

7, pp. 717-725, July, 2003.

[21] W. N. Lie, H. C. Yeh, Tom C. I. Lin, and C. F. Chen, “Hardware-efficient computing architecture for motion compensation interpolation in H.264 Video Coding,” ” in Proc.

IEEE Int. Symp.Circuits and Systems, 2005, pp. 2136-2139.

[22] T. C. Chen, Y. W. Huang, and L. G. Chen, “Fully utilized and reusable architecture for fractional motion estimation of H.264/AVC,” in Proc. IEEE Int.Conf. Acoustics, Speech, and Signal Processing, vol. 5, 2004, pp. V - 9-12.

[23] C. D. Chien, H. C. Chen, L. C. Huang, and J. I. Guo, “A Low-power motion compensation IP core design for MPEG-1/2/4 video decoding,” in Proc. IEEE Int. Symp.

Circuits and Systems, 2005, pp. 4542-4545.

[24] W. F. He, Z. G. Mao, J. X. Wang, and D. F. Wang, “Design and implementation of motion compensation for MPEG-4 AS profile streaming video decoding”, in Proc. IEEE Int.Conf. ASIC, vol. 2, 2003, pp. 942-945.

[25] Tourapis, A.M.; Feng Wu; Shipeng Li, “ Direct mode coding for bipredictive slices in the H.264 standard”, IEEE Trans. Circuits and Systems for Video Technology, vol 15, Jan.

2005, pp.119 - 126

[26] Micron Technology, Inc. product documents. [Online]. Available:

http://www.micron.com/products/

[27] Micron Technology, Inc. MT48LC2M32B2P-5 64Mb SDRAM (Jan. 2005). [Online].

Available: http://www.micron.com/products/dram/sdram/partlist.aspx?density=64Mb [28] P. R. Panda, N. Dutt, and A. Nicolau, “Memory Issues in Embedded Systems-on-Chip:

Optimization and Exploration ”. Boston, MA: Kluwer Academic Publishers, 1999.

[29] S. Rixner, W. J. Dally, U. J. Kapasi, P. Mattson, and J. D. Owens, “Memory access scheduling,” in Proc. IEEE Int. Symp. Computer Architecture, Vancouver, BC, Canada, Jun. 2000, pp. 128-138.

[30] S. Miura, and T. Watanabe, “A dynamic-SDRAM-mode-control scheme for low-power systems with a 32-bit RISC CPU,” in Proc. IEEE Int. Symp. Low Power Electron. and Design, Aug. 2001, pp. 358-363.

[31] K. B. Lee, T. C. Lin, and C. W. Jen, “An efficient quality-aware memory controller for multimedia platform SoC,” IEEE Trans. Circuits Syst. Video Techno.,vol. 15, no. 5, pp.

620-633, May 2005.

[32] H. Kim, and I. C. Park, “High-performance and low-power memory-interface architecture for video processing applications,” IEEE Trans. Circuits Syst. Video Techno., vol. 11, no. 11, pp. 1160-1170, Nov. 2001.

[33] S. I. Park, Y. Yongseok, and I. C. Park, “High performance memory mode control for HDTV decoders,” IEEE Trans. Consumer Electron., vol. 49, no. 4, pp. 1348-1353, Nov.

2003.

[34] J. H. Li, and N. Ling, “Architecture and bus-arbitration schemes for MPEG-2 video decoder,” IEEE Trans. Circuit Syst. Video Techno., vol. 9, no. 5, pp. 727-736, Aug.

1999.

[35] J. Zhu, L. Hou, R. Wang, C. Huang, and J. Li, “High performance synchronous DRAMs controller in H.264 HDTV decoder,” in Proc. IEEE Int. Conf. Solid-State and Integrated Circuits Technol., vol. 3, 2004, pp.1621-1624.

[36] J. Tajime, T. Takizawa, S. Nogaki, and H. Harasaki, “Memory compression method considering memory bandwidth for HDTV decoder LSIs,” in Proc. IEEE Int. Conf.

Image Processing, vol. 2, 1999, pp. 779-782.

[37] T. Y. Lee, “A new frame-recompression algorithm and its hardware design for MPEG-2 video decoders” IEEE Trans. Circuit Syst. Video Techno., vol. 13, no. 6, pp. 529-534, Jun. 2003.

[38] E. De Greef, F. Catthoor, and H. De Man, “Memory organization for video algorithms on programmable signal processors,” in Proc. IEEE Computer Design: VLSI in Computers & Processors, Oct. 1995, pp. 552-557.

[39] L. Nachtergaele, F. Catthoor, B. Kapoor, S. Janssens, and D. Moolenaar, “Low-power data transfer and storage exploration for H.263 video decoder system,” IEEE J. Select.

Areas Commun., vol. 16, no. 1, pp. 120-129, Jan. 1998.

[40] E. Brockmeyer, L. Nachtergaele, F. V. M. Catthoor, J. Bormans, H. J. De Man, “Low power memory storage and transfer organization for the MPEG-4 full pel motion estimation on a multimedia processor,” IEEE Trans. Multimedia, vol. 1, no. 2, pp.

202-216, June 1999.

[41] K. Denolf, C. De Vleeschouwer, R. Turney, G. Lafruit, and J. Bormans, “Memory centric design of an MPEG-4 video encoder,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 5, pp. 609-619, May 2005.

[42] Shih-Chang Hsia,” Efficient memory IP design for HDTV coding applications”, IEEE Trans. Circuits and Systems for Video Technology, Volume 13, Issue 6, June 2003, pp.

465 – 471

[43] Takizawa, T.; Tajime, J.; Harasaki, H ,” High performance and cost effective memory architecture for an HDTV decoder LSI”, IEEE Trans. ICASSP '99. Proceedings., Volume 4, 15-19 March 1999 pp.1981 – 1984

[44] Sinnathamby, M.; Manjikian, N.” A versatile memory-interface architecture for enhancing performance of video applications”, IEEE-NEWCAS Conference, 2005. 19-22 June 2005 pp.91 - 94

[45] S. Wuytack, J. –P. Diguet, and F. V. M. Catthoor, “Formalized methodology for data reuse exploration for low-power hierarchical memory mappings,” IEEE Trans VLSI Syst., vol. 6, no. 4, pp. 529-537, Dec. 1998.

作 者 簡 歷

姓 名 : 侯康正

出生地 : 台灣省台北市

出生日期: 1979. 12. 05

學歷: 1985. 9 ~ 1991. 6 台北縣竹圍國民小學

1991. 9 ~ 1994. 6 台北縣立淡水國民中學

1994. 9 ~ 1999. 6 北台科學技術學院 電子工程科

1999. 9 ~ 2001. 6 國立高雄應用科技大學 電子工程系 學士

2004. 9 ~ 2006. 6 國立交通大學 電子研究所 系統組 碩士

得 獎 事 績

1999 春 教育部單晶片微處理器設計 佳作

1999 秋 義隆盃單晶片微控制器設計組 特優

2006 春 九十四學年度大學校院積體電路(IC)設計競賽 第四名

發 表 論 文

Kang-Cheng Hou, Sheng-Zen Wang, Yi-Hong Huang, Tsu-Ming Liu, Chen-Yi Lee, “A Bandwidth-Efficient Motion Compensation Architecture for H.264/AVC HDTV Decoder”, in Proceedings of the 17th VLSI/CAD Symposium, August 2006.

Yi-Hong Huang, Ping-Chang Lin, Kang-Cheng Hou, Yueh-Chi Hung, Tsu-Ming Liu, Chen-Yi Lee,” A High-Throughput SRAM-Based Context Adaptive Binary Arithmetic Decoder (CABAD) for H.264/AVC”, in Proceedings of the 17th VLSI/CAD Symposium, August 2006.

Tsu-Ming Liu, Ting-An Lin, Sheng-Zen Wang, Wen-Ping Lee, Kang-Cheng Hou, Jiun-Yan Yang and Chen-Yi Lee, “A 125-μW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications”, is accepted by IEEE Journal of Solid-State Circuits.

Tsu-Ming Liu, Ting-An Lin, Sheng-Zen Wang, Wen-Ping Lee, Kang-Cheng Hou, Jiun-Yan Yang and Chen-Yi Lee, “A 125-μW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications”, ISSCC Dig. of Tech. Papers, pp. 402-403, San Francisco, USA, Feb. 2006.

Tsu-Ming Liu, Ting-An Lin, Sheng-Zen Wang, Wen-Ping Lee, Kang-Cheng Hou, Jiun-Yan Yang and Chen-Yi Lee, “An 865-μW H.264/AVC Video Decoder for Mobile Applications”, IEEE Asian Solid-State Circuit Conference, 2005. pp. 301-304, HsinChu, Taiwan, Nov. 2005.