Hardware Performance Comparison and Summary

Chapter 5 Architecture Designs of LDPC Code Decoders

5.2 Hardware Performance Comparison and Summary

To compare the area, speed, latency, and power consumption of the architectures discussed in this section, we describe the hardware architectures in VHDL, and afterwards simulate and synthesize it using EDA tools Synopsis^TM, PrimePower, and DesignAnalyzer. The process technology is UMC 0.18 mµ process. Table 5.2 lists the results of CNU using min-sum algorithm and the proposed modified min-sum algorithm.

Table 5.2 Area, speed, and power consumption of the CNU using min-sum algorithm and modified min-sum algorithm

6 input CNU 6 input CNU (modified)

7 input CNU 7 input CNU (modified) Area

(gate count)

0.52k 0.57k 0.72 0.79

Speed (MHz) 100 100 100 100

Power Consumption

(mW)

4.82 4.96 6.77 7.1

As mentioned before, two different codewords are processed concurrently without any stalls. In our proposed design, BNUs and CNUs have no idle time. Hence, it leads to an efficient utilization of the functional units. The design takes four cycles to complete a decoding iteration for each codeword, including two cycles for horizontal steps in CNUs and two cycles for vertical steps in BNUs. For channel value loading, each codeword takes two extra cycles. Since the maximum iteration of the decoding procedure is 10, the total amount of cycles needed to complete the decoding of two different codewords is 2+2+10*4=44 cycles. According to our initial synthesis results, the clock frequency is 100MHz, thus the data decoding throughput is 100*[1152*(1/2)]/44≈ 1.31 Gbps.

The proposed LDPC decoder is compared with other designs as listed in Table 5.3. The objective of our design is to devise a high throughput LDPC decoder with little chip area. Partial-parallel decoder architecture can meet our demand. Compared with [19], our design has lower data throughput. Because our decoder design has shorter code length and lower code rate. In our design, one codeword has 288 message bits. In [19], one codeword has 720 bits. Moreover considering the BER

performance, we choose the iteration number=10. This also reduces the data throughput. The superiority of our design is the chip area. Although we choose higher quantization bits, the chip area in our design has 82.6% of the design in [19] and 54.3% of the design in [17].

Table 5.3 Comparison of LDPC decoders Proposed LDPC

decoder

[19] [17]

Code length 576 1200 1024

Code rate 1/2 3/5 1/2

Quantization bits 7 6 4

Iteration number 10 8 10

Architecture Partial-parallel Partial-parallel Fully-parallel Process

Technology (μm) 0.18 0.18 0.16

Clock rate (MHz) 100 83 64

Power (mW) 620 644 690

Area (gate count) 950k 1150k 1750k

Throughput

(Mbps) 1310 3330 500

Chapter 6 Conclusions and Future Work

6.1 Conclusions

From this work, we summarize that using dynamic normalized-offset technique in LDPC decoder can further improve the error correction performance when compared with the conventional method. Various simulation results of LDPC decoder are investigated and the optimal choice considering the tradeoff between the hardware complexity and the performance have been discussed in this thesis.

In this thesis, with partial-parallel architecture, high-throughput and area-efficient LDPC code decoders are proposed for high-speed communication systems. A (576, 288) LDPC code in 802.16e standard has been implemented, of which the code rate is 1/2, the code length is 576 bits, and the maximum number of decoding iterations is 10. The LDPC decoder in our design can achieve a data throughput of 1.31 Gbps and the chip area is 950k gates using the UMC 0.18 mµ process technology.

6.2 Future Work

The normalization factor β and the offset factor α influence the decoder BER performance quite large. Through our research, we found that our proposed dynamic normalized-offset technique and dynamic normalization technique [23] have

similar BER decoding performance. The other idea is to dynamically adjust the two factors α and β in the same time. The threshold values of α and β may be obtained through simulations. Moreover, as mentioned in Appendix A, there are a lot of different codeword lengths and code rates in 802.16e standard. Our future work is to integrate the multi-mode 802.16e LDPC decoder design.

Appendix A

LDPC Codes Specification in IEEE 802.16e

OFDMA

The LDPC code in IEEE802.16e is a systematic linear block code, where k systematic information bits are encoded to n coded bits by adding m= −n k parity-check bits. The code-rate is k n/ .

The LDPC code in IEEE802.16e is defined based on a parity-check matrix H of size m n× that is expanded from a binary base matrix H with size _b m_b× , where n_b m= ⋅z mb and n= ⋅ . In this standard, there are six different base matrices. One z n_b for the rate 1/2 code is depicted in Figure A.1. Two different ones for two rate 2/3 codes, type A is in Figure A.2 and type B is in Figure A.3. Two different ones for two rate 3/4 codes, type A is in Figure A.4 and type B is in Figure A.5. One for the rate 5/6 code is depicted in Figure A.6. In these base matrices, size n is an integer equal to _b 24 and the expansion factor z is an integer between 24 and 96. Therefore, we can compute the minimal code length as n_min =24 24 576× = bits and the maximum code length as n_max =24 96 2304× = bits.

For codes 1/2, 2/3B, 3/4A, 3/4B, and 5/6, the shift sizes ( , , )p f i j for a code size corresponding to the expansion factor z are derived from _f p i j , which is the ( , ) element at the i-th row, j-th column in the base matrices, by scaling ( , )p i j proportionally as

0 permutation matrix. The permutation matrix represents a circular right shift by

( , , )

Figure A.1 Base matrix of the rate 1/2 code

Rate 2/3 A code:

Figure A.2 Base matrix of the rate 2/3, type A code

Rate 2/3 B code:

Figure A.3 Base matrix of the rate 2/3, type B code

Rate 3/4 A code:

Figure A.4 Base matrix of the rate 3/4, type A code

Rate 3/4 B code:

Figure A.5 Base matrix of the rate 3/4, type A code

Rate 5/6 code:

Figure A.6 Base matrix of the rate 5/6 code

References

[1] R. G. Gallager, Low-density parity-check codes, Cambridge, MA: MIT Press, 1963.

[2] D. J. C. Mackay and R. M. Neal, “Near Shannon limit performance of low density parity check codes,” Electron. Lett., Vol. 32, pp. 1645-1646, Aug. 1996.

[3] T. J. Richardson and R. L. Urbabke, “Efficient encoding of low-density parity-check codes,” IEEE Trans. Inform. Theory, Vol. 47, pp. 638-656, Feb.

2001.

[4] D. J. C. Mackay, S. T. Wilson, and M. C. Davey, “Comparison of constructions of irregular gallager codes,’’ IEEE Trans. Comm., Vol. 47, pp. 1449-1454, Oct.

1999.

[5] S. J. Johnson and S. R. Weller, “A family of irregular LDPC codes with low encoding complexity,” IEEE Comm. Lett., Vol. 7, pp. 79-81, Feb. 2003.

[6] J. Chen, A. Dholakia, E. Eleftheriou, and M.P.C. Fosoorier, and X.Y. Hu,

“Reduced-complexity decoding of LDPC codes,” IEEE Trans. Commun., Vol. 53, pp. 1288-1299, July 2005.

[7] R. Tanner, “A recursive approach to low complexity codes,” IEEE Trans. Inform.

Theory, Vol. 27, pp. 533-547, Sep. 1981.

[8] M. Luby, M. Mitzenmacher, A. Shokrollahi, D. Spielman, and V. Stemann,

“Practical loss-resilient codes,” IEEE Trans. Inform. Theory, Vol. 47, pp. 569-584, Feb. 2001.

[9] T. J. Richardson, M. A. Shokrollashi, and R. L. Urbanke, “Design of

capacity-approaching irregular low-density parity-check codes,” IEEE Trans.

Inform. Theory, Vol. 47, pp. 619-637, Feb. 2001.

[10] D. J. C. Mackay, “Good error-correcting codes based on very sparse matrices,”

IEEE Trans. Inform. Theory, Vol. 45, pp. 399-431, Mar. 1999.

[11] F. R. Kschischang, B. J. Frey, and H. A. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Trans. Inform. Theory, Vol. 47, pp. 498-519, Feb.

2001.

[12] H. Futaki and T. Ohtuski, “Low-density parity-check (LDPC) coded OFDM systems,” IEEE VTS, Vol. 1, pp. 82-86, Fall. 2001.

[13] X. Y. Hu, E. Eleftheriou, D. M. Arnold, and A. Dholakia, “Efficient implementation of the sum-product algorithm for decoding LDPC codes,” IEEE GLOBECOM’01, Vol. 02, pp. 1036-1036E, Nov. 2001.

[14] Jinghu Chen and Marc P.C. Fossorier, “Near Optimum Universal Belief Propagation Based Decoding of Low-Density Parity Check Codes,” IEEE Trans.

on Commun., Vol. 50, pp. 583-587, NO.3 Mar. 2002.

[15] Marjan Karkooti and Joseph R. Cavallaro, “Semi-parallel reconfigurable architectures for real-time LDPC decoding,” IEEE ITCC’04 Vol. 65, pp.

683-689.

[16] Z. Wang, Y. Chen, and K. K. Parhi, “Area efficient decoding of quasi-cyclic low density parity check codes,” IEEE ICASSP’04, Vol. 5, pp. 49-52, May. 2004.

[17] A. J. Blanksby and C. J. Howland, “A 690-mW 1-Gb/s 1024-b, rate-1/2 low-density parity-check code decoder,” IEEE J. Solid-State Circuits, Vol. 37, pp.

404-412, Mar. 2002.

[18] Y. Chen and D. Hocevar, “A FPGA and ASIC implementation of rate 1/2, 8088-b irregular low density parity check decoder,” IEEE GLOBECOM’03, Vol.

3, pp. 113-117, Dec. 2003.

[19] Chien-Ching Lin, Kai-Li Lin, Hsie-Chia Chang and Chen-Yi Lee, “A 3.33Gb/s (1200,720) low-density parity check code decoder,” IEEE Proceedings of ESSCIRC, Grenoble, France, 2005.

[20] T.M.N. Ngatched, M. Bossert, and A. Fahrner, “Two decoding algorithms for low-density parity check codes,” IEEE ITCC’04, Vol. 32, pp. 253-257.

[21] Yuan-Jih Chu and Sau-Gee Chen, “An efficient LDPC code structure combined with the concept of difference family,” IWCMC’06, Vol. 18, pp.

355-360.

[22] I. V. Kozintsev. Software for low-density parity-check codes. [Online] Available at: http://www.kozintsev.net/soft.html.

[23] Yen-Chin Liao, Chien-Ching Lin, Chih-Wei Liu, and Hsie-Chia Chang, “A dynamic normalization technique for decoding LDPC codes,” IEEE Signal Processing Systems Design and Implementation, pp. 768-772, Nov. 2005.

自傳

邱敏杰，1982 年 6 月 15 日出生，高雄縣人。2004 年自國立暨南國際大學電機工程學系畢業，隨即進入國立交通大學電子研究所攻讀碩士學位。研究興趣為通訊系統與數位信號處理，碩士論文題目為低密度對偶檢查碼解碼演算法之改進以及其高速解碼器架構之設計。

在文檔中低密度對偶檢查碼解碼演算法之改進以及其高速解碼器架構之設計 (頁 76-0)