CHAPTER 1 INTRODUCTION
1.2 T HESIS O RGANIZATION
The remainder of this thesis is organized as follows. Chapter 2 describes the characteristics and decoding algorithms of LDPC codes. High-speed applications which adopted LDPC codes or potentially will adopt LDPC codes as the FEC kernel are introduced in Chapter 3. Simulation results and performance analysis will also be discussed here. In Chapter 4, the proposed LDPC code decoders, including functional units, data rescheduling and memory arrangement, are presented in detail. Besides, the chip implementation results
and comparisons with the state-of-the-arts will also be shown. Finally, conclusion and future work are made in Chapter 5.
Chapter 2
Low-Density Parity-Check Codes
Low-density parity-check (LDPC) codes are linear block codes that are specified by sparse parity check matrices containing mostly 0’s and only a small number of 1’s [1]. The code structures and decoding algorithms can be represented by bipartite graph [2].
Furthermore, it has been shown that the codes can achieve a capacity near Shannon limit with large block length. In this chapter, the code characteristics and decoding algorithms are presented.
2.1 LDPC Codes
The parity check matrix H which has N columns and M rows defines a LDPC code with the block length of N bits and M parity checks. Assuming the matrix is of full rank, the number of information bits is K = N – M, and the code rate is R = 1 – M/N. It was shown by Gallager [2] that for large block lengths, the minimum distance of the code grows linearly with N. Thus block lengths of LDPC codes are often designed as large as possible. For a regular LDPC code, each column and row contains a fixed number of 1’s in H, leading to equal weights for both columns and rows. Otherwise, the code is termed irregular. It has been shown that irregular codes outperform regular codes due to wave effect [12]. An example of regular LDPC code parity check matrix is shown in Fig. 2.1.
Generation a set of valid codewords requires the generator matrix G, which can be derived from H. The relationship between G and H can be expressed as
T 0
⋅ =
G H . (2. 1)
Let u = (u , u , u ,1 2 3 ..., uK) with ui={0, 1} be the information bits, a LDPC code C is defined as
{ | }
= x x u G= ⋅
C . (2. 2) Note that matrix G is not generally sparse; as a result, the complexity of encoding process is much higher due to the large and dense matrix multiplication. From equation (2.1) and (2.2), a valid codeword vector x = (x , x , x , ..., x1 2 3 N) should satisfy M parity check equations
Fig. 2.1 Example of regular LDPC code parity check matrix
LDPC codes can also be represented in bipartite graph. On one side the graph has N bit nodes which correspond to the N columns of H and M check nodes which correspond to the M rows of H on the other side. An edge which connects a bit node Bj and check node Ci
corresponds to a 1 in the entry (i, j) of H. Fig. 2.2 is the corresponding bipartite graph of the LDPC code specified by the parity check matrix in Fig. 2.1.
C1 Fig. 2.2 Bipartite graph of the code specified by matrix in Fig. 2.1
2.2 Message Passing Algorithm
In this section, message passing algorithm which is used to perform probabilistic decoding is introduced. The intrinsic probability PEint(x a)= represents the probability that the variable x chooses the value a. The extrinsic probability ( ) describes the new information for variable x which is obtained from the event E. Moreover, the a posteriori probability ( )
Eext
P x a=
post
PE x a= represents the conditional probability that the variable x takes the value a based on the knowledge of event E.
2.2.1 Principle of Message Passing Algorithm
The key factor of the message passing algorithm is to iteratively pass and exchange probabilistic messages in a graph. Extrinsic and a posteriori probabilities can be evaluated based on given intrinsic probabilities and the construction of the graph.
Consider a node G with K+1 edges, which are associated with the variables e0, e1, …, eK
belonging to the alphabet sets A0, A1, …, AK, respectively. The connection is shown as Fig. 2.3.
For simplicity, only the case of binary variables is discussed in the following. That is, . Denote the intrinsic, extrinsic and a posteriori probability for e
∈ 2
Ai Z i with respect to event
G as PGint(ei =ξi), (PGext ei =ξi) and PGpost(ei =ξi), respectively. Assuming that the intrinsic probability for variable ei is available, the a posteriori probability can be derived by Bayes’
theorem as
G
Fig. 2.3 Message passing on a node
Note that the extrinsic probability is in proportion to P G e( | i =ξi). That is
To evaluate the extrinsic and a posteriori probabilities of variables { }K=0
ei i , the probabilities of variable e0 are considered without loss of generality. Note that the product of alphabets A1 × A2 ×…× AK forms a complete set of values for variables (e1, e2, …, eK). In this way, the probability of event G can be decomposed as
1 1 the following result is obtained.
1 0 0 0 1 0 0
Because event G is true only when equation (2.6) is satisfied, the first term in equation (2.10) can be written as
2.2.2 Message Passing on Bit Nodes
Representing one bit of the codeword, a bit node in a bipartite graph corresponds to a specified column in the parity check matrix H which defines the code. Thus the constraint on a bit node specifies that the associated variables should be equal. The constraint set SB on bit node B, which connects to K+1 check nodes, can be expressed as
0 1 0 1
{( , ,..., ) | }
B K
S = e e e e = = Le =eK . (2.14)
e0 e1 eK-1 eK
+ + +
+
C0 C1 CK-1 CK
B
Fig. 2.4 Message passing on a bit node
The connection is also shown in Fig. 2.4.
2.2.3 Message Passing on Check Nodes
In a bipartite graph, a check node, denoting a parity check equation of the code, corresponds to a specified row in the parity check matrix H. Thus the constraint on a check node specifies that the summation of the associated bits should be zero. The constraint set SC
on check node C, which connects to K+1 bit nodes, can be expressed as
0 1 0 1
{( , ,..., ) | 0}
C K
S = e e e e + + +e L eK = , (2.16) where the operation “+” represents the modulo-2 summation. The connection is shown in Fig 2.5.
+ C
e
0e
1e
K-1e
KB
0B
1B
K-1B
KFig. 2.5 Message passing on a check node
The input message vector along edge ei is denoted by µBi→C( )e for i = 1~K. With equation i check equation is satisfied. Because the indicator function consists of large number of possible configurations, the summation operation in equation (2.17) is very complicated. Thus we first consider the case of K=2 for simplicity. Therefore,
0 1 2
( )( )
the expression in equation (2.19) can be rewritten as( )( )
By induction [13], the results in equation (2.20) can be generalized for K>2 and becomes
( )
As a result, the output messages can be expressed in terms of the input messages:0
2.3 LDPC Code Decoding Algorithm
2.3.1 Sum-Product Algorithm (SPA)
For a M×N parity check matrix H and the corresponding graph, Bi for i = 1, 2, …, N denote the bit nodes, Cj for j = 1, 2, …, M are check nodes, and eij is the edge connecting Bi
and Cj. Furthermore, M(i) is the set of check nodes connected to bit node Bi, and L(j) is the set of bit nodes that are associated with check node Cj. The codeword is also represented by
1 2
[ , , , ]
x= x x L xN . The intrinsic probabilities with respect to the LDPC code can thus be
written as
int ( ) ( )
LDPC i i i
P x =P x =ξ , (2.23) where ξi∈{0,1} and P x( i =0) 1= −P x( i =1), assuming binary symmetric channel.
Fig 2.6 illustrates the iterative decoding flow of LDPC codes where each step will be described as follows [5].
Syndrome Fig. 2.6 Iterative decoding flow chart for LDPC codes
(1) Initialization: The messages from bit node Bi to check node Cj are initialized as
( 0) ( 0) (2) Horizontal step: As shown in Fig. 2.7(a), message updates associated with check nodes are completed in this step. As shown in equation (2.22), the update equations can be expressed as (3) Vertical step: In vertical step, the messages associated with bit nodes are updated as
illustrated in Fig. 2.7(b). According to equation (2.15), the update equations can be
expressed as
(4) Syndrome check: The a posteriori probabilities for each codeword bit can be computed as
x is verified whether the estimated sequence ˆ =[xˆ1
x , , ,x Lˆ2 xˆ ]N is a valid codeword.
The decoding process halts when the syndrome check equation is satisfied; otherwise it goes into the next decoding iteration. A failure is declared if some maximum number of iterations occurs without finding a valid codeword.
+ C
j Fig. 2.7 Message passing in LDPC code decoding2.3.2 Log-Likelihood Ratio Sum-Product Algorithm (LLR-SPA)
For a binary codeword, the decoding operations can be performed in terms of log-likelihood ratios [15]. The log-likelihood ratio (LLR) of a random variable U can be defined as Therefore, the decoding flow can be modified as follows.
(1) Initialization: The messages sent from bit node Bi to check node Cj are initialized by
( 0 which is the so-called “channel value” or “channel information”.
(2) Horizontal step: Based on equation (2.25), the update operation in logarithmic domain can be rewritten as
(
Based on the hyperbolic tangent function and the arc-hyperbolic tangent function,
1 1 1
( )\
(3) Vertical step: Using LLR, the update equation can be rewritten as
( )\
(4) Syndrome check: The pseudo- a posteriori probabilities for each codeword bit can be computed as
( )
Compared with the SPA, multiplications are replaced by additions and the normalization factors are eliminated in the LLR-SPA. Less complexity in implementation is achieved when LLR-SPA is employed.
2.3.3 Min-Sum Algorithm (MS)
In the LLR-SPA, the horizontal step is the most computationally complex part because of hyperbolic tangent functions. Hence it is difficult to implement in hardware based on
LLR-SPA. To further simplify the decoding process, the min-sum algorithm [16] is introduced.
We first consider a check node with 3 edges without loss of generality. Combining equation (2.20), (2.31) and (2.32), we can obtain
Based on the approximation in [17], equation (2.36) becomes
( ) ( )
By induction [15], the result in equation (2.37) can be generalized to obtain a sub-optimal expression of the horizontal step, which is
( )
( )\(
This approximation results in a significant reduction of hardware complexity but little penalty of degraded performance [18].In the min-sum algorithm, all steps of the decoding are the same with LLR-SPA except for the horizontal step. Thus the min-sum algorithm can be derived by just replacing equation (2.33) with (2.38) in LLR-SPA.
Chapter 3
High-Speed Communication Systems with LDPC Codes
In communication systems, channel coding is a key technique to minimize the interferences from the noisy channel. Due to the excellent error-correcting ability and the inherent parallelism, LDPC codes are suitable for high-speed applications. In this chapter, high-speed communication systems that adopted LDPC codes or potentially will apply LDPC codes as the channel coding technology are introduced. The simulation results of the error-correcting performance are also shown in the following.
3.1 Introductions to High-Speed Communication Systems
3.1.1 Satellite Wireless Communication
Digital video broadcasting (DVB) standards are established to deliver videos for the subscriber to provide various entertainments. Over past few years, different broadcasting modes have been designed for kinds of purposes, including the terrestrial, cable and satellite broadcasts. The original satellite digital video broadcasting (DVB-S) was developed in 1994 [19], whose forward error correction (FEC) technology is the concatenation of convolutional codes and Reed-Solomon codes. It is now used worldwide by most of the satellite operators for data and television broadcasting services. To improve the overall performance of the digital satellite transmission technology, the second generation of DVB-S (DVB-S.2) was developed [20]. As a successor to the current DVB-S standard, DVB-S.2 is expected to
provide not only existing but also new services, including TV, High Definition Television (HDTV), audio and other multimedia services.
Employing a powerful FEC system based on LDPC codes concatenated with BCH codes, DVB-S.2 allows quasi-error-free (QEF) operation at about 0.7dB to 1.0dB from the Shannon limit, depending on the transmission mode [20]. Moreover, a capacity gain in the order of 30 percent over DVB-S is achieved due to higher order modulation schemes. The functional block diagram of the DVB-S.2 system is illustrated in Fig. 3.1.
BCH Fig. 3.1 Functional block diagram of the DVB-S.2 system
To transmit data via satellite, DVB-S.2 targets for a robust and reliable communication service. The corresponding packet error rate for DVB-S.2 at QEF over AWGN channel is 10-7, which is very low as compared to other systems. Therefore LDPC codes with large block lengths, which are 64,800 and 16,200, are chosen to accomplish excellent error performance.
And different coding rate of LDPC codes are specified to accommodate various transmission modes.
3.1.2 60GHz Band Wireless Communication
Recently, the Federal Communications Commission (FCC) released the RF band around 60GHz, leading to a new era in the millimeter wave based communications. It potentially can
provide a variety of applications including high-speed wireless personal area network (WPAN), automotive radar at nearby frequencies and multimedia communications. The corresponding standardization (IEEE 802.15.3c) is now under construction by IEEE 802.15 Working Group for WPANs. It is intended to offer higher data transmission, higher frequency re-usage and superior coexistence than the existing wireless systems. The working group also suggest IEEE 802.15.3c will be widely used for Gigabit Ethernet and replace the cables and other wired links.
One of the optional data rate suggested by IEEE 802.15.3c is greater than 2Gb/s in order to satisfy an evolutionary set of consumer multimedia industry in WPAN communications.
Due to the required high data rate, LDPC codes are potential candidates for the FEC technique. With parallel implementation, the LDPC code decoders can easily achieve the demands for data rates over Gb/s.
3.1.3 Ultra-Wideband System
Ultra-wideband (UWB) is an emerging wireless physical (PHY)-layer technology that uses a very large bandwidth [21], [22]. It possesses unique advantages that are attractive to the communication applications: i) the potential for very high data throughput and large increase in user capacity; ii) the implementation of UWB potentially takes small size and processing power; and iii) ultra high precision ranging at centimeter level [22].
Due to the lack of available spectral bands, the applications of UWB devices prior to 2001 were mainly for military usage. In the spring of 2002, the FCC unleashed 3.1GHz to 10.6GHz RF band for increasing high-speed data transmission. Responding to this FCC ruling, industries, government agencies and academic institutions made many research efforts that adopted UWB technology in various areas. These include short-range high-speed wireless communication, localization system, high-resolution radar and imaging system. In this thesis,
we will focus on the UWB applications for wireless networks.
UWB addresses short-range connections among digital home electronics appliances that are applied for the wireless personal area network (WPAN). It is expected to provide high-speed data exchange among storage systems and real-time video/audio distribution for home entertainment devices. Due to small power consumption and high data rate, UWB technology will be exploited to replace existing wireless services.
In [23], the multi-band orthogonal frequency-division multiplexing (MB-OFDM) PHY-layer proposal indicates the coded OFDM based solution can provide up to 480Mb/s for 528MHz UWB system. The desired range in MB-OFDM is 10m for 110Mb/s and can be reduced for higher data rates [23]. To enhance the overall system performance, the convolutional codes and interleaving techniques are applied in the FEC mechanism, whose block diagram is shown in Fig. 3.2.
Fig. 3.2 Block diagram of MB-OFDM UWB system
For improving PHY-layer capacity, LDPC codes can increase the throughput to over 500Mb/s in future WLAN applications [24]. And the LDPC coded OFDM baseband system has been silicon proven to achieve 480 Mb/s data rate [25]. To provide better performance, the original convolutional codes and bit interleaving are replaced with LDPC codes in MB-OFDM UWB systems [25] as shown in Fig. 3.3. The overall system performance will be
described and discussed later.
Scrambler LDPC
Encoder
OFDM
Modulater DAC RF
De-Scrambler LDPC Decoder
OFDM
De-Modulater ADC RF
TX Data
RX Data
Baseband
Fig. 3.3 Block diagram of the proposed LDPC-COFDM UWB system
3.2 Error-Correcting Performance of LDPC Codes in UWB System
In the MB-OFDM UWB systems [25], the maximum 480Mb/s data rate with a bandwidth of 528MHz is specified. The time domain spreading scheme is used to change the data rate for different channel state information. In the following, the simulation results are based on the system illustrated in Fig 3.3, whose detail specification is given in Table 3.1.
Two different irregular LDPC codes are constructed by the progressive edge-growth (PEG) algorithm [26] to enhance the system performances. One is (600, 450) LDPC code (Code I), and the other is (1200, 720) LDPC code (Code II).
Table 3.1 Specification of referenced MB-OFDM UWB system
Data rate (Mb/s) 120 240 480
Spreading gain 4 2 1
Constellation QPSK
Data carrier 100
FFT size 128
Packet size (Bytes) 1024
Signal bandwidth (MHz) 528
Channel model Additive White Gaussian Noise (AWGN)
As stated in Chapter 2, the pseudo- a posteriori probabilities of the codeword bits gradually converge to the real a posteriori probabilities as the number of decoding iterations grows. And the internal messages which are exchanged between check nodes and bit nodes are soft values. However, since infinite decoding iterations and infinite signal precision are impossible for practical implementation, the maximum iteration number and the quantization bits have to be decided. Some performance degradation would be introduced due to the implementation limitations. As a result, a trade-off between the performance and hardware cost will be concerned in the following.
3.2.1 Performance Analysis of Code I
Code I is a (600, 450) rate-3/4 irregular LDPC code, whose column weights are fixed to 3 and row weights are ranging from 11 to 14. Based on the referenced MB-OFDM UWB system, its performances with different decoding iterations including the bit-error rate (BER) and packet-error rate (PER), which is demanded to be less than 8% [21], is shown in Fig 3.4.
0 1 2 3 4 5 6 7 8 9 10
Fig. 3.4 Performance results of the (600, 450) LDPC code
Note that the required signal to noise ratio (SNR) is reduced as the iteration number increases. In Fig. 3.4(b), 3dB SNR gain at PER = 8% is achieved as the number of decoding iterations moves from 1 to 8. However, the improvement tends to be insignificant after 8 iterations, which is only about 0.3dB. As a result, LDPC decoding for Code I with 8 iterations in referenced MB-OFDM UWB system is considerably a good trade-off for practical implementation.
Quantization has to be performed for two types of signal values. One is the channel values, and the other is the internal messages. Fig. 3.5 shows the fixed point simulation results of Code I, where the notation (p, q) represents that the bit width of channel values and internal messages are p and q bits, respectively. The number of bits used for the integer and the fractional part in each (p, q) quantization schemes are shown as Table 3.2.
Table 3.2 Bit width distribution for different quantization schemes
Channel value Internal message
Quantization
scheme Integer part Fractional part Integer part Fractional part
(4, 5) 1 3 1 4
(5, 6) 1 4 1 5
Many combinations of the quantization schemes and the bit width distributions have been tested through simulations. The performances of the quantization with more precision than (5, 6) scheme are almost the same as those with infinite precision. Consequently, the (5, 6) scheme together with the bit width distribution listed in Table 3.2 are used for the proposed LDPC Code I decoder.
0 1 2 3 4 5 6 7 8 9 10 1 iter. Floating 8 iters. (4, 5)
Fig. 3.5 Fixed point simulation of the (600, 450) LDPC code
3.2.2 Performance Analysis of Code II
Code II is a (1200, 720) rate-3/5 irregular LDPC code, whose column weights are also fixed to 3 and row weights range from 7 to 9. Its performances on the MB-OFDM UWB system including BER and PER under different decoding iterations are shown in Fig. 3.6.
In Fig. 3.6(b), The performance has 4.5 dB SNR gain under PER=8% is obtained as the number of decoding iterations grows from 1 to 8, but only 0.4 dB from 8 iterations to 64 iterations. Therefore, LDPC decoding for Code II with 8 iterations is considered as a good trade-off between implementation and error-correcting performance. The fixed point simulation results of Code II are shown in Fig. 3.7, and the bit width distributions are given in Table 3.2. According to the results, the (5, 6) quantization scheme is chosen as the implementation parameter for the proposed decoder for Code II.
0 1 2 3 4 5 6 7 8 9 10
10-7 10-6 10-5 10-4 10-3 10-2 10-1 100
SNR [dB]
BER
1 iteration 4 iterations 8 iterations 16 iterations 64 iterations
Fig. 3.6(a) BER of the (1200, 720) LDPC code
0 1 2 3 4 5 6 7 8 9 10 1 iter. Floating 8 iters. (4, 5) 8 iters. (5, 6) 8 iters. Floating 64 iters. (4, 5) 64 iters. (5, 6) 64 iters. Floating
Fig. 3.7(a) Fixed point simulation of BER for the (1200, 720) LDPC code
0 1 2 3 4 5 6 7 8 9 10 10-4
10-3 10-2 10-1 100
SNR [dB]
1 iter. (4, 5) 1 iter. (5, 6) 1 iter. Floating 8 iters. (4, 5) 8 iters. (5, 6) 8 iters. Floating 64 iters. (4, 5) 64 iters. (5, 6) 64 iters. Floating PER=8%
PER
Fig. 3.7(b) Fixed point simulation of PER for the (1200, 720) LDPC code
3.2.3 Performance Comparison with Convolutional Codes
In Fig. 3.8, the performance of LDPC codes is compared to the 64-state convolutional coded system proposed in [23] where two different rates after puncturing the R = 1/3 convolutional code are selected as the references. It shows that both LDPC codes can outperform the convolutional codes after puncturing with only 8 iterations. The short block length and small decoding iterations will facilitate high speed implementation.
0 1 2 3 4 5 6 7 8 10-6
10-5 10-4 10-3 10-2 10-1 100
R=3/4 convolutional code R=5/8 convolutional code (600,450) LDPC code with 8 iters.
R=3/4 convolutional code R=5/8 convolutional code (600,450) LDPC code with 8 iters.