Chapter 2 Low-Density Parity-Check Code
2.4 Decoding
2.4.3 Efficient Check Node Computation
Hard decision is made on q , and the resulting decoded input vector xˆ is checked l against the parity-check matrix H. If HxˆT =0, the decoder stops and output xˆ . Otherwise it repeats steps 1-3 until it reaches the specified maximum iteration loops.
2.4.3 Efficient Check Node Computation
According to equation (2.41), the check node update computation can be implemented in a serial configuration. Consider a particular check node m with l connections from variable nodes. The incoming messages are then qm,1,qm,2,...,qm,l.
3, the outgoing message from check node m can be simply expressed as
).
The total computational load consists of the forward recursive computation of )
(fi
CHK , the backward recursive computation of CHK(bi), and the final pair-wise part in equation (2.50), which amounts to 3(l−1) core operation of the type
) (a b
CHK ⊕ per check node. Clearly, the above procedure is exactly the forward-backward algorithm, as shown in Figure 2.7. The serial nature of computations makes a latency of O(l) units of time in computing a check node update.
Figure 2.7 Serial configuration for computing check node update
An efficient implementation for computing check node update is introduced by
[13]. A simple parallel configuration that enables fast check node update is described here. First, an auxiliary binary random variable
∑
= can be computed using the parallel configuration shown in Figure 2.8. The computation at each check node in the parallel configuration is CHK(a⊕b). The latency in computing the S is of order m O(log l), resulting in a speed-up factor of
] ) log(
[d l
O c compared to the serial configuration. Having obtained S , the m outgoing message
,i,
CHK in equation (2.51) is exactly equivalent to the
outgoing message rm,i from check node m to all the variable nodes ,i where
Lastly, let’s define
m i
m CHK S
r , = ( qm,i), where i=1,2,...,l. (2.54) It can be seen that for each i∈{1,2,...,l}, the message rm,i can be computed
simultaneously by a parallel implementation of the new core computation
Sm
CHK ( qm,i) as shown in Figure 2.8. Clearly, only l−1 core computation of
type CHK(a⊕b) and l core computation of type CHK (a b) are necessary for a particular check node update in this parallel configuration.
Figure 2.8 Parallel configuration for computing check node update
In the end of this section, we synthesize the contents discussed in sections 2.4.1, 2.4.2 and 2.4.3, and give a summary to the sum-product algorithm, min-sum based algorithm and min-sum algorithm in Table 2.4, Table 2.5 and Table 2.6, respectively.
Table 2.4 Summary of the sum-product algorithm 1. Initialization:
2. Message passing:
Step1: Message passing from check nodes to variable nodes. For each l,m, Step3: Decoding
1
Table 2.5 Summary of the min-sum based algorithm 1. Initialization:
2. Message passing:
Step1: Message passing from check nodes to variable nodes. First, compute
)
Step3: Decoding
Table 2.6 Summary of the min-sum algorithm 1. Initialization:
2. Message passing:
Step1: Message passing from check nodes to variable nodes. First, compute
Then, for each l,m, compute
Step3: Decoding
Chapter 3
A New Structure for Low-Density
Parity-Check Code Using the Difference
Family
In this chapter, we will partition the discussion into two sections. In section 3.1, an introduction to the difference family and the construction of an irregular quasi-cyclic code based on this concept will be discussed. In section 3.2, we will propose a new structure of the low-density parity-check code, and expecting the new structure to bring performance improvement.
3.1 The Difference Family
In [5], a concept using the difference family to construct an irregular quasi-cyclic code with a Tanner graph free of 4-cycle was introduced. A difference family is an arrangement of a group of v elements, such as Z , into not necessarily disjoint v subsets of equal size which meet certain difference requirements. More precisely:
Definition 1: The t γ -element subsets of the group Z , v D1,D2,...,Dt with }
,..., ,
{ i,1 i,2 i,γ
i d d d
D = form a (v,γ,λ) difference family if the difference
v d
dix iy)mod
( , − , , (i=1,2,...,t;x,y=1,2,...,γ,x≠ y) give each nonzero element of Z exactly v λ times.
For example, the subsets D1 ={1,2,5}, }D2 ={1,3,9 of Z form a (13,3,1) 13 difference family with differences
From D : 1 2−1=1, 1−2=12, 45−1= , 1−5=9, 35−2= , 102−5= From D : 22 3−1= , 111−3= , 89−1= , 1−9=5, 69−3= , 73−9= .
In this work where the difference families with λ=1 allows the design of codes free of 4-cycles. For an irregular quasi-cyclic code, define the column weight distribution of a length vl rate l−(1/l) code as the vector W =[w1,w2,...,wl], where wj is the column weight of the columns in the j circulant. Denote that th
wmax is the maximum column weight of the parity-check matrix H }
,..., , max{ 1 2
max w w wl
w = . (3.1)
To construct an irregular quasi-cyclic code with length vl and rate l−(1/l), so that its parity-check matrix H =[a1(x),a2(x),...,al(x)] has a weight distribution
] ,..., ,
[w1 w2 wl
W = , l sets D1,D2,...,Dl of a (v,γ,1) difference family with wmax
γ ≥ , and aj(x) can be defined using wj of the elements of Dj as
wj j j
j d d
d
j x x x x
a ( )= ,1 + ,2 +...+ , . (3.2)
To ensure that the code can be encoded, xv −1 must be divisible by at least one of the aj(x).
For a regular code, all of the elements in each set are included in each circulant, while for an irregular code the choice of which elements in the set to use is arbitrary.
The row weight, ρ, of the parity-check matrix is constant, and given by
∑
== l
i
wi 1
ρ . (3.3)
To demonstrate that the quasi-cyclic codes are free of 4-cycles we need a well known result of the difference families.
Lemma 3.1 [5]: A pair of elements from Z occur together exactly v λ times in the set of translates of every set in a (v,γ,λ) difference family.
Lemma 3.2: The codes of construction by using difference families have Tanner graphs free of 4-cycles.
Proof: Follows from the choice of λ=1. First consider the regular case. Each column of )]H =[a1(x),a2(x),...,al(x is a translate of one of the sets Dj in the difference family. To show that there can be no 4-cycles in H, we need to show that no two columns of H can have a nonzero entry in the same two rows, which is equivalent to requiring that two elements of Z can occur together in at most one of all the v translates of the sets in the difference family. Since two elements occur together in exactly λ translates, we need only choose λ=1 to avoid 4-cycles. The argument follows naturally in the irregular construction. Since only wj of the elements in a given set of the difference family will be taken, removing elements from the set of translates will keep it free of 4-cycles.
3.2 The Proposed Structure of LDPC Code
According to section 3.1, we can use difference family to construct an irregular quasi-cyclic code free of 4-cycles. In the following section we will describe the construction we wish to propose for LDPC codes using these difference families.
Below is our proposed structure of the parity-check matrix H,
⎥⎦
⎢ ⎤
⎣
=⎡
−
−
l l l
B B B
B
A A
H A
1 2
1
1 2
1
...
0
... . (3.4)
where A1,A2,...,Al−1,B1,B2,...,andBl are all v× circulant matrices. The code v length is vl and the code rate is (
l
1− 2). We can use the difference families to
determine the polynomials of each of the circulant matrix ai(x) and bj(x), where }
1 ,..., 2 , 1
{ −
∈ l
i and j∈{1,2,...,l}, just as the quasi-cyclic code. In order to avoid any 4-cycles in the new structure of the parity-check matrix, we provide a new difference family to solve this problem. First, construct two (v,γ,1) difference families Family A and Family B and combine the two families to form a new difference Family C which are needed to add the following two constraints.
Constraint 1: The differences [(ai,x −ai,y)mod v ] and [(bi,x −bi,y)mod v ], where i=1,2,...,l−1;x,y=1,2,...,γ ,x≠ y, give each element, can not be the same.
Constraint 2: The differences [(ai,x −aj,y)mod v ] and [(bi,x −bj,y)mod v ], where i, j=1,2,...,l−1,i≠ j;x,y=1,2,...,γ , give each element, can not be the same.
More precisely, if a parity-check matrix is 4-cycles free, it represents that no two columns of H can have a nonzero entry in the same two rows. Suppose the new circulant matrix is Ci =[Ai,Bi]T where i∈{1,2,...,l}. Constraint 1 is added to avoid the case where any two columns of C have a nonzero entry in the same two rows. i
Constraint 2 is added to avoid the case where a column of C , i i∈{1,2,...,l} and another column of Cj, j∈{1,2,...,l}, i≠ have a nonzero entry in the same rows. j For example, the subsets from the difference Family A are A1 ={3,7} and
} 6 , 1
2 ={
A , and the subsets from the difference Family B are B1 ={1,7}, B2 ={2,3} and }B3 ={4,6 of Z , which form a new (13,2,1) difference family C. The 13 differences from Constraint 1:
From A : 91 3−7= , 47−3= From B : 71 1−7= , 67−1= From A : 82 1−6= , 56−1= From B : 122 2−3= , 13−2= . The differences from Constraint 2:
From A and 1 A : 22 3−1= , 103−6= , 67−1= , 17−6= From B and 1 B : 2 1−2=12, 111−3= , 57−2= , 47−3= .
Regarding the encoding for the new structure, suppose that two of the circulant matrices Al−1 and B are invertible, we can derive two generator matrices in the l following systematic forms
[
( 2) 1]
bits, each having the same length v . The encoding procedure is partitioned into two steps.Then, combine the parity bits p with the message bits d to form an intermediate 1 codeword c′ where c′=[d, p1].
Encoding Step2: The last parity bits p can be derived from the generator matrix 2
G and the intermediate codeword c′ . That is 2 2
2 c G
p = ′× . (3.8)
In fact, the encoding procedure for the proposed structure is very similar to the quasi-cyclic code discussed in section 2.3. The parity bits p can be generated with 1 linear complexity by using a shift register of size v(l−2) while encoding of the random codes is via matrix multiplication. For example, encoding of the Encoding Step1 requires vα1 binary operations, α1 is one less than the column weight of G , 1 while matrix multiplication requires v[2v(l−2)−1] binary operations. Similarly, the parity bits p can also be obtained by using a shift register of size 2 v(l−1) that needs vα2 binary operations to complete the computation, where α2 is one less than the column weight of G . Since the encoding complexities of Encoding Step1 2 and Encoding Step2 are linear functions of to the code length, so is the total encoding complexity of the proposed structure which can be implemented by shift register and some combinatory logic.
Chapter 4
Simulation Results
In the beginning of this chapter, we will make a comparison of error correction performances by using some different structures of parity-check matrices such as irregular quasi-cyclic code, randomly constructed code and the proposed structure irregular code. Then, we will make a comparison of error correction performances by using some different decoding algorithms such as sum-product algorithm, min-sum based algorithm and min-sum algorithm. In the end, we will furthermore analyze the finite-precision effects on the decoding performance, and decide proper finite word lengths of variables considering tradeoffs between the performance and the hardware cost.
Before proceed to the following simulation, some parameters should be described here:
1: The polynomials of each of the circulant matrices of the proposed LDPC code structure are shown in Table 3.1. Three proposed structures of irregular LDPC codes have been constructed. When the rate is 2/3 and code length is 720 with degree distribution W=[4, 4, 4, 4, 5, 3], the parity-check matrix is of the form
⎥⎦
⎢ ⎤
⎣
=⎡
10 9 8 7 6 5
9 8 7 6
5 0
B B B B B B
A A A A
H A (4.1)
where A5,A6,...,A9,B5,B6,...,B9 and B are 10 120×120 circulant matrices. When the rate is 3/4 and code length is 960 with degree distribution W=[4, 4, 4, 4, 4, 4, 5, 3],
the parity-check matrix is of the form 5, 3], the parity-check matrix is of the form
⎥⎦
Table 4.1 Polynomials of each of the circulant matrices of the proposed LDPC code structure
2: The polynomials of each of the circulant matrices of the irregular quasi-cyclic codes are shown in Table 3.2. Three quasi-cyclic irregular LDPC codes have been
constructed. When the rate is 2/3 and code length is 720 with degree distribution W=[4, 5, 3], the parity-check matrix is of the form
[
A3 A4 A5]
H = (4.4)
where A3, A4 and A are 5 240×240 circulant matrices. When the rate is 3/4 and code length is 960 with degree distribution W=[4, 4, 5, 3], the parity-check matrix is of the form
[
A2 A3 A4 A5]
H = (4.5)
where A2,A3,A4 and A are 5 240×240 circulant matrices. When a rate 4/5, code length is 1200 with degree distribution W=[4, 4, 4, 5, 3], the parity-check matrix is of the form
[
A1 A2 A3 A4 A5]
H = (4.6)
where A1,A2,A3,A4 and A are 5 240×240 circulant matrices.
Table 4.2 Polynomials of each of the circulant matrices of the quasi-cyclic irregular LDPC codes
)
1(x
a 1+x3 +x21 +x45 )
2(x
a x3 + x43 +x84 +x101 )
3(x
a x+x51 +x57 + x65 )
4(x
a x2 + x6 + x11+ x18 + x33 )
5(x
a 1+x10 +x30
3: The randomly constructed codes are derived from [14] and [15], and they have a regular column weight of four with similar parameters. This means that for a rate of 2/3 and code length of 720 with a random structure, the column weight is four and the averaged row weight is twelve. Similarly, for a rate of 3/4 and code length of 960 with
a random structure, the column weight is four and the average row weight is sixteen.
Finally, for a rate of 4/5 and code length of 1200 with a random structure, the column weight is four and the average row weight is twenty.
4: For the decoding algorithm, we adopt the sum-product algorithm, min-sum based algorithm and min-sum algorithm. The maximum iteration loops 10= .
5: We use the AWGN channel and BPSK modulation method as our test environment.
4.1 Floating-Point Simulations
Figures 4.1-4.3 show the error correction performance for different structures of the parity-check matrix that use the sum-product algorithm for iterative decoding. We can see that in Figures 4.1-4.3, using the proposed structures of the parity-check matrix, the decoding performance is the best, compared to the irregular quasi-cyclic codes and randomly constructed codes. Figures 4.4-4.6 show the error correction performance for different decoding algorithms such as the sum-product algorithm, the min-sum based algorithm and the min-sum algorithm. In the simulations and figures the proposed parity-check matrix structures assume some different code lengths and code rates. We can see that in Figures 4.4-4.6, the decoding performances are almost the same for the sum-product and the min-sum based algorithms combined with iterative decoding. As shown, the min-sum algorithm has the worst performance of all the compared algorithms. This is due to the fact that the min-sum algorithm in the check node update is an approximate form and using the approximation will cause a performance penalty of about 0.5dB.
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-6
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 2/3
Random structure Irregular quasi-cyclic Proposed structure
Figure 4.1 Floating-point simulations of various parity-check matrix structures in AWGN channel, code length=720, code rate=2/3, maximum iteration=10, using the sum-product algorithm
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-6 10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 3/4
Random structure Irregular quasi-cyclic Proposed structure
Figure 4.2 Floating-point simulations of various parity-check matrix structures in AWGN channel, code length=960, code rate=3/4, maximum iteration=10, using the
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-5
10-4 10-3 10-2 10-1
SNR
BER
Code rate: 4/5
Random structure Irregular quasi-cyclic Proposed structure
Figure 4.3 Floating-point simulations of various structure parity-check matrix structures in AWGN channel, code length=1200, code rate=4/5, maximum iteration=10, using the sum-product algorithm
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-6 10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 2/3
sum-product algorithm min-sum based algorithm min-sum algorithm
Figure 4.4 Floating-point simulations of the proposed parity-check matrix structure, under the three decoding algorithm in AWGN channel, code length=720, code rate=2/3, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-6
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 3/4
sum-product algorithm min-sum based algorithm min-sum algorithm
Figure 4.5 Floating-point simulations of the proposed parity-check matrix structure, under the three decoding algorithm in AWGN channel, code length=960, code rate=3/4, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 4/5
sum-product algorithm min-sum based algorithm min-sum algorithm
Figure 4.6 Floating-point simulations of the proposed parity-check matrix structure, under the three decoding algorithm in AWGN channel, code length=1200, code
4.2 Fixed-Point Simulations
In this section, we furthermore analyze the finite-word-length performance of the proposed LDPC codes. Possible tradeoff between hardware complexity and decoding performance will be discussed. It is shown that the performance degradation from the infinite precision is negligible if 6 bits are used for the initially received signal and 6 bits for the extrinsic messages rm,l and qm,l.
4.2.1 Quantization of Initially Received Signal
We first consider the quantization of the initially received signal. Since a receiving buffer is needed for storing the received signal, quantization of the initially received signal significantly affects the total decoder complexity. A long word length not only increases the hardware overhead for the buffers, but also causes a large amount of hardware for the iterative decoding computation, while a short word length may result in very poor performance. Let [t: f] denote the quantization scheme in which a total of t bits are used, of which f bits are used for the fractional part of the value. Various quantization schemes for the initially received signal such as [5:2], [6:2] and [7:3] are investigated here. It should be noted that if we use the min-sum based algorithm for iterative decoding, the quantized initially received signal can not be 0, because when the quantized signal is 0, the results of the check node update operation will also be 0 and will thus lose the ability of error correction. So if we adopt the min-sum based algorithm as the iterative decoding algorithm, we will restrict the quantized signal to a specified minimum value when the initially received
signal is close to 0. That means when we use the quantization schemes such as [5:2]
and [6:2], the minimum quantized values will be ±0.25, and when the quantization scheme is [7:3], the minimum quantized values will be ±0.125. Figures 4.7-4.12 show the decoding performances of using these three different quantization schemes and various code lengths. It can be seen that the difference between [6:2] and [7:3]
quantization schemes is quite small and the [5:2] is far away (by more than 0.2dB) from [6:2] and [7:3] schemes. Thus [6:2] scheme is the best choice.
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-6 10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 2/3
[5:2]
[6:2]
[7:3]
Figure 4.7 Three different fixed-point simulation results of the proposed parity-check matrix structure, based on the sum-product decoding algorithm in AWGN channel, code length=720, code rate=2/3, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-6
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 2/3
[5:2]
[6:2]
[7:3]
Figure 4.8 Three different fixed-point simulation results of the proposed parity-check matrix structure, based on the min-sum based decoding algorithm in AWGN channel, code length=720, code rate=2/3, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-6 10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 3/4
[5:2]
[6:2]
[7:3]
Figure 4.9 Three different fixed-point simulation results of the proposed parity-check matrix structure, based on the sum-product decoding algorithm in AWGN channel, code length=960, code rate=3/4, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-6
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 3/4
[5:2]
[6:2]
[7:3]
Figure 4.10 Three different fixed-point simulation results of the proposed parity-check matrix structure, based on the min-sum based decoding algorithm in AWGN channel, code length=960, code rate=3/4, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 4/5
[5:2]
[6:2]
[7:3]
Figure 4.11 Three different fixed-point simulation results of the proposed parity-check matrix structure, based on the sum-product decoding algorithm in AWGN channel,
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-5
10-4 10-3 10-2 10-1
SNR
BER
Code rate: 4/5
[5:2]
[6:2]
[7:3]
Figure 4.12 Three different fixed-point simulation results of the proposed parity-check matrix structure, based on the min-sum based decoding algorithm in AWGN channel, code length=1200, code rate=4/5, maximum iteration=10
4.2.2 Quantization of
rm,land
qm,lWe know that the whole decoding process mainly consists of iteratively exchanging and updating the extrinsic messages rm,l and qm,l, performed by the
check node update operations and the variable node update operations, respectively.
Therefore, quantization of rm,l and qm,l is also critical for hardware implementation.
Various quantization schemes for the extrinsic messages rm,l and qm,l such as [6:2]
and [7:3] have been examined in this work. In turns out that there is almost no difference in the decoding performance for the [6:2] and [7:3] quantization schemes.
Simulation results for these schemes to with various code lengths are shown in Figures 4.13-4.18. Thus we suggest that the [6:2] scheme to be the best choice.
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-6
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 2/3
[6:2]
[7:3]
Figure 4.13 Two different fixed-point simulation results of the proposed parity-check matrix structure, based on the sum-product decoding algorithm in AWGN channel, code length=720, code rate=2/3, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 2/3
[6:2]
[7:3]
Figure 4.14 Two different fixed-point simulation results of the proposed parity-check matrix structure, based on the min-sum based decoding algorithm in AWGN channel,
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-6
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 3/4
[6:2]
[7:3]
Figure 4.15 Two different fixed-point simulation results of the proposed parity-check matrix structure, based on the sum-product decoding algorithm in AWGN channel, code length=960, code rate=3/4, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-6 10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 3/4
[6:2]
[7:3]
Figure 4.16 Two different fixed-point simulation results of the proposed parity-check matrix structure, based on the min-sum based decoding algorithm in AWGN channel, code length=960, code rate=3/4, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4 10-5
10-4 10-3 10-2 10-1
SNR
BER
Code rate: 4/5
[6:2]
[7:3]
Figure 4.17 Two different fixed-point simulation results of the proposed parity-check matrix structure, based on the sum-product decoding algorithm in AWGN channel, code length=1200, code rate=4/5, maximum iteration=10
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
10-5 10-4 10-3 10-2 10-1
SNR
BER
Code rate: 4/5
[6:2]
[7:3]
Figure 4.18 Two different fixed-point simulation results of the proposed parity-check
Figure 4.18 Two different fixed-point simulation results of the proposed parity-check