A Structured LDPC Code Construction for Efﬁcient Encoder Design

(1)

A Structured LDPC Code Construction for Efficient Encoder Design

Jean-Baptiste Doré, Marie-Hélène Hamon and Pierre Pénard

France Telecom R&D, 4 rue du clos courtel, 35512 Cesson-S´evign´e, France {jeanbaptiste.dore, mhelene.hamon, pierre.penard}@rd.francetelecom.com

Abstract— Low-Density Parity Check codes have been widely investigated over the past few years, and construc- tion of Low-Density Parity Check code enabling efficient implementation design represents one of the main topics of interest. This article proposes a new code family, based on a structures code construction with an inherent parallelism. This construction divides the parity check matrix into layers which can be processed simultaneously, improving encoding throughput and latency. Possible optimizations and example are proposed and illustrated.

I. INTRODUCTION

Low-Density Parity Check (LDPC) codes, have raised a significant interest in the scientific community in the past few years and are now increasingly considered for introduction into communication systems [1].

This evolution towards industrialization motivates the design of LDPC codes suitable for hardware design, especially at the encoder side. Some of the key re- quirements for an efficient design include parallelism, as the race to higher throughput calls for simultaneous processing of the information, and design flexibility, in order to avoid duplication of material for each configuration. We propose here a structured LDPC code construction satisfying these constraints, thanks to an inherent parallelism. We focus on medium codes size (few kilo bits), which are suitable for hardware design and largely considered for introduction into wireless communication systems. After reviewing some existing design rules and general construction for hardware-efficient LDPC codes, we propose a new code construction enabling a parallel processing of the parity information, and then optimize this code structure through appropriate choice of the constituent codes for improved performance. Finally, we illustrate this construction method.

II. CONSTRUCTION OF ENCODER HARDWARE EFFICIENTLDPCCODES

This section describes LDPC codes suitable for simple and flexible hardware design. After reviewing some basic design rules, we propose a code structure enabling an efficient encoding scheme.

A. Design rules

Design of hardware efficient LDPC codes requires some constraints on the code structure. Design aspects has to be taken into account from the beginning of

the code design process. One of the first constraints is related to the check and variable node degrees, which should be restricted to a small set of values, to enable a simple decoder architecture. This design rule allows a efficient reuse of node processors during the decoding process. Another requirement constraining the design of the code is parallelism. Indeed, parallel encoding and decoding architectures are crucial to ensure high throughputs. The LDPC code structure should be suited to such architectures.

Deterministic constructions are potentially good can- didates [1]. For long code lengths, random constructions or pseudorandom constructions of irregular LDPC codes have been shown to closely approach the theoretical limits for the Additive White Gaussian Noise (AWGN) channel [2]. On the other hand, in the case of medium LDPC codes size, deterministic constructions can outperform random ones [3]. Several authors have proposed additional design rules. The most important ones consists in avoiding short cycles in the code graph (at least length four cycles, which represent the shortest length cycles). A particular at- tention should be brought to the short cycles involving only degree two variable nodes [4]. According to [4], degree two variable nodes should correspond to parity portion of the codeword. Indeed, low degree variable nodes converge slower than variable nodes with larger degrees [5]. However as shown in [4], the presence of degree two variable nodes are preferable to optimize the irregularity of the code. Actually, low degree variable nodes must be present to balance out the presence of the low degree check nodes which have less opportunities to fail as fewer bits are involved in the parity equation. Therefore, the design aim at optimizing the distance profile of the code.

B. Code structure

The design rules described above motivates the choice of LDPC codes with a parity check matrix which can be represented as:

H = [H_s H_p] (1) whereH_sis a sparse (N −K)×(K) matrix containing no weight-two columns andH_pis a (N −K)×(N −K) triangular matrix. Using the following notation x = [c p] where c is the information word and p is the 1680

(2)

parity word,x is a codeword if and only if:

Hx^T = [H_s H_p]

c^T p^T

= 0^T (2)

We denotev the projection vector:

v^T =Hsc^T (3)

Parity bits are computed using:

p^T =H_p⁻¹v^T (4)

Hpis chosen to be a matrix given by:

Hp=





 1 1 1

1 1

· · · 1 1

1 1







(5)

This family of matrices enables very efficient encoding.

Indeed, the generator matrixG of the code is provided by:

G =

I (H_s^TH_p^−T)

(6) and it can be easily verified that:

Hp−T =







1 1 1 . . . 1 1 1 1 . . . 1 1 1 . . . 1 1 . .. ... ...

0 1 1

1







(7)

which corresponds to a differential encoder, with a transfer function equal to 1/1 ⊕ D. Thus the encoder for this class of LDPC codes benefits from a low complexity configuration, as depicted in Figure 1. The

HsT 1

1⊕D -

-

c p

c

Fig. 1. Efficient encoder using dual-diagonal matrix

parity bit computation involves multiplication of the codeword c by the low density matrix HsT followed by a differential encoding. Thanks to the presence of this accumulator, hardware constraints on the encoding process are reduced. Authors of [6] mentioned that these codes are similar to the systematic version of IRA (Irregular Repeat Accumulate) codes [7]. For this reason this class of LDPC codes is also refer to extended IRA (eIRA).

Intrinsically, this sub-matrix H_p is free of short cycles involving only degree two variable nodes, and therefore only H_s need to be designed cautiously to avoid these cycles and satisfy the general design rules described earlier.

C. Rate compatibility

An interesting advantage of this matrix construction is the flexibility offered for puncturing. Indeed, dual diagonal matrices enable ”good” puncturing pattern.

More generally, puncturing patterns applied to LDPC codes require a careful design in order to avoid dra- matic performance losses. Puncturing patterns should not affect check nodes by more than one erased bit.

The dual diagonal matrix allows a simple puncturing rule, as it is sufficient to puncture the parity bits in an alternate pattern (i.e. eliminating bits in the parity portion such that two adjacent bits are not both erased), until the desired rate is reached. In other words, under this rule, a mother code of rateRc can be punctured to any rateR comprised within the range:

Rc≤ R ≤ 2Rc

Rc+ 1 (8)

In order to illustrate the possible losses introduced by various puncturing patterns, a R = 1/2, N = 960 LDPC code has been simulated with three different puncturing patterns. The transmitted bits x_t is computed using:

x^T_t =H_punctx (9)

where Hpunct is the size N × N puncturing matrix.

Puncturing is therefore applied only on parity bits and Hpunct can be written:

H_punct =

I 0

0 P

(10) whereP is a size (N − K) × (N − K) matrix. Figure 2 compares performance of three different puncturing patterns defined by puncturing sub-matrices P₁, P2

andP3.

P₁ = diag[1 0 1 0 1 · · · 1 0 1] (11) P2 = diag[Random(0, 1)] (12) P₃ = diag[0 0 · · · 0 1 1 · · · 1] (13) Pattern P2 is constructed randomly. Performance for this case has been averaged over 10 trials. Each sub matrix has 240 ones on the diagonal. The simulation results show that puncturing pattern described above (i.e. P₁) achieves better performance than the other cases.

D. Proposed encoding structure

As mentioned in the previous sections, the eIRA LDPC codes are very interesting due to their simple encoding scheme. However, the differential encoder computes each parity bit serially, and the latency introduced is proportional toM , the number of parity bits. We propose here an encoding structure with an inherent parallelism, enabling a simultaneous processing of the parity bits and resulting in a reduction of the latency, or equivalently in a throughput increase, without introducing additional operations per bit.

1681

(3)

1 1.5 2 2.5 3 3.5 4 4.5 10⁻⁴

10⁻³ 10⁻² 10⁻¹ 10⁰

Eb/N 0 (dB)

Block Error Rate

K = 60 bytes − QPSK AWGN − 50 it − R = 2/3

R = 2/3 P1 R = 2/3 P2 R = 2/3 P3

Fig. 2. Performance obtained with three different puncturing patterns

In order to reduce the processing time of encoding, parallelism should be introduced as much as possible in the parity bit computation, without complexity overhead and with an impact on the performance as small as possible. To achieve this goal, we first focused on a structure based on the parallel concatenation of two eIRA LDPC codes. The parity check matrix representation of this new code is:

H =

H1 0 Hp 0

0 H2 0 Hp

(14) In this example, the first half of bits are encoded using one code (represented by [H1 Hp]), while the other half are encoded with another code (represented by [H2Hp]). The time required for the encoding process is therefore divided by two. However there are no information exchanges during the iterative decoding process, as there is no ”link” between the two codes, which generally help improving the performance of

”concatenated” schemes. We thus introduce ”link” matrices enabling information exchanges between the two codes. To illustrate this, the new parity check matrix can be represented:

H =

H₁ A₁ H_p 0 A2 H2 0 Hp

(15) whereA1 andA2 are the ”link” matrices. The ”link”

matrices can be constrained to ensure good performance and simple encoding process. First, link matrices have to be easily characterizable and should not add significantly more complexity and latency to the encoding process. Authors of [8] have studied, in another context, the influence of inter partition between constituent block matrices of an LDPC matrix, and have shown that the more the number of connections between constituent block matrices, the better the performance will be, up to a given threshold.

As a consequence, link matrices should be full rank and fully interconnect the constituent block matrices in order to improve the message passing during the

iterative decoding. Each information variable node of one constituent code is connected to a parity check node of the other constituent code.

Given an arbitrary information vector c, the parity vectorp is generated using:

H₁ A₁ H_p 0 A₂ H₂ 0 H_p

c^T p^T

= 0^T (16) using the following notation, c = [c₁ c₂] and p = [p₁ p₂]:

p^T₁ = Hp−1

H1c^T₁ +A1c^T₂

(17) p^T₂ = H_p⁻¹

H₂c^T₂ +A₂c^T₁

(18) The choice of link matrices is a degree of freedom in this construction. To respect the hardware constraints, H_i and A_i should be chosen judiciously. For the special case in which link matrices are both identity matrices, an efficient encoder is represented in Figure 3. Parity vectors p₁ and p₂ are computed in parallel

H2T

H1T

1⊕D1 1⊕D1

- -

m m

@@

@

@@ -

-

c₂ c₁

p₂ p₁

c₂ c₁

Fig. 3. Proposed encoder when the link matrices are identity matrices

using sub vectors c₁ and c₂. This encoder allows a parallel computation of the parity bits which reduces the latency of the encoding process. Indeed, the encoding time can be reduced by a factor of two without noticeably increasing number of operations per bits.

This design enable a natural division of the parity- check matrix into layers which can be processed simultaneously and therefore allows to improve the encoding throughput and/or decrease the latency. The selection of the constituent matrices is essential for the resulting performance, and the next section proposes optimizations of this structure.

III. OPTIMIZATION OF THE PROPOSED STRUCTURE

The LDPC code proposed in the previous section is interesting from the viewpoint of design. We now focus on possible optimizations of these code in order to improve the performances. We propose a method to select LDPC codes with good properties (i.e. no short cycles and good distance profile).

1682

(4)

A. Elimination of short cycles

First we assume that each sub matrix Hi can be described by a set of parameters P. We describe here an algorithm eliminating short cycles of length four in the proposed structure. In the first step of the algorithm, an exhaustive search on all possible sets of parameters (i.e. on all possible code matrices) is performed and yields the set C1 of parameters P defining code matrices Hi such that the global graph defined by the matrix [H_iH_p] has no length four cycles. Then, in a second step, a search is performed on the setC1of parameters to determine a setC2⊂ C₁ of couple of parameters sets (P_i P_j), whereP_iandP_j determine respectively the constituent matricesH_iand H_j, guaranteeing that the graph defined by the matrix:

Hs=

H_i A_i A_j H_j

has no length four cycle. In those two steps, the search is restricted to the systematic part of the parity-check matrix. Indeed, if there is no length four cycles in the graph defined by the matrix [Hi Hp], there will be no length four cycles in the graph defined by the matrix [HiI Hp]. Proof is trivial because a length four cycle is introduced in a graph by variable node with at least two connections. Therefore adding variable nodes with one connection will not introduce new length four cycles and the search can be restricted to the systematic portion of the matrix. The proposed method eliminates all length four cycles, but not cycles with larger length.

B. Search for good distance profiles

The last step of the algorithm consists in maximizing the distance profile of the LDPC codes obtained through the first steps. Several authors focused on the minimal distance problem of linear error correcting codes. In [9] authors first introduced the error impulse method to optimize the design of Turbo codes. The error impulse method has been first extended to LDPC codes in [10]. More recently in [11] authors proposed a modified error impulse method for LDPC codes with the dual diagonal structure. This method uses the property that the differential encoder structure will generate its lowest weight codewords primarily when information vectors at the input of the encoder have low weight. This can be shown analytically through a computation of the weight probability function of the code. The number of codewords with weightw at the input of the differential encoder and weight h at the output is computed by [12]:

Aw,h=

N − K − h

w/2

h − 1

w/2 − 1

(19)

where

x y

= x!

(x − y)!y! (20)

With the knowledge of the distribution Pr (w/s), where s represents the weight of the information vector, the

distribution Pr (h/s) can be computed with:

Pr (h/s) =

w

Pr (w/s) Pr (h/w) (21)

This method is employed by the author of [13] to optimize Pi-rotation matrices. When applying this method to the proposed structure, it appeared that low weight information words have the largest probabilities for low weight parity words. For example ifH_iare Pi-rotation matrices andA_i are identity matrices, the distribution of Pr (h/s) is illustrated in Figure 4. As a consequence,

0 20 40 60 80 100 120 140 160 180

10⁻⁴ 10⁻³ 10⁻²

Codeword weight (s+h)

Probability P(h+s|s)

s = 1 − Theorical s = 1 − Simulated s = 2 − Theorical s = 2 − Simulated

Fig. 4. Weight probability density function for the proposed structure using Pi-rotation matrix - N = 384 - R=1/2

the last step of the algorithm consists in determining LDPC codes maximizing the weight of codeword generated from weight one and two information words. It would be recommended to continue for weights higher than two, but our investigation have shown the viability of the proposed method, for the considered structure.

IV. ILLUSTRATION

In order to illustrate the proposed method, we applied the algorithm to design codes based on Pi- rotation matrices.

A. Pi rotation matrix

Pi rotation LDPC codes were first introduced in [13].

Pi rotation codes are a class of deterministic LDPC codes which benefits from an original description and an efficient encoding scheme. The parity check matrix is decomposed into two sub-matrices,H = [Hπ Hp].

Hp is an (N − K) × (N − K) dual diagonal square matrix.Hπis a composition of aq by t array of m×m permutation matrices. The construction ofH_π is based on a single permutation matrix πA. Authors of [13]

proposed to create four permutation matrices from the four rotational orientations ofπAlabelledπA,πB,πC

andπD. The original permutationπA, is described by a permutation vector generated by a key [m, a, b][13].

For example, using the m = 3 permutation vector of [1 3 2] we obtain the following four π rotation:

1683

(5)

πA=



 0 1 0

0 0 1

1 0 0



 πD=



 1 0 0

0 0 1

0 1 0





πB=



 0 1 0

1 0 0

0 0 1



 πC=



 0 0 1

1 0 0

0 1 0





So arranging them as follows, rate 1/2 codes can be built:

Hπ=







πA πB πC πD

πB πC πD πA

πC πD πA πB

πD πA πB πC







As shown in [13], this structure benefits a simple and hardware-efficient encoding process, well suited to compact hardware design.

B. Example

As an example we designed an LDPC code with the following parameters: K = 60 bytes, R = 1/2. H1

and H2 are chosen to be Pi-rotation matrices, while link matrices are both identity matrices. Encoder of such a code is the one illustrated in Figure 3. We first specified the parameter m corresponding to the size of the Pi-rotation pattern. Using the algorithm described above we selected the set of parameters [m, a, b, c, d] which provides the best distance profile, where [a, b](resp[c, d]) correspond to H1 (resp. H2) Pi-rotation parameters. Figure 5 depicts some simulation results on AWGN channel. The codes have been simulated with 50 iterations, with the classical Belief Propagation algorithm. Various coding rate can

0 0.5 1 1.5 2 2.5 3

10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰

E_b/N₀ (dB)

Block Error Rate

R = 1/2 − QPSK AWGN − 50 it

Proposed construction − K=480 (3,6) MacKay − K=504 Pi−Rotation [126,18,119] − K=504

Fig. 5. Performance comparison of the proposed construction with Pi-rotation codes

be obtained with the puncturing patterns described in previous sections. Codes with ratesR = 3/5 and R = 2/3 have been generated from the mother code of rate R = 1/2. Performances are illustrated in Figure 6. The proposed structure using Pi-rotation achieves similar performance than the classical Pi-rotation LDPC codes,

0 0.5 1 1.5 2 2.5 3 3.5

10⁻³ 10⁻² 10⁻¹ 10⁰

E_b/N₀ (dB)

Block Error Rate

K = 60 bytes − QPSK AWGN − 50 it

R = 1/2 R = 3/5 R = 2/3

Fig. 6. Performance comparison for various coding rate

but the encoding latency is divided by two thanks to the inherent parallelism of the proposed construction (ie.

two codes operating in parallel) . This parallelism can also be employed to double the encoding throughput.

This results in a very efficient encoder taking advantage from both the parallelism of the general construction and from the compact hardware design enabled by Pi- rotation constituent codes.

V. CONCLUSION

We have proposed here a new LDPC code construction enabling efficient hardware design of the encoder thanks to an inherent parallelism. Indeed, the method divides the parity-check matrix into layers which can be processed simultaneously and therefore allows to improve the encoding throughput and/or decrease the latency. The selection of the constituent matrices is essential for the resulting performance, and can be performed with the algorithm presented in this paper.

An illustration of the proposed code structure with Pi-rotation matrices as constituent matrices has been provided, but this construction is highly generic and other combinations could be explored. This original code construction may offer opportunity for an efficient decoder design which would take advantage of the structure.

REFERENCES

[1] Draft Amendment to IEEE Standard for Local and Metropoli- tan Area Networks - Part 16: Air Interface for Fixed Broadband Wireless Access Systems- Physical and Medium Access Control Layers for Combined Fixed and Mobile Operation in Licensed Bands, IEEE Project P802.16e Std. 9, June 2005.

[2] S.-Y. Chung, G. Forney, T. Richardson, and R. Urbanke, “On the design of low-density parity-check codes within 0.0045 db of the shannon limit,” IEEE Communications Letters, vol. 5, Feb 2001.

[3] M. Fossorier, “Quasi-cyclic low-density parity-check codes from circulant permuation matrices,” IEEE Transactions on Information Theory, vol. 50, Aug 2004.

[4] T. Richardson, A. Shokrollahi, and R. Urbanke, “Design of ca- pacity approaching irregular low-density parity-check codes,”

IEEE Transactions on Information Theory, vol. 47, Feb 2001.

1684

(6)

[5] S.-Y. Chung, T. Richardson, and R. Urbanke, “Analysis of sum- product decoding of low-density parity-check codes using a gaussian approximation,” IEEE Transactions on Information Theory, vol. 47, Feb 2001.

[6] M. Yang, W. Ryan, and Y. Li, “Design of efficiently encodable moderate-length high-rate irregular ldpc codes,” IEEE Trans- actions on Communications, vol. 52, April 2004.

[7] H. Jin, A. Khandekar, and R. McEliece, “Irregular repeat- accumulate codes,” Second International Conference on Turbo Codes, Sept 2000.

[8] F. Verdier and D. Declercq, “A ldpc parity check matrix construction for parallel hardware decoding,” 3rd International Symposium on Turbo-Codes and Related Topics, Sept 2003.

[9] C. Berrou, M. Jezequel, and C. Douillard, “Computing the minimum distance of linear codes by the error impulse method,”

IEEE Global Telecommunications Conference, vol. 2, nov 2002.

[10] X.-Y. Hu, M. Fossorier, and E. Eleftheriou, “On the computation of the minimum distance of low-density parity-check codes,” IEEE International Conference on Communications, vol. 2, June 2004.

[11] F. Daneshgaran, M. Laddomada, and M. Mondin, “An algorithm for the estimation of the minimum distance of ldpc codes,” IEEE Wireless Communications and Networking Con- ference, vol. 2, March 2005.

[12] D. Divsalar, H. Jin, and R. McEliece, “Coding theorems for turbo-like codes,” Proceeding of the 36th Allerton conference on communication, control and computing, 1998.

[13] R. Echard, “On the construction of some deterministic low- density parity-check codes,” Ph.D. dissertation, 2002.

1685