L Low-DensityParity-CheckCodesBasedonFiniteGeometries:ARediscoveryandNewResults

(1)

Low-Density Parity-Check Codes Based on Finite Geometries: A Rediscovery and New Results

Yu Kou, Student Member, IEEE, Shu Lin, Fellow, IEEE, and Marc P. C. Fossorier, Senior Member, IEEE

Abstract—This paper presents a geometric approach to the con- struction of low-density parity-check (LDPC) codes. Four classes of LDPC codes are constructed based on the lines and points of Eu- clidean and projective geometries over finite fields. Codes of these four classes have good minimum distances and their Tanner graphs have girth6. Finite-geometry LDPC codes can be decoded in var- ious ways, ranging from low to high decoding complexity and from reasonably good to very good performance. They perform very well with iterative decoding. Furthermore, they can be put in either cyclic or quasi-cyclic form. Consequently, their encoding can be achieved in linear time and implemented with simple feedback shift registers. This advantage is not shared by other LDPC codes in gen- eral and is important in practice. Finite-geometry LDPC codes can be extended and shortened in various ways to obtain other good LDPC codes. Several techniques of extension and shortening are presented. Long extended finite-geometry LDPC codes have been constructed and they achieve a performance only a few tenths of a decibel away from the Shannon theoretical limit with iterative de- coding.

Index Terms—Bit flipping decoding, column splitting, cyclic code, Euclidean geometry, iterative decoding, low-density parity- check (LDPC) codes, projective geometry, quasi-cyclic code, row- splitting, shortening.

I. INTRODUCTION

L

OW-density parity-check (LDPC) codes were first discovered by Gallager [1], [2] in the early 1960s and have re- cently been rediscovered and generalized [3]–[14]. It has been shown that these codes achieve a remarkable performance with iterative decoding that is very close to the Shannon limit [4], [9]–[14]. Consequently, these codes have become strong com- petitors to turbo codes [23]–[26] for error control in many com- munication and digital storage systems where high reliability is required.

An LDPC code is defined as the null space of a parity-check matrix with the following structural properties: 1) each row consists of “ones”; 2) each column consists of “ones”; 3) the number of “ones” in common between any two columns, denoted , is no greater than ; 4) both and are small compared to the length of the code and the number of rows in [1],

Manuscript received October 15, 1999; revised June 5, 2001. This work was supported by the National Science Foundation (NSF) under Grants CCR-0096191, CCR-0098029, and by NASA under Grants NAG 5-9025 and NAG 5-10480.

Y. Kou and S. Lin are with the Department of Electrical and Computer Engi- neering, University of California, Davis, CA 95616 USA (e-mail: ykou@att.net;

shulin@ece.ucdavis.edu).

M. P. C. Fossorier is with the Department of Electrical Engineering, University of Hawaii at Manoa, Honolulu, HI 96822 USA (e-mail: marc@

spectra.eng.hawaii.edu).

Communicated by F. R. Kschischang, Associate Editor for Coding Theory.

Publisher Item Identifier S 0018-9448(01)08588-1.

[2]. Since and are small, has a small density of “ones”

and hence is a sparse matrix. For this reason, the code specified by is called an LDPC code. The LDPC code defined above is known as a regular LDPC code. If not all the columns or all the rows of the parity-check matrix have the same number of

“ones” (or weights), an LDPC code is said to be irregular.

Although LDPC codes have been shown to achieve out- standing performance, no analytic (algebraic or geometric) method has been found for constructing these codes. Gallager only provided a class of pseudorandom LDPC codes [1], [2].

Good LDPC codes that have been found are largely computer generated, especially long codes. Encoding of these long computer-generated LDPC codes is quite complex due to the lack of code structure such as cyclic or quasi-cyclic structure.

Furthermore, their minimum distances are either poor or hard to determine.

In this paper, we investigate the construction of LDPC codes from a geometric approach. The construction is based on the lines and points of a finite geometry. Well-known finite geometries are Euclidean and projective geometries over finite fields.

Based on these two families of finite geometries, four classes of LDPC codes are constructed. Codes of these four classes are either cyclic or quasi-cyclic, and, therefore, their encoding can be implemented with linear feedback shift registers based on their generator (or characterization) polynomials [27], [28].

This linear time encoding is very important in practice and is not shared by other LDPC codes in general. We call codes of these four classes finite-geometry LDPC codes.

Finite-geometry LDPC codes have relatively good minimum distances and their Tanner graphs do not contain cycles of length . They can be decoded with various decoding methods, ranging from low to high complexity and from reasonably good to very good performance. These decoding methods include:

one-step majority-logic (MLG) decoding [28], [31], Gallager’s bit flipping (BF) decoding [2], weighted MLG decoding [49], weighted BF decoding, a posteriori probability (APP) decoding [2], and iterative decoding based on belief propagation (com- monly known as sum-product algorithm (SPA)) [10], [11], [15], [20]–[22]. Finite-geometry LDPC codes, especially high-rate codes, perform very well with the iterative SPA decoding.

A finite-geometry LDPC code can be extended by splitting each column of its parity-check matrix into multiple columns.

This column splitting results in a new sparse matrix and hence a new LDPC code of longer length. If column splitting is done properly, the extended code performs amazingly well using the SPA decoding. An error performance only a few tenths of a decibel away from the Shannon limit can be achieved. New LDPC codes can also be constructed by splitting each row of the

(2)

parity-check matrix of a finite-geometry LDPC code into multiple rows. Combining column and row splittings of the parity- check matrices of finite-geometry LDPC codes, we can obtain a large class of LDPC codes with a wide range of code lengths and rates. A finite-geometry LDPC code can also be shortened by puncturing the columns of its parity-check matrix that correspond to the points on a set of lines or a subgeometry of the geometry based on which the code is constructed. Shortened finite-geometry LDPC codes also perform well with the SPA decoding.

The paper is organized as follows. Section II presents a construction method of LDPC codes based on the lines and points of a finite geometry. Two types of codes are constructed and their minimum distances are lower-bounded. Section III gives the construction and characterization of LDPC codes based on Euclidean and projective geometries. Various decoding methods for finite-geometry LDPC codes are discussed in Section IV. A simple weighted BF decoding algorithm and a two-stage hybrid soft/hard decoding scheme are presented.

Section V presents simulation results of error performance of some finite-geometry LDPC codes using various decoding methods. Techniques for extending and shortening finite-geometry LDPC codes are given in Sections VI and VII, respectively.

Section VIII discusses the possible combinations of finite-geometry LDPC codes and turbo codes in concatenation form.

Finally, Section IX concludes this paper with some remarks and suggestions of further research work.

II. FINITE-GEOMETRYLDPC CODES ANDTHEIRGENERAL

STRUCTURE

This section presents a simple construction of LDPC codes based on the lines and points of finite geometries. Two types of codes are constructed and their general structural properties are investigated. Lower bounds on their minimum distances are derived.

Let be a finite geometry with points and lines which has the following fundamental structural properties: 1) every line consists of points; 2) any two points are connected by one and only one line; 3) every point is intersected by lines (i.e., every point lies on lines); and 4) two lines are either parallel (i.e., they have no point in common) or they intersect at one and only one point. There are two families of finite geometries which have the above fundamental structural properties, namely, Euclidean and projective geometries over finite fields.

Form a matrix over GF whose rows

and columns correspond to the lines and points of the finite geometry , respectively, where if and only if the th line of contains the th point of and , otherwise. A row in simply displays the points on a specific line of and has weight . A column in simply displays the lines that intersect at a specific point in and has weight . The rows of are called the incidence vectors of the lines in , and the columns are called the intersecting vectors of the points in . Therefore, is the incidence matrix of the lines in over the points in . It follows from the second structural property of that every two columns have exactly one “ -component” in common, and it follows from the fourth structural property of

that any two rows have at most one “ -component” in common.

The density of this matrix, denoted , is defined as the ratio of the total number of “ones” in to the total number of entries in . Then we readily see that . If and are small compared to and , then is a low-density matrix which has all the structural properties defined in Section I.

The null space over GF of gives a binary LDPC code of length . Such a code is called the type-I geometry- LDPC code, denoted . The rows of are not necessarily linearly independent. Let be the rank of . Then is a binary linear code with as its parity-check matrix.

Let be the transpose of , i.e., .

Then is also a low-density matrix with row weight and column weight . The null space over GF of gives a binary LDPC code of length , denoted . Since and have the same rank , is a binary linear code. This code is called the type-II geometry- LDPC code.

and are called companion codes and have the same number of parity-check symbols.

Let be the rows of where

for

An -tuple is a codeword of the type-I

geometry- LDPC code specified by if and only if the inner product

(1) for . The sum given by (1) is called a parity-check sum (or simply check sum), which is simply a linear sum of a subset of code bits. A code bit is said to be checked by the

sum (or the row ) if . Let be the set

of rows in that check on the code bit . Let denote the set of check sums formed by the rows in . It follows from the structural properties of that the code bit is contained in every check sum in and any of the other code bits is contained in at most one check sum in . The check sums in (or the rows in ) are said to be orthogonal on the code bit [28], [31]. The check sums in are called the orthogonal check sums on code bit and the rows in are called the orthogonal vectors on . For , each code bit is checked by exactly orthogonal check sums. These orthogonal check sums can be used for majority-logic decoding of the code [28], [31].

The code is capable of correcting any error pattern with or fewer errors using one-step majority-logic decoding [28], [31].

As a result, the minimum distance of the type-I geometry- LDPC code is at least .

Similarly, it can be shown that there are check sums orthogonal on each code bit of a codeword in the type-II geometry- code . Therefore, is also one-step majority-logic decodable and has a minimum distance at least .

For a linear block code of length specified by a parity-check matrix of rows, a graph can be constructed to display the re- lationship between its code bits and the check sums that check on them. This graph consists of two levels of vertices. The first

(3)

level consists of vertices which represent the code bits of the code. These vertices, denoted are called the code-bit (or variable) vertices. The second level consists of vertices which represent the check sums, , that the code bits must satisfy. These vertices are called the check sum vertices. A code-bit vertex is connected to a check sum vertex by an edge, denoted , if and only if the code bit is contained in the check sum . No two code-bit vertices are connected and no two check sum vertices are connected.

This graph is a bipartite graph [32] which was first proposed by Tanner [3] to study the structure and iterative decoding of LDPC codes, and hence it is called the Tanner graph. The number of edges that are connected to (or incident at) a code-bit vertex , called the degree of , is simply the number of check sums that contain . The number of edges that are incident at the check sum vertex , called the degree of , is simply the number of code bits that are checked by the check sum . For a regular LDPC code, the degrees of all the code-bit vertices are the same and the degrees of all the check-sum vertices are the same. Such a Tanner graph is said to be regular.

A cycle in a graph of vertices and edges is defined as a sequence of connected edges which starts from a vertex and ends at the same vertex, and satisfies the condition that no vertex (except the initial and the final vertex) appears more than once [32].

The number of edges on a cycle is called the length of the cycle.

The length of the shortest cycle in a graph is called the girth of the graph. The Tanner graph of a linear block code contains no cycles of length and no cycles of odd lengths. Therefore, the girth of the Tanner graph of a linear block code is at least .

In decoding a linear block code with the SPA decoding, the performance very much depends on cycles of short lengths in its Tanner graph. These short cycles, especially cycles of length , make successive decoding iterations highly correlated and hence severely limit the decoding performance [3], [10], [11], [20], [33]–[35]. Therefore, to use the SPA for decoding, it is important to design codes without short cycles in their Tanner graphs, especially cycles of length .

Both types of geometry- LDPC codes are regular and hence their Tanner graphs are regular. Since the row and column weights of are and , respectively, the degrees of each check-sum vertex and each code-bit vertex in the Tanner graph of the type-I geometry- LDPC code are and , respectively. Since is the transpose of , the degrees of each check-sum vertex and each code-bit vertex in the Tanner graph of the type-II geometry- code are and , respectively. In fact, the Tanner graphs of the type-I and type-II geometry- LDPC codes are dual graphs, i.e., the code-bit vertices of one graph become the check-sum vertices of the other graph and the check-sum vertices of one graph become the code-bit vertices of the other graph.

It follows from the structural properties of the parity-check matrices and that no two code bits are checked si- multaneously by two check sums. This implies that the Tanner graphs of both types of geometry- LDPC codes do not contain cycles of length . However, they do contain cycles of length . To show this, we use the fundamental property of a finite geometry that any two points are connected by a line. Let and

be any two points in the finite geometry . Then there is a line connecting and . Let be a third point in but not on . Then there is a line connecting and and a line connecting and . These three lines enclose a triangle with and as the vertices. In the Tanner graph of the type-I geometry- LDPC code, these three lines , , and correspond to three check-sum vertices, say and and the three points and correspond to three code-bit vertices, say and . Each of these three check-sum vertices, and is connected to only two of the three code-bit vertices and . Since no two check-sum vertices are connected to the same pair of code-bit vertices, the edges connecting and to and form a cycle of length in the Tanner graph of the type-I geometry- code . The number of cycles of length is equal to the number of triangles in which can be enumerated and is

(2) Since the Tanner graphs of type-I and type-II geometry- codes are dual, they have the same girth and the same cycle distribution. The above analysis shows that the girth of the Tanner graph of an LDPC code constructed based on the lines and points of a finite geometry is .

III. EUCLIDEAN ANDPROJECTIVEGEOMETRYLDPC CODES

Euclidean and projective geometries over finite fields form two large families of finite geometries. The structures of these two families of finite geometries have been well studied and can be found in any major text in combinatorics or groups of finite order. References [36]–[38] give a good exposition of this sub- ject. A simple discussion of these two types of finite geometries can also be found in [28]. To make this paper self-contained, the fundamental structural properties of lines and points of these two types of geometries are briefly described before the code construction.

Based on the lines and points of Euclidean and projective geometries, four classes of finite geometry LDPC codes can be constructed. They are: 1) type-I Euclidean geometry (EG)- LDPC codes; 2) type-II EG-LDPC codes; 3) type-I projective geometry (PG)-LDPC codes; and 4) type-II PG-LDPC codes.

Among these four classes of codes, two are cyclic and two are quasi-cyclic.

A. Type-I EG-LDPC Codes

Let EG be an -dimensional Euclidean geometry over the Galois field GF where and are two positive integers. This geometry consists of points, each point is simply an -tuple over GF . The all-zero -tuple is called the origin. The -tuples over GF that represent the points of EG form an -dimensional vector space over GF . Therefore, EG is simply the -dimensional vector space of all the -tuples over GF . A line in EG is either a one-dimensional (1-D) subspace of EG or a coset of a 1-D subspace.

Therefore, a line in EG consists of points. There are (3)

(4)

lines in EG . Every line has lines parallel to it. For any point in EG , there are

(4) lines intersecting at this point.

Let GF be the extension field of GF . Each element in GF can be represented as an -tuple over GF . Therefore, the elements in GF may be regarded as the points in EG and hence GF may be regarded as the Euclidean geometry EG . Let be a primitive element of GF . Then

form the points of EG , where is the origin.

Let be a nonorigin point in EG . Then, the points GF

form a line in EG . Since for ,

the line contains the origin as a point. We say that passes through the origin. Let and be two linearly independent points in EG . Then the collection of the following points:

GF

form a line in EG that passes through the point . Lines and do not have any point in common and hence they are parallel. Let be a point which is linearly inde-

pendent of and . Then lines and

intersect at the point .

Let be a matrix over GF whose rows are the incidence vectors of all the lines in EG that do not pass through the origin and whose columns correspond to the

nonorigin points in EG . The columns are arranged

in the order of , i.e., the th column

corresponds to the point . Then consists of columns and

(5) rows. has the following structures: 1) each row has weight ; 2) each column has weight

; 3) any two columns have at most one “ -component” in common, i.e., ; 4) any two rows have at most one “ -component” in common. The density of is which is small for and . Therefore, is a low-density matrix.

Let be the null space of . Then

is a regular LDPC code of length . We call this code the type-I -dimensional EG-LDPC code.

Since the column weight of is

the minimum distance of is at least

It turns out that this EG-LDPC code is the one-step majority- logic decodable th-order EG code constructed based on EG [28], [39], [40] and is the dual code of a polynomial code [40]–[43]. Therefore, it is cyclic and its generator polynomial is completely characterized by its roots in GF .

Let be a nonnegative integer less than . Then can be expressed in radix- form as follows:

where for . The weight of , denoted

, is defined as the following sum:

(6) For a nonnegative integer , let be the remainder resulting

from dividing by . Then .

Let be the generator polynomial of the type-I -dimensional EG-LDPC code. Let be a primitive element of GF . Then is a root of if and only if [28], [39], [40]

(7) From the above characterization of the roots of , it has been shown [40] that has the following sequence of consecutive powers of :

as roots. It follows from the Bose–Chaudhuri–Hocquenghem (BCH) bound [27]–[30] that the minimum distance of the type-I -dimensional EG-LDPC code is lower-bounded as follows:

(8) This bound is exactly the same as the bound given above based on majority-logic decoding.

The number of parity-check symbols of the type-I -dimensional EG-LDPC code is, of course, equal to the degree of its generator polynomial . However, a combinatorial expression for this number has been derived by Lin [42].

A special subclass of EG-LDPC codes is the class of type-I two-dimensional (2-D) EG-LDPC codes . For any positive integer , the type-I 2-D EG-LDPC code has the following parameters [28], [40]:

Length

Number of parity bits Minimum distance Row weight of the

parity-check matrix Column weight of the

parity-check matrix (9)

For this special case, the geometry EG contains lines that do not pass through the origin. Therefore, the parity- check matrix of the type-I 2-D EG-LDPC code is

a square matrix. Actually,

can be constructed easily by taking the incidence vector of a line in EG that does not pass through the origin and then cyclically shifting this incidence vector times. This results in incidence vectors for the distinct lines in EG that do not pass through the origin. The incidence vector and its cyclic shifts form the rows of the parity-check matrix . Therefore, is a square circulant matrix. A list of type-I 2-D EG-LDPC codes is given in Table I.

Example 1: Consider the 2-D Euclidean geometry EG . Let be a primitive element of GF .

The incidence vector for the line is

(5)

TABLE I

A LIST OFTYPE-I 2-D EG-LDPC CODES

This vector and its 14 cyclic shifts form the parity-check matrix . The null space of this matrix is the type-I 2-D EG-LDPC code, the first code given in Table I.

It follows from the analysis of finite-geometry LDPC codes given in Section II that the Tanner graph of the type-I -dimensional EG code does not contain cycles of length . With some modifications to (2) (due to the exclusion of the origin), we find that the number of cycles of length is

(10) We see that the Tanner graph of the code contains many cycles of length .

B. Type-II EG-LDPC Codes Let

Then is a matrix with rows and

columns. The rows of this matrix correspond to the nonorigin points of EG and the columns correspond to the lines in EG that do not pass through the origin. Its column and row weights are and

respectively. Any two rows of this matrix have exactly one

“ -component” in common, and any two columns have at most one “ -component” in common.

The null space of gives an LDPC code of length . This code is called the type-II -dimensional EG-LDPC code. This code is also one-step majority-logic decodable and has minimum distance at least . Since

and have the same rank,

and have the same number of parity-check symbols. Since the Tanner graphs of and

are dual, they have the same cycle distribution. For , since the parity-check matrix of the type-I 2-D EG-LDPC code is a square matrix whose rows are the cyclic shifts of the first row, the rows of are simply permutations of the rows of . Therefore,

and are identical.

In general, for , is not cyclic but it can be put in quasi-cyclic form. To see this, consider the

lines in EG that do not pass through the origin. The incidence vectors of these lines can be partitioned into

(11) cyclic classes. Each of these cyclic classes contains

incidence vectors of lines which are obtained by cyclically shifting any incidence vector in the class times. For each cyclic class of incidence vectors of lines, we can choose a representative and the rest of the incidence vectors are generated by cyclically shifting this representative. Now we construct a matrix , whose columns are the representative incidence vectors of the cyclic classes. For

, let be a matrix whose

columns are the th (downward) cyclic shifts of the columns of . Form the following matrix:

(12) Then the null space of gives a quasi-cyclic type-II -dimensional EG-LDPC code . Every cyclic shifts of a codeword in is also a codeword in . Encoding of quasi-cyclic codes can also be achieved with linear feedback shift registers [27].

C. Type-I PG-LDPC Codes

The construction of PG-LDPC codes for both types is based on the lines and points of projective geometries (PGs) over finite fields. For the purpose of code construction, a brief description of this family of finite geometries is given here.

Let GF be the extension field of GF . Let be a primitive element of GF . Let

(13) and . Then the order of is . The elements form all the elements of GF . Con-

sider the first powers of , .

Partition the nonzero elements of GF into disjoint subsets as follows:

(14) for . Each set consists of elements and each element is a multiple of the first element in the set. Represent each set by its first element as follows:

with . For any GF , if

with , then is in and represented by . If we represent each element in GF as an -tuple over GF , then consists of -tuples over GF . The -tuple for represents the

-tuples in . The -tuple over GF that represents may be regarded as a point in a finite geometry over

GF . Then the points form

an -dimensional projective geometry over GF , denoted PG [28], [36]–[38]. Note that the elements in

(6)

are considered to be the same point in PG and a PG does not have an origin.

Let and be any two distinct points in PG . Then, the line passing through (or connecting) and consists of points of the following form: , where

and are from GF and are not both equal to zero. Since and are the same point, therefore, each line in PG consists of

(15) points.

Let be a point not on the line . Then,

the line and the line have

as a common point (the only common point). We say that they intersect at . The number of lines in PG that intersect at a given point is

(16) There are

(17) lines in PG .

Form a matrix whose rows are the incidence vectors of the lines in PG and whose columns correspond to the points of PG . The columns are arranged in the

order , , , . has rows and

columns. It follows from the structural properties of lines and points described above that has the following structural properties: 1) each row has weight ; 2) each

column has weight ; 3) any two columns

have exactly one “ -component” in common; and 4) any two rows have at most one “ -component” in common. The density

of is

For , is relatively small. Therefore, is a sparse matrix.

Let be the null space of . Then

is a regular LDPC code, called the type-I -dimensional PG-LDPC code. Since the column weight of

is

the minimum distance of is at least

. This regular LDPC code turns out to be the one-step majority-logic decodable th order PG code constructed based on the lines and points of PG

discovered in the late 1960s [28], [44], [45] and is the dual of a nonprimitive polynomial code [40]–[43]. It is cyclic and, therefore, can be encoded with a linear feedback shift register based on its generator polynomial.

Let be a nonnegative integer less than . For a nonnegative integer , let be the remainder resulting from

dividing by . The -weight of , , is

defined by (6). Let be the generator polynomial of the type-I -dimensional PG-LDPC code constructed based on PG . Let be a primitive element of GF .

TABLE II

A LIST OFTYPE-I 2-D PG-LDPC CODES

Then has as a root if and only if is divisible by and

(18)

with [28], [40], [44]. Let . The order

of is then . From the characteriza-

tion of the roots of given by (18), it can be shown [39], [40] that has the following consecutive powers of :

(19) as roots. Therefore, it follows from the BCH bound that the minimum distance of the type-I -dimensional PG-LDPC code is lower-bounded as follows:

(20) This bound is exactly the bound derived based on one-step majority-logic decoding.

The number of parity-check symbols of the type-I -dimensional PG-LDPC for a given can be enumerated by deter- mining the roots of its generator polynomial. A combinatorial expression for this number can be found in [42].

A special subclass of PG-LDPC codes is the class of type-I 2-D PG-LDPC codes constructed based on PG for various . For any positive integer , the type-I 2-D PG-LDPC code has the following parameters [28], [42], [46], [47]:

Length

Number of parity bits Minimum distance Row weight of the

parity-check matrix Column weight of the

parity-check matrix (21)

It is a difference-set code [28], [46]. The parity-check matrix

of this code is a square

matrix, which can be formed by taking the incidence vector of a line in PG and its cyclic shifts as rows. A list of type-I 2-D PG-LDPC codes is given in Table II.

D. Type-II PG-LDPC Codes

Let be the transpose of . Then the

rows and columns of correspond to the points and lines of PG , respectively. is also a low-den-

sity matrix with row weight and column

weight . The null space of gives a regular

(7)

LDPC code, called the type-II -dimensional PG-LDPC code, denoted . This code is also one-step majority-logic decodable with minimum distance at least . For , is a square matrix whose rows are the cyclic shifts of the first row, and the rows of

are simply permutations of the rows of . As a result, and are identical. In general, for , is not cyclic but it can be put in quasi-cyclic form in a similar manner as for the type-II -dimensional EG-LDPC code.

Example 2: Let and . The three-dimensional (3-D) projective geometry PG has 85 points and 357 lines. To construct the type-I 3-D PG-LDPC code , we form the parity-check matrix whose rows are the incidence vectors of all the 357 lines in PG and whose columns correspond to all the 85 points in PG . The matrix can be put in the following form:

where is the identity matrix and each is an circulant matrix. The circulant matrices

have the following vectors (in polynomial form) as their first rows, respectively:

The matrix has row weight and column

weight . The null space of gives a

type-I 3-D PG-LDPC code . The companion code of this code is the null space of the parity-check matrix

The matrix has row weight and column

weight . has the same number of parity-check

bits as . Hence, it is a PG-LDPC code

with minimum distance at least .

It follows from (2), (13), and (15) that the number of cycles of length in the Tanner graph of an -dimensional (type-I or II) PG-LDPC code is

(22)

IV. DECODING OFFINITE-GEOMETRYLDPC CODES

Finite-geometry LDPC codes can be decoded in various ways, namely one-step MLG decoding [28], [31], BF decoding [1], [2], weighted MLG decoding, weighted BF decoding, APP decoding [2], [31] and SPA decoding [10], [11], [15], [20], [22]. These decoding methods range from low to high decoding complexity and from reasonably good to very good error performance. They provide a wide range of tradeoffs among decoding complexity, decoding speed, and error performance. MLG and BF decodings are hard-decision decoding and they can be easily implemented. Since finite-geometry LDPC codes have relatively good minimum distances, they provide relatively large coding gains over the uncoded system.

MLG decoding has the least decoding delay and very high decoding speed can be achieved. APP and the SPA decodings are soft-decision decoding schemes. They require extensive decoding computation but they provide the best error performance. Weighted MLG and BF decodings are between hard- and soft-decision decodings. They improve the error performance of the MLG and BF decodings with some additional computational complexity. They offer a good tradeoff between error performance and decoding complexity. The SPA decoding gives the best error performance among the six decoding methods for finite-geometry LDPC codes and yet is practically implementable.

The first MLG decoding algorithm was devised by Reed [48]

for decoding Reed–Muller codes [27]. Later Reed’s algorithm was reformulated and generalized by Massey for decoding both block and convolutional codes [31]. A thorough discussion of various types and implementation of MLG decoding can be found in [28]. Therefore, we will not describe this decoding method here. APP decoding also gives minimum error performance, however, it is computationally intractable and hence it will not be discussed here for decoding finite-geometry LDPC codes. A good presentation of APP decoding can be found in [1], [2].

Suppose a finite-geometry (EG- or PG-) LDPC code is used for error control over an additive white Gaussian noise (AWGN) channel with zero mean and power spectral density . Assume binary-phase shifrt keying (BPSK) signaling

with unit energy. A codeword is

mapped into a bipolar sequence

before its transmission where for

and for with .

Let be the soft-decision received

sequence at the output of the receiver matched filter. For where is a Gaussian random variable with zero mean and variance . Let be the binary hard-decision received sequence obtained from as follows: for and

for .

Let be the parity-check matrix of the finite-geometry

LDPC code with rows and columns. Let ,

denote the rows of , where

for . Then

(23)

(8)

gives the syndrome of the received sequence , where the th syndrome component is given by the check sum

(24) The received vector is a codeword if and only if . If , errors in are detected. A nonzero syndrome component indicates a parity failure. The total number of parity failures is equal to the number of nonzero syndrome components in . Let

(25) Then is the error pattern in . This error pattern and the syndrome satisfy the condition

(26) where

(27)

for .

A. BF Decoding

BF decoding of LDPC codes was devised by Gallager in the early 1960s [1], [2]. When detectable errors occur during the transmission, there will be parity failures in the syndrome

and some of the syndrome bits are equal to . BF decoding is based on the change of the number of parity

failures in when a bit in the received

sequence is changed.

First, the decoder computes all the parity-check sums based on (24) and then changes any bit in the received vector that is contained in more than some fixed number of unsatisfied parity-check equations. Using these new values, the parity-check sums are recomputed, and the process is repeated until the parity-check equations are all satisfied. This decoding is an iterative decoding algorithm. The parameter , called threshold, is a design parameter which should be chosen to optimize the error performance while minimizing the number of computations of parity-check sums. The value of depends on the code parameters and the signal-to-noise ratio (SNR).

If decoding fails for a given value of , then the value of can be reduced to allow further decoding iterations. For error patterns with number of errors less than or equal to the error correcting capability of the code, the decoding will be completed in one or a few iterations. Otherwise, more decoding iterations are needed. Therefore, the number of decoding iterations is a random variable and is a function of the channel SNR. A limit may be set on the number of iterations. When this limit is reached, the decoding process is terminated to avoid excessive computations. Due to the nature of LDPCs, the above decoding algorithm corrects many error patterns with number of errors exceeding the error correcting capability of the code.

A very simple BF decoding algorithm is given as follows.

Step 1 Compute the parity-check sums (syndrome bits). If all the parity-check equations are satisfied (i.e., all the syndrome bits are zero), stop the decoding.

Step 2 Find the number of unsatisfied parity-check equations for each code bit position, denoted ,

.

Step 3 Identify the set of bits for which is the largest.

Step 4 Flip the bits in set .

Step 5 Repeat Steps 1) to 4) until all the parity-check equations are satisfied (for this case, we stop the iteration in Step 1) or a predefined maximum number of iterations is reached.

BF decoding requires only logical operations. The number of logical operations performed for each decoding iteration is linearly proportional to (or ), say , where the constant depends on the implementation of the BF decoding algorithm. Typically, is less than three. The simple BF decoding algorithm can be improved by using adap- tive thresholds ’s. Of course, this improvement is achieved at the expense of more computations. EG- and PG-LDPC codes perform well with the BF decoding due to the large number of check sums orthogonal on each code bit.

B. Weighted MLG and BF Decodings

The simple hard-decision MLG and BF decodings can be improved to achieve better error performance by including some kind of reliability information (or measure) of the received symbols in their decoding decisions. Of course, additional decoding complexity is required for such performance improvement.

Consider the soft-decision received sequence

For the AWGN channel, a simple measure of the reliability of a received symbol is its magnitude . The larger the magnitude , the larger the reliability of the hard-decision digit . Many algorithms for decoding linear block codes based on this reliability measure have been devised. In the following, this reliability measure is used to modify the one-step majority logic decoding and the BF decoding.

Again consider a finite-geometry LDPC code specified by a parity-check matrix with rows, . For

and , define

(28) and

(29)

where is the set of check sums orthogonal on bit position . The value is simply a weighted check sum that is orthogonal

on the code bit position . Let be the

error pattern to be estimated. Then the one-step MLG decoding can be modified based on the weighted check sum as follows:

for

for (30)

for . The above decoding algorithm is called weighted-MLG decoding and was first proposed by Kolesnik in 1971 [49] for decoding majority logic decodable codes.

(9)

The decision rule given by (30) can be used in BF decoding.

In this case, the decoding is carried out as follows.

Step 1: Compute the check sums. If all the parity-check equations are satisfied, stop the decoding.

Step 2: Compute based on (29), for . Step 3: Find the bit position for which is the largest.

Step 4: Flip the bit .

Step 5: Repeat Steps 1 to 4. This process of BF continues until all the parity-check equations are satisfied or a preset maximum number of iterations is reached.

This modified BF algorithm is called weighted BF decoding algorithm.

The above weighted decoding algorithms are in a way soft- decision decoding algorithms and require real addition operations to compute the weighted check sums ’s to make decisions. Since a real addition operation is much more complex than a logical operation, the computational complexities of both weighted MLG and BF decodings are dominated by the total number of real additions needed to decode a received sequence.

From (29), we can readily see that for weighted MLG decoding, the number of real additions required for decoding a received

sequence is where is a constant. How-

ever, for weighted BF decoding, the number of real additions needed for each decoding iteration is . Since (or ) is the total number of -entries in the parity-check matrix of the code, the computational complexities of both weighted-MLG and -BF decodings are linearly proportional to the total number of -entries in .

C. The Sum–Product Algorithm

The sum–product algorithm (SPA) [17]–[20], [33] is an iterative decoding algorithm based on belief propagation [10], [11], [20]–[22] which is extremely efficient for decoding LDPC codes. Like MAP probability decoding algorithm [50], it is a symbol-by-symbol soft-in/soft-out decoding algorithm. It pro- cesses the received symbols iteratively to improve the reliability of each decoded code symbol based on the parity-check sums computed from the hard decisions of the received symbols and the parity-check matrix that specifies the code. The reliability of a decoded symbol can be measured by its marginal posteriori probability, its log-likelihood ratio (LLR), or the value of its corresponding received symbol. The computed reliability measures of code symbols at the end of each decoding iteration are used as inputs for the next iteration. The decoding iteration process continues until a certain stopping condition is satisfied. Then, based on the computed reliability measures of code symbols, hard decisions are made.

Again we consider a finite-geometry LDPC code of length specified by a parity-check matrix with rows, . For , define the following index set for :

(31) which is called the support of .

The implementation of the SPA decoding [10] is based on the computation of marginal a posteriori probabilities ’s for

. Then the LLR for each code bit is given by (32)

Let and be the prior probabil-

ities of and , respectively.

For and each , let be the

conditional probability that the transmitted code bit has value , given the check sums computed based on the check vectors in

at the th decoding iteration. For

and , let be the conditional probability that the check sum is satisfied, given ( or ) and the other code bits in have a separable distribution

i.e.,

(33)

The computed values of are then used to update the values of as follows:

(34)

where is chosen such that .

At the th iteration step, the pseudo-posterior probabilities are given by

(35)

where is chosen such that .

Based on these probabilities, we can form the vector as the decoded candidate with

for otherwise.

Then compute If , stop decoding the

iteration process and output as the decoded codeword.

The SPA decoding in terms of probability consists of the following steps.

Initialization: Set , maximum number of iterations to . For every pair such that

with and , set

and .

Step 1: For and each ,

compute the probabilities, and . Go to Step 2.

Step 2: For and each ,

compute the values of and

(10)

Fig. 1. Bit-error probabilities of the type-I 2-D(255; 175) EG-LDPC code and (273; 191) PG-LDPC code based on different decoding algorithms.

and the values of and

. Form and test

. If or the

maximum iteration number is reached, go to Step 3. Otherwise, set and go to Step 1.

Step 3: Output as the decoded codeword and stop the decoding process.

In the above SPA decoding, real number addition, subtraction, multiplication, division, exponential, and logarithm operations are needed. In implementation, the last four types of operations are more complex than addition and subtraction. For this reason, we simply ignore the number of additions and subtractions in analyzing the computational complexity. From (33)–(35), we find that the number of multiplications and divisions needed in each iteration of the SPA decoding is of the order

and the number of exponential and logarithm operations needed for each iteration of decoding is of the order . A detail exposition of the SPA can be found in [10], [17]–[20], [33].

D. Two-Stage Hybrid Decoding

The SPA decoding is computationally expensive. Each decoding iteration requires many real number computations. If decoding of a code with the SPA converges slowly, a large number of iterations is needed to achieve the desired performance. A large number of iterations results in a large number of computations and long decoding delay which is not desirable in high- speed communications. However, for finite- geometry LDPC codes, this difficulty can be overcome by using a two-stage hy-

brid soft/hard decoding scheme. At the first stage, a code is decoded with the SPA with a small fixed number of iterations, say . At the completion of the th iteration, hard decisions of decoded symbols are made based on their LLRs. This results in a binary sequence of estimated code bits. This sequence is then decoded with the simple one-step MLG decoding.

This two-stage hybrid decoding works well for finite-geometry LDPC codes because they have large minimum distances and SPA decoding of these codes converges very fast. Simulation results for many codes show that the performance gap between five iterations and 100 iterations is within 0.2 dB. Therefore, at the first stage, we may set the number of iterations for the SPA decoding to five or less (in many cases, two iterations are enough). The resulting estimated code sequence may still contain a small number of errors. These errors will be corrected by the one-step MLG decoding at the second stage due to the large majority-logic error-correcting capability of the finite-geometry LDPC codes.

The two-stage hybrid soft/hard decoding scheme offers a good tradeoff between error performance and decoding complexity. Furthermore, it reduces decoding delay.

V. PERFORMANCE OFFINITE-GEOMETRYLDPC CODES

To demonstrate the error performance of finite-geometry LDPC codes, we select several EG- and PG-LDPC codes of various lengths and decode them with various decoding methods. Figs. 1–8 show the error probabilities of these codes.

Fig. 1 gives the bit-error performance of the type-I 2-D EG-LDPC code and the type-I 2-D

(11)

Fig. 2. Bit-error probabilities of the(255; 175) EG-LDPC code, (273; 191) PG-LDPC code and two computer generated (273; 191) Gallager codes with the SPA decoding.

Fig. 3. Bit- and block-error probabilities of the type-I 2-D(1023; 781) EG-LDPC code and (1057; 813) PG-LDPC code based on different decoding algorithms.

PG-LDPC code given in Tables I and II, respectively. These two codes are equivalent in terms of geometries based on

which they are constructed. They have about the same rate and minimum distance. The EG-LDPC code is decoded with

(12)

Fig. 4. Bit-error probabilities of the(1023; 781) EG-LDPC code, (1057; 813) PG-LDPC code and two computer-generated (1057; 813) Gallager codes with the SPA decoding.

Fig. 5. Bit- and block-error probabilities of the type-I 2-D(4095; 3367) EG-LDPC code and (4161; 3431) PG-LDPC code based on different decoding algorithms.

(13)

Fig. 6. Error performances of the type-I 3-D(511; 139) EG-LDPC code and the type-II 3-D (4599; 4227) EG-LDPC code with the SPA decoding.

Fig. 7. Error performance of the type-II five-dimensional (5-D)(86955; 85963) EG-LDPC code with the SPA decoding.

(14)

Fig. 8. Convergence of the SPA decoding for the type-I 2-D(4095; 3367) EG-LDPC code.

various decoding methods but the PG-LDPC code is only decoded with the SPA decoding. From Fig. 1, we see that these two codes have almost the same error performance with the SPA decoding. We also see that the SPA decoding gives the best error performance at the expense of computational complexity.

The hard-decision BF decoding achieves relatively good error performance with much less computational complexity. It outperforms the simple one-step MLG decoding by 0.45 dB at the bit-error rate (BER) of . With some additional computational complexity, the weighted BF decoding achieves 0.75- and 1.20-dB coding gains over the hard-decision BF and MLG decodings at the BER of , respectively, and it is only 1.2 dB away from the performance of the SPA decoding.

It requires much less computational complexity than that of the SPA decoding. Therefore, weighted BF decoding provides a very good tradeoff between the error performance of the SPA decoding and the complexity of the simple one-step MLG decoding. Fig. 2 gives a comparison of the error performance of the two finite-geometry LDPC codes and that of two best computer-generated Gallager’s LDPC codes [10] with equals to and , respectively. All codes are decoded with the SPA decoding. For the two finite-geometry LDPC codes, the maximum number of decoding iterations is set to 50, however, for Gallager’s codes, the maximum number of decoding iterations is set to 200. We see that both finite-geometry LDPC codes outperform their corresponding computer-generated Gallager’s codes. The Gallager’s code with also shows an error floor. This indicates that the code has poor minimum distance.

Fig. 3 shows the bit-error performance of the type-I 2-D EG-LDPC code and the type-I 2-D

PG-LDPC code given in Tables I and II, respectively. These two codes are equivalent in terms of the code construction geometries and they have about the same rate and minimum distance. Again, the EG-LDPC code is decoded with various decoding methods and the PG-LDPC code is only decoded with the SPA decoding. The two codes perform almost the same with the SPA decoding. At the BER of , the performance of both codes is only 1.7 dB away from the Shannon limit (with binary-input constraint computed based on the rate of the code). For codes of length and rate , this performance is amazingly good. Again, we see that the weighted BF performs very well and provides a good tradeoff between the error performance of the SPA decoding and the decoding complexity of the simple one-step MLG decoding.

The block error performance of both codes with the SPA decoding is also shown in Fig. 3. They both perform well. Fig. 4 gives a comparison of the error performance of the two finite geometry LDPC codes and that of two best computer-generated Gallager’s LDPC codes with equals to and , respectively. All codes are decoded with the SPA decoding.

We see that the two finite-geometry LDPC codes slightly outperform their corresponding Gallager’s codes.

The next two codes being evaluated are the type-I 2-D EG-LDPC code and the type-I 2-D

PG-LDPC code, the fifth codes given in Tables I and II, respectively. Both codes have rates about . Their error performances with various types of decoding are shown in

(15)

Fig. 9. Bit-error probabilities of the type-I 2-D(4095; 3367) EG-LDPC code based on two-stage hybrid decoding.

Fig. 5. With the SPA decoding, they perform 1.5 dB from the Shannon limit at the BER of .

For , the type-I 3-D EG-LDPC code

is a code with minimum distance at least . Its

parity-check matrix is a matrix with

row weight and column weight . Then,

is a matrix with row weight and column

weight . The null space of gives the type-II 3-D EG-LDPC code which is a code with minimum distance at least and rate . The type-I code is a low-rate code but the type-II code is a high-rate code. Both codes have 372 parity-check bits. The bit- and block-error performances of both codes with the SPA decoding are shown in Fig. 6. We see that the type-II EG-LDPC code performs very well. At BER of , its performance is only 1 dB away from the Shannon limit.

For and , the type-II five-dimensional (5-D) EG-LDPC code constructed based on the lines and

points of EG is an code with rate

and minimum distance at least . With the SPA decoding, this code performs only 0.4 dB away from the Shannon limit at the BER of as shown in Fig. 7. Its block error performance is also very good.

In decoding the finite-geometry LDPC codes with the SPA decoding, we set the maximum number of decoding iterations to 50. Many codes have been simulated. Simulation results of all these codes show that the SPA decoding converges

very fast. For example, consider the type-I 2-D

EG-LDPC code, the fifth code given in Table I. Fig. 8 shows the convergence of the SPA decoding for this code with

. We see that at BER of , the performance gap between five and 100 iterations is less than 0.2 dB, and the performance between 10 and 100 iterations is less than 0.05 dB. This fast convergence of the SPA decoding for finite-geometry LDPC codes is not shared by the computer-generated Gallager’s codes whose parity-check matrices have small column weights, or .

To demonstrate the effectiveness of the two-stage hybrid soft/hard-decoding scheme for finite-geometry LDPC codes, we consider the decoding of the type-I 2-D

EG-LDPC code. Fig. 8 shows that decoding this code with the SPA, the performance gap between two iterations and 100 iterations is about 0.5 dB at the BER of . Therefore, in two-stage hybrid decoding, we may set the first stage SPA decoding to two iterations and then carry out the second stage with the one-stage MLG decoding. The code is capable of correcting 32 or fewer errors with one-step MLG decoding. Fig. 9 shows that the code performs very well with the two-stage hybrid decoding.

The parity-check matrix of a type-I finite-geometry LDPC code in general has more rows than columns. This is because the number of lines is larger than the number of points in either EG or PG, except for the 2-D case. Therefore, the number of rows is larger than the rank of the matrix. In decoding a finite- geometry LDPC code with the SPA (or BF decoding), all the rows of its parity-check matrix are used for computing check sums to achieve good error performance. If we remove some redundant rows for the parity-check matrix, simulation results

(16)

Fig. 10. Bit- and block-error probabilities of the extended(65520; 61425) EG-LDPC code with the SPA decoding.

show that the error performance of the code will be degraded.

Therefore, finite-geometry LDPC codes in general require more computations than their equivalent computer-generated LDPC codes with small row and column weights (often column weight is or and the row weight is ).

VI. CODECONSTRUCTION BYCOLUMN ANDROWSPLITTING OF THEPARITY-CHECKMATRICES OFFINITE-GEOMETRY

LDPC CODES

A finite-geometry (type-I or type-II) LDPC code of length can be extended by splitting each column of its parity-check matrix into multiple columns. This results in a new parity matrix with smaller density and hence a new LDPC code. If the column splitting is done properly, very good extended finite-geometry LDPC codes can be obtained. Some of the extended finite-geometry LDPC codes constructed perform amazingly well with the SPA decoding. They achieve an error performance only a few tenths of a decibel away from the Shannon limit. They are the first known algebraically constructed codes approaching the Shannon limit.

Let denote the columns of the parity-check matrix . First we consider splitting each column of into the same number of columns. All the new columns have the same length as the original column. The weight (or “ones”) of the original column is distributed among the new columns. A regular column weight distribution can be done as follows. Let be a positive integer such that . Dividing by ,

we have , where . Split each column

of into columns , such that the first

columns, , , have weight and the next

columns, , , have weight . The

distribution of “ones” of into , is carried out in a rotating manner. In the first rotation, the first “ ” of is put in , the second “ ” of is put in , and so on. In the second rotation, the th “one” of is put in , the th “one” of is put in , and so on. This rotating distribution of the “ones” of continues until all the “ones” of

have been distributed into the new columns.

The above column splitting results in a new parity-check matrix with columns which has the following structural properties: 1) each row has weight ; 2) each column either has weight or has weight ; 3) any two columns have at most one “ ” in common. If the density of is , the density of is then . Therefore, the above column splitting results in a new parity-check matrix with smaller density. The null space of gives an extended finite-geometry LDPC code . If is not divisible by , then the columns of have two different weights, and . Therefore, a code bit of the extended code is either checked by check sums or by check sums. In this case, the extended LDPC code

is an irregular LDPC code.

Example 3: For and , the type-I 2-D EG-LDPC

code is a code with minimum distance

, the fifth code given in Table I. The parity-check matrix of this code has row weight and column weight , respectively. Its error performance is shown in Fig. 5. At the BER of , the required SNR is 1.5 dB away from the Shannon limit. Suppose we split each column of the parity-check matrix of this code into 16 columns with rotating column weight distribution. This column splitting results in a

(17)

TABLE III

EXTENDEDCODESCONSTRUCTEDFROM THETYPE-I 2-D(4095; 3367) EG-LDPC C^ODE

extended type-I EG-LDPC code whose parity-check matrix has row weight and column weight . The rate of this new code is . This code decoded with the SPA decoding achieves an error performance which is only 0.42 dB away from the Shannon limit at the BER of as shown in Fig. 10. We see that it has a sharp waterfall error performance. In decoding, the maximum number of decoding iterations is set to 50, but the decoding converges very fast. The performance gap between 10 and 50 iterations is less than 0.1 dB.

Given a base finite-geometry LDPC code , it can be extended into codes of many different lengths. All these extended codes have different rates and behave differently. Consider the type-I 2-D EG-LDPC code discussed in Example 3. Suppose we split each column of its parity-check matrix into various numbers of columns from two to 23. Table III shows the performances of all the extended codes in terms of SNRs required to achieve the BER and the gaps between the required SNRs and their corresponding Shannon limits. We see that splitting each column of the parity-check matrix of the base code into 16 or 17 columns gives the best performance in terms of the Shannon limit gap.

Example 4: For and , the type-I 2-D EG-LDPC

code is a code with minimum distance ,

the sixth code in Table I. The column and row weights of its parity-check matrix are both . Suppose we split each column of the parity-check matrix of this code into 32 columns. We obtain a extended type-I EG-LDPC code with rate . The bit-error performances of this extended code and its base code are shown in Fig. 11. At the BER of , the performance of the extended code is 0.3 dB away from the Shannon limit.

Example 5: Let . The type-I 3-D EG-LDPC code constructed based on the lines and points of EG is a code with minimum distance at least and rate

. It is a low-rate code. Its parity-check matrix is a

matrix with row weight and column weight . Suppose this code is extended by splitting each column of its parity-check matrix into 24 columns. Then the extended code is a LDPC code with rate . The bit-error performances of this extended code and its base code are shown in Fig. 12. The error performance of the extended code is only 1.1 dB away from the Shannon limit at the BER of .

(18)

Fig. 11. Error performances of the type-I 2-D(16383; 14197) EG-LDPC code and the extended (524256; 507873) EG-LDPC code with the SPA decoding.

Given a finite-geometry LDPC code specified by a parity- check matrix , each column of can be split in different manner and into different numbers of columns. Consequently, many extended finite-geometry LDPC codes can be obtained by splitting columns of the parity-check matrix . If the columns are split differently, the resultant extended code is an irregular LDPC code.

Column splitting of the parity-check matrix of a finite-geometry LDPC code may result in an extended code which is nei- ther cyclic nor quasi-cyclic. However, if we arrange the rows of the parity-check matrix into circulant submatrices and then split each column into a fixed number of new columns with column weight distributed in a rotating and circular manner, the resultant extended code can be put in quasi-cyclic form. To see this, we consider a type-I EG-LDPC code of length . Let be the parity-check matrix of this code with rows and columns.

The rows of can be grouped into circulant subma-

trices, where . Each circulant sub-

matrix is obtained by cyclically shifting the incidence vector of a line times. Therefore, can be put in the following form:

...

(36)

Now we split each column of into columns in a similar manner as that described earlier in this section. However, the

-components in a column of must be labeled in a specific circular order. For , let be the th column of the th circulant matrix . Then the th column of is

obtained by cascading with one on top

the other. We label the -components of the th column of as follows. The first -component of on or below the main diagonal line of circulant and inside is labeled as the first -component of the th column of . The first -component of on or below the main diagonal line of circulant and inside is labeled as the second -component of . Continue this labeling process until we label the first -component of on or below the main diagonal line of circular and inside as the th -component of column . Then we go back to circulant and start the second round of the labeling progress. The second -component of below the main diagonal line of and inside is labeled as the th -component of . The second

-component of below the main diagonal line of circulant is labeled as the th -component of . Continue the second round labeling process until we reach the th circulant again. Then we loop back to circulant and continue the labeling process. During the labeling process, whenever we reach down to the bottom of a circulant matrix , we wrap around to the top of the same column of . The above labeling process continues until all the -components of are labeled. Once the labeling of -components of is completed, we distribute the -components of into the new columns in the same rotating manner as described earlier in this sec-

(19)

Fig. 12. Error performance of the type-I 3-D(511; 139) EG-LDPC code and the extended (12264; 7665) EG-LDPC code with the SPA decoding.

Fig. 13. Graph decomposition by column/row splitting.

tion. So the weight of each column of is distributed into new columns in a doubly circular and rotating manner. Clearly, the labeling and weight distribution can be carried out at the same time. Let be the new matrix resulting from the above column splitting. Then consists of submatrices

. For , the rows of

are cyclic shifts of the first row bits at a time. As a result, the null space of gives an extended finite-geometry LDPC code in quasi-cyclic form. Type-II EG-LDPC codes can be extended and put in quasi-cyclic form in a similar manner.

For PG-LDPC codes, may not be divisible by . In this case, not all the submatrices of the parity-check matrix of a type-I PG-LDPC code can be arranged as square circulant matrices. Some of them are nonsquare circulant matrices as shown in Example 2. The rows of such a matrix are still cyclic shifts of the first row and the number of rows divides . In regular column splitting, the labeling and distribution of -components of a column in a nonsquare circulant submatrix still follow the 45 diagonal and wrap back to the top order. When we reach the last row, move back to the first row and start to move down from the next column. After column splitting, each extended submatrix is still a circulant matrix and the extended code is in quasi-cyclic form. The columns of the parity-check matrix of a type-II PG-LDPC code can be split in a similar manner.

The last three examples show that splitting each column of the parity-check matrix of a finite-geometry LDPC code into multiple columns properly results in an extended LDPC code which performs very close to the Shannon limit with the SPA decoding. A reason for this is that column splitting reduces the degree of each code-bit vertex in the Tanner graph of the base code and hence reduces the number of cycles in the graph.

Splitting a column of into columns results in splitting a code-bit vertex of the Tanner graph of the base code into code-bit vertices in the Tanner graph of the extended code

. Each code-bit vertex in is connected to a smaller number of check-sum vertices than in . Fig. 13(a) shows that splitting a column in into two columns results in splitting a code-bit vertex in the Tanner graph into two code-bit vertices in the Tanner graph . The original code-bit vertex has

(20)

Fig. 14. Cycle decomposition.

Fig. 15. Decomposition of a cycle of length6 by column splitting.

a degree of but each code bit after splitting has a degree of . This code bit splitting breaks some cycles that exist in the Tanner graph of the base code . Figs.14(a) and 15 show the breaking of cycles of lengths and . Therefore, column splitting of a base finite-geometry LDPC code breaks many cycles of its Tanner graph and results in an extended LDPC code whose Tanner graph has many fewer cycles. This reduction in cycles in the Tanner graph improves the performance of the code with the SPA decoding. In fact, breaking cycles with column splitting of the parity-check matrix can be applied to any linear block code.

This may result in good LDPC codes.

LDPC codes can also be obtained by splitting each row of the parity-check matrix of a base finite-geometry LDPC code into multiple rows. The resultant code has the same length as the base code but has a lower code rate. Furthermore, proper row splitting also preserves the cyclic or quasi-cyclic structure of the code. Clearly, LDPC codes can be obtained by splitting both columns and rows of the parity-check matrix of a base finite- geometry code.

Splitting a row in the matrix is equivalent to splitting a check sum vertex in the Tanner graph of the code and hence reduces the degree of the vertex as shown in Fig. 13(b). There- fore, row splitting of the parity-check matrix of a base code can

also break many cycles in the Tanner graph of the base code.

An example of cycle breaking by check-sum vertex splitting is shown in Fig. 14(b). Clearly, a combination of column and row splitting will break many cycles in the Tanner graph of the base code. This may result in a very good LDPC code.

Example 6: Consider the type-I EG-LDPC 2-D code given in Table I. Its performance is shown in Fig. 1. The column and row weights of the parity-check matrix are both . If each column of is split into five columns and each row of is split into two rows, we obtain a parity-check matrix whose columns have two weights, and , and whose rows have weight . The null space of gives a LDPC code whose error performance is shown in Fig. 16.

Example 7: Again we consider the type-I 2-D EG-LDPC code given in Table I. If we split each column of the parity-check matrix of this code into 16 columns and each row of into three rows, we obtain a new parity-check matrix with column weight and row weights

and . The null space of gives a

extended LDPC code. This extended code and its base code have about the same rate. Its error performance is shown in Fig. 17, and it is 0.7 dB away from the Shannon limit at the BER of . However, the performance of its base code is 1.5 dB away from the Shannon limit. This example shows that by a proper combination of column and row splittings of the parity-check matrix of a base finite-geometry LDPC code, we may obtain a new LDPC code which has about the same rate but better error performance.

VII. SHORTENEDFINITE-GEOMETRYLDPC CODES

Both types of finite-geometry LDPC codes can be shortened to obtain good LDPC codes. This is achieved by deleting properly selected columns from their parity-check matrices. For a type-I code, the columns to be deleted correspond to a properly chosen set of points in the finite geometry based on which the code is constructed. For a type-II code, the columns to be deleted correspond to a properly chosen set of lines in the finite geometry. In this section, several shortening techniques are presented.

First, we consider shortening type-I finite-geometry LDPC codes. We use a type-I EG-LDPC code to explain the shortening techniques. The same techniques can be used to shorten a type-I PG-LDPC code. Consider the type-I EG-LDPC code constructed based on the -dimensional

Euclidean geometry EG . Let EG be an

-dimensional subspace (also called an -flat) of EG [28], [36]–[38]. If the points in EG

are removed from EG , we obtain a system , denoted

EG EG , that contains

points. Every line (or -flat) contained in EG is deleted from EG . Every line that is completely outside of EG remains in and still contains points.

Every line not completely contained in contains only points, since by deleting an EG from EG we also delete a point in EG from each such line.

The columns of that correspond to the points

(21)

Fig. 16. Bit- and block-error probabilities of the extended(1275; 765) LDPC code with the SPA decoding.

Fig. 17. Bit-error probabilities of the extended(65520; 53235) EG-LDPC code and the type-I 2-D (4095; 3367) EG-LDPC code with the SPA decoding.