Iterative Decoding Algorithms for a Class of Non-Binary Two-Step Majority-Logic Decodable Cyclic Codes

(1)

Iterative Decoding Algorithms for a Class of

Non-Binary Two-Step Majority-Logic

Decodable Cyclic Codes

Hsiu-Chi Chang and Hsie-Chia Chang

Abstract—This paper presents two iterative decoding algorithms for a class of non-binary two-step majority-logic (NB-TS-MLG) decodable cyclic codes. A partial parallel decoding scheme is also introduced to provide a balanced trade-off between decod-ing speed and storage requirements. Unlike non-binary one-step MLG decodable cyclic codes, the Tanner graphs of which are 4-cycle-free, NB-TS-MLG decodable cyclic codes contain a large number of short cycles of length 4, which tend to degrade decod-ing performance. The proposed algorithms utilize the orthogonal structure of the parity-check matrices of the codes to resolve the degrading effects of the short cycles of length 4. Simulation results demonstrate that the NB-TS-MLG decodable cyclic codes decoded with the proposed algorithms offer coding gains as much as 2.5 dB over Reed-Solomon codes of the same lengths and rates decoded with either hard-decision or algebraic soft decision decoding.

Index Terms—Extended min-sum algorithm, majority-logic de-coding, non-binary LDPC codes, cyclic codes.

I. INTRODUCTION

F

INITE geometry codes received considerable attention in the late 1960s and 1970s [1]–[3]. These codes form an important class of cyclic codes, which can be systematically encoded with linear shift registers and decoded with majority-logic decoding (MLGD) [4]. Based on finite geometries, there are two types of cyclic codes: one-step and multi-step MLG decodable. One-step MLG decodable cyclic codes were re-discovered in 2001 [5] as finite geometry low-density parity-check (FG-LDPC) codes with 4-cycle-free Tanner graphs [6]. Long FG-LDPC codes provide error correction performance approaching to Shannon’s theoretical limit [7] when decoded using belief propagation algorithms, such as the sum-product algorithm [8] and the min-sum algorithm [9]. In contrast, numerous short cycles of length 4 involved in multi-step MLG decodable cyclic codes limit the effectiveness of the standard belief propagation algorithm [10]. Consequently, only a small amount of coding gain is achieved at a considerable increment in decoding complexity. Efforts to overcome this key disadvan-tage have led to the development of efficient iterative decoding

Manuscript received August 16, 2013; revised December 17, 2013, February 11, 2014, March 16, 2014, and April 4, 2014; accepted April 16, 2014. Date of publication April 25, 2014; date of current version June 18, 2014. This work was supported by the NSC, Taiwan, under Contract NSC 101-2628-E-009-013-MY3. The associate editor coordinating the review of this paper and approving it for publication was L. Dolecek.

The authors are with the Department of Electronics Engineering and In-stitute of Electronics, National Chiao Tung University, Hsinchu 300, Taiwan (e-mail: [email protected]; [email protected]).

Digital Object Identifier 10.1109/TCOMM.2014.2320508

algorithms, which utilize the orthogonal structure of the parity-check matrices of the two-step MLG (TS-MLG) decodable cyclic codes [10], [11].

Binary LDPC codes typically demonstrate weakness in error performance for short and moderate code lengths [12]. In these cases, non-binary LDPC (NB-LDPC) codes in higher order Galois fields provide excellent alternatives. NB-LDPC codes constructed based on finite geometries have been discussed in [13], [14]. These codes are non-binary one-step MLG decod-able. The associated Tanner graphs of the parity-check matrices of the codes are 4-cycle free, which enables NB-LDPC codes perform very well over the additive white Gaussian noise (AWGN) channel using standard belief propagation algorithms such as FFT-QSPA [12] or EMS [15] algorithm. However, the development of an efficient belief propagation algorithm for decoding non-binary multi-step MLG decodable cyclic codes has yet to be achieved. In this paper, a subclass of NB-TS-MLG decodable cyclic codes is presented. From our simulation stud-ies, standard belief propagation algorithm for decoding NB-TS-MLG decodable cyclic codes is not effective due to the large number of short cycles of length 4. These short cycles produce decoding correlations after a few decoding iterations, thereby preventing convergence to maximum-likelihood decoding. As a result, coding gains are marginal and the speed of convergence is slow. To overcome this major drawback, we modify standard belief propagation by introducing the geometric structure of the parity-check matrices of the codes [4]. Two efficient decoding algorithms based on the orthogonal structure of the parity-check matrices of the codes are proposed to reduce or eliminate the degrading effects of short cycles of length 4. Furthermore, the orthogonal structure of NB-TS-MLG decodable cyclic codes allows a decomposition on the parity-check matrices, resulting in a partial parallel decoding scheme.

FFT-QSPA presents the best performance among the belief propagation algorithms developed for decoding NB-LDPC; however, complex operations, such as multiplication and divi-sion tend to increase decoding complexity. The EMS algorithm overcomes this issue by utilizing the log-domain operations that turn multiplications into log-domain additions and avoid divisions. In this paper, we propose an algorithm called iterative two-step EMS (ITS-EMS) by modifying the standard EMS algorithm. The NB-TS-MLG decodable cyclic codes decoded with the proposed ITS-EMS achieve as much as 2.5 dB cod-ing gain over Reed-Solomon (RS) codes of the same lengths and rates decoded using either the hard-decision Berlekamp-Massey (HD-BM) algorithm [4] or the algebraic soft-decision 0090-6778 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.

(2)

Koetter-Vardy (ASD-KV) algorithm [16]. Unfortunately, ITS-EMS suffers from high computational complexity because many of its computations involve real numbers. A low com-plexity iterative message passing decoding algorithm was developed previously to decode non-binary one-step MLG decodable cyclic codes [17], called iterative soft reliability-based MLGD (ISRB-MLGD) algorithm. We further generalize the ISRB-MLGD algorithm as iterative reliability two-step MLGD (IRTS-MLGD) algorithm to decode the NB-TS-MLG decodable cyclic codes. The IRTS-MLGD requires far lower computational complexity by employing only finite field and integer operations, compared to the ITS-EMS using compu-tations in real numbers. Moreover, the decoding process is different between the ISRB-MLGD and the IRTS-MLGD. The ISRB-MLGD uses a fully parallel decoding scheme; instead, the IRTS-MLGD employs a partial parallel decoding scheme. The partial parallel decoding scheme can be generalized for decoding the binary TS-MLG decodable cyclic codes presented in [10], resulting in a more balanced traoff between de-coding speed and memory usage. In addition, we compare the error performances of ITS-EMS decoding with the NB-TS-MLG decodable cyclic codes and standard EMS decoding with the one-step MLG decodable NB-LDPC codes constructed based on Euclidean geometries via matrix dispersion [14], [18]. Simulation results show that in a small number of decoding iterations, the NB-TS-MLG decodable cyclic codes outperform one-step MLG decodable NB-LDPC codes.

The remainder of this paper is organized as follows. Section II briefly introduces a subclass of NB-TS-MLG de-codable cyclic codes and the hard-decision non-binary two-step MLGD (NB-TS-MLGD) algorithm. The proposed ITS-EMS is introduced in Section III, together with a parity-check matrix decomposition for partial parallel decoding. We also discuss the computational complexity of ITS-EMS and investigate its memory requirements. Section IV gives the low complexity IRTS-MLGD algorithm and evaluates its computational com-plexity. Section V concludes the paper.

II. CLASS OFNB-TS-MLG DECODABLECYCLICCODES In this section, we consider a special class of TS-MLG de-codable cyclic code, referred to as two-fold Euclidean geometry (EG) codes. This subclass of binary MLG decodable cyclic codes was constructed based on Euclidean geometries by Lin [19] in 1973, called multifold Euclidean geometry codes. We generalize the binary two-fold EG codes to the non-binary cases known as NB-two-fold EG (NB-TF-EG) codes, then investigate this special case of NB-TS-MLG decodable cyclic codes. A. Code Construction

Consider a d-dimensional Euclidean geometry EG(d, q) over the field GF(q), where q is a power of prime. The field GF(qd_{) as an extension filed of the field GF(q) is a realization}

of EG(d, q). Let α be a primitive element of GF(qd_{). Then}

the powers of α, α−∞= 0, α0_{= 1, α, . . . α}qd₋₂

, represent the qd _{points of EG(d, q) and α}−∞_{= 0 represents the origin}

of EG(d, q). Let EG∗(d, q) be the subgeometry by removing

the origin and all the lines passing through the origin in EG(d, q). Let n = qd_{− 1. There are n non-origin points}

and J0= n(qd−1− 1)/(q − 1) lines not passing through ori-gin in EG∗(d, q) [4]. Let L ={αj1_{, α}j2_{, . . . , α}jq} be a line

in EG∗(d, q) comprising points αj1_{, α}j2_{, . . . , α}jq_{, where 0}≤

j1, j2, . . . , jq < qd− 1. Let vL be the (qd− 1)-tuple over

GF(qd) as vL= (v0, v1, . . . , vqd₋₂). The components in v_L

correspond to the n non-origin points of EG∗(d, q), where the j1th, j2th, . . . , jqth components are vj1 = α

j1_{, v} j2 = α j2_, vj3 = α j3_{, . . . , v} jq = α

jq _{and other components are zero}

ele-ment in GF(qd_{). This (q}d_{− 1)-tuple v}

Lis called a qd-ary

inci-dence vector of line L. This vector vLhas q points, each point

represents its location and value by the element of GF(qd_).

Let L be a J0× n matrix which is formed by the J0 lines in EG∗(d, q). Let vL0, vL1, . . . , vLJ0−1be the rows of L. Let vLi

be the qd_{-ary incidence vector of line L}

i denoted as vLi=

(vi,0, vi,1, . . . , vi,n−1), where 0≤ i < J0. For 0≤ j < n, we define Ni={j : 0 ≤ j < n, vi,j= 0} and Mj ={i : 0 ≤ i <

J0, vi,j= 0}. The indices in Nidenote the location of nonzero

components in the ith row of L. The indices in Mj denote

the location of nonzero components in the jth column of L. The Tanner graph [6] of matrix L has two disjoint classes of nodes: variable nodes (VN) and check nodes (CN). The jth VN corresponds to the jth qd_{-ary received symbol in L, while the}

ith CN corresponds to the ith row of L. If vi,j= 0, the jth VN

is connected to the ith CN by an edge.

If a point is on a line in EG∗(d, q), we say that the line passes through the point (or is orthogonal on the point). Every point in EG∗(d, q) is intersected by J1= n/(q− 1) − 1 lines. For the ith line Li in EG∗(d, q), where 0≤ i < J0, it has J2= qd−1− 2 parallel lines denoted as Lt,i, where 0≤ t < J2. {Li, Lt,i} forms a (1,2)-frame which consists of 2q points

in EG∗(d, q). The corresponding qd_{-ary incidence vector of}

{Li, Lt,i} is denoted as vLi+ vLt,i, where vLi and vLt,i

are two (qd_{− 1)-tuple over GF(q}d_{) without any points in}

common. Let {Li, L0,i}, {Li, L1,i}, . . . , {Li, LJ2−1,i} be the

J2(1,2)-frames that intersect on line Li. We say that these J2 (1,2)-frames are orthogonal on Li. There are a total of mr=

n(qd−1− 1)(qd−1− 2)/2(q − 1) (1,2)-frames F in EG∗(d, q). These (1,2)-frames form a mr× n matrix H over GF(qd) with

each row as a qd_{-ary incidence vector of the (1,2)-frames in}

EG∗(d, q). Then the null space of H gives a cyclic code of length n, referred to as NB-TF-EG code. The generator poly-nomial of a NB-TF-EG code can be derived as the following steps [13]. Each row of H is represented by a polynomial of degree qd− 2 or less over GF(qd). Let h(X) be the greatest common divisor of the row polynomials of H. Let h∗(X) be the reciprocal polynomial of h(X). The generator polynomial of a NB-TF-EG code is derived by g(X) = (Xn_{− 1)/h}∗_(X).

B. NB-TS-MLGD Algorithm

Consider GF(qd) as the field on which to construct the NB Euclidean geometry. For simplicity of illustration, we con-sider qd= 2r_{. Although this paper considers only the case}

for 2-powers, the codes and the decoding algorithms can be generalized to any prime-powers. Assume that transmis-sion uses binary phase-shift keying (BPSK) or m-QAM over

(3)

the AWGN channel with two-sided power spectral density N0/2. We use Rr for BPSK and R2 for m-QAM. Let u = (u0, u1, u2, . . . , un−1) be a transmitted n-tuple codeword of a

NB-TF-EG code over GF(qd_{). For 0}_{≤ j < n, the jth symbol}

uj of u can be converted into a sequence of r = log2(qd) bits and denoted as uj= (uj,0, uj,1, . . . , uj,r−1) over GF(2). Let z = (z0, z1, z2, . . . , zn−1)∈ Zn be the hard-decision received

sequence, where Z is the received alphabet for a single NB-TF-EG symbol. For 0≤ j < n, each component zj of z is an

element in GF(qd). The hard-decision received sequence is a NB-TF-EG codeword if and only if HzT = 0 (or the poly-nomial representation z(X) of z is divisible by the generator polynomial g(X)).

The NB-TS-MLGD algorithm is generalized from the non-binary one-step MLGD algorithm [17]. Assume that αj _in

EG∗(d, q) is updated. The corresponding received symbol for αj _{is z}

j. Let zj be the jth received symbol in z participating

in Li. Let S(Li) be the line-sum (or check-sum), which can be

derived by the inner product of the non-zero element vi,jin vLi

and the received symbol zjin z as

S(Li) =

j∈Ni

vi,jzj. (1)

Consider a (1,2)-frame F ={Li, Lt,i} in EG∗(d, q). The

frame-sum of F denoted by S(F ) = S(Li) + S(Lt,i) is the

inner product of z and the qd_{-ary incidence vector of}

(1,2)-frame F comprising two lines Li and Lt,i in EG∗(d, q). We

omit the subscript t of Lt,ifor calculating S(Lt,i) by (1) since

Lt,i is also a line in EG∗(d, q). Let Lju denote the J1 lines passing through zj, where 0≤ u < J1. The line-sum of Lju is

denoted as S(Lj

u). For 0≤ t < J2, J2lines are denoted as Ljt,u

parallel to Lj u. The line-sum of L j t,uis denoted as S(L j t,u). The

first step of decoding is to decode S(Lj

u) with the J2 (1,2)-frames in EG∗(d, q) orthogonal on Lj u. Let Fj,u,t={Lju, L j t,u} be a (1,2)-frame in EG∗(d, q) orthogonal on Lj u. The frame-sum

of Fj,u,t_{is denoted as S(F}j,u,t_{) = S(L}j

u) + S(L j

t,u). Note that

S(Lju), S(Ljt,u) and S(Fj,u,t) are the elements in GF(qd). The

line-sum S(Lj_t,u) of S(Fj,u,t_{) is the extrinsic information for}

decoding S(Lju). A received symbol in z not contained in Lju

can appear in at most one Lj_t,u. Thus, we can correctly decode the value of S(Lj

u) from the J2S(Fj,u,t) orthogonal on S(Lju)

provided that no more than J2/2 symbol errors in z. The second step is to decode zj with J1 S(Lju) orthogonal on zj.

Any received symbols of z other than zjcan appear in at most

one of these J1 lines. The symbols orthogonal on zj are the

extrinsic information for zj. Since J1> J2, the value of zj

can be correctly determined if there are no more thanJ2/2 symbol errors in z. This completes the decoding process of the NB-TS-MLGD for the NB-TS-MLG decodable cyclic codes.

III. ITERATIVETWO-STEPEXTENDEDMIN-SUM ALGORITHMWITHPARTIALPARALLELDECODING Serial and parallel decoding algorithms have been developed for binary TS-MLG decodable cyclic codes [10]. If we consider hardware implementation, serial decoding algorithm has the

advantage of requiring a simple decoding circuit at the cost of large number of decoding cycles. In contrast, parallel decoding has the advantage of fast decoding but requires hardware of greater complexity. A partial parallel decoding scheme can be a good trade-off between serial and parallel decoding with regard to decoding speed and hardware complexity.

A. Parity-Check Matrix Decomposition and Partial Parallel Decoding Scheme

In this subsection, we present a partial parallel decoding scheme via a decomposition on parity-check matrix. Unlike traditional method used to represent the parity-check matrix for a NB-TF-EG code with points on the column side and frames on the row side, we decompose the parity-check matrix into two parts. One contains qd_{-ary incidence vectors representing}

the relationship between points and lines, and the other comprises binary incidence vectors describing the relationship between lines and frames. In the following, we illustrate the construction of these two matrices. Consider the d-dimensional Euclidean geometry EG∗(d, q) over the GF(qd_{). For d = 2,}

let β = αJ1+1_{. Then,} {0, 1, β, β2_{, β}3_{, . . . , β}J2} form a

subfield GF(qd−1) of the field GF(qd). Consider a parallel bundle P [4] in EG∗(d, q) comprising lines {L, β1L, . . . , βJ2L}. The corresponding qd_{-ary incidence vector is}

vPL ={vL, vβ1L, . . . , vβJ2L}. By multiplying P by α,

we obtain αP ={αL, αβ1_{L, . . . , αβ}J2_L}, where v

αPL=

{vαL, vαβ1_L, . . . , v_αβ_J2_L} is its qd-ary incidence vector. Each

line in αP is the right cyclic shift of the line in P . The J0lines in EG∗(d, q) can be divided into J3= J1+ 1 groups of parallel bundles [4] and denoted as{P, αP, . . . , αJ3−1P}. Each group

of parallel bundles comprises J4= J2+ 1 lines. A J0× n matrix LP can be formed by the qd-ary incidence vectors of

the parallel bundle of lines via {vPL, vαPL, . . . , vαJ3−1PL}.

Matrix LP represents the relationship between points and lines

with qd_{-ary incidence vectors of lines as rows. This completes}

the first part of the decomposition of the parity-check matrix H. The parallel bundle P has cyclic property; therefore, we only need to store the qd-ary incidence vectors in vPL

as the indices for iterative decoding. The qd-ary incidence vectors of the other parallel bundles of lines can simply be derived by cyclically shifting the elements in vPL when the

corresponding block is decoded. Next, we construct the matrix with binary incidence vectors. In each parallel bundle, J5= (qd−1− 1)(qd−1− 2)/2 different frames are formed by J4 lines. Let F ={F0, F1, . . . , FJ3−1} be the frames constructed

by P ={P0, P1, . . . , PJ3−1}, where P0= P , P1= αP0 and

so on. Consider the ath parallel bundle Paand its corresponding

frames Fa, where 0≤ a < J3. We express the relationship between frames and lines by defining a J5× J4matrix, referred to as a double identity matrix (DIM). This matrix can be decomposed vertically into J5/J4blocks. For 1≤ k ≤ J5/J4, the kth block is equal to I0+ Ik, where I0is a J4× J4identity matrix, and Ik is a k-times right cyclically-shifted matrix of

I0. The rows of a DIM represent (1,2)-frames in Fa, while the

columns represent the qd-ary incidence vectors of lines in Pa.

Each row includes two values of 1, representing two parallel lines in Pathat participate the corresponding frame in Fa. The

(4)

column and row weights of DIM are J2 and 2, respectively. Different frames in Fa and the corresponding parallel bundles

Pa share the same DIM, such that the partial parallel decoding

scheme can be operated using cyclically-shifted qd_-ary

incidence vectors of lines in LP as inputs on the column side

to form the corresponding (1,2)-frames on the row side. Next, we demonstrate the partial parallel decoding scheme. Recall that there are J4lines in a parallel bundle in EG∗(d, q), The bth line participating in the ath parallel bundle Pa is

denoted as La,b, where 0≤b<J4. Let La,b be another line

in Pa, where 0≤b < J4 and b =b. For each La,b, there are

J2 parallel lines La,b. We can form J4×J2 pairs of (1,2)-frames in Fadenoted byFa,b={La,b,La,b}. For each Pa, we

need to calculate J4×J2frames. The decoding process in Pais

continued until all of the J4×q symbols participating in these J4lines in Pahave been updated. We redefine the index set Ni

and Mj in II-A so as to represent the partial parallel decoding

scheme. For 0≤a<J3, 0≤b<J4, the bth line in the ath group is identical to the the ith line in EG∗(d, q), where i = a×J4+b and 0≤i<J0. Therefore, we use the notation (a, b)≡i to represent the one and only one corresponding index for the ith line in EG∗(d, q). For 0≤i<J0, and 0≤j <n, we define the index sets N(a,b)and Mj by replacing i with (a, b) as N(a,b)= {j : 0≤j <n, v(a,b),j=0} and Mj ={(a, b)≡i : 0≤(a, b)<

J0, v(a,b),j=0}, respectively. In the following, we present an example to illustrate the decomposition of the parity-check matrix of the NB-TF-EG code for partial parallel decoding.

Example 1: Let d = 2 and q = 8 = 23_{, and consider the} two-dimensional Euclidean geometry EG(2, 23_{) over the field} GF(23_{). The subgeometry EG}∗_{(2, 2}3_{) comprises 63 non-origin} points and 63 lines not passing through the origin of EG(2, 23_). These 63 lines in EG∗(2, 23_{) form 189 (1,2)-frames. The} parity-check matrix H of this code is a 189× 63 matrix with 189 (1,2)-frames on the row side and 63 64-ary symbols on the column side. The null space over GF(26_{) of this parity-check} matrix gives a 64-ary (63,45) NB-TF-EG code over GF(26_). We decompose H as follows. A 63× 63 matrix LP with 64-ary

incidence vectors of lines is formed to represent the relationship between points and lines in EG∗(2, 23). We divide 63 lines into 9 groups of parallel bundles, each group contains 7 lines. A 21× 7 DIM is formed to represent the relationship between lines and (1,2)-frames in EG∗(2, 23_{). The decoding process is} accomplished by decoding a 21× 7 DIM 9 times using the corresponding 64-ary incidence vectors of 7 lines in LP on the

column side.

B. Proposed Iterative Two-Step Extended Min-Sum Algorithm NB-TF-EG codes contain large numbers of short cycles of length 4. There are a total of J2

2 ×q 2 short cycles of length 4 in the Tanner graphs of these codes [10]. Using a standard belief propagation algorithm, such a large number of short cycles of length 4 would degrade decoding performance. The proposed ITS-EMS employs the orthogonal structure of NB-TF-EG codes to overcome the performance degradation resulting from short cycles.

Before outlining the decoding algorithm, we define some notation for later use. Upper script w represents the iteration

index, and wmax is the maximum number of iterations to be performed. Suppose Vj is the jth received symbol, where 0≤

j < n. A soft message of the jth code symbol at the wth de-coding iteration is a vector comprising qd_{sub-messages λ}(w)

j =

[λ(w)_j (0), λ(w)_j (1), . . . , λ(w)_j (qd−1)]. The initial value λ(0)_j = [λ(0)_j (0), λ(0)_j (1), . . . , λ(0)_j (qd_{−1)] is the a priori information}

of the jth code symbol from the channel. The log-likelihood re-liability (LLR) of the xth sub-message of λ(w)_j (x) is defined as

λ(w)_j (x) = lnP b(Vj= zj) P b(Vj= x)

(2) with P b(Vj= x) as the probability of Vj equal to x∈GF(qd).

We define zj = arg maxx∈GF(qd₎λ(w)(x) as the most likely

symbol for Vj, which also represents the hard-decision of the

jth received symbol. Let⊕ be the elementary CN operation (ECN) [15] with two-input messages and one output message. Notation ⊕ implies that the equation sums up the input messages using the operation of ECN and stores the smallest soft value. Let⊗be the multiplication in GF(qd_{). Let δ}(w)

i,j and

η(w)_i,j represent the VN-to-CN (V2C) and CN-to-VN (C2V) soft messages between the ith CN and the jth VN, respectively. For x∈GF(qd), the xth LLR of δ_i,j(w) and η(w)_i,j are denoted as δ(w)_i,j (x) and η_i,j(w)(x), respectively. Let γi,jbe the symbol with

the lowest reliability. With vi,j= vi,j⊗Vj, we let δ(w)i,j (x) =

ln(P b(vi,j= γi,j)/P b(vi,j= x)) and η(w)i,j (x) = ln(P b(vi,j=

γi,j)/P b(vi,j= x)), where δi,j(w)(γi,j) = 0 and η(w)i,j (γi,j) = 0,

respectively. To initialize the decoding process, we set zj=

arg minx∈GF (qd₎λ(0)_j (x) and δ_i,j(0)(v_i,j⊗x)=λ(0)_j (x).

We illustrate the ITS-EMS using partial parallel decod-ing for the case of Vj participating in the (1,2)-frames Fa,b

formed by Pa, where 0≤ a < J3, and 0≤ b < J4. In the following, we use the relation (a, b)≡ i to rewrite the notation δ(w)_i,j (x), η(w)_i,j (x), γi,j, and vi,j= vi,j⊗ Vj as δ_(a,b),j(w) (x),

η(w)_(a,b),j(x), γ(a,b),j, and v(a,b),j = v(a,b),j⊗ Vj. The soft

mes-sages of lines La,b in Pa are calculated first with scaling

factor c by LLR(w)_(a,b)(x) = c×⊕ j∈N(a,b) δ_(a,b),j(w) v(a,b),j , (3) where 0≤ c < 1. The extrinsic information of La,bcontributed

by other linesLa,b participating in Pa with scaling factor κ is

given by

E_(a,b)(w) (x) = κ×

La,b∈Pa,b=b

LLR(w)_(a,b₎(x), (4)

where 0≤ κ < 1. The extrinsic information of Vj contributed

by other received symbols participating in La,b except Vj is

obtained by E_(a,b),j(w) (x) =⊕ j∈N(a,b)\j δ_(a,b),j(w) v(a,b),j (5) where v(a,b),j = v(a,b),j⊗ Vj. Let η(w)_(a,b),j(x) be a tentative

(5)

ECN step with E_(a,b),j(w) (x) and E_(a,b)(w) , which is formulated as η_(a,b),j(w) (x) = E_(a,b),j(w) (x)⊕ E_(a,b)(w) (x). (6) After finishing the partial decoding process from (3) to (6) for all symbols participating in J3 parallel bundles, the post-processing for Vjis executed as

λ(w+1)_j (x) = λ(w)_j (x) + (a,b)∈M_j

η_(a,b),j(w) v(a,b),j⊗x

, (7)

where 0≤ j < n. By letting w ← w + 1, we obtain z_j(w)= arg min

x∈GF(qd₎λ

(w+1)

j (x). (8)

A new received vector z(w) is formed from (8) for syndrome calculation. For 0≤ j < n, we execute the VN processing to derive the new V2C messages δ(w+1)_(a,b),j(x) for the next iteration. First, we compute the primitive V2C messages by

ˆ δ_(a,b),j(w+1)v(a,b),j⊗x = λ(w+1)_j (x)−η_(a,b),j(w) v(a,b),j⊗x . (9) Thereafter, the (w + 1)-th V2C messages are derived by nor-malizing primitive V2C messages with respect to the most likely symbol γ(a,b),j as

δ_(a,b),j(w+1)(x) = ˆδ_(a,b),j(w+1)(x)− ˆδ_(a,b),j(w+1)γ(a,b),j

, (10) where

γ(a,b),j = arg min

x∈GF(qd₎

ˆ

δ_(a,b),j(w+1)(x). (11) Based on the above updating process and notation, the proposed ITS-EMS is formulated in Algorithm 1.

Algorithm 1 ITS-EMS

1) Initialization: For 0≤ i < J0 and 0≤ j < n, set zj =

arg minx∈GF (qd₎λ(0)_j (x), δ_i,j(0)(v_i,j⊗ x) = λ(0)_j (x) with

vi,j= 0, w = 0, and the maximum number of iterations

to wmax.

2) Let S(w)(X) be the syndrome derived by dividing the re-ceived polynomial z(w)(X) by the generator polynomial g(X) of the codes. If S(w)(X) = 0, then stop the decod-ing process and output z(w)as the decoded codeword. 3) If w = wmax, then stop the decoding process. If

S(w)_(X)_{= 0, declare a decoding failure.} 4) CN processing:

For 0≤ a < J3, 0≤ b, b < J4, and i = a× J4+ b, a) Compute soft messages for lines in Paby (3).

b) Calculate (4) and (5).

c) Update tentative C2V messages by (6). 5) Post processing:

For 0≤ j < n, execute the post processing for Vj by

(7). Let w← w + 1, and form a new received vector z(w) by (8).

6) VN processing:

a) Compute the V2C messages by (9).

b) Normalize the V2C messages by (10) and (11). 7) Go to Step 2.

TABLE I

MEMORY REQUIREMENTS FORFULLY ANDPARTIALPARALLELDECODING

Next, we demonstrate the complexity analysis of the ITS-EMS with q as a power of 2. To ensure the best performance for the code, we take qd _{elements of field GF(q}d_{) as the}

input for each symbol. We also have d = 2 for the NB-TF-EG codes constructed using the two-dimensional Euclidean geometry. At the Step 2, a (n− k)-stage syndrome calculation necessarily employs at most (n− k) finite field additions and (n− k) finite field multiplications, where k is the number of information symbols. We use the bubble check [20] to calculate the ECN. Each stage in the ECN requires 2× q2_{additions and} q3_{comparisons. At the Step 4, q}_{− 1 ECN steps are required} for (3). 2J4 multiplications are required for the scaling factors c and κ in (3) and (4), and J4J2 additions are required for (4). Moreover, to update each symbol in a line in (5) and (6), we need 2q− 4 and q ECN operations, respectively. Therefore, it takes J4(4q− 5) ECN operations to calculate all the line-sum of the lines and update each symbol in each line in Pa. Since there are J3 blocks, a total of J3J4(4q− 5) ECN operations, 2J3J4 multiplications, and J3J4J2 additions are needed to perform one iteration. At the step 5, J1n additions are needed for (7), and nq2 _{comparisons are needed for (8).} At the Step 6, nq3 _{additions and comparisons are requried} for (9), and nq3 _{additions are required for (10). With some} translations, we summarize the computational complexity with code length n and q. To carry out one iteration of the ITS-EMS algorithm, (10n2_{+ 12n)(q}_{− 1) real-number additions,} 9(n2_{+ n)(q}_{− 1) − nq number comparisons, and 2n} real-number multiplications are required. Both the addition and the comparison operations are on the order of O(n2q), and the multiplication operations are on the order of O(n).

Table I presents the memory requirements for fully and partial parallel decoding. Each value in the table has N bits of finite precision represented by U = qd_{(N + log}

2qd). It turns out that partial parallel decoding saves on storage for line-sums, extrinsic information contributed by other lines, and extrinsic information contributed by other symbols at a factor of J0/J4. Thus, partial parallel decoding provides an alternative for fully parallel decoding if memory is limited.

In the following, two examples are presented to demonstrate the frame error rate (FER) performances of the NB-TF-EG codes decoded using the proposed ITS-EMS and various decod-ing algorithms for short to moderate code lengths. Note that the decoding complexity of the NB-LDPC codes is in proportion to the field size of the finite field [12], [15]. For constructing NB-TF-EG codes with longer block length, the construction needs to be modified as in [18] to decrease the field size of the codes and thus reduce decoding complexity. We also include the error performances of the RS codes with same lengths and rates decoded using the HD-BM and the ASD-KV algorithm.

(6)

TABLE II

NUMBER OFCOMPUTATIONSREQUIRED FORITERATIVEDECODING OFNB-TF-EG CODES ANDASD-KV DECODING OFRS CODES

The computational complexity of the ASD-KV algorithm is on the order of (λ4_N2_{) (the interpolation step), where N is} the length of the code and λ is the parameter of multiplicity assignment in the interpolation steps. We use λ =∞, λ = 9.99, and λ = 4.99 for comparison [21]. Scaling factors c and κ for decoding the NB-TF-EG codes in BPSK with the ITS-EMS are determined by the points with the lowest signal to noise ratio via extensive simulation, as illustrated in Fig. 5. We use the same scaling factors for the higher order modulations. We also examine the performance of one-step MLG decodable NB-LDPC codes with similar lengths and rates constructed based on Euclidean geometries via matrix dispersion [14] and [18]. Furthermore, Table II illustrates the number of computations required for the proposed two iterative decoding algorithms decoding the NB-TF-EG codes and the ASD-KV algorithm decoding RS codes. The numbers for the corresponding NB-TF-EG codes are derived by summing up all of the operations of the ITS-EMS algorithm. In addition, the major computational complexity to carry out the ASD-KV algorihtm comes from the interpolation step [21]; therefore, we only consider this type of calculation for comparison.

Example 2: Let d = 2 and q = 8 = 23_{. Consider the} two-dimensional Euclidean geometry EG(2, 23) over the field GF(23). From Example 1, we know that the null space of the parity-check matrix of this code is the 64-ary (63,45) NB-TF-EG code with J1= 8, J2= 6. By using NB-TS-MLGD, 3 symbol errors can be corrected. The Tanner graph of this code has 79380 cycles of length 4. From Fig. 5, we set c = 0.2 and κ = 0.21. Fig. 1 shows the FER performances of the 64-ary (63,45) NB-TF-EG code over the AWGN channel with BPSK transmission decoded using the proposed ITS-EMS with 3 and 5 iterations, standard EMS with 50 iterations, and NB-TS-MLGD. We also include the error performances of the (63,45) RS code over GF(26_{) decoded using the HD-BM and the} ASD-KV algorithms. In addition, the FER performance of the standard EMS algorithm in decoding one-step MLG decodable NB-LDPC code with same rate and length is also included. This code is a 64-ary (63,45) NB-LDPC code with two different column weights 2 and 3, and row weight 8. At the FER of 10−6, the NB-TF-EG code decoded using the proposed ITS-EMS with 5 iterations achieves a coding gain of 2.2 dB over the RS code decoded using the HD-BM algorithm, as well as a coding gain of 1 dB, 1.3 dB and 1.6 dB over the RS code decoded using the ASD-KV algorithm with λ =∞, λ = 9.99,

Fig. 1. Frame error rates of various decoding algorithms forthe 64-ary (63,45) NB-TF-EG code, the 64-ary (63,45) NB-LDPC code, and the (63,45) RS code over GF(26_{) decoded with the HD-BM and the ASD-KV algorithms using}

BPSK over the AWGN channel.

and λ = 4.99, respectively. Due to the degrading effect of short cycles of length 4, the NB-TF-EG code decoded using the standard EMS algorithm with 50 iterations gains only 0.5 dB over the RS code decoded using the HD-BM algorithm, and degrades by 1.6 dB, compared to the proposed ITS-EMS with 5 iterations. Moreover, the NB-TF-EG code decoded with 5 iterations of the ITS-EMS outperforms the NB-TS-MLGD by 4.3 dB. At the FER of 10−4, we find that the low column weights of the one-step MLG decodable 64-ary (63,45) NB-LDPC code decoded with 10 iterations of standard EMS result in an error floor phenomenon. The 64-ary (63,45) NB-TF-EG code decoded with 5 iterations of the ITS-EMS achieves a coding gain of 1 dB over the 64-ary (63,45) NB-LDPC code decoded with 10 iterations of standard EMS.

Fig. 2 shows the FER versus Eb/N0 performance of the 64-ary (63,45) NB-TF-EG code and the (63,45) RS code over the AWGN channel using 64-QAM. At the FER of 10−5, the NB-TF-EG code decoded with 5 iterations of the ITS-EMS achieves a coding gain of 2.5 dB over the RS code decoded using the HD-BM, as well as a coding gain of 1.2 dB, 1.6 dB, and 1.9 dB over the RS code decoded using the ASD-KV algorithm with λ =∞, λ = 9.99, and λ = 4.99, respectively. In addition, the ITS-EMS with 5 iterations outperforms the standard EMS with 50 iterations by 2 dB for decoding the NB-TF-EG code.

In Table II, the number of computations for decoding the 64-ary (63,45) NB-TF-EG code with 5 iterations of the ITS-EMS is on the order of 1.45× 106_{. On the other hand, the} number of computations for the (63,45) RS code decoded using the ASD-KV algorithm in the interpolation step with λ = 9.99 is on the order of 2.6× 107. From Fig. 1 and Table II, the 64-ary (63,45) NB-TF-EG code decoded with 5 iterations of the ITS-EMS achieves a 1.3 dB coding gain over the (63,45) RS code decoded using the ASD-KV algorithm with λ = 9.99, representing an order of magnitude reduction in the number of computations.

(7)

Fig. 2. Frame error rates of various decoding algorithms forthe 64-ary (63,45) NB-TF-EG code and the (63,45) RS code over GF(26_{) decoded with the}

HD-BM and the ASD-KV algorithms using 64-QAM over the AWGN channel.

Example 3: Let d = 2 and q = 16 = 24. Consider the two-dimensional Euclidean geometry EG(2, 24) over the field GF(24). The subgeometry EG∗(2, 24) consists of 255 non-origin points and 255 lines not passing through the non-origin of EG(2, 24), which form 1785 (1,2)-frames. The parity-check matrix H of this code is a 1785× 255 matrix with 1785 (1,2)-frames on the row side and 255 256-ary symbols on the column side. The null space over GF(28_{) of this parity-check matrix} gives a 256-ary (255,191) NB-TF-EG code over GF(28_).

The decomposition of H is as follows. With the 255 lines and the 255 points in EG∗(2, 24_{), a 255}_{× 255 matrix L}

P with

256-ary incidence vectors of lines is formed. The 255 lines in EG∗(2, 24_{) can be divided into 17 groups of parallel bundles,} with each of them consisting of 15 lines and forming 105 (1,2)-frames. A 105 × 15 DIM with binary incidence vectors of (1,2)-frames is formed as a unit for partial parallel decoding. The decoding process is accomplished by decoding a 105 × 15 DIM 17 times with the corresponding 256-ary incidence vectors of 15 lines on the column side. The values of J1 and J2 for the 256-ary (255,191) NB-TF-EG code code are 16 and 14, respectively. This code can correct up to 7 symbol errors with NB-TS-MLGD. The Tanner graph of this code contains 19,492,200 short cycles of length 4. From Fig. 5, we set c = 0.2 and κ = 0.05, respectively. Fig. 3 shows the FER performances of the 256-ary (255,191) NB-TF-EG code over the AWGN channel with BPSK signaling decoded using the proposed ITS-EMS with 3 and 5 iterations, standard EMS with 30 iterations, and NB-TS-MLGD. We also include the error performances of the (255,191) RS code over GF(28_{) decoded} using the HD-BM algorithm and ASD-KV algorithms using λ =∞, λ = 9.99, and λ = 4.99, respectively. In addition, the FER performance of the standard EMS algorithm in decoding one-step MLG decodable NB-LDPC code with same rate and length is also included. The code is a 256-ary (255,193) NB-LDPC code with two different column weights 5 and 6, and row weight 16. At the FER of 10−5, we see that the NB-TF-EG code decoded using 5 iterations of the ITS-EMS achieves a

Fig. 3. Frame error rates of various decoding algorithms for the 256-ary (255,191) NB-TF-EG code, the 256-ary (255,193) NB-LDPC code, and the (255,191) RS code over GF(28_{) decoded with the HD-BM and the ASD-KV}

algorithms using BPSK over the AWGN channel.

coding gain of 1.3 dB over the RS code decoded using the HD-BM algorithm, and a coding gain of 0.4 dB, 0.6 dB and 0.75 dB over the RS code decoded using the ASD-KV algorithm with λ =∞, λ = 9.99, and λ = 4.99, respectively. Note that the performance gap between 3 and 5 iterations of the ITS-EMS is less than 0.1 dB. Moreover, the 256-ary (255,191) NB-TF-EG code decoded with 5 iterations of the ITS-EMS outperforms the NB-TS-MLGD by 3.7 dB. We notice that the NB-TF-EG code decoded with the standard EMS algorithm performs poorly due to the degrading effect of short cycles of length 4. The ITS-EMS with 5 iterations achieves a 0.9 dB coding gain over the standard EMS with 30 iterations. At the FER of 10−2, note that the low column weights of the one-step MLG decodable 256-ary (255,193) NB-LDPC code decoded with 5 iterations of the standard EMS result in an error floor phenomenon. In Fig. 4, we demonstrate the FER versus Es/N0performance of the 256-ary (255,191) NB-TF-EG code and the (255,191) RS code over the AWGN channel using 256-QAM. At FER = 10−5, the NB-TF-EG code decoded using 5 iterations of the ITS-EMS achieves a coding gain of 1.5 dB over the RS code decoded using the HD-BM, as well as a coding gain of 0.5 dB, 0.7 dB and 0.8 dB over the RS code decoded using the ASD-KV algorithm with λ =∞, λ = 9.99 and λ = 4.99, respectively. Also, the 256-ary (255,191) NB-TF-EG code decoded using the ITS-EMS with 5 iterations outperforms the standard EMS with 30 iterations by 0.8 dB.

As shown in Table II, the number of computations for de-coding the 256-ary (255,191) NB-TF-EG code with 3 iterations of the ITS-EMS is on the order of 1.7× 108_{. In contrast, the} number of computations for decoding the (255,191) RS code with the ASD-KV algorithm in the interpolation step with λ = 9.99 is on the order of 4.26× 108. From Fig. 3 and Table II, the 256-ary (255,191) NB-TF-EG code decoded with 3 iterations of the ITS-EMS outperforms the (255,191) RS code decoded with ASD-KV λ = 9.99 by 0.4 dB, providing 60% reduction in computational complexity.

(8)

Fig. 4. Frame error rates of various decoding algorithms for the 256-ary (255,191) NB-TF-EG code and the (255,191) RS code over GF(28_{) decoded}

with the HD-BM and the ASD-KV algorithms using 256-QAM over the AWGN channel.

Fig. 5. Scaling factors c and κ for the 64-ary NB-TF-EG (63,45) and the 256-ary NB-TF-EG (255,191) in BPSK decoding with 5 iterations of ITS-EMS with target FER of 10−5and 10−6, respectively.

IV. ITERATIVERELIABILITYTWO-STEP MLGD ALGORITHM

The computational complexity of the ITS-EMS algorithm is high because a large number of operations are performed using real numbers. In this section, we present a simplified decoding algorithm, called IRTS-MLGD algorithm. The IRTS-MLGD only utilizes finite field and integer operations, which greatly reduce computational complexity, compared to the ITS-EMS using operations in real numbers. In addition, compared to the one-pass NB-TS-MLGD employs only hard-decision values from the received symbols, the IRTS-MLGD utilizes the soft information of the received symbols in conjunction with an iterative decoding process. As a result, a considerable coding gain can be achieved.

For practical applications, we devise the algorithm over GF(2r_{). Consider NB-TF-EG code C over GF(2}r_{) of length}

n. Let ys be the soft received sequence at the received

sampler represented by ys= (y0, y1, . . . , yn−1), where “s” stands for soft information. For 0≤ j < n, each element of ysin GF(2r) is represented as an r-tuple yj = (yj,0, yj,1, . . . ,

yj,r−1) over GF(2). The hard-decision received sequence z =

(z0, z1, . . . , zn−1) over GF(2r) is determined by ys, where zj

is an estimate of the jth transmitted symbol, for 0≤ j < n. Let ρj,k be the quantized value of sample yj,k, where 0≤ j < n

and 0≤ k < r. The quantized value is an integer representation of the 2p− 1 quantized intervals symmetric to the origin. Each interval has a length and each sample is represented by p bits. Therefore, ρj,kis in the range of [−(2(p−1)− 1), +2(p−1)− 1].

For 0≤ j < n, the jth group (ρj,0, ρj,1, . . . , ρj,r−1) is decoded

into element a in GF(2r_{) =}_{a

0, a1, . . . , a2r₋₁}. For 0 ≤ l <

2r_{, the binary representation of the lth element a}

l∈ GF(2r) is

denoted by an r-tuple al= (al,0, al,1, . . . , al,r−1) over GF(2).

For each element al∈ GF(2r), we calculate the reliability

measure of alas φj,l= r−1 k=0 (1− 2al,k)ρj,k (12)

which is in the range of [−r(2(p−1)−1)_{, +r(2}(p−1)−1_{)]. Let a}

be the element in GF(2r) with the highest reliability, and a is selected as zj. For 0≤j <n, let φj= (φj,0, φj,1, . . . , φj,2r₋₁)

which is called the decision vector of the jth received symbol zj. For 0≤ i < J0, the reliability measure of the jth received symbol is given by

ϕi,j= min

j_∈N_i_\jmaxl φj,l (13)

which can be regarded as a reliability measure of the extrinsic information contributed to zj by other received symbols in

S(Li). Consider the jth received symbol zj participating in

Fj,u,t which consists of two lines Lj_u and Lj_t,u, where 0≤ u < J1 and 0≤ t < J2. Frame-sum S(Fj,u,t) is actually a check-sum in H. There are J1J2 check-sums that contain zj.

Assume that S(Fj,u,t_{) participates in the ith line of EG}∗_{(d, q),}

where 0≤ i < J0, 0≤ j < n, 0 ≤ u < J1, and 0≤ t < J2. S(Fj,u,t_{) can be normalized for decoding the jth received}

symbol as

S (Fj,u,t) = v−1_i,jS(Fj,u,t) = v−1_i,jSLj_u+ v−1_i,jS Lj_t,u = zj+ vi,j−1 l∈Ni\j vi,lzl+ v−1i,jS Lj_t,u . (14) Next, we consider the partial parallel decoding scheme mentioned in III-A for the proposed IRTS-MLGD algorithm. The bth line in which the jth received symbol participates in the a-group is denoted as La,b. The extrinsic information of the

jth received symbol comprises two parts. The first part is the extrinsic symbol information, which comes from the frame-sum S(Fj,u,t) without the jth symbol. The other part is the magnitude of the reliability measure, which comes from the re-liability measure of the parallel lines ofLa,band the reliability

measure of the symbols participating in the same line as the jth symbol. Recall that wmaxis the maximum number of iterations to be performed. At the wth iteration, the jth received symbol

(9)

is denoted as z_j(w). The extrinsic information of La,b can be

derived by the line-sum and the reliability measure ofLa,bas

S(w)(La,b) = j∈N(a,b) v(a,b),jzj(w), (15) ΓLa,b = min j∈N(a,b) max l φj,l. (16)

The reliability measure of the jth received symbol participating inLa,bis calculated as

ϕ(a,b),j = min

j∈N(a,b)\j

max

l φj,l. (17)

The extrinsic symbol information of the bth line is contributed by other lines in the a-th group as

ξ_(a,b),j(w) =

La,b∈Pa,b=b

v−1_(a,b),jS(w)(La,b). (18)

The frame-sum (14) can be rewritten as S (w)(Fj,u,t) = z_j(w)+ v−1_(a,b),j ⎛ ⎝ l∈N(a,b)\j v(a,b),lz (w) l +ξ (w) (a,b),j ⎞ ⎠ . (19) Let σ_(a,b),j(w) = v−1_(a,b),j ⎛ ⎝ l∈N(a,b)\j v(a,b),lzl(w)+ ξ (w) (a,b),j ⎞ ⎠ . (20) The normalized check-sum S (w)(Fj,u,t_{) can be rewritten as}

S (w)(Fj,u,t) = z_j(w)+ σ_(a,b),j(w) . (21) The extrinsic symbol information of the jth received symbol can be derived by

σ_(a,b),j(w) = S (w)(Fj,u,t)− z_j(w). (22) From (22), we can see that: 1) if S (w)(Fj,u,t) = 0 and σ_(a,b),j(w) is error free, then z_j(w) must be error free; 2) if S (w)(Fj,u,t₎_{= 0 and σ}(w)

(a,b),j is error free, then z (w)

j contains

an error ej. The value of zj(w)must be changed to z

(w)

j − ej =

−σ(w)

(a,b),j to make the normalized check-sum S (w)(F

j,u,t₎

equal to zero when ej= S (w)(Fj,u,t). Next, we consider

updating the magnitude of the reliability measure of the jth received symbol which participates inLa,b. The first step is to

calculate the reliability measure contributed by J2La,bparallel

toLa,band denoted as

β_La,b = min La,b∈Pa

Γ_L_a,b, (23)

where ΓLa,b is derived as (16) withLa,b replaced withLa,b.

In the second step, we update the reliability measure of the re-ceived symbol z participating in eachLa,b. Let ψ(w)j be the

de-cision vector of the magnitude of the reliability measure for the jth received symbol contributed by other symbols participating in the lineLa,b except z(w)j and other lines parallel toLa,b at

the wth iteration. For (a, b)∈ M jand 0≤ j < n, the decision

vector is denoted as ψ(w)_j = (ψ_j,0(w), ψ(w)_j,1, . . . , ψ_j,2(w)r₋₁), and

derived by summing up the the minimum value between β_La,b

and ϕ(a,b),j when the extrinsic symbol information −σ(w)(a,b),j equals al∈ GF(2r). The lth element in ψ(w)j is calculated by

ψ_j,l(w)=

(a,b)∈M_j −σ(w)_(a,b),j=al

minβ_La,b, ϕ(a,b),j

. (24)

Let R(w)_j ={R_j,0(w), R(w)_j,1, . . . , R(w)_j,2r₋₁} be the reliability

mea-sure vector of z_j(w) at the wth iteration, where R(w)_j,l is the reliability measure, such that alis taken to be z(w)j . In the (w +

1)-th iteration, the reliability measure of z(w+1)_j is updated by R(w+1)_j = R(w)_j + ψ(w)_j . (25) For w = 0 and 0≤ j < n, we set R(0)_j,l = φj,l, where the

parameter is called a scaling factor which is selected to optimize the performance of a given code.

Algorithm 2 IRTS-MLGD

1) Initialization: For 0≤ j < n, set R(0)_j,l = φj,l, w = 0,

and the maximum number of iterations to wmax.

2) Let S(w)_{(X) be the syndrome derived by dividing} the received polynomial z(w)_{(X) by the generator} polynomialg(X) of the codes. If S(w)_{(X) = 0, then stop} the decoding process and output z(w) as the decoded codeword.

3) If w = wmax, then stop the decoding process. If

S(w)(X)= 0, then declare a decoding failure. 4) For 0≤ a < J3, 0≤ b, b < J4and i = a× J4+ b:

Update the elements in ψ(w)_j using (24) by selecting the minimum value between (17) and (23) when symbol extrinsic information−σ_(a,b),j(w) equals al∈ GF(2r).

5) For 0≤ j < n, update the reliability measure vector R(w+1)_j using (25). Make the hard-decision z(w)_j = arg maxalR

(w)

j,l . Let w← w + 1, and form a new

re-ceived vector z(w)_. 6) Proceed to Step 2.

Due to the limit of quantization bit widths, we need to bound the range of the reliability for each symbol. Let Δ = (2p−1− 1) be the range of quantization. If R(w+1)_j,max = maxΔ l(R(w)j,l + ψ

(w)

j,l )

is greater than Δ, then R(w+1)_j,max is truncated at Δ. By defining π = R(w+1)_j,max − Δ, we obtain

R(w+1)_j,l =

−Δ, if R(w+1)_j,l − π < −Δ; R(w+1)_j,l − π, otherwise

Using the above updating process, the proposed IRTS-MLGD algorithm is formulated in Algorithm 2.

The computational complexity of the IRTS-MLGD is ana-lyzed as follows. The initialization of the decoding algorithm needs n2r_{log 2}r _{integer additions for (12) and n2}r _integer

(10)

multiplications for ε to compute the reliability measure of all R(0)_j . In addition, J3J4(3q−5)(2r−1) integer comparisons are required to calculate (16) and (17). At the Step 2, an (n −k)-stage syndrome calculation must be performed using no more than (n−k) finite field additions and (n−k) finite field mul-tiplications. At the Step 4, J4(q−1) finite field additions and J4q finite field multiplications are required for the calculation of J4line-sums using (15) in each Pa. For two-step decoding,

J2J4finite field additions and J2J4finite field multiplications are required for (18). J4q finite field additions and J4q finite field multiplications are required to calculate (20), and q finite field additions are required for (22) to update the symbols in the corresponding line-sum. Finally, J4(q−2)(2r−1) integer com-parisons are required to calculate (24). J3[J4(J2−1+2q)+q] finite field additions, J3J4(J2+2q) finite field multiplications, and J3J4(q−2)(2r−1) integer comparisons are required to complete J3blocks for partial parallel decoding. At the Step 5, nJ1integer additions are required to update (25). Moreover, a maximum of n2rinteger additions and n(2r−1) integer com-parisons are required for normalization, and n(2r−1) integer comparisons are required to make hard decisions. The compu-tational complexity is summarized with some translations with code length n and q. A total of n(4q−2) finite field additions, n(3q−2) finite field multiplications, nq(q+1) integer addi-tions, and nq(n+1) integer comparisons are required to carry out one iteration of the IRTS-MLGD. In the following exam-ples, the bit widths are respectively 10-bits and 12-bits and the interval length of both codes is=0.3125. For convenience, the computational complexity of the ITS-EMS and the IRTS-MLGD is evaluated according to the number of operations. The IRTS-MLGD is shown to reduce computational complexity to a degree exceeding that of real numbered ITS-EMS with 32-bit floating point format in IEEE Standard 754 [22]. Moreover, the number of computations for the interpolation of ASD-KV algorithm exceeds that of the two proposed iterative decoding algorithms as λ increases to λ = 9.99.

Example 4: Consider the 64-ary (63,45) NB-TF-EG code in Example 1 with = 8. Fig. 6 shows the FER performances of the NB-TF-EG code over the AWGN channel with BPSK transmission using the ITS-EMS with 5 iterations, the IRTS-MLGD with 5 and 10 iterations, standard EMS with 50 it-erations, and NB-TS-MLGD. The FER performances of the (63,45) RS code over GF(26_{) decoded using the HD-BM and} the ASD-KV algorithm are also included. As shown in Table II, the number of integer operations for the IRTS-MLGD with 10 iterations is on the order of 3.67× 105. In contrast, the number of computations in real numbers using the ITS-EMS with 5 iterations is on the order of 1.45× 106_{. At the FER of 10}−6_{, the} IRTS-MLGD with 10 iterations reduces the number of compu-tations by 75% with a 1.1 dB in performance loss, compared to the ITS-EMS with 5 iterations. Besides, the IRTS-MLGD with 10 iterations outperforms the NB-TS-MLGD by 3 dB and achieves 1 dB coding gain over the RS code decoded using the HD-BM algorithm. Furthermore, the NB-TF-EG code decoded using the IRTS-MLGD with 10 iterations nearly exceeds the RS code decoded using the ASD-KV algorithm with λ =∞, and achieves a 0.5 dB coding gain over the RS code decoded using the ASD-KV algorithm with λ = 4.99.

Fig. 6. Frame error rates of the IRTS-MLGD algorithm, and other decoding algorithms for the 64-ary (63,45) NB-TF-EG code, and the (63,45) RS code over GF(26_{) decoded with the HD-BM and the ASD-KV algorithms using}

Fig. 7. Frame error rates of the IRTS-MLGD algorithm, and other decoding algorithms for the 256-ary (255,191) NB-TF-EG code, and the (255,191) RS code over GF(28_{) decoded with the HD-BM and the ASD-KV algorithms using}

Example 5: Consider the 256-ary (255,191) NB-TF-EG code given in Example 2 with = 16. Fig. 7 presents the FER performances of the NB-TF-EG code over the AWGN channel with BPSK signaling decoded using the ITS-EMS with 5 itera-tions, the IRTS-MLGD with 5 and 10 iteraitera-tions, standard EMS with 30 iterations, and NB-TS-MLGD. The FER performances of the (255,191) RS code over GF(28_{) decoded using the} HD-BM and the ASD-KV algorithm are also included.

From Table II, decoding with 5 and 10 iterations of the IRTS-MLGD require the integer operations on the order of 5.55× 106 and 1.11× 107, respectively. In contrast, the number of computations required for real numbers using the ITS-EMS with 5 iterations is on the order of 2.83× 108. At the FER of 10−5, the IRTS-MLGD with 5 iterations reduces the number of computations by 99% with a 0.5 dB in performance loss, compared to the ITS-EMS with 5 iterations. In addition, the

(11)

NB-TF-EG code decoded using the IRTS-MLGD with 10 iter-ations achieves a 3.2 dB coding gain over the NB-TS-MLGD, and outperforms the RS code by 0.7 dB when decoded using the HD-BM algorithm. Moreover, decoding the NB-TF-EG code with 10 iterations of IRTS-MLGD nearly exceeds the RS code decoded using the ASD-KV algorithm with λ =∞, and outperforms the RS code decoded using the ASD-KV algorithm by 0.3 dB with λ = 4.99.

V. CONCLUSION

This paper presents a subclass of the TS-MLG decodable cyclic codes based on Euclidean geometries to non-binary cases, termed as NB-TF-EG codes. We also present two cor-responding algorithms for decoding NB-TS-MLG decodable cyclic code. Our results demonstrate that the proposed iterative decoding algorithms are capable of efficient decoding of NB-TS-MLG decodable cyclic codes with Tanner graphs including a large number of short cycles of length 4. This is achieved by utilizing the orthogonal structure of the parity-check matrices of the codes to avoid performance degradation resulting from numerous short cycles of length 4. In addition, the proposed partial parallel decoding scheme strikes a reasonable balance between decoding speed and memory usage by incorporating a decomposition of the parity-check matrices of the codes. Simulation results demonstrate that the NB-TF-EG codes de-coded using the proposed ITS-EMS algorithm in a small number of decoding iterations outperform the RS codes with similar lengths and rates decoded using either hard-decision or algebraic soft-decision decoding algorithms. Moreover, the IRTS-MLGD provides an alternative for ITS-EMS in decoding NB-TF-EG codes with far lower computational complexity.

ACKNOWLEDGMENT

The authors would like to acknowledge the contributions of Prof. Shu Lin to the development of this paper. The authors also thank the anonymous reviewers and the Editor for their helpful suggestions and careful reading which have improved the quality and presentation of the paper.

REFERENCES

[1] L. D. Rudolph, “Geometric configuration and majority logic decodable codes,” M.S. thesis, Univ. Oklahoma, Norman, OK, USA, 1964. [2] T. Kasami, S. Lin, and W. W. Peterson, “Polynomial codes,” IEEE Trans.

Inf. Theory, vol. IT-14, no. 6, pp. 807–814, Nov. 1968.

[3] T. Kasami and S. Lin, “On majority-logic decoding for duals of primitive polynomial codes,” IEEE Trans. Inf. Theory, vol. IT-17, no. 3, pp. 322– 331, May 1971.

[4] S. Lin and D. J. Costello, Jr., Error Control Coding: Fundamentals

and Applications, 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall,

2004.

[5] Y. Kou, S. Lin, and M. Fossorier, “Low-density parity-check codes based on finite geometries: A rediscovery and new results,” IEEE Trans. Inf.

Theory, vol. 47, no. 7, pp. 2711–2736, Nov. 2001.

[6] R. Tanner, “A recursive approach to low complexity codes,” IEEE Trans.

Inf. Theory, vol. IT-27, no. 5, pp. 533–547, Sep. 1981.

[7] H. Tang, J. Xu, S. Lin, and K. Abdel-Ghaffar, “Codes on finite geome-tries,” IEEE Trans. Inf. Theory, vol. 51, no. 2, pp. 572–596, Feb. 2005. [8] F. R. Kschischang, B. J. Frey, and H. A. Loeliger, “Factor graphs and the

sum-product algortihm,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 498– 519, Feb. 2001.

[9] J. Chen and M. Fossorier, “Near optimum universal belief propagation based decoding of low-density parity check code,” IEEE Trans. Commun., vol. 50, no. 3, pp. 406–414, Mar. 2002.

[10] L. Zhang, Q. Huang, and S. Lin, “Iterative algorithms for decoding a class of two-step majority logic decodable cyclic codes,” IEEE Trans.

Commun., vol. 59, no. 2, pp. 416–427, Feb. 2011.

[11] H.-C. Chang, C.-L. Chen, and H.-C. Chang, “An iterative weighted reli-ability decoding algorithm for two-step majority-logic decodable cyclic codes,” IEEE Commun. Lett., vol. 17, no. 10, pp. 1980–1983, Oct. 2013. [12] D. J. MacKay and M. C. Davey, “Evaluation of gallager codes for short

block length and high rate applications,” in Proc. IMA Workshop Codes,

Syst. Graph. Models, 1999, pp. 113–130.

[13] L. Zeng et al., “Construction of nonbinary cyclic, quasi-cyclic and reg-ular LDPC codes: A finite geometry approach,” IEEE Trans. Commun., vol. 56, no. 3, pp. 378–387, Mar. 2008.

[14] W. E. Ryan and S. Lin, Channel Codes: Classical and Modern. Cambridge, U.K.: Cambridge Univ. Press, 2009.

[15] D. Declercq and M. Fossorier, “Decoding algorithms for nonbinary LDPC codes over GF(q),” IEEE Trans. Commun., vol. 55, no. 4, pp. 633–643, Apr. 2007.

[16] R. Koetter and A. Vardy, “Algebraic soft-decision decoding of Reed-Solomon codes,” IEEE Trans. Inf. Theory, vol. 49, no. 11, pp. 2809–2825, Nov. 2003.

[17] C.-Y. Chen, Q. Huang, C.-C. Chao, and S. Lin, “Two low-complexity reliability-based message-passing algorithms for decoding non-binary LDPC codes,” IEEE Trans. Commun., vol. 58, no. 11, pp. 3140–3147, Nov. 2010.

[18] Q. Huang, Q. Diao, S. Lin, and K. Abdel-Ghaffar, “Cyclic and quasi-cyclic LDPC codes on constrained parity-check matrices and their trap-ping sets,” IEEE Trans. Inf. Theory., vol. 58, no. 5, pp. 2648–2671, May 2012.

[19] S. Lin, “Multifold Euclidean geometry codes,” IEEE Trans. Inf. Theory, vol. IT-19, no. 4, pp. 537–548, Jul. 1973.

[20] E. Boutillon and L. Conde-Canecia, “Bubble check: A simplified algo-rithm for elementary check node processing in extended min-sum non-binary LDPC decoders,” Electron. Lett., vol. 46, no. 9, pp. 633–634, Apr. 2010.

[21] W. Gross, F. Kschischang, R. Koetter, and P. Gulak, “Applications of algebraic soft-decision decoding of Reed-Solomon codes,” IEEE Trans.

Commun., vol. 54, no. 7, pp. 1224–1234, Jul. 2006.

[22] IEEE Standard for Binary Floating-Point Arithmetic, IEEE Std. 754, 1985.

Hsiu-Chi Chang received the B.S. and M.S.

de-grees in electronics engineering from National Chiao Tung University, Hsinchu, Taiwan, in 2005 and 2007, respectively, where he is currently working toward the Ph.D. degree in electronics engineering. From January 2011 to July 2011, he was a visiting scholar in electrical engineering at the University of California, Davis, CA, USA.

His research interests include iterative algorithms, error control codes, and machine learning.

Hsie-Chia Chang received the B.S., M.S., and Ph.D.

degrees in electronics engineering from National Chiao Tung University, Hsinchu, Taiwan, in 1995, 1997, and 2002, respectively.

From 2002 to 2003, he was with OSP/DE1 in MediaTek Corporation, working in the area of decod-ing architectures for Combo sdecod-ingle chip. In February 2003, he joined the Faculty of Electronics Engineer-ing Department, National Chiao Tung University, where he has been a Professor since August 2010. His research interests include algorithms and VLSI architectures in signal processing, especially for error control codes and crypto-systems. Recently, he has also committed himself for designing high code-rate ECC schemes for flash memory and multi-Gb/s chip implementations for wireless communications.

He served as the Associate Editor of IEEE Transactions on Circuits and System I: Regular papers since 2012. He also served as Technique Program Committee (TPC) member of IEEE A-SSCC 2011 and 2012. Dr. Chang was the recipient of the Outstanding Youth Electrical Engineer Award from Chinese Institute of Electrical Engineering in 2010, and the Outstanding Youth Researcher Award from Taiwan IC Design Society in 2011.