T HESIS O RGANIZATION - 以記憶體為基礎的多標準前端錯誤更正

CHAPTER 1 INTRODUCTION

1.3 T HESIS O RGANIZATION

The organization of this thesis is as follows. In chapter 2, the algorithm of FEC will be described, including the algorithm of FEC encoder and decoder. It contains scrambler,

interleaver, Reed-Solomon codes and convolutional codes. And, the proposed algorithms and architectures of FEC for multi-standard will be addressed in chapter 3, which mainly contains a multi-mode RS decoder with memories to store and correct the received data and a memory-based universal convolutional interleaver and de-interleaver with a simple address generator. Chapter 4 will show the result of the chip implementation, the simulation result and will do some comparisons between other reference works and the proposed result. The last chapter is the conclusion and the future work.

Chapter 2 Algorithm of FEC

First of all, the encoder and decoder of FEC will be introduced. In ITU-T J.83, it can be divided into two main parts and composed of three or four processing layers. The first one is shown in figure 2.1(a), including ITU-T J.83 annex A, C, and D. The other is shown in figure 2.1(b), including ITU - T J.83 annex B. The following sections will define and introduce the algorithm of each layer in FEC.

in

Scrambler RS Encoder Interleaver

to modulation

(a)

to modulation in

RS Encoder Interleaver Trellis

Encoder Scrambler

(b)

Figure 2.1: (a) FEC in ITU-T J.83 annexes A, C and D. (b) FEC in ITU-T J.83 annex B

2.1 Scrambler

The basic idea of scrambler is to randomize the transmitted data to provide the even distribution of the symbols in the constellation and to ensure adequate binary transitions for clock recovery.

Enable Initialization

Sequence ⁰ ⁰ ¹ ⁰ ¹ ⁰ ¹ ⁰ ⁰ ⁰ ⁰ ⁰ ⁰ ⁰

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Data output Data Input 10111000 XXXXXXXX

(B8)_HEX

(a)

Z^-1 Z^-1 Z^-1

α³

Data in 7

7 7 Data out

(b)

D⁰ D¹ D² D³ D⁴ D⁵ D⁶ D⁷

(c)

Figure 2.2: Scrambler in (a) J.83A, C and DVB-T system. (b) J.83B. (c) J.83D

Figure 2.2(a) shows the scrambler in J.83 annexes A, C and DVB-T systems. The scrambler adds a Pseudorandom Noise (PN) sequence to input symbols. And, the polynomial for the Pseudo-Random Binary Sequence (PRBS) generator is:

14 1

15 + x +

x (2.1) At the start of every eight transport packets, the PRBS registers shall be initiated to the sequence “100101010000000”.

Figure 2.2(b) shows the scrambler in J.83B. The scrambler adds a PN sequence of 7-bit symbols over GF(128) to the input symbols to assure a random transmitted sequence.

Initialization is defined as pre-loading to the “all one” state. The scrambler uses a linear feedback shift register specified by a GF(128) polynomial defined as follows:

) 3

(x =x +x+α

f (2.2) Where:

3 1

7 +α + =

α (2.3)

The scrambler generator polynomial and initialization in J.83D are shown in figure 2.2(c). The PRBS is generated in a 16-bit shift register that has nine feedback taps. Eight of the shift registers outputs are selected as the fixed randomizing byte (D⁷ D⁶ D⁵ D⁴ D³ D² D¹ D⁰), where each bit from this byte is used to individually XOR the corresponding input data bit. The random generator polynomial is denoted as:

3 1

6 7 11 12 13

16 +x +x +x +x +x +x +x+

x (2.4)

Where initialization is defined as pre-load to F180h, indicating the registers of x¹⁶, x¹⁵, x¹⁴, x¹³, x⁹ and x⁸ will be loaded to 1 during the field sync interval.

The structure of de-scrambler is the same as scrambler since PRBS generator is

constructed by shift registers and all operations are XORS.

2.2 Interleaving

The main purpose of interleaving is to resist burst errors, which are induced in noisy channel. It rearranges the order of input data sequence. Generally, there are two kinds of techniques of interleaving. One is block interleaving, and the other is convolutional interleaving. However, convolutional interleaving has better ability to spread burst errors than block interleaving.

The structure of (I, J) convolutional interleaver and deinterleaver based on Forney approach [8] and Ramsey type III approach [5] is shown in figure 2.3. The parameter I is the interleaving depth and is chosen to be larger than the maximum expected length of burst errors. It also represents that there are I branches in the structure of convolutional interleaving.

The parameter J is usually chosen such that I x J should be larger than the decoding constraint length for convolutional codes. It also means that branch 0 has zero delays in convolutional interleaver. And, there are J shift registers in branch 1, 2J shift registers in branch 2, and son on, (I-1) x J shift registers in branch I-1. Convolutional de-interleaver has the inverse of this property. Hence, the memory requirement is J x I x (I-1) / 2 in both convolutional interleaver and deinterleaver. The total end-to-end delay is J x I x (I-1). This is half the required delay and memory in the block interleaving.

The operation of convolutional interleaving is that at the start of the FEC frame, the input switch is initialized to the top-most branch. Then, the input switch is cyclically connected to the other branches as the valid symbols come in. So does the output switch. And, the input and output switches shall be synchronized.

. . .

Figure 2.3: Structure of (I, J) convolutional interleaving

Taking (12, 17) convolutional interleaver for an example, this interleaver is adopted in J.83 annexes A, C and DVB-T systems. Assume the input sequence is 0, 1, 2, …, 204, 205, …, and so on. Where the number means the input timing index. And the output sequence will be 0, x, x, …, x, 12, x, x, …, x, 204, 1, x x, …, x, 2244, 2041, …, 11, and so on, as shown in figure 2.4. Where x means “the don’t care symbols” at the beginning transmission. Hence, the burst errors will be spread out as the pseudo noise after deinterleaving. And, the data should be reordered to the original sequence after deinterleaving in receiver part.

. . .

2244 2040 408 204 0

X Figure 2.4: The output symbols in convolutional interleaver with I = 12, J = 17

In J.83 annexes A, C, DVB-T and ATSC Digital TV system, the convolutional interleaving is with I = 12 and J = 17. In J.83 annex D, the convolutional interleaving is with I = 52 and J = 4. The upper systems have only one dedicated parameters. However, the convolutional interleaving in J.83 annex B has lots of different modes to be operated. That is,

I can be 128, 64, 32, 16 and 8. J can be 1, 2, 3 ~ 7, 8 and 16. The most critical mode is with I

= 128, J = 8. The detail information about specifications is in [1].

2.3 Reed-Solomon Codes

Reed-Solomon codes have become the most important code of various types of error-control codes due to its superior capability for burst error correcting and the feasibility for digital implementation. Hence, RS codes are widely adopted in many data communication applications, such as digital TV system, compact disk (CD), and digital versatile disk (DVD).

It is adopted in DVB-T, ITU-J.83 cable systems, too. A (N, K) RS codes over GF(2^m) contain N coded symbols with K message symbols and can correct up to t = ⎣ N-K / 2 ⎦ errors. Note that each symbol over GF(2^m) has m bits and all operations in RS codes are based on GF(2^m) [11][12].

2.3.1 Reed-Solomon encoder

Let (MK-1, MK-2, …, M1, M0) denote K message symbols that are to be transmitted. So the message polynomial:

0 1 2

1 K

K− 1 − 2

)

(x M x M x M x M

M = ^K⁻ + ^K⁻ + + + (2.5) And, there is a generator polynomial:

) (

) )(

( )

(x = x+ ^h x+ ^h⁺¹ x+ ^h⁺²^t⁻¹

g α α α (2.6) Where g(x) has the degree of 2t, h may be 0 or 1, and α is the primitive n-th root over GF(2^m). Firstly, the message polynomial M(x) is multiplied by x^2t and then divided by the generator polynomial g(x) to obtain a remainder polynomial R(x):

) Then, the codeword polynomial C(x) with the systematic form can be expressed as:

)

The previous description of RS encoder can be implemented as the systematic feedback shift register encoder as shown in figure 2.5 [11][12], where G0, G1, …, Gr-1 is the coefficient of the generator polynomial. In first K cycles, it will output the message M(x). In last N-K cycles, it will output R(x). This forms the final codeword C(x).

+ … + +

First K ticks down Last N - K ticks up

Figure 2.5: The circuit of the systematic feedback shift register RS encoder

For J.83 annex A and C, the (204, 188) RS codes over GF(2⁸) are utilized for correcting 8 errors. The code generator polynomial is denoted as:

) Where α represents the primitive element for the primitive polynomial:

1 )

(x =x⁸ +x⁴+x³ +x²+

p (2.11)

For J.83 annex B, the RS encoder is utilized to implement a t = 3, (128, 122) extended RS codes over GF(2⁷). The primitive polynomial used to form the filed over GF(2⁷) is:

1 )

(x =x⁷+x³+

p (2.12) And the generator polynomial is:

) After C(x) is generated from equation (2.9), an extended parity symbol C_ is generated by evaluating the codeword at the sixth power of α and denoted as C_ = C(α⁶). This extended symbol is used to form the last symbol of a transmitted RS codeword. The extended codeword polynomial is then as follows: Cˆ x( )

(207, 187) RS codes with t = 10 over GF(2⁸) are utilized for J.83 annex D. The generator polynomial g(x) is shown as follows:

)

2.3.2 Reed-Solomon decoder

Assume the received data polynomial is r(x), and error polynomial is e(x). That is:

) Reed-Solomon decoding process can be divided into four steps [4]: (1) syndrome calculator, (2) Key equation solver, (3) chien Search, and (4) error value evaluator, as shown in figure 2.6.

The syndrome calculator calculates a set of syndromes from the received codewords. The key equation solver produces the error locator polynomial σ(x) and the error value evaluator polynomial from the syndromes. By the chien search and the error value evaluator, we can get the error locations and error values respectively.

)

Figure 2.6: RS decoding process

In syndrome calculator, the syndromes are calculated as follows:

For the extended RS codes in J.83B, the syndrome should be modified. Recall r(x) =

; there are two cases discussed individually as follows:

)

The decoding procedure is the same as the normal cases.

(2) r0 is an error, meaning r0 = C_+e_. Then, While , the error value e_ can be calculated to let the discrepancy during solving key equation by Berlekamp-Massey algorithm that will be introduced later.

v≤ ∆⁽²^t⁻¹⁾ =0

The key equation is defined as follows:

x t

The key equation can be solved by Euclidean algorithm or Berlekamp-Massey (BM) algorithm [11][12][19].

An inversionless BM algorithm which is a 2t-step iterative algorithm is shown as follows [4]:

Where is the i-th step error locator polynomial and is the coefficient of . is the i-th step discrepancy and

assisting polynomial and is an assisting degree variable in i-th step.

)

And the modified inversionless BM algorithm with some differences in initial conditions can be shown as follows [4]:

Besides, if σ(x) is first obtained, from the key equation and the Newton’s identity we could

These modified inversionless BM equation will be adopted in our proposed multi-mode key equation solver because of its regularity.

The alternative algorithm of key equation solver is Euclidean algorithm. It can be summarized as follows:

Initial condition:

Where is the i-th step error locator polynomial, is the i-th step error evaluator polynomial, and q

)

i(x

σ Ωⁱ(x)

i(x) is the i-th quotient polynomial generated in key equation.

After solving the key equation, we find the roots of σ(x) for error location X1, X2, …,

Then, using Forney algorithm to calculate error values in error value evaluator and together key equation and equation (2.19), we can get:

) the error values can be calculated as follows:

According to the error locations and error values solved from previous algorithm, we can correct the channel induced errors in received data and get the correct codeword.

Unfortunately, if error numbers in one codeword are larger than t, we could not correct the received data.

For more information about RS decoding process, please see [4] [11][12][19].

2.4 Trellis Codes

In a (n, k, m) Trellis code (or called convolutional code), the coded n-bit output block depends not only on the corresponding k-bit input message block, but also on the m previous message blocks. It can be implemented with an n-output, k-input linear sequential circuit with an input memory of m words. The advantage of the convolutional codes is that it allows the introduction of redundancy to improve the threshold Signal-to-Noise Ratio.

Only J.83 annex B contains the trellis code. This trellis-coded modulator is a 16-state non-systematic rate 1/2 encoder with the generator:

(G1, G2) = (25, 37octal)

The punctured matrix proposed in [13] essentially converts the rate 1/2 encoder to rate 5/4.

The punctured matrix is defined as:

(P1, P2) = (0001, 1111)

The internal structure of the punctured convolutional encoder is illustrated in figure 2.7.

D D D D

to QAM mapper

G1 = (25)

G2 = (37)

1 0 1 0 1

1 1 1 1 1

puncture matrix

1 1 1 1 0 0 0 1

u

Figure 2.7: Punctured binary convolutional codes in ITU-T J.83B

The Viterbi algorithm proposed in 1967 is a straightforward implementation of the maximum likelihood (ML) decoder and is the most powerful and popular algorithm for decoding convolutional codes [14][15]. The following four steps are Viterbi algorithm, which can be applied to find the ML path:

(1) According to the current received input datum, we calculate the transition metrics (TM) to the next transition states.

(2) Sum the previous path metrics (PM) with the calculated TM and compare tow paths, which come from different states but merge at the same current state. Then, we select the path with the smallest distance. This operation is called ACS (Add-Compare-Select), and we use ACS unit for each state.

(3) The output of the select branch in each state is stored into the memory, which is called

“survivor path”.

(4) Repeat (1), (2) and (3) until the memory of survivor path is full, then the output decision begins to trace-back the survivor path to find the output of the smallest path metrics (the

ML path).

In practice, the register-exchange approach and trace-back approach are useful methods for survivor path storage management in Viterbi decoder architecture. The former one takes more area but less time than the latter one. We will use register-exchange method to implement the survivor path storage management in Viterbi decoder since the convolutional codes in J.83B has only 16 states and thus the number of registers required for this decoder is not quite large. The detail architecture of Viterbi decoder for J.83B will be introduced in chapter 3.4.2.

2.5 Summary

In this chapter, we introduce the encoding and decoding algorithm of each FEC section.

It includes scrambler, interleaving, RS codes and convolutional codes. In chapter 2.1, three kinds of scrambler of J.83 are introduced. In chapter 2.2, both convolutional interleaver and deinterleaver are introduced. It has more advantage than block interleaving. In chapter 2.3, the encoding and decoding algorithm of RS codes is introduced. In RS decoding algorithm, two kinds of key equation solver are presented. One is BM algorithm, and the other is Euclidean algorithm. We also introduce three kinds of RS codes among J.83, one is over GF(2⁷) with t = 3, the others are in GF(2⁸) with t = 8 and 10, respectively. In chapter 2.4, we introduce the convolutional codes and Viterbi algorithm. Fortunately, it has only one mode in J.83, that is, a 16-state non-systematic rate 1/2 encoder with the generator: (G1, G2) = (25, 37octal).

Chapter 3 Algorithm and Architecture for Multi - Mode FEC Decoder

The algorithm and architecture of a multi-mode RS decoder with memories to store and correct the received data and a memory-based universal convolutional interleaver/

de-interleaver will be proposed in this chapter. These two modules are compatible for ITU-T J.83, DVB-T, ATSC Digital TV systems, etc. The scrambler and Viterbi decoder will be only mentioned briefly since the complexity of scrambler is so simple and there is only one kind of convolutional codes.

3.1 The proposed multi-mode FEC decoder

Figure 3.1 shows the block diagram of the proposed multi-mode FEC decoder. It integrates all systems from figure 2.1 into one system. The symbols A/B/C/D represent the annex A/B/C/D in ITU-T J.83. The different data paths between J.83 annex B and annex A/C/D are decided by multiplexer.

mode From De-mapper

out Trellis Decoder &

Synchronization B

Descrambler B

Deinterleaver A/B/C/D

RS Decoder A/B/C/D

Descrambler A/C/D mode

M U X

Figure 3.1: The proposed multi-mode FEC decoder

3.2 Memory-based universal convolutional interleaver/

de-interleaver

It is not efficient for implementing so many pieces of FIFO in convolutional interleaver or deinterleaver since it consumes lots of power, area and induces routing difficulty in APR (Auto Placement and Route). Hence, a better solution is to use SRAM to solve these problems.

The key issue becomes how to generate the correct address of SRAM for each input and output data. As a result, a novel, low complexity, high flexibility and memory-based method to implement the multi-mode convolutional interleaver and deinterleaver is proposed, which is induced from [6][7].

3.2.1 The algorithm and architecture of memory-based universal convolutional interleaving

The idea is that we rebuilt the FIFO registers of convolutional deinterleaver as a memory array. Assume the FIFO registers in first branch are put in somewhere of the memory array, and the FIFO registers in second branch are appended latter, and so on, until the last FIFO registers are appended. Hence, the memory array is as shown in figure 3.2. For writing, we realize that after writing first symbol into the head of the memory array, the next symbol should be written into the head of the second branch, i.e., the address distance of memory between first symbol and second symbol is (I-1) x J. The values are the same as the numbers of the FIFO in first branch. Hence, we call this “branch address”. And the address for first symbol is called intra-initial address. For the third symbol, the address distance of memory between second symbol and third symbol is (I-2) x J. And so on, the address distance of memory between (I-2)-th symbol and (I-1)-th symbol is 2J. In contrast to write, the first readout symbol should be in the end of the first branch in memory array. The second readout symbol should be in the end of the second branch, i.e., the address distance between first symbol and second symbol is (I-2) x J. Similarly, the address distance between second symbol and third symbol is (I-3) x J. And so on, the address distance between (I-2)-th symbol and (I-1)-th symbol is J. For the coincidence of writing and reading direction, the initial address pointer should be decreased by 1 for the next I symbols. Then, do the previous operation again. In addition, the memory size should be defined. If the memory address is out of the memory size, it should modulo the address by the memory size.

(I - 1) * J (I - 2) * J

. . .

2J J

. . . . . .

Write ReadWrite Read Write ReadWrite

Read J J ... J

. . .

J J J

I-1 I-2 I-3 De-Interleaver

J ... J 1

Figure 3.2: The memory array by rebuilding the FIFO registers of deinterleaver

A (12, 17) convolutional deinterleaver which is adopted in ITU-T J.83A, C and DVB-T system will be taken for an example to show how it works. Assume the datum we received are 0, x, x, …, x, 12, x, x, …, x, 204, 1, x, x, …, x, 2244, 2041, …, 11, …, as shown in figure 2.4.

Where the number means the input indexes from interleaver, and x means “don’t care symbols” at the beginning. When deinterleaving, after writing 0 to memory, the interval between 0 and the next writing address is (I-1) x J = 187 as shown in figure 3.3(a). The interval between previous address and the next address is (I-2) x J = 170, and so on, until to 2J

= 34. These numbers are the same as the numbers of FIFO on branches of convolutional deinterleaver. When writing 12 to the memory, it needs go back to the address of “initial writing address-1” and does the previous operation again. After writing 202 into the memory, the data stored in memory is like in figure 3.3(b). Then we can see that the distance between 0 and 1 is (I-2) x J = 170. The distance between 1 and 2 is (I-3) x J = 153, and so on. The distance between 9 and 10 is J = 17. At this time, the memory size in figure 3.3(b) is J x I x (I-1) / 2, just the same as the minimum memory requirement infigure 2.3. Because there is no more space to write 2244 into memory, so it must increase more memory sizes. Or it will violate the rules. By the observation, it needs more J memory size. As shown in figure 3.3(c),

when 0 is read out from memory, 2244 is written into memory. And, 1 is read out, 2041 is written to the original position of 0. Then, do the previous operation again. In addition, when the address is out of the memory size, it must modulo the address by the memory size. Hence, the required memory size is J x I x (I-1) / 2 + J. The maximum size is 65032 bytes for (128, 8) convolutional deinterleaver in J.83B. We realize that it just needs more 8 bytes than the original structure and has the advantage of low cost and high flexibility for multi-mode design.

0 12 24

...

. . .

^X ^X ^X ^X

204 1

187 . . .

187 (I-1)xJ

170 (I-2)xJ

(a)

187 170 17

0 12 24

... 36

2208 2220

2232 ²⁰²⁹²⁰¹⁷ ²⁰⁰⁵ ... ²⁵ 13 1

. . .

²⁰² _... ²²

(I-1)xJ (I-2)xJ J

10 9

(b)

2244

187 170 17

...

2220

2232 ²⁰²⁹ ²⁰¹⁷

...

13 1

. . .

⁹ ²⁰²

_...

²² ¹⁰

(c)

Figure 3.3: Behavior of the novel algorithm for (12, 17) convolutional deinterleaver

The detail operations of universal convolutional deinterleaver are described as pseudo codes in figure 3.4, where there are 12 parameters that we used:

(1) I: Interleaver depth

(2) J: The difference delays between each neighboring branch (3) in: data input.

(4) out: data output.

(5) w_addr: The writing address for memory input.

(6) r_addr: The reading address from memory to output (7) w_ini_addr: The intra-initial address of w_addr.

(8) r_ini_addr: The intra-initial address of r_addr.

(9) branch_addr: This is the address between 2 neighboring data.

(10) counter: For determining when to output directly and reset w_addr and r_addr.

(11) mem_bound: Maximum size of memory

(12) mem[ ]: It represent the memory and the size is mem_bound.

Convolutional interleaver which is the inverse of convolutional deinterleaver can be easily formulated, too.

Initial condition:

w_addr = w_ini_addr = 0; branch_addr = (I-1)*J;

r_addr = r_ini_addr = (I-1)*J; counter = 1;

mem_bound = J*I*(I-1)/2 + J;

While ( in != NULL) {

if (counter == I ) /* In last branch, input will pass to output directly*/

{

out = in;

branch_addr = (I-1)*J; /* branch_addr goes back to initial condition */

counter = 1; /* reset the counter */

w_ini_addr = w_ini_addr - 1; /* reset the writing and reading address */

在文檔中以記憶體為基礎的多標準前端錯誤更正 (頁 13-0)