3-4 Viterbi decoder with receiver diversity

In the WLAN environment, the multipath channel effect may affect the receiver performance significantly. It is known that the diversity technique can effectively solve the problem. Space diversity is a popular and effective diversity technique and its can be easily applied in WLAN systems. In this approach, an antenna array is used in the receiver. There can be different various implementation architectures for the antenna diversity. The first one shown in Figure 3.15 is called the antenna selection. In this architecture, we need only one RF module although L antennas are used. This architecture only selects the antenna with the strongest output for processing. Baseband processing is not affected at all and its

implementation cost is lowest. However, its performance enhancement capability is limited. For other structures we require L RF modules in general. Figure 3.16 shows the architecture. Outputs from RF modules are first down converted, sampled, and transformed using FFT. Various diversity combing methods can then be applied.

We only discuss the optimal one, which is the maximum ratio combing (MRC).

R F

C IR C U IT D S P

MUX

C o m p a ra to r

A n te n n a 1

A n te n n a 2

A n te n n a N

Figure 3.15 Antenna selection diversity

C o m b in in g

Simulations indicate that this scheme can enhance10dB performance with two receive antenna in an exponential decay AWGN channel. The performance enhancement is significant. Let the received signal for the ith tone signal of a nth symbol at the jth antenna be X_i^j(n), and the corresponding transmit signal, channel, and noise beS_i^j(n), )H_i^j(n , and V_i^j(n), respectively. Then,

Note that the conventional approach uses diversity combing first and then pass the result to the Viterbi decoder. We call this observation MRC (OMRC). We propose

Figure 3.16 Antenna combing diversity

antenna is the kth be z_ij^k(n). We then have a new BM as

∑

= ^L

k k ij

ij z n

)

( (3.16)

As we can see, this BM takes all the observations from all antennas into account.

Since the channel effect is taken into account, it will give the MRC-like results for the Viterbi algorithm. We call this a Viterbi MRC (VMRC) algorithm.

Simulation results for an exponential decay AWGN channel are shown in Figure 3.17. Here, two antennas are used for the diversity approach. From the figure, we can see that at BER=10⁻⁴, the OMRC and the VMRC outperform the decoder without diversity (NORMAL) by 9dB and 11dB, respectively. Apparently, the VMRC outperforms the OMRC by 2 dB. The gain is significant.

We then add more antennas and carry out more simulations Figure 3.17 shows the results. In the figure, the solid line with symbol “+” indicates the result with the

Figure3.17 Comparison between NORMAL, OMRC and VMRC

decoder without diversity. Solid lines with symbols “o”, “ * ”, and “+” indicate the performance of the OMRC with two, three, and four antennas, respectively (denoted as OMRC2, OMRC3, and OMRC4). Dashed lines with symbols “o”, “ * ”, and “+”

indicate the performance of the VMRC with two, three, and four antennas,

respectively (denoted as VMRC2, VMRC3, and MRC4). From these results, we can clearly see that the VMRC is always better than the OMRC. When the number of antenna is larger, the performance gap becomes larger also.

Figure 3.18 Comparison between NORMAL, OMRC2, OMRC3, OMRC4, VMRC2, VMRC3 and VMRC4

3-6 Memory management and adaptive tracing back

The REM (register exchange method) is simple, but it is not efficient for the WLAN system since it requires high power consumption and large chip area. Thus, we employ the TBM as our implementation scheme. This section addresses the memory management scheme for the TBM.

There are several trace back algorithms known as, the K-pointer even algorithm, the K-pointer odd algorithm [18], the one-pointer algorithm, and the hybrid algorithm. In one- pointer trace back architecture [6], it only uses a single read pointer and needs approximately half as much memory as other approaches. The disadvantage of the one-pointer algorithm is that we need to provide separate column counters for the write and read operations individually. Another disadvantage is that the trace back operation clock must be three times as fast as the writing operation clock if the read region is three times as large as the write region. Since in the WLAN application, the clock rate is not particularly high, we then adopt this memory management method in our implementation.

Figure 3.19 shows the one-pointer algorithm operation. Here, “TB” denotes the track back read. This operation is to find out the previous state according the data stored in the memory. There are two banks memory for trace back and this implies that this Viterbi decoder trace back with depth 2T. The notation “DC” means the data decode read. The first decode read (starting state) in a memory bank is determined by the previously two track back bank result. The decoder reads the newer data, decode them, and send them to the LIFO (last in first out) register performing the bit-order reverse. The operation “WR” means decoder new data write. Decisions, indicating the previous surviving state, are made by the ACS and written into locations representing corresponding states.

We now explain the operations outlined in Figure 3.19. We assume that the time interval between t0 and t1 is a unit time, so is that between t1 and t2 (and so on).

During the T0 interval (between t0- t1), one write pointer points out the write location and ACS out data is written into bank0 memory. In the same time, one read pointer starts tracking back and reading data from bank3 memory. The read pointer is

operated as three times faster as the write pointer. Thus, the write operation only fills 1/3 bank0 memory when the “TB” has finished bank3 memory track back. Then, at

time t2, the trace back operation has terminated at the end of bank2 memory, the write operation fills 2/3 bank0 memory. Finally, the decoder read bank1 memory and finishes the decoding operation at t3 and the write operation fills bank0 memory at the same time. During the T3 (t3-t4) time period, we then shift the read/write pointer as that shown in Figure 3.19 and repeat the operations all over again. We can obtain the data out from the decoder continually.

Figure 3.19 displays the memory organization for storing the survivor path. Here, the memory depth for each bank is Γ. In order to find the optimal decode starting state, we have to trace back two memory banks before decode. In other words, the trace back length is 2Γ. One data output from the decoder must have four memory accesses; the first is “WR” data write. When the memory bank is full (3/3 WR), we will start trace back for the next time unit. After two times trace back, we then have the decode (DC) operation. Since the trace back unit contributes almost half of power dissipation (the trace back length is 2Γ). Howe to reduce the memory access

frequency during the trace back is then the main concern.

Figure 3.19 One-pointer trace back method

Figure 3.20 also shows the Viterbi decoder memory trace back paths. As we can see if the trace back length is long enough, all the trace back paths will merge at some block. Figure 3.21 illustrates the trace back path within two memory banks; the solid line indicates one trace back path and dash lines indicate other trace back paths. It is desirable that all paths merge before the decode block. It means that trace back from

DC TB

Before the trace back, we can guess a terminating state having a minimum PM in the previous memory block. In the writing operation, we can record all possible states (at each time instant) that can reach the terminating state. Then, if the trace back starting state is in these states, we can immediately know that the track back will end with the terminating state we guessed; we call this case as a “trace forward hit”.

Figure 3.21 shows an example. In the figure, the state (s35) with the minimum PM in the previous block is written at t=0. When t=1, we found that only one survivor path (s17) is from s35 and we record this state (s17) by writing a “1” in a 64x1 register array. This means that if trace back starts from state17 (s17), it will go through the state (s35). When t=2, there are two survivor paths coming from the state (s17). We record these two states (s8) (s40) by writing two “1”s in the register array (note that the array is reset before writing). These two “1”s indicate that if trace back starts from state8 (s8) or state40 (s40), it will go through the state (s35) in the end. When t=3, there are three survivor paths coming from the previous two states. We then write three “1”s in the 64x1 register array. As the writing process proceeds, the values in the register array are updated. Finally, at t=n, the register array remains four “1”s indicating states s2, s8, s32, and s63. The four states imply that if trace back starts from state2 (s2), state8 (s8), state32 (s32), or state63 (s63), it will terminate at s35.

Thus, we can skip the trace back operation in the memory block. This will reduce the memory access frequency. Figure 3.22 illustrates this skip scheme. When bank0 is decoded, we write data to bnak3 and record the trace forward states. After the decoding in bank0 is finished, we write data to bnak1 and starts trace back from bank3. If the state with the minimum PM is in the trace forward states, the track back operation in bank3 will be skipped (see T1). If once again, the trace forward is hit in bank0. At T3, we can skip the trace back in bank0 and bank3.

DECODE

BLOCK MERGE BLOCK

TB TB

decode starting state

Unfortunately, the “trace forward” is not always hit. Figure 3.23 shows the hit rate with different SNR in AWGN channel. It is obviously that a higher hit rate appears in a higher SNR environment. The percentage that the memory access activity can be reduced is shown in Figure 3.24.

t=0 t=1 t=2 t=n-1 t=n

time

Previous block minimum path metric state

current block minimum path metric state

(s35)

(s32 )

(s63) (s8) (s2 ) (s8)

(s17) (s40)

(s36) (s2 0)

(s52 ) t=3

TB1 WR DC TB2

TB2 TB1 DC

WR DC

TB1 TB2

Bank0 Bank1 Bank2 Bank3

Figure 3.21 Write data to record all possible trace

Figure 3.23 Trace forward hit rate with different SNR

Figure 3.24 Memory saving by trace forward scheme

As we can see, when the SNR is 12 dB, the “look forward” scheme can save only 45% memory access activity. Note that the hitting rate is over 90%. This is because

“WR” and “DC” are still needed. To further reduce the memory access frequency, we need to merge these two operations

The advance minimum-transition trace back (AMTTB) scheme in [3] is the algorithm can further reduce the memory access activity. Figure 3.25 shows the operation. Here, we assume that the trace back depth is 32 and the size for each memory bank is 64x32. We need one extra predict buffer 1x32 for storing the data for the most likely path. In the write operation we record the most likely trace back path in the predict buffer. In the trace back stage, read data from the memory bank, find the previous state, and compare the read data with the predict data. If these data are different, modify the data in the predict buffer. Otherwise, the trace back operation is replaced by reading the predict buffer only. This is because the trace back has merged at this point and we do not have to read the memory bank. By using so, we can substantially reduce the memory access frequency in the trace back and decode stages. How munch the memory access times can we save? It depends on the merge point. Table 3.4 shows some simulation results. It indicates that in most cases, the prediction hits at first trace back stage. Almost all cases will be hit (merged) before the decode stage. This scheme can reduce the memory access frequency significantly.

However, we have to add an extra trace back circuit.

Figure 3.25 Minimum-transition trace back operation scheme DC WR

TB1

Compare and modify buffer

Predict buffer 1*32

TB2

Compare and modify buffer

Predict buffer 1*32

Prediction Hit distribution

Table 3.4 Minimum-transition trace back hit rate distribution SNR 18 19 20 21

Hit 1stage 79.9286 84.0000 89.5000 91.0714 Hit 2stage 97.7857 98.2143 99.2857 99.4286

Chapter 4

在文檔中一低複雜度無線區域網路Viterbi解碼器之設計與實作 (頁 36-47)