Fixed-Point Implementation - Fixed-Point Implementation of Initial Downlink Synchronization

Fixed-Point Implementation of Initial Downlink Synchronization

5.2 Fixed-Point Implementation

Usually, we use floating-point processing to verity the performance of the algorithms. But fixed-point processing improves power efficiency, speed and hardware cost. Hance

large-−250 −20 −15 −10 −5 0 5 10 15 20 25

ICFO estimation under AWGN in 0 dB.

100

ICFO estimation under AWGN in 10 dB.

−250 −20 −15 −10 −5 0 5 10 15 20 25

ICFO estimation under SUI1 at mobility 10 km/h in 0 dB.

−250 −20 −15 −10 −5 0 5 10 15 20 25

ICFO estimation under SUI1 at mobility 10 km/h in 10 dB.

−250 −20 −15 −10 −5 0 5 10 15 20 25

ICFO estimation under SUI1 at mobility 90 km/h in 0 dB.

100

ICFO estimation under SUI1 at mobility 90 km/h in 10 dB.

0 1 2

PID detection under AWGN in 0 dB.

Cumulative amount

PID detection under AWGN in 10 dB.

Cumulative amount

PID index

0 1 2

PID detection under SUI1 at mobility 10 km/h in 0 dB.

Cumulative amount

PID detection under SUI1 at mobility 10 km/h in 10 dB.

Cumulative amount

0 1 2

PID detection under SUI1 at mobility 90 km/h in 0 dB.

Cumulative amount

PID detection under SUI1 at mobility 90 km/h in 10 dB.

Cumulative amount

PID index

0 50 100 150 200 250 300

Fine timing estimation under AWGN in 0dB.

Cumulative amount

Fine timing estimation under AWGN in 10dB.

Cumulative amount

Fine timing estimation under AWGN in 20 dB.

Cumulative amount

0 50 100 150 200 250 300

Fine timing estimation under SUI1 at mobility 10 km/h in 0 dB.

Cumulative amount

Fine timing estimation under SUI1 at mobility 10 km/h in 10 dB.

Cumulative amount

Fine timing estimation under SUI1 at mobility 10 km/h in 20 dB.

Cumulative amount

Estimated timing index

0 50 100 150 200 250 300

Fine timing estimation under SUI1 at mobility 90 km/h in 0 dB.

Cumulative amount

Fine timing estimation under SUI1 at mobility 90 km/h in 10 dB.

Cumulative amount

Estimated timing index

50 100 150

Fine timing estimation under SUI1 at mobility 90 km/h in 20 dB.

Cumulative amount

0 50 100 150 200 250 300

Fine timing estimation under PB at mobility 10 km/h in 0 dB.

Cumulative amount

Fine timing estimation under PB at mobility 10 km/h in 10 dB.

Cumulative amount

Fine timing estimation under PB at mobility 10 km/h in 20 dB.

Cumulative amount

Estimated timing index

0 50 100 150 200 250 300

Fine timing estimation under PB at mobility 90 km/h in 0 dB.

Cumulative amount

Fine timing estimation under PB at mobility 90 km/h in 10 dB.

Cumulative amount

Fine timing estimation under PB at mobility 90 km/h in 20 dB.

Cumulative amount

volume practical implementation normally employ fixed-point processing. In this section, we present the initial downlink synchronization algorithm implementation in fixed-point processing using TI’s TMS320C6416T DSP. We also try to utilize coding style and intrinsic functions to reduce cycle counts on DSP.

According to chapter 4, we know that the C6416T CPU has a VLIW architecture that contains 8 parallel 32-bits function units. The 8 units include two multipliers and six that can do a number of arithmetic, logic and memory access operations, and it is flexible so that each function unit can do double 16-bit or quadruple 8-bit operations. In our work, we choose 16-bit data type mostly, because 16-bit computation has enough accuracy for most of the functions we implement.

Fig. 5.21 shows the fixed-point data formats used in the different places in our algorithm, where Qx.y means there are x bits before the binary points and y bits after. In our case, x+y

= 15 because the sign takes 1 bit. We choose Q7.8 to be the data format at many places, because coarse timing estimation needs to accumulate the squared norm of data. The Q7.8 format can avoid overflow in coarse timing estimation. In fact, we find that the Q7.8 data format has enough accuracy for our experiment. In the following subsections, we discuss the details of the blocks in the algorithm.

5.2.1 Coarse Timing Estimation and Removal of Cycle Prefix

The first step in the procedure is coarse timing estimation to find the approximate location of PA-Preamble. Figs. 3.1 shows our signal structure, where we compute the signal power in a finite window size and slide the window. According to the IEEE 802.16m standard, the PA-Preamble magnitude is boosted by a factor of 1.9216, 2.6731 or 4.6511 compared to regular data signal. To the maximum power position should be a good indicator of what the PA-Preamble is. After coarse timing estimation, we remove the CP from the 576 points starting at the estimated point to get 512 points of data. Actually, because the estimated point by coarse timing estimation may be located within the CP, what we in fact do is to take

Figure 5.21: Fixed-point data formats used in DSP implementation.

the first 512 points starting from the coarse timing point, which is equivalent to discard the last 64 point of the 576 points, because it is more probable to get a complete PA-Preamble this way.

5.2.2 Fractional Carrier Frequency Offset Estimation and Com-pensation

FCFO estimation is the second step in the procedure. Fig. 5.22 shows that we correlate the first 256 points and the last 256 points of PA-Preamble to calculate the FCFO, which is obtained as the arc-tangent of the correlation. For efficiency in DSP implementation, we use a lookup table to implement the arctan() function. For dynamic range, we create a table for the arcsin() function to estimate the FCFO in place of a table of the arctan() function.

The table contains 2048 entries uniformly spanning the range [sin 0, sin 0.25π), and the table entries are normalized with respect π so that they span the range [0, 0.25).

In frequency offset compensation, we create two lookup tables for the sin() and the cos() functions, each containing 2048 entries uniformly spanning the range [0, π ÷ 2). Since the

Figure 5.22: Calculating the correlation in received PA-Preamble.

the FCFO is compensated, the data format becomes Q7.24. Then we change the data format from Q7.24 to Q7.8 in order to avoid overflow in ICFO estimation.

5.2.3 Integer Carrier Frequency Offset Estimation and PID De-tection

The last step of the procedure is ICFO estimation and PID detection. For this, we operate in the frequency domain. Since ICFO is just a shift in the subcarrier indexes in the frequency domain, it is relatively simple to implement in C program. According to (3.14), we calculate the channel frequency response and transform it to the time domain. Since the CIR length is supposed be not exceeding 64 points, we can assume that the correct choice of ICFO and PID should yield the maximum squared value, sum for the resulting CIR. The flow chart is shown in Fig. 5.23.

在文檔中 IEEE 802.16m 初始下行同步之數位訊號處理器實現 (頁 85-99)