• 沒有找到結果。

Proposed Mode/GI and Symbol Boundary Detection Scheme

Chapter 3 Symbol Synchronization Algorithms

3.2 Coarse Symbol Synchronization

3.2.4 Proposed Mode/GI and Symbol Boundary Detection Scheme

By observing the mode/GI and boundary detection expression, it is easy to find out their structures are very similar. The mode/GI detection block diagram is illustrated in Fig. 3.15. The architecture is modified from NMC architecture by using the subtraction to replace the division as Eqn. (3.4) derived. As a result, mode/GI and boundary detection is able to share the same hardware to detect mode/GI and symbol boundary by controlling the correlation and moving sum delay-line lengths. After mode/GI detection, controller changes the correlation and moving sum delay-line lengths for symbol boundary detection.

Fig. 3.15 Block diagram of mode/GI detection

Fig. 3.16 illustrates the proposed mode/GI and boundary detection hybrid architecture. In the correlation part, the functions of 2K, 4K and 8K delays for different transmission modes are realized by a 2K/4K/8K triple modes reconfigurable delay-line and defined as correlation delay-line in this architecture. For four guard interval lengths of three transmission modes, there are totally six possible moving sum lengths and realized by two 2K/1K/512/256/128/64 reconfigurable delay-line, which is also named the moving sum (MS) delay-line.

Correlator

[ ]*

FFT

Gate

Gate

D

| |2 r(n)

Moving Sum 2K/4K/8K

delay

2K/1K/512/256 /128/64 delay

MC2

Threshold

to FCFO

×

× D

CSS Controller FFT Out

| |2 2K/1K/512/256

/128/64 delay

Fig. 3.16 Architecture of mode/GI and boundary detection scheme

The proposed mode/GI and CSS scheme can be divided into three stages, the first stage is mode detection, the second stage is guard interval length calculation and the final stage symbol is boundary detection. As Fig. 3.17 shows, the system first starts to detect the mode beginning from 2K then 4K and finally 8K mode. It spends 2K samples to fill the correlation delay-line and 64 samples (the minimum guard interval length of 2K) to fill the moving sum delay-line and then enter the 2K mode detection. The purpose to choose 2K mode as the first detected mode is it only needs 2K samples latencies to fill the correlation delay-line since others candidates need 4K or 8K samples. By using the twister memory access method based delay-line design to overlap the filling time and the previous mode detection time, the correlation delay-line does not have to refill or replenish when the detection mode is changed. As a result, the latency of the proposed scheme is reduced to only 2K samples.

Fig. 3.17 Finite state machine of mode/GI and boundary detection scheme The pseudo code of the proposed scheme is shown below.

1. Set the detection mode as 2K and fill the correlation delay-line with 2K samples.

2. Fill the moving sum with detection mode/32 samples. If the moving sum is filled and threshold=0, go to Step 3.

3. Detect the detection mode for a pre-defined detection period. If the threshold=1, jump to Step 4. Else if the detection counter=detection period, jump to Step 2 and set the detection mode as detection mode×2.

4. Count the period length of threshold=1. If the threshold=0, set moving sum length as detected GI length and go to Step 5.

5. Fill the moving sum then find the maximum correlation result as boundary.

6. Derive the next boundaries using the detected mode, GI length and boundary.

After a detection period of (1+1/32)×2K ~ 2×(1+1/4)×2K samples, the system knows the transmission mode is not the same with 2K mode and then enters the “4K Dummy State” to do the 4K mode detection. Otherwise, if the threshold changes from low to high during the 2K mode detection period, it means the transmission mode is the 2K mode and so do other detection modes. The system jumps to “GI Detection State” as soon as the threshold turns to high and calculate the period length when it is high. The system period length is regarded as guard interval length. When the moving sum window accidentally locates on any place of the guard interval period initially, the threshold will turn to high before leaving “Dummy State”. This situation possibly

leads to an incorrect guard interval length. For the purpose to prevent this situation, the state won’t enter “Mode Detection State” before the threshold turns to low. Thus, the system only starts to calculate the guard interval length at the beginning of a plateau.

After mode/GI detection, the system enters the “Dummy Find Boundary State”

and uses the detected guard interval length to refill the moving sum delay-line. After refilling the moving sum delay-line, the system has a best MC2 result. By observing the MC2 results during plateau, the location of the maximum MC2 result subtracts half of the guard interval length will be the rough symbol boundary. After all, the system predicts the successive boundaries using the detected mode, GI length and first boundary information.

The required time of the proposed scheme is analyzed below. For the 2K transmission mode, it needs 2K samples to fill correlation delay-line, 64 samples to fill moving sum delay-line, 1 ~ (1+1/32)×2K samples for mode detection, 64~512 samples for calculating guard interval length and (1+1/32)×2K ~ (1+1/4)×2K for boundary detection. Theoretically, 7296 samples is the worst case to finish the mode/GI and symbol boundary detection scheme. The system stays at “2K Mode Detection State” for at least (1+1/32) × 2K samples to guarantee the possible peak/plateau is detected. Without enough stay time, the detector is possible to miss the peak/plateau. After this, the system jumps to “4K Dummy State” and changes the correlation delay-line to 4K mode and refill the moving sum delay-line. While the mode changes to 4K, it’s not necessary to refill or replenish the correlation delay-line because of the signal r(n-4k) is already in the correlation delay-line. The other procedures are similar to 2K mode and so do 8K mode. Table 3-3 lists the required samples for different transmission modes.

Table 3-3 Timing for mode/GI and boundary detection Mode Modedet_t GIdet_t Boundarydet_t Totaldet_t

2K 2K+64+(1~2K+64) 64~512 2K+2K_GIdet_t

2K_Modedet_t+2K_GIdet_t

+2K_Boundarydet_t

4K Max(2K_Modedet_t)

+128+(1~4K+128) 128~1K 4K+4K_GIdet_t

4K_Modedet_t+4K_GIdet_t

+4K_Boundarydet_t

8K Max(4K_Modedet_t)

+256+(1~8K+256) 256~2K 8K+8K_GIdet_t

8K_Modedet_t+8K_GIdet_t

+8K_Boundarydet_t

Comparing to parallel mode detection shown in Fig. 3.18 (a), (b) and (c), the proposed scheme, shown in Fig. 3.18 (f), has the same latency for 2K transmission mode and an extra delay depends on mode detection period for other modes. But only single hardware is necessary for different mode detection. For the sequential mode detection shown in Fig. 3.18 (d) and (e), the proposed scheme saves 12K sample as a result of using twister memory access based delay-line. In summary, only 384 samples (2.27%) more than parallel 8K mode detection, 12K samples (41.56%) less than sequential mode detection (refill) and 6K samples (26.23%) less than sequential mode detection (replenish).

Delay-Line Filling Moving Sum

Delay-Line Filling

(a)

t

2K Mode Detection 4K Mode Detection 8K Mode Detection

(d)

(f) (b) (c)

(e)

2K, 64, 2K+64 (4224)

4K, 128, 4K+128 (8448)

8K, 256, 8K+256 (16896)

2K, 64, 2K+64, 4K, 128, 4K+128, 8K, 256, 8K+256 (29568)

2K, 64, 2K+64, 2K, 128, 4K+128, 4K, 256, 8K+256 (23424)

2K, 64, 2K+64, 128, 4K+128, 256, 8K+256 (17280)

Fig. 3.18 Timing of parallel (a) 2K, (b) 4K, (c) 8K, sequential (d) refill, (e) replenish and (f) proposed mode detection schemes

Fig. 3.19 (a), (b), (c) and (d) show the simulation result of the proposed scheme for 2K and 8K transmission modes in MC2 and NMC using the same pattern with Fig.

3.3. The aliasing peak is caused by the moving sum doesn’t accumulate enough values

and it only occurs at “Dummy States”. As Fig. 3.19 shows, without the assistance of NMC, the correlation values during GI detection period is much smaller than the boundary detection period and the system is not easy to define the GI detection period.

With the assistance of NMC, to define the GI detection period and compute the period length becomes much easier for the system. Furthermore, for the purpose to improve the reliability, the threshold detector is realized using 8 states confidence counter. The simulation environment is 2K/8K transmission symbols with 1/4 GI, 23.33 CFO, 12dB SNR and surviving from Rayleigh channel.

0 1000 2000 3000 4000 5000 6000 7000

0

2K Mode Detection

GI Detection Boundary

0 1000 2000 3000 4000 5000 6000 7000

0

2K/8K Mode Detection

GI Detection

Aliasing Peak Aliasing Peak

2K/8K Mode Detection GI Detection

Boundary Aliasing Peak

(c) (d) Fig. 3.19 (a) 2K MC2, (b) 2K NMC, (c) 8K MC2 and (d) 8K NMC simulation of the

proposed blind mode/GI boundary detection scheme

In summary, an efficient sequential mode/GI and boundary detection scheme using normalized plateau to compute the guard interval length with single hardware, which is also used by boundary detection, is proposed in this chapter.

Chapter 4

Channel Estimation Algorithms

This chapter will focus on channel estimation issued. First, the scattered pilot synchronization (SPS), which is also called scattered pilot mode detection, will be discussed for the purpose to extract the correct scattered pilots for channel estimation.

After the scattered pilot mode is detected, the channel estimation collects the scattered pilots and using their value to estimate the channel response. Finally, a division-free equalizer and three-stage demapping hybrid architecture will be described.

4.1 Scattered Pilot Synchronization

The inserted scattered pilots are useful for the purpose to estimate the channel response for channel estimation. Unfortunately, unlike the continuous pilots, the distribution of scattered pilots is not fixed with regular locations. As Eqn. (2.2) shows, there are four scattered pilot distribution modes. Typically, the scattered pilot distribution can be known after TPS synchronization. But it takes 17~68 symbols to decode the TPS pilots and it spends too many time. For DVB-H application, the synchronization time is strictly limited. Therefore, using fast scattered pilot synchronization technique to accelerate the synchronization timing is necessary. This section will introduce two fast scattered pilot detection algorithms. Performance simulation and hardware complexity comparison is carried out. Furthermore, an idea of two-stage scattered pilot synchronization scheme with extracted scattered pilots pre-filling, which is required by channel estimation to estimate the channel response, will be described.

4.1.1 Fast Scattered Pilot Synchronization Algorithms

In order to detect the scattered pilot distribution mode as soon as possible, two fast scattered pilot synchronization algorithms were proposed in [22] [23] and will be discussed below.

a) Power-Based Scattered Pilots Mode Detection

The first algorithm is power-based scattered pilot mode detection algorithm (PB) as shown in Eqn. (4.1): sub-carrier of the nth OFDM symbol. The power-based algorithm uses the characteristic of scattered pilots’ boosted power by 4/3. It’s easy to see that only the correct scattered pilots have the largest `power due to its boosted power. By observing the power accumulation of four possible scattered pilot distributions, the scattered pilot mode can be detected.

b) Correlation-Based Scattered Pilots Mode Detection

The second algorithm is correlation-based scattered pilot mode detection algorithm (CB). By using the similar skill as Eqn. (4.1) but multiplying the same

sub-carriers located four OFDM symbols before instead of itself. This method is stronger to against noise. But if Doppler effect changes the channel response significantly after four symbols time, the performance of this algorithm will be affected by Doppler effects. Eqn. (4.2) presents the correlation-based scattered pilot mode detection algorithms mathematically.

{

1,2,3,4

}

Comparing to the second method, the first one is better in against Doppler Effects and doesn’t require a latency of four symbols due to multiplying sub-carriers itself. This implies that the power-based algorithm is four symbols time faster than the correlation-based and requires no sub-carriers storage memory to save the scattered pilots of four distributions four symbols before current symbol. Though the scattered pilot storage can be done by using the channel estimation storage memories and does not cause an additional hardware overhead, the ability to against Doppler effects and fast to get the result makes power-based scattered pilot mode detection algorithm the choice in scattered pilot mode detection.

4.1.2 Performance Simulation and Comparisons a) Performance Simulation

Fig. 4.1 shows the error rate of the above two algorithms under different Signal-to-Noise Ratio (SNR). The environment is 1000 2K transmission mode symbols with 1/4 guard interval length, Rayleigh channel, 10 samples earlier than correct boundary. The residual CFOs after fractional and integer CFO compensation are (a) zero (b) 0.03 sub-carrier and (c) 0.33 sub-carrier. As shown in Fig. 4.1, the error rate of power-based algorithm starts to increase when SNR below 2dB and the correlation based algorithm starts to increase when the SNR below -6dB. Generally, as [1] defined, the SNR is bigger than 3 dB and the accidentally scattered pilot mode detection error is only need to be concerned.

-21 -18 -15 -12 -9 -6 -3 0 3 6

10-1 100 101 102

SNR (dB)

Error Rate (%)

Power-Based Correlation-Based

(a)

-21 -18 -15 -12 -9 -6 -3 0 3 6

10-1 100 101 102

SNR (dB)

Error Rate (%)

Power-Based Correlation-Based

-21 -18 -15 -12 -9 -6 -3 0 3 6 9 12

10-1 100 101 102

SNR (dB)

Error Rate (%)

Power-Based Correlation-Based

(b) (c) Fig. 4.1 Error rate of two SPS algorithms versus SNR with (a) 0 (b) 0.03 (c) 0.33 CFO

b) Hardware Complexity

The architecture of power-based and correlation-based scattered pilot mode detection is shown in Fig. 4.2.

FFT

4 to 1 Mux

SPS Controller Mode 0 Mode 1 Mode 2 Mode 3

[ ]*

×

D D D D

(a)

FFT

4 to 1 Mux

SPS Controller Mode 0 Mode 1 Mode 2 Mode 3

[ ]*

×

D D D D

delay

(4N) | |2

(b)

Fig. 4.2 Architecture of (a) power-based (b) correlation-based SPS

The power-based algorithm needs a complex multiplier to have a correlation with the conjugate of itself, an adder to do summation and four register groups to hold the values. The correlation-based algorithm needs two adders due to the correlation expression output is a complexity number. This also means the size of register groups is double. Also an additional complex multiplier is required to do square operation.

Table 4-1 shows the statistic of the two algorithms. Correlation-based algorithm needs more area and more time to figure out the answer than power-based algorithm.

Therefore, the algorithm with low area cost and fast to detect the mode, which is power-based algorithm, is adopted as scattered pilot mode detection algorithm in this thesis.

Table 4-1 Statistic of power-based and correlation-based architectures

Name Hardware Time Other

Power-Based

Complex Multiplier×1 Adder×1

Registers Groups×4

1 Symbol Better for Doppler

Correlation-Based

Complex Multiplier×2 Adder×2

Registers Groups×8 Delay Element×1

5 Symbols Better for noise

4.1.3 Proposed Two-Stage Scattered Pilot Synchronization Scheme

Generally, under the specified SNR, both the power-based (PB) and correlation-based (CB) scattered pilot synchronization algorithms are seldom fail.

However, an unexpected fail may occur. As a result, a two-stage scattered pilot synchronization scheme illustrated in Fig. 4.3 is proposed to improve the reliability.

The two-stage scheme will do scattered pilot synchronization process twice, the first time is to detect the scattered pilot mode and the second time is to ensure the result of the first one. If the detected mode from the second time is not the same as the predicted mode to the first time, the system will repeat the two-stage scattered pilot synchronization process until they are the same.

Fig. 4.3 Two-stage scattered pilot synchronization scheme

The first and second scattered pilot synchronization processes can be constructed by the combination of PB-PB, PB-CB, CB-PB or PB-PB. Table 4-2 lists the symbol time required in the scattered pilot synchronization time for PB, CB and four combinations above.

Table 4-2 Latency of different combination

One stage Two stage

PB CB PB-PB PB-CB CB-PB CB-CB

No Error 2 5 6 6

An Error 1 5

4 10 12 12

The PB-PB does the first time SPS process at the 1st symbol and the second time SPS at the 2nd symbol. The PB-CB does the first time SPS process at the 1st or the 4th symbol and the second time SPS at the 5th symbol. The CB-PB does the first time SPS process at the 5th and the second time SPS at the 6th symbols. Finally, the CB-CB does the first time SPS process at the 5th and the second time SPS at the 6th symbols. As shown in Table 4-2, the penalty of an error occurrence is to repeat the two-stage scattered pilot synchronization process again which will leads to at least one (PB-PB) to six (CB-PB or CB-CB) symbol delay to channel estimation process.

For the purpose to prove the proposed two-stage scattered pilot synchronization scheme really improves the reliability of the original scattered pilot synchronization algorithms, a scattered pilot detection error rate simulation is shown in Fig. 4.4. The testing environment is as the same with Fig. 4.1. For the two-stage scattered pilot synchronization scheme, the second stage tests all 1000 symbols and inverse infers the symbols that the first stage tests. Therefore, the two-stage scattered pilot synchronization scheme tests all 1000 symbols overall. Due to the two-stage scattered pilot synchronization scheme gives up the debatable answers, the error rate is calculated as (correct number of the detection result pass the two-stage scattered pilot synchronization scheme) over (number of the detection result pass the two-stage scattered pilot synchronization scheme). Moreover, a three-stage PB3 SPS scheme is added in the simulation. As the simulation shows, the three-stage PB3 scheme is very close to single stage CB. Therefore, using three-stage or more stages PB schemes to approximate the performance of single stage CB is worth to be considered.

-21 -18 -15 -12 -9 -6 -3 0 3 6 10-1

100 101 102

SNR (dB)

Error Rate (%)

PB CB PB-PB PB-CB CB-PB CB-CB PB-PB-PB

(a)

-21 -18 -15 -12 -9 -6 -3 0 3 6

10-1 100 101 102

SNR (dB)

Error Rate (%)

PB CB PB-PB PB-CB CB-PB CB-CB PB-PB-PB

(b)

-21 -18 -15 -12 -9 -6 -3 0 3 6 9 12

10-1 100 101 102

SNR (dB)

Error Rate (%)

PB CB PB-PB PB-CB CB-PB CB-CB PB-PB-PB

(c)

Fig. 4.4 Error rate of SPS algorithms versus SNR with (a) 0 (b) 0.03 (c) 0.33 CFO

The simulation result shown in Fig. 4.4 proves the two-stage PB-PB scattered pilot synchronization is really has a better performance comparing to single PB scattered pilot synchronization. Since CB has better ability to against noise, the two-stage scattered pilot synchronization curves with at least one CB show a lower error rate than single CB or PB. Therefore, the two-stage scattered pilot synchronization scheme truly has an improvement on reliability.

4.1.4 Proposed Scattered Pilots Pre-Filling Scheme

As the latencies referred in Table 4-2, in order to improve the reliability, the two-stage scattered pilot synchronization scheme increases one to six symbol latencies to start channel estimation. For the purpose to reduce or eliminate the latency listed in Table 4-2, a channel estimation storage elements pre-filling scheme is proposed below:

1. 1st SPS: Fill the scattered pilots of four possible locations into four pilot storage elements individually.

2. Set predicted mode = mod(1st SPS detected mode+1,4).

3. Fill the predicted mode pilots into the storage element next to previous one.

4. Set predicted mode = mod(predicted mode+1,4).

5. Check if the state is 2nd SPS go to 6, else go to 3.

6. Check if the predicted mode = 2nd SPS detected mode, go to 7. Else, go to 1.

7. Fill the remaining storage elements and start channel estimation.

The principle of channel estimation scattered pilots pre-filling scheme is to store all pilots of the four possible modes during the first scattered pilot synchronization process. Then use the detected mode to predict and store the later symbols’ scattered pilots until the second scattered pilot synchronization process ensure the mode detected in the first scattered pilot synchronization process is correct or not. An example of two-stage PB-PB scattered pilot synchronization process with scattered pilots pre-filling for 2-D channel estimation, which requires at least six symbols’

scattered pilots to be stored and will be referred later, is shown in Fig. 4.5. Where an error occurs and SP(n,m) represent the mode m scattered pilot distribution of symbol n.

Fig. 4.5 Example of pre-filling scheme with one error occur

Overall, Table 4-3 lists the required timing of channel estimation pre-processing (timing for SPS and collecting scattered pilots) and different schemes in 2-D channel estimation. Although the error probability under specified SNR is a visual zero from

Overall, Table 4-3 lists the required timing of channel estimation pre-processing (timing for SPS and collecting scattered pilots) and different schemes in 2-D channel estimation. Although the error probability under specified SNR is a visual zero from