• 沒有找到結果。

CHAPTER 4. A HIGH SPEED AND LOW COMPLEXITY FREQUENCY

4.3.4 Threshold search

The decision of threshold is related to system performance and CFO environment. In the FV environment, in order to keep the performance, the turn on probability of the fine

estimation should be higher. That means the threshold should be decreased. In the SV environment, the threshold can be increased that makes the higher turn on probability of the fine estimation and lower complexity. We simulated the PER with different threshold in two CFO environments, one is SV which is shown in Figure 4.12, the other is FV which is shown in Figure 4.13. According to Table 4.2, the SNR loss of the threshold equal to 10ppm compared with the perfect synchronization is 0.6dB for 8% PER. However, when threshold is extended from 10ppm to 12ppm, the SNR loss is larger than 1dB. That means the threshold = 10ppm is chosen to achieve low design complexity with an acceptable performance loss in the SV environment.

0 2 4 6 8 10

10-2 10-1 100

SNR [dB]

PER

SV : perfect sync.

SV : 10ppm SV : 12ppm PER = 8%

Figure 4.12 Threshold search in 110Mb/s SV environment

Table 4.2 Performance of different threshold in SV environment Threshold SNR for 8% PER (dB) SNR loss (dB)

Perfect synchronization 6.4 0

10ppm 7.0 0.6

12ppm 7.8 1.4

From Table 4.3, the SNR loss of the threshold equal to 2ppm compared with the perfect synchronization is 0.1dB for 8% PER. However, when threshold is extended from 2ppm to 4ppm, the SNR loss is equal to 1dB. That means in the FV environment, the threshold = 2ppm is chosen to achieve acceptable performance loss. However, because of the lower threshold, the complexity of the CFO estimation in the FV environment could not be reduced as more as in the SV environment. Therefore, the concept of the power-aware is truly used in proposed design.

0 2 4 6 8 10

10-2 10-1 100

SNR [dB]

PER

FV : perfect sync.

FV : 2ppm FV : 4ppm PER = 8%

Figure 4.13 Threshold search in 110Mb/s FV environment

Table 4.3 Performance of different threshold in FV environment Threshold SNR for 8% PER (dB) SNR loss (dB)

Perfect synchronization 7.0 0

2ppm 7.1 0.1

4ppm 8.0 1.0

Chapter 5.

Simulation Results and Performance Analysis

In order to verify the proposed design, the complete system platforms of the IEEE 802.11a and UWB proposal are established on Matlab. These platforms have been introduced in chapter2. The performance of the proposed design will be simulated and compared with the conventional approaches in the following analysis.

5.1 Performance Analysis of the Proposed Frequency Synchronizer for OFDM WLAN Systems

The proposed frequency synchronizer for OFDM-based wireless systems is simulated in the system platform compliant to the IEEE 802.11a PHY. The PER analysis will focus on the 10% PER, which is the requirement in IEEE 802.11a standard

5.1.1 CFO Estimation Accuracy Analysis

RMSE Analysis

To analyze the CFO estimation accuracy of the proposed frequency synchronizer, the Root-Mean-Square-Error (RMSE) between the estimated CFO and the real CFO is measured, which has shown in Figure 5.1. Because AGC and packet detection need to take several short training symbols before frequency synchronization, we use three short symbols for coarse CFO estimation and one short symbol for sample-power detection. In the beginning, we simulated 4 cases for known sample-power-distribution without multipath effect; of course,

we don’t need sample power detection. These 4 curves show me that even-samples in short symbol and odd-samples in long symbol (the square-mark curve) have more accuracy than opposite results (the triangle-up-mark curve) in SNR less than 5dB compared with 100%

memory rate (the circle-mark curve). However, because of the multipath effect, the known sample power distribution will be interfered and estimation accuracy also decreased as the diamond-mark curve. In order to overcome this problem, the sample-power-detection could be applied to improve the estimation accuracy as the proposed curve. The reason that the accuracy of the proposed curve can’t close to the square-mark curve is only one short symbol for power detection, the correct decision probability could not achieve 100%.

*Simulated packets per SNR: 5000, CFO: 40ppm

0 5 10 15

100 101 102

SNR [dB]

RMSE

short (100% memory), long (100% memory), RMS=0 short-even (50% memory), long-odd (50% memory) short (50% memory), long (50% memory), RMS=150ns short-even (25% memory), long-odd (25% memory), RMS=0 short-even (50% memory), long-odd (50% memory), RMS=150ns short-odd (50% memory), long-even (50% memory), RMS=0 short (100% memory), long (100% memory), RMS=0, without decision short-even (50% memory), long-odd (50% memory), RMS=0, without decision short (50% memory), long (50% memory), RMS=150ns, with decision (proposed) short-even (25% memory), long-odd (25% memory), RMS=0, without decision short-even (50% memory), long-odd (50% memory), RMS=150ns, without decision short-odd (50% memory), long-even (50% memory), RMS=0, without decision

Figure 5.1 RMSE analysis of the proposed design PER Analysis

For performance analysis of the propose design, PER is simulated with the typical indoor wireless channel model that contains 50ns multipath RMS delay spread, 40ppm CFO and 40ppm SCO. The PER curves of 6Mb/s and 54Mb/s with perfect synchronization (CFO-estimation error = 0.0ppm), 100% memory approach and proposed design can be

shown in Figure 5.2. From Figure 5.2, the SNR loss of the proposed design (50% memory) compared with the 100% memory approach in the 6Mb/s and 54Mb/s are only 0.1dB and 0.13dB respectively for 10% PER. However, the memory size and computational complexity can be reduced from 100% to 50% with very low SNR loss.

(a) (b)

0 1 2 3 4 5

10-2 10-1 100

SNR [dB]

PER

Perfect Sync.

100% memory Proposed, 50% memory PER = 10%

18 18.5 19 19.5 20 20.5 21 21.5 22 10-2

10-1 100

SNR [dB]

PER

100% memory Proposed: 50% memory Perfect Sync.

PER = 10%

Figure 5.2 PER of the proposed design in (a) 6Mb/s (b) 54Mb/s data rates

5.1.2 System Performance

The PER curves of the OFDM-based WLAN system with 6Mb/s ~ 54Mb/s data rates are shown in Figure 5.3 and the design SNR for 10% PER are listed in Table 5.1. Compared with the perfect synchronization, the SNR loss of the proposed design is 0.15 ~ 0.38dB for 10%

PER. Figure 5.4 shows the root-mean-square-error (RMSE) of the proposed CFO estimation.

Within ±100ppm estimation range, the estimation RMSE can be ≤ 1ppm when SNR is ≥ 5dB.

The simulation result shows the proposed design can achieve low SNR-loss and meet the estimation range requirement for OFDM WLAN system.

0 5 10 15 20 10-3

10-2 10-1 100

SNR [dB]

PER

54Mb/s 48Mb/s 36Mb/s 24Mb/s 18Mb/s 12Mb/s 9Mb/s 6Mb/s PER = 10%

*40ppm CFO and 40ppm sampling –clock-offset

Simulated packets per SNR: 1000, Data bytes per packet: 1000

Figure 5.3 System PER performance Table 5.1 Required SNR for 10% PER Data Rate

(Mbits/s)

Proposed design (dB)

Perfect Sync.

(dB) SNR Loss (dB) IEEE 802.11a Requirement

6 3.13 2.76 0.37 9.7

9 4.92 4.7 0.22 10.7

12 6.10 5.88 0.22 12.7

18 8.85 8.48 0.37 14.7

24 11.35 11.10 0.25 17.7

36 14.88 14.73 0.15 21.7

48 18.94 18.62 0.32 25.7

54 20.62 20.24 0.38 26.7

-200 -150 -100 -50 0 50 100 150 200 10-1

100 101 102 103

Original CFO [ppm]

Root-Mean-Square-Error [ppm]

Proposed design, SNR = 5dB Proposed design, SNR = 10dB Proposed design, SNR = 20dB

*Simulated packets per SNR and CFO: 5000, RMS = 150ns 100ppm -100ppm

Figure 5.4 The CFO estimation performance in different CFO environment

5.2 Performance Analysis of the Proposed Frequency Synchronizer for UWB Systems

In this section, the simulated system PER with proposed (λ=4) data-partition-based, power-aware CFO estimation and approximate CFO compensation is analyzed in LDPC-COFDM and MB-OFDM UWB systems. Further, the simulated complexity and power reduction are also shown.

5.2.1 LDPC-COFDM UWB System Performance

PER curves of the LDPC-COFDM UWB system with 120Mb/s ~ 480Mb/s data rates are shown in Figure 5.5. Compared with the perfect synchronization, the SNR loss of the proposed design is 0.04 ~0.07dB for 8% PER. Figure 5.6 shows the root-mean-square-error (RMSE) of the proposed CFO estimation. With the estimation range of ±45ppm of 10.6GHz

show that the proposed design which has only 25% design complexity can achieve low SNR-loss and meet the estimation range requirement. The computational complexity of the proposed design is listed in Table 5.3. In Table 5.3, N is the sample amount of each repetitive symbol, and m is the auto-correlation times. Compared with the conventional design with λ = 1, the proposed design can save ~ 75% memory capacity and computational requirements.

*40ppm CFO and 40ppm sampling-clock-offset

Simulated packets per SNR: 1500, Data bytes per packet: 1024

2 3 4 5 6 7 8

10-4 10-3 10-2 10-1 100

SNR [dB]

PER 120Mb/s with the proposed design 120Mb/s with perfect Sync.

240Mb/s with the proposed design 240Mb/s with perfect Sync.

480Mb/s with the proposed design 480Mb/s with perfect Sync.

PER = 8%

Figure 5.5 LDPC-COFDM UWB system PER

Table 5.2 Required SNR for 8% PER Data Rate

(Mb/s)

Perfect Sync.

SNR (dB)

Proposed Design (dB)

System Required (dB)

SNR loss (dB)

120 3.22 3.27 7.6 0.05

240 5.31 5.38 16.0 0.07

480 7.49 7.53 21.1 0.04

*AWGN Channel, 5000 Packets per SNR and CFO

-80 -60 -40 -20 0 20 40 60 80

10-1 100 101 102 103

Channel CFO [ppm]

Root-mean-square-error [ppm]

Proposed design, SNR=3dB Proposed design, SNR=6dB Proposed design, SNR=10dB

-45ppm 45ppm

Figure 5.6 The CFO estimation performance in different CFO environment

Table 5.3 Design complexity

Design with λ = 1 The proposed design

Memory size of CFO estimation N

N 4

Multiplications of CFO estimation N× m

N 4

×m

Average phasor computations for each

received sample 1 1/4

5.2.2 MB-OFDM UWB system performance

In this section, the simulated PER with data-partition-based, power-aware CFO estimation and approximate CFO compensation scheme is analyzed in MB-OFDM UWB system. In order to prove the power aware CFO estimation is still work, the CFO environment is slow-variant (SV) with phase noise which is mentioned in section 2.3.5. The threshold in SV environment is 10ppm (100 KHz) which is mentioned in section 4.3.5.

System Performance

PER curves of the MB-OFDM UWB system with 110Mb/s ~ 480Mb/s data rates are shown in Figure 5.7. Compared with the perfect synchronization, the SNR loss of the proposed design is 0.5 ~0.6dB for 8% PER in worst CM channel of 110Mb/s, 200Mb/s and 480Mb/s, which shown in Table 5.4. The design SNR for 8% PER are 7dB, 15.1dB and 20.5dB and tally with the system required 7.1dB, 15.2dB and 21.1dB.

0 5 10 15 20 25

10-3 10-2 10-1 100

SNR [dB]

PER

110Mb/s (perfect sync.)

110Mb/s (data-partition+power-aware) 200Mb/s (perfect sync.)

200Mb/s (data-partition+power-aware) 480Mb/s (perfect sync.)

480Mb/s (data-partition+power-aware) PER = 8%

Simulation condition: CM channel, SV CFO, 40ppm SCO

Figure 5.7 MB-OFDM UWB system PER

Table 5.4 Required SNR for 8% PER Data Rate

(Mb/s)

CM channel

Perfect Sync.

SNR (dB)

Proposed Design (dB)

System Required (dB)

SNR loss (dB)

110 CM4 6.4 7.0 7.1 0.6

200 CM4 14.5 15.1 15.2 0.6

480 CM2 20.0 20.5 21.1 0.5

PER vs. distance

Figure 5.8 shows the PER performance for as a function of distance and information data rate in worst CM channel environment with SV-CFO condition. From Table 5.5, the distances of proposed design are 10.2 meter, 4.1 meter and 2.2 meter for 8% PER in worst CM channel of 110Mb/s, 200Mb/s and 480Mb/s. The design distances for 8% PER are still tally with the system required 10 meter, 4 meter and 2 meter.

0 5 10 15 20

10-2 10-1 100

Distance (meters)

PER

480Mb/s CM2 200Mb/s CM4 110Mb/s CM4 PER = 8%

Simulation condition: CM channel, SV CFO, 40ppm SCO

Figure 5.8 PER vs. distance

Table 5.5 Required distance for 8% PER

Data rates (Mb/s) CM channel Required (meter) Proposed (meter)

110 CM4 10 10.2 200 CM4 4 4.1 480 CM2 2 2.2

5.2.3 Complexity and power reduction summary

Figure 5.9 shows the complexity reduction of CFO estimation and power reduction of proposed frequency synchronizer respectively. The complexity of proposed data-partition can be reduced to 25% of conventional estimation approach. If we combine the method of power-aware, the complexity can be reduced more 15% in low SNR, and 23% in high SNR.

Besides, the power of proposed data-partition can be reduced to 40% of conventional frequency synchronizer. If we combine the method of power-aware, the power can be reduce more 10.3% in low SNR, and 16.5% in high SNR.

0 5 10 15 20

0 5 10 15 20 25 30 35

SNR [dB]

Complexity [%]

Data-partition

Data-partition+Power-aware

0 5 10 15 20

20 25 30 35 40 45 50

SNR [dB]

Proposed design power [%] Data-partition

Data-partition+Power-aware

Complexity of CFO estimation [%]

15% 23% 10.3% 16.5%

Figure 5.9 Complexity and power reduction

Chapter 6.

Hardware Implementation

In this chapter, we will introduce the platform based design flow. The architecture of the proposed design, hardware synthesis information and chip summary will be shown in the following sections.

6.1 Design Methodology

The trend of IC technology is towards to System-on-Chip (SoC). System-level simulation becomes very important in today’s design flow. Our design methodology from system simulation to hardware implementation can be shown in Figure 6.1.

Matlab platform

Algorithm verification Fixed-point design

Verilog HDL Coding Gate-level Synthesis Circuit-level Implementation

Channel model

Wordlength Analysis

Verilog Test Bench

Circuit Test Bench System built-up

Figure 6.1 Platform-based design methodology

First, the system platform with channel modals should be established according to the system specification, which ensures the design in the practical condition. Algorithm and architecture developments of each function block should be verified in the system platform to ensure the whole system performance. Fixed-point simulation is applied before hardware implementation to make a trade-off between system performance and hardware cost. An example of the word-length distribution analysis can be shown in Figure 6.1.2. Based on the signal distribution analysis and the PER simulation, a reasonable word-length of each signal can be decided. In hardware implementation, the HDL modules are verified with the test benches dumped from the equivalent Matlab blocks to ensure the correctness.

(a) (b)

Figure 6.2 (a) Signal distribution analysis (b) PER analysis of different word-length

6.2 The High-Speed and Low-Complexity Frequency Synchronizer for UWB Systems

6.2.1 Architecture of the Proposed Frequency Synchronizer

The architecture of the proposed frequency synchronizer is shown in Figure 6.3. It’s developed based on the proposed algorithms with λ = 4. Since the needed computation rate of equation (4.4) and (4.6) can be reduced to 1/λ of equation (4.3) and (4.5) respectively, the proposed design can work on 528MHz/4 = 132MHz low clock frequency. Before CFO estimator, data-partition controller selects one input sample from four data paths in each clock cycle. To avoid burst noise or serious interference causing CFO estimation failure, twice CFO estimation is applied. And a memory is used to store 2×

N λ

= 82 samples. Then CFO is estimated after arc-tangent circuit. Since calculation rate of compensating phasor is reduced by equation (4.6), the proposed CFO compensator can work with single phase accumulator (ACC) and single phase-to-I/Q lookup table (LUT) at 132MHz clock frequency. The needed parallel part is only the complex multipliers to compensate the received FFT symbols of preamble and data signal. Based on data partition and approximate phasor compensation scheme, the proposed design can achieve 528MS/s throughput through a single architecture with parallel multipliers at 132MHz clock frequency.

The detail architecture of the proposed CFO estimator is shown in Figure 6.4. According to this figure, we can see that, the register-files are used instead of the memory because only 82 samples that are 656 bits (4-bits x 82-samples x 2-I/Q) needed to be stored. If we used memory to store the samples, compare with the register files, the required area and power consumption will be enlarged. Besides, consider about dynamic power consumption issue, we denied the shift registers to store the samples, even if its area is smaller than register-files.

Data partition

Coarse Est ./Fine Est.

Control

Figure 6.3 Architecture of proposed frequency synchronizer

MUX

Figure 6.4 Detail architecture of proposed CFO estimator

The detail architecture of the proposed CFO compensator is shown in Figure 6.5(a) and its mapping diagram of sine and cosine is shown in Figure 6.5(b). The sine and cosine generator is designed using lookup table. Both the two function have the same property. The value in the first quadrant can be mapped onto other quadrants by simple sign transformation. Another property is that the values of sine or cosine can be transformed to each other. Generally, we can only build the first quadrant table of sine function and get the cosine values using the mapping function. In the system, however, the sine and cosine generator is used to generate complex values. The relative values have to be accessed at the same time for an angle. If the table only contains the sine values, there must be two accesses for a complex value. In our design, we build a table which contains the sine and cosine values, and we can produce a complex value using an access. The input range of the table is half of the previous. If the input angle is larger than the 450, exchange the output values of sine and cosine. Besides, the complexity of lookup table can be reduced because of the approximate CFO compensation scheme.

A conventional approach based on equation (4.3) and (4.5) is shown in Figure 6.6. It uses parallel-4 architecture to achieve 528MS/s throughput in 132MHz clock frequency. Compared with the parallel approach, the proposed design can reduce 75% memory size and complex multiplications. Implementation result will show the proposed design can efficiently reduce hardware cost and power consumption.

MUX (b) Mapping diagram of sine and cosine

Σ

Figure 6.6 Conventional parallel architecture

6.2.2 Hardware Synthesis

The equivalent gate-count of the proposed design and the power consumption measured by post-layout simulation are listed in Table 6.1 and Table 6.2 respectively. Compared with the conventional parallel approach as shown in Figure 6.6, the proposed design combining with the data-partition-based, power-aware CFO estimation and approximate CFO compensation scheme can reduce 59% gate count and 69.4 ~ 75.6% power consumption.

Table 6.1 Equivalent gate-count of UWB frequency synchronizer Gate-count CFO estimator CFO compensator Total Conventional

parallel design 41K 20K 61K

Proposed

design 11K 14K 25K

Reduction

Percentage 49.2% 9.8% 59%

Table 6.2 Power of UWB frequency synchronizer (528MS/s)

Power (mW) CFO estimator CFO compensator Total

Conventional

parallel design 41.3 16.2 57.5

Proposed

design 0.9 ~ 4.2 13.1 23.5

Reduction Percentage

64 ~ 70.2%

(data-partition +power-aware) 5.4% 69.4 ~ 75.6%

6.3 UWB Baseband Processor

Figure 6.7 shows the micro-photo of the LDPC-COFDM UWB baseband processor integrating the proposed design in standard 0.18µm CMOS process. Its features also listed in

Table 6.3. Measured result shows 21.4mW power is consumed by the proposed 528MS/s frequency synchronizer.

1.16 mm

0.79 mm 1.21mm

1.07 mm 0.93 mm

Figure 6.7 LDPC-COFDM UWB baseband processor

Table 6.3 LDPC-COFDM UWB PHY baseband feature

Technology 0.18µm CMOS 1P6M

Package 208 CQFP

Die area 42.25 mm2 (6.5 mm x 6.5 mm)

Max. Working Frequency 264 MHz

Core Power at 480Mb/s (TX/RX) 523 mW/575 mW Supply Voltage 1.8V Core, 3.3V I/O

Design Area 2.17 mm2 (5.1%)

Design Power 21.4 mW (3.7%) @480Mb/s RX

Chapter 7.

Conclusion and Future Work

After design description, performance analysis and hardware comparison, a novel frequency synchronizer is proposed here to achieve high throughput, low power, and satisfy performance for OFDM-based WLAN and UWB system. Combining data-partition-based, power-aware CFO estimation and approximate CFO compensation scheme, our proposal can reduce 69.4% ~ 75.6% power consumption with achievable 0.04 ~ 0.6dB SNR loss for 10%

PER of IEEE 802.11a WLAN system and 8% PER of LDPC-COFDM and MB-OFDM UWB systems; and further, the CFO-estimation range of ±100ppm in WLAN and of ±45ppm in UWB also can achieve the system requirement. The proposed design can achieve 528MSamples/s high throughput in both standard 0.13µm and 0.18µm CMOS processes.

For simulation of the power-aware CFO estimation, we established pseudo time-variant mean-CFO with phase noise model includes TIV (time-invariant), SV (slow-variant) and FV (fast-variant). However, the practical CFO model is also needed. In the future, we will survey more practical model to verify the design even if our pseudo model has considered about the worst CFO condition. Therefore, we are supposed to have more power reduction in general simulation case which mean-CFO is time-invariant.

Bibliography

[1] Salzberg, B.R, “Performance of an efficient parallel data transmission system,” IEEE Trans. Comm., Vol. COM-15, pp.805-813, Dec. 1967.

[2] Rechard Van Nee, and Ramjee Prasad, “OFDM for Wireless Multimedia Communications”, pp.20-51, 2000.

[3] IEEE 802.11a IEEE Standards for Wireless LAN Medium Access Control and Physical Layer Specifications, Nov. 1999.

[4] IEEE P802.15 Working Group, “Multi-band OFDM Physical Layer Proposal for IEEE 802.15 Task Group 3a,” Sept. 2004.

[5] ESTI EN 300 401 “Radio broadcasting systems; digital audio broadcasting (DAB) to mobile; portable and fixed receivers,” May 2001.

[6] ESTI EN 300 744 “Digital vedio broadcasting (DVB); framing structure, channel coding and modulation for signal digital terrestrial television,” Jan. 2001.

[7] T. Pollet, M. Blade and M. Moeneclaey, “BER sensitivity of OFDM systems to carrier frequency offset and Weiner phase noise,” IEEE Trans. on Commun., vol. 43, pp. 191-193, Feb./Mar./Apr. 1995.

[8] A. Batra, J. Balakrishnan, G.R. Aiello, J.R. Foerster and A. Dabak, “Design of A Multiband OFDM System for Realistic UWB Channel Environments” IEEE Trans. On Microwave Theory and Techniques, pp. 2123-2138, Sept. 2004.

[9] M. Krstic, A. Troya, K. Maharatna, and E. Grass, “Optimized low-power synchronizer design for the IEEE 802.11a standard,” ICASSP, vol. 2, 6-10 April 2003.

[10] J. Liu and J. Li, “Parameter Estimation and Error Reduction for OFDM-Based WLANs,”

IEEE Trans. On Mobile Computing, vol. 03, issue 2, Apr. 2004.

[11] C.S. Peng and K.A. Wen, “Synchronization for carrier frequency offset in wireless LAN

802.11a system,” Wireless Personal Multimedia Communications, vol.3, Oct. 2002.

[12] M. Morelli and U. Mengali, “An improved frequency offset estimator for OFDM applications,” IEEE Commun. Letters, vol. 3, pp. 75-77, Mar. 1999.

[13] T. M. Schmidl and D. C. Cox, “Robust frequency and timing synchronization for OFDM,” IEEE Trans. Commun., vol. 45, pp. 1613-1621, Dec. 1997.

[14] P.H. Moose, “A technique for Orthogonal Frequency Division Multiplexing Frequency offset correction,” IEEE Trans. on Commun., vol. 42, no. 10, pp. 2908-2914, Oct. 1994.

[15] ESTI TS 101 475 “Broadband radio access network (BRAN); Hiperlan type 2; Physical layer,” April 2001.

[16] Weinstein, S.B. and P.M. Ebert, “Data Transmission by Frequency Division Multiplexing Using the Discrete Fourier Transform,” IEEE Trans. Comm., Vol. COM-19, pp.628-634,

[16] Weinstein, S.B. and P.M. Ebert, “Data Transmission by Frequency Division Multiplexing Using the Discrete Fourier Transform,” IEEE Trans. Comm., Vol. COM-19, pp.628-634,

相關文件