ModelSim Simulation - SC-FDE System Realization

Chapter 4 SC-FDE System Realization

4.4 ModelSim Simulation

When developing an FPGA system, ModelSim simulation can help designers develop in an efficient and accurate way. It can pull out all signals and simulate how they work simultaneously without a limitation on the number of debugging pins.

Therefore, designers can save a lot of time downloading to FPGA and directly examine the changes and interactions between signals. Figure 4.29 and 4.30 show the data flows at the transmitter and receiver of the SC-FDE system, respectively. The six data symbols are conspicuously shown. Besides, from the figures certain design principles such as parallel processing is observed. In Figure 4.31 the output waveform of the transmitter is enlarged to show the real transmitted baseband signal. The IEEE 802.11a like output waveform with ten short preambles, two long preambles and six UW-appended SC-FDE data frames is clearly demonstrated, where the preamble part is BPSK modulated.

Preamble Preamble

Data Frame Data Frame

Figure 4.29: SC-FDE transmitter ModelSim simulation result

Parallelism Parallelism

Figure 4.30: SC-FDE receiver ModelSim simulation result

16 us

16 us 24 us24 us

Figure 4.31: Transmitted waveform of SC-FDE system

4.5 Experimental Results

In the self-design platform, we attempt to establish a real wireless environment, under which the adopted algorithm can be tested. Figure 4.32 shows the experimental environment which has been shown in Chapter 3. First, source data are stored in a ROM in FPGA, and passed to DA after being processed by the transmitter algorithm on FPGA. Next, data are transmitted by the RF module, and a received antenna is allocated near the RF module. Subsequently data are received by the receive antenna and passed to spectrum analyzer E443A and vector signal analyzer 89600S. Finally, received data are analyzed and shown on PC.

Figure 4.32: Self-designed platform development environment

Figure 4.34 shows the source data stream in the transmitter, transmitted data stream, and detected data stream in the receiver, where the source data stream and the detected data stream are specially expanded below. By comparing the source data stream with detected data stream we can find out that they are exactly the same, which confirms that our algorithm does work successfully.

Figure 4.33: Self-designed platform experimental result: source data and detected data waveform on LA

Synthesis, map, and place and route are necessary steps in FPGA circuit design as well as the most time-consuming process. Besides, the insuffucuency of FPGA gate count is another problem worth noting. The goal in our design is to achieve a compromise between the hardware resource requirement and the system time consumed. Table 4-1 and 4-2 show the relative resource consumption of transmitter and receiver. It can be seen that, at the transmitter side, the upsampler and RRC is the only part that has multiplication operation, while at the receiver side, FFT/IFFT are responsible for most of the complicated calculations. Finally, Table 4-3 shows time consumption in our development flow, and the whole design flow includes developing transmitter and receiver.

Table 4-1: Relative Resource consumption of the SC-FDE system at the transmitter

50% 100%

40%

12%

19%

Upsampler & RRC 11%

0% 0%

15%

Preamble 25%

50% 0%

Convolutional Encoder 4%

0% 0%

Selected Device : 2v6000ff1152-6

50% 100%

40%

12%

19%

Upsampler & RRC 11%

0% 0%

15%

Preamble 25%

50% 0%

Convolutional Encoder 4%

0% 0%

Selected Device : 2v6000ff1152-6

Table 4-2: Relative Resource consumption of the SC-FDE system at the receiver

Unique Word Based Phase Tracking

Frequency Estimation & Compensation

47%

Symbol Timing Recovery (DLL)

Selected Device : 2v6000ff1152-6

Unique Word Based Phase Tracking

Frequency Estimation & Compensation

47%

Symbol Timing Recovery (DLL)

Selected Device : 2v6000ff1152-6

Table 4-3: Time consumption of Synthesis and P&R in SC-FDE system

19 min 33 sec 3 min 58 sec

Place and Route Time

7 min 2 sec

Place and Route Time

7 min 2 sec 2 min 37 sec

Synthesis Time

RX TX

4.6 Summary

In this chapter, a complete communication system design flow is presented, including MATLAB verification, FPGA realization, ModelSim simulation, and experimental results. Through this design flow, we developed a UW-based SC-FDE system on two FPGA-based platforms, where real wireless channel effects can be generated by means of RF module. We also introduce some RF debugging instruments which make our system become much closer to real communication systems. The designing principles we follow are described, and the designing concept of each function block is detailed. We especially show that the proposed UW-based phase tracking algorithm is not only theoretically suitable for the SC-FDE system but also practically applicable in hardware design. In Chapter 5, we will give some other applications based on UW which are also usable in SC-FDE system.

Chapter 5 Other Applications on Unique Word Structure in SC-FDE System

In this thesis we have considered and implemented the phase tracking algorithms for single carrier systems with frequency domain equalizers based on block-by-block of Unique Word (UW) insertion similar to that adopted in IEEE 802.11a OFDM systems. We focus on the design and performance of the algorithms, and show that it provides almost optimum performance in SC-FDE systems. Moreover, we also compare the result with pilot carrier based phase tracking algorithms in OFDM systems.

The Unique Word structure can be exploited not merely in the way described above. It can be observed that the overall baseband processing performance of SC-FDE systems largely depends on the design of channel estimation and synchronization algorithms. In fact, the deterministic properties of the UW give it a good nature to do various kinds of synchronization tasks as well as channel estimation especially in a mobile environment. Moreover, with the UW-based algorithms the SC systems can employing frequency-domain equalization at the receiver and benefit from low complexity which is suitable to implement on hardware. Therefore, in this chapter we investigate the use of UW and elaborate on the advantages it provides for equalization, channel estimation, and synchronization. A comparison between UW-based with CP-based SC-FDE system is also given, and their performance in terms of BER behavior and bandwidth efficiency are shown as well.

5.1 Cyclic Prefix versus Unique Word

The frequency-domain equalization for single carrier systems is based on the equivalence between the convolution of two sequences in the time domain and the product of their Fourier transforms. Besides, the use of FFT operations anticipates that signals have to be processed blockwise; not only applying blockwise processing at the received signal but also performing a blockwise transmission and inserting a cyclic prefix between successive transmitted blocks. The content of the CP is obviously not known and varies with every single block. With a slight modification to implement the cyclic prefix as a training sequence – the Unique Word in this thesis, however, it can play two important roles: avoid the inter-block-interference (IBI) and be used in synchronization and channel estimation. The topic of channel estimation is especially of utter importance in fast fading environment. In the following section we will show that the UW-based SC scheme (SC-UW) scheme offers the advantages at the expense of only a small fraction of a dB, while in other situations it has hardly any drawback compared to CP-based SC scheme (SC-CP) system.

Before introducing the algorithms taking advantage of the UW sequence that is provided, we can expect something a priori:

¾ From a performance point of view, the SC-UW scheme inherits from the properties of the SC-CP scheme: it offers a similar performance as for OFDM, with more robustness to nonlinear distortion and phase noise.

Moreover, the UW sequence does not contain data. Hence, it can be optimized to get appropriate properties (e.g., autocorrelation) and its symbols could even be chosen from a separate alphabet. This avoids the accidental presence of the UW sequence in the useful data.

¾ From a synchronization point of view, the SC-UW acquisition is essentially the same as for the SC-CP: data-aided algorithms are known to perform better than their non-data-aided counterparts. They avoid decision directed algorithms and alleviate the problem of feeding the decisions back, which would mean a delay of one frame.

¾ For the channel estimation, the concept of UW is most useful when the channel is varying rather rapidly, like in mobile communications. The extension of SC-UW to the multiuser case (in a spatial division multiple access scheme) is easier than for SC-CP, as the users can be distinguished on the basis of their different UW.

5.1.1 Comparison of CP and UW in Terms of Bandwidth Efficiency and BER Behaviour

Figure 5.1 shows the structure of Cyclic Prefix and Unique Word. Two main differences are obvious when comparing the two concepts:

¾ The UW is not random as the CP

¾ Instead of having to throw away the cyclic prefix, we always process the UW, which is not removed at the receiver but is available after the equalization in the time domain. Hence, there is no gap anymore between two FFTs.

In practical situations, the FFT is usually taken in the middle of the UW to allow small timing synchronization errors. Moreover, as the UW is always present on both edges of the data block, the transformation from linear convolution to cyclic convolution is kept, and the performance of the original SC-CP is also kept.

CP1 CP2 CP2

FFT-1 FFT-2

CP1

(a)

UW UWUW UW

FFT-2 FFT-1

(b)

Data Payload TS Overhead CP1

CP1 CP2CP2 CP2CP2

FFT-1 FFT-2

CP1

(a)

UW UWUW UW

FFT-2 FFT-1

(b)

UW UWUW UW

FFT-2 FFT-1

UW UWUWUWUW UWUW

FFT-2 FFT-1

(b)

Data Payload TS Overhead Data Payload TS Overhead

Figure 5.1: Single Carrier with (a) Cyclic Prefix and (b) Unique Word

The bandwidth efficiency is reduced for a SC-FDE by the guard period. Recall that TFFT and TG denote the FFT period and guard interval of a frame, respectively.

The bandwidth efficiency of the described SC- CP and SC-UW systems without taking coding into account can be given as:

0.8 0.75

FFT CP

FFT G

FFT

T T

T η

= =

= − =

(5.1)

The result in Eq.(5.1) leads to an additional degradation of 5% in terms of bandwidth efficiency, assuming TG to be 25% of TFFT (In our thesis, frame length = 64 symbols and UW length = 16 symbols). Furthermore, a loss in terms of the BER behavior is expected, and a loss as a result of additional overhead compared to a single carrier system with time domain equalization is anticipated as well.

5.2 Application of the Unique Word Structure

Transmission over multipath channels makes channel estimation and synchronization not only necessary but also important. Due to the fact that the UW is known, it can be used for equalization, channel estimation, or synchronization purposes. In the following section some algorithms and results will be given for the mentioned application.

5.2.1 Synchronization

Synchronization is indispensable criterions for high data rate wireless transmission. In a time-invariant environment, initial channel estimation and block synchronization can be done by a preamble at the beginning of every burst . In time varying channels, however, clock-frequency-offsets or carrier-frequency-offsets make tracking necessary. Tracking is mainly based on the insertion of pilot symbols;

implementing the structure of UW, pilot sequences are available automatically.

The variation of the sampling time between transmitter and receiver caused by the clock-frequency-offset will lead to rising displacement of the FFT-window. To solve the problem, an autocorrelation as shown in Eq.(5.2) of two consecutive, received UWs, denoted by uk and uk+N, which are separated by N symbols may result in distinctive correlation peaks if the symbols of the UW are chosen as to have good correlation properties (e.g. Pseudo noise sequences, Barker sequences)[15].

{

}

( ) ^G

k k N k

k u u

φ ₊

∑

⋅ ^(5.2)

where u indicates the complex conjugate of _k^* u . _k

With the UW structure and a selected symbol sequence, this method shows conspicuous correlation peaks. Figure 5.2 shows the result of the autocorrelation, which indicates the beginning of every FFT window very precisely. The simulation is performed forSNR=25dBand multipath conditions; the UW is a PN sequence.

Nevertheless it is to mention that, if due to the fact that UW is corrupted by the channel and noise on the one hand or the time duration of UW is too short on the other hand, the correlation of the two successive UWs may not show reliable enough correlation peaks – the autocorrelation properties of the investigated sequences are partly lost.

0 50 100 150 200 250 300 350 400 450

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

1x 10⁵

Autocorrelation Function

FFT Window

Figure 5.2: Synchronization and tracking of the FFT-window

5.2.2 Channel Estimation

Unavoidably, the equalization of the received message in systems employing single-carrier transmission is a fundamental problem in high data rate wireless communication, and performing the equalization requires knowledge of the channel.

These parameters can be estimated relatively easily prior to data transmission in stationary environment; however, in non-stationary channels, they vary with time and must be tracked by some means.

The deterministic properties of the UW can be exploited in the channel estimation algorithms for SC-FDE systems to track a temporally fading channel. The algorithm intrduced here relies on the ensemble averaging of the received signal to recover the channel state information (CSI). Although any time-domain or frequency domain equalization technique can be employed once the channel has been estimated, this algorithm lends itself to FDE systems since the use of the UW gives the system a cyclic nature.

System Model

Consider a system employing SC block transmissions with a UW extension. The i-th length-K blockx( )i of transmitted symbols is partitioned into a length-P vector s(i) of data symbols and a length-Q vector u representing the UW. An illustration of this block structure is depicted in Figure 5.3.

u s(i) u s(i+1) u

P Q

K K

u s(i) u s(i+1) u

P Q

K K

Figure 5.3: Basic UW block structure

In order to alleviate inter-block interference, we assume that Q ≥ L where L is the memory order of the channel impulse response (CIR). This condition also induces circularity in the system, which allows us to express the i-th length-K block of

received symbols ( )y i by

( )i = ( ) ( )i i + ( )i

y h x n (5.3)

where h( )i is a K K× circulant matrix representing the channel at time i and n(i) is a length-K vector of uncorrelated, zero-mean, complex Gaussian noise samples, each with a variance of σ_n² 2 per dimension. Specifically,

0 1

( ) 0 ( ) ( )

( ) 0

( ) ( )

( ) 0 ( ) 0

0 0 ( ) ( )

L L

h i h i h i

h i

h i h i

i h i

h i h i

⎛ ⎞

⎜ ⎟

= ⎜ ⎟

⎜ ⎟

⎝ ⎠

" "

# % % #

# % # %

# % % #

" "

(5.4)

where hm(i) is the m-th complex tap coefficient of the CIR at time i.

Denote F as the normalized K K× DFT matrix where its (k,i)-th element is given byF_{k i}, K⁻¹exp

(

−j2πki K

)

for ,k i=0,...,K− , and F1 m is the first m columns of F while F is the last m columns of F. Referring to Eq.(5.3) , we consider the _m′ transformation of the received symbol vector y(i) into the frequency domain, which is given by

( )i = ( ) ( )i i + ( )i

Y H X N (5.5)

where Y( )i =Fy( )i , N( )i =Fn( )i , X( )i =Fx( )i ,and H( )i =Fh( )i F^H is a diagonal matrix with the channel frequency response coefficients on the diagonal. In addition, the transmitted vector X(i) can be partition into a data part and a UW part as given by

( ) _P _Q ( )i ( ) i =⎡⎣ ′⎤⎦⎡⎢⎣ ⎤⎥⎦= i +

X F F s S U

u (5.6)

where S( )i =F s_P ( )i and U=F u . Therefore, _Q′

( )i = ( ) ( )i i + ( )i + ( )i

Y H S H U N (5.7)

Channel Esitmation Algorithm by UW

If the channel is time-invariant (i.e. ( )H i =H for all i), the received frequency-domain vectors

{

^Y^{( )}ⁱ

}

^N_i₌₁ can be treated as a sample set of a random process ψ where the mean of the process is given by

1 ^N ( )

N i ψ

∑

^Y ^(5.8)

If a symmetric constellation such as QPSK is employed for data transmission and no a priori knowledge of the transmitted message is assumed, then ^{E ( )}

{ }

^S ⁱ ⁼⁰^and

evaluating the expectation in Eq.(5.8) yields

{ }

lim E

N ψ

→∞ =

= Y HU

(5.9)

While channel varies with time, as is the case in mobile environments, the sample size N must be limited in some way to include only those blocks received within the last Tc seconds, where Tc is the coherence time of the channel. In this case, the recursive least square (RLS) algorithm can be employed with the cost function

( ) ⁱ ^{i k} ( , )

i e k i

ϕ ρ⁻

∑

^(5.10)

where ρ is the standard RLS forgetting factor that is usually close to, but less than one. The error term ( , )e k i in Eq.(5.10) is defined as

( , ) ( ) i l( )

e k i =Y k −UH i (5.11)

where i^U^D

{ }

^U is the diagonal matrix with the elements of U on the diagonal and l( )i

H is a length-K vector of the i-th estimated channel frequency response coefficients.

Our goal is to find a channel estimation such that the error term in Eq.(5.10) is smallest. From the result derived from [20], taking the gradient of Eq.(5.10) with respect to l( )H i , setting the result equal to zero, the channel estimate vector is given by

l i

The channel can therefore be updated with the i-th received block by nothing that ( )i =ρ ( 1)i− +i

P P U (5.13)

( )i =ρ ( 1)i− +i( )i

r r Y (5.14)

Note that Eq.(5.12) requires the inverse of P(i) to compute the updated channel estimate. Since P(i) is a diagonal matrix, however, it is easy to invert. Consequently, this method of mobile channel estimation benefits from very low complexity since only three complex multiplications are required to update the channel estimate on a given frequency tone. As with all applications of the RLS algorithm, this application requires the vector r and the matrix P to be initialized. If a reliable initial channel estimate l(0)H is available (e.g. by the preamble based channel estimation), r and P can be initialized to

(0)=βi l(0)

Consequently, we may chooseβto be

1 1

Defining β as above is equivalent to initializing the channel estimation by transmitting an infinite number of blocks containing only the UW over a static channel, which is denoted here by l(0)H , and computing ( )r ∞ and ( )P ∞ . This definition ofβproduces good convergence results as shown in the simulation result later.

Also, a more complex version of the stochastic algorithm can be implemented. In this version, the received symbols y(i) are first equalized and the data symbols are detected. Using the length-P vector of the detected data symbols s( )i and the previous channel estimate l( 1)H i− , the contribution of the data to the received message is subtracted from the original received vector in the frequency domain to give

{ }

( ) ( ) ( ) ( 1)

u i = i −D P i i−

Y Y F s H (5.19)

Replacing ( )Y i withY_u( )i in Eq.(5.10) through (5.12), the new channel estimate can be obtained

5.3 Simulation Results

The algorithm described in section 5.2 was implemented in computer simulations in order to observe its performance relative to other techniques. Two systems were simulated. In each of these systems, UWs were appended to every frame of QPSK data symbols to form blocks of K = 64 symbols. These blocks were transmitted over a 3-tap, exponentially decaying channel with a normalized Doppler

spread of f_D = 1.5 10× ⁻⁶. The channel realizations were generated with a Rayleigh fading profile from burst to burst, and Jakes’ model was used to simulate temporal fading within each burst. At the receiver, each system utilized its own knowledge of the channel to equalize the received message with a linear FDE. Each equalized symbol was then mapped to the nearest QPSK symbol.

The first system used an initial channel estimate, which was gleaned from a preamble, to construct a linear FDE, and only one channel estimate was obtained for each burst. The second system employed the stochastic channel estimation method

with feedback detailed in section 5.2.2. This system initialized the metrics r and P according to Eq. (5.15) and (5.16) where the initial channel estimate was obtained through a preamble. A forgetting factor of 0.96 was used. The two systems are summarized in Table 5-1.

Table 5-1: Summary of simulated SC-FDE systems

Estimated by preamble and updated by UW 2

Estimated by preamble only 1

Channel Knowledge System

Estimated by preamble and updated by UW 2

Estimated by preamble only 1

Channel Knowledge System

Figure 5.4 depicts the probability of bit error of each of the systems described above. It is observed that the system that employs stochastic channel tracking performs better than the preamble-only system. Indeed, the system employing a preamble-based only channel estimation suffers greatly even in this slow-fading environment.

Figure 5.4: Comparison of the proposed preamble based channel estimation and UW-update channel estimation

5.4 Circuit Design of Proposed Methods

In this part, the circuit designs of the algorithms presented in previous section are proposed. Our purpose is to show that the UW-based synchronization and channel estimation algorithms are not only able to provide better performance but also have low complexity and are suitable for hardware designing. Besides, The design principles still follow the rules mentioned in section 4.3.1.

(1) UW-based Synchronizer

The design of the UW-based synchronizer is quite simple. As shown in

在文檔中基於獨特碼架構之頻域等化單載波系統之FPGA實現 (頁 85-0)