# A System Design Example

## Chap 2 Overview of OFDM 7

### 2.4.3 A System Design Example

As an example, suppose we want to design a system with the following requirements: the bit rate is 20 Mbps, the tolerable delay spread is 200 ns, and the maximum bandwidth is 15 MHz operating at fc = 5 GHz.

The delay spread tolerance 200 ns suggests that 800 ns is a safe value for the guard time. By choosing the OFDM symbol duration 6 times the guard time (4.8µs), the guard time loss is made smaller than 1 dB. The subcarrier spacing is now the inverse of 4.8 – 0.8 = 4µs, which gives 250 kHz. To determine the number of subcarriers needed, we can look at the ratio of the required bit rate and the OFDM symbol rate. To achieve 20 Mbps, each OFDM symbol has to carry 96 bits of information (96/4.8µs= 20 Mbps). To do this, there are several options. One is to use 16-QAM together with rate 1/2 coding to per 2 bits per symbol per subcarrier. In this case, 48 carriers are needed to get the required 96 bits per symbol. Another option is to use QPSK with rate 3/4 coding, which gives 1.5 bits per symbol per subcarrier. However, 64 subcarriers means a bandwidth of 64g 250 kHz = 16 MHz, which is larger than the target bandwidth. To achieve a bandwidth smaller than 15 MHz, the first option with 48 subcarriers and 16-QAM fulfills all the requirements.

It has the additional advantage that an efficient 64-point radix-4 FFT/IFFT can be used, leaving 16 zero subcarriers to provide oversampling necessary to avoid aliasing. If we assume that the moving speed of the mobile v is no more than 100 km/hr, (2.13) is satisfied. (

### Frame Synchronization Techniques

This chapter is organized as follows. We will introduce the OFDM system model and the synchronization task in Section 3.1. Several typical CP-based frame synchronization techniques are described in Section 3.2 and 3.3, and our proposed modified techniques are presented in Section 3.4.

### 3.1.1 System Description

Figure 3.1 OFDM system, transmitting subsequent blocks of N complex data.

Figure 3.1 illustrates the baseband, discrete-time OFDM system model we investigate. The complex data subsymbols are modulated by means of an IDFT (IFFT) on N parallel subcarriers. The resulting OFDM symbol is serially transmitted

over a discrete-time channel, whose impulse response we assume is shorter than L samples. At the receiver, the data are retrieved by means of DFT (FFT).

An accepted means of avoiding ISI and preserving orthogonality between subcarriers is to copy the last L samples of the body of the OFDM symbol the cyclic prefix (CP) − to form the complete OFDM symbol, as mentioned in Subsection 2.2.1. The effective length of the OFDM symbol as transmitted is this CP plus the body (L+N samples long). The insertion of CP can be shown to result in an equivalent parallel orthogonal channel structure that allows for simple channel estimation and equalization, as mentioned in Subsection 2.2.2. In spite of the loss of transmission power and bandwidth associated with the CP, these properties generally motivate its use.

Consider two uncertainties in the receiver of the OFDM symbol: the uncertainty in the arrival time of the OFDM symbol and the uncertainty in carrier frequency.

The first uncertainty, also called the frame error, is modeled as a delay in the channel impulse responseδ(kθ), where θ is the integer-valued unknown arrival time of a symbol. The latter is modeled as a complex multiplicative distortion of the received data in the time domainej2πεk N/ , where ε denotes the difference in the transmitter and receiver oscillators as a fraction of the subcarrier spacing. Notice that all subcarriers experience the same shift ε . These two uncertainties and the AWGN thus yield the received signal

2 /

( ) ( ) j k N ( ) (3.1) r k =s kθ e πε +n k

Two other synchronization parameters are not accounted here. First, an offset in the carrier phase may affect the symbol error rate in coherent modulation. If the data is differentially encoded, however, this effect is eliminated. An offset in the sampling

frequency will also affect the system performance. We assume that such an offset is negligible. The effect of non-synchronized sampling is investigated in .

Now, consider the transmitted signal s(k). This is the IDFT of the data symbols xk, which we assume are independent. Hence, s(k) is a linear combination of independent and identically distributed (i.i.d) random variables. If the number of subcarriers is sufficiently large, we know from the central limit theorem that s(k) approximates a complex Gaussian process whose real and imaginary parts are independent. This process, however, is not white since the appearance of a CP yields a correlation between some pairs of samples that are spaced N samples apart. Hence, r(k) is not a white process either, but because of its probabilistic structure, it contains information about the time offset θ and carrier frequency offset ε . This is the crucial observation that offers the opportunity for joint estimation of these parameters based on r(k).

Next, we investigate the influence of the frame errors on the FFT output symbols while AWGN channel is used. If the estimated start position of the frame is located within the guard interval, each FFT output symbol within the frame will be rotated by a different angle. From subcarrier to subcarrier, the angle increases proportionally to the frequency offset. If the estimated start position of the frame locates within the data interval, the sampled OFDM frame will contain some samples that belong to other OFDM frame. Therefore, each symbol at the FFT output is rotated and dispersed due to the ISI from other OFDM frame. The phase rotation imposed by frame synchronization error can thus be corrected by appropriately rotating the received signal, but the dispersion of signal constellation caused by ISI forms a BER floor. Another effect that we must take into account is the channel impairment.

The OFDM symbols are dispersed in time axis due to the multipath effect.

Consequently, the guard interval used to estimate the frame location is interfered by the previous symbol.

A synchronizer cannot distinguish between phase shifts introduced by the channel and those introduced by symbol time delays. Time error requirements may range from the order of one sample (wireless applications, where the channel phase is tracked and corrected by the channel equalizer) to a fraction of a sample (in, e.g., high bit-rate xDSL, where the channel is static and essentially estimated only during startup). The effect of a frequency offset is a loss of orthogonality between the tones.

The resulting ICI has been investigated in .

In the following sections, we assume that the channel is non-dispersive and that the transmitted signal is only affected by AWGN. We will evaluate our techniques for both the AWGN channel and a time-dispersive channel by computer simulation in Chapter 5.

### 3.2.1 ML Estimation Based on Received Signal 

Figure 3.2 Structure of OFDM signal with CP symbols s(k).

Assume that we observe 2N+L consecutive samples of r(k), as shown in Figure 3.2, and that these samples contain one complete (N+L)-sample OFDM symbol. The position of this symbol within the observed block of samples, however, is unknown because the channel delay θ is unknown to the receiver. Define the index sets

{ , , 1} and probability density function (pdf) of the 2N+L observed samples in rˆ given the arrival time θ and the carrier frequency offset ε . In the following, we will drop all additive and positive multiplicative constants that show up in the expression of the log-likelihood function since they do not affect the maximizing argument. Moreover, we drop the conditioning on for notational clarity. Using the correlation properties of the observationsrˆ, the log-likelihood function can be written as

( , ) log (ˆ , )

for both 1-D and 2-D distributions. The second product

### Π

f(r(k)) in (3.3) is independent of θ (since the product is over all k) and ε (since the density f r k( ( )) is rotationally invariant). Since the ML estimation of θ and ε is the argument maximizingΛ( , )θ ε , we may omit this factor. Under the assumption that rˆ is a jointly Gaussian vector, (3.3) is shown in the Appendix A to be

( , )θ ε γ θ( ) cos(2πε γ θ( )) ρ θ( ) (3.4)

Λ = + ∠ − Φ

where ∠ denotes the argument of a complex number

*

is the magnitude of the correlation coefficient between r(k) and r(k+N), the asterisk ∗ indicates the conjugate of a complex value and SNR =σ σs2 n2. The first term in (3.4) is the weighted magnitude of γ θ( ), which is a sum of L consecutive correlations between pairs of samples spaced samples apart. The weighting factor depends on the frequency offset. The term Φ( )θ is an energy term, independent of the frequency offset ε . Notice that its contribution depends on the SNR (by the weighting-factor ρ). The maximization of the log-likelihood function can be performed in two steps:

( , )

max ( , ) max max ( , ) max ( ,ˆML( )). (3.8)

θ ε Λ θ ε = θ ε Λ θ ε = θ Λθ ε θ

The maximum with respect to the frequency offset ε is obtained when the cosine term in (3.4) equals one. This yields the ML estimation of ε

ˆ ( ) 1 ( ) (3.9)

ML 2 n

ε θ γ θ

= − π ∠ +

where n is an integer. Notice that by the periodicity of the cosine function, several maxima are found. We assume that an acquisition, or rough estimate, of the

k

frequency offset has been performed and that ε <1 2; thus, n=0 . Since cos(2πε θˆML( )+ ∠γ θ( )) 1= , the log-likelihood function of θ(which is the compressed log-likelihood function with respect to ε ) becomes

( ,θ ε θˆML( )) γ θ( ) ρ θ( ) (3.10)

Λ = − Φ

and the joint ML estimator of θ and ε given r(k) becomes

ˆML arg max{ ( ) ( )} (3.11) θ = θ γ θ − Φρ θ

1 (ˆ ). (3.12)

ML 2 ML

ε γ θ

= − π

Notice that only two quantities affect the log-likelihood function (and thus the performance of the estimator): the number of the CP samples L and the correlation coefficient ρ given by the SNR. The former is known at the receiver, and the latter can be fixed. Basically, the quantity γ θ( ) provides the estimates of θ and ε . The structure of the estimator in an OFDM receiver is shown in Figure 3.3.

Figure 3.3 Structure of the ML estimator.

### 3.2.2 Peak-Picking Algorithm 

The peak-picking (PP) algorithm we introduce in this subsection is based on ML estimation describe in 3.2.1. We can see from (3.10), the first term (correlation part)

γ θ dominates the log-likelihood function because the second term (energy part) is ( ) almost the same for different θ, then we can reformulate the ML estimator by a correlation function G(n), which is given by

-1

The correlation function G(n) is used for both frequency synchronization and frame timing synchronization. It represents the correlation of two sequences of L samples length, separated by N samples, in the received sample sequences as shown in Figure 3.4. The maximum magnitude sample of G(n) is expected to coincide with the first sample of the current OFDM symbol. At this position, samples of CP and their copies in the current OFDM symbol are perfectly aligned in the summation window.

Therefore, the estimation ˆθ of the frame timing for the mm th OFDM symbol can be given as

ˆm arg max Gm( ) (3.14)

θ θ θ

= ∈Θ

The maximum value of the correlation function is found over a window of

{ |1θ θ N L}

Θ = ≤ ≤ + for each OFDM symbol (window boundaries are not normally aligned with that of OFDM symbols) in the receiver. The estimation frequency error

ˆm

ε is estimated using the phase of the correlation function at ˆ θ θ= m,

Figure 3.4 Computation of correlation function G(n) using an L-length shift register.

### 3.2.3 Averaging and Peak-Picking Algorithm 

choice of M, the number of windows (symbols) to average over, in averaging and peak-picking (APP) algorithm mainly depends on the following two factors,

(i) Time interval (number of OFDM symbols) over which the arrival time θ and frequency offset ε can be considered to be constant,

(ii) Restriction on computational complexity.

The above factor (i) is tightly constrained in time fading channels because of the time variant channel delay θ and frequency offset ε . However, in a non-time fading channel (still with multipath and frequency selective fading), this constraint can be significantly relaxed, and θ and ε can be assumed to be constant over significantly long periods. Although this scenario allows large M values for averaging, the computational complexity becomes a major problem. We will introduce several low complexity solutions to this problem in the following sections.

### 3.3.1 Complex-Quantization Algorithm 

In complex-quantization (CQ) algorithm, we quantize the in-phase and quadrature components of r(k) to form the complex sequence c(k)=Q[r(k)],

1, , 2

The signal c(k) is a complex bitstream, i.e., c(k) can only take one of the four different values in the alphabet

0 1 2 3

{ ,a a a a, , }, {1 j, 1 j, 1 j,1 j}, (3.19)

= + − + − − −

?

see Figure 3.5. The sequence c(k) can thus be represented by 2 bits, one for its real and one for its imaginary part. In spite of this quantization, c(k) still contains information about θ. A sample c(k), k∈Ι, is correlated with c k( +N), while all samples c(k), k∉Ι ΙU , are independent. '

Figure 3.5 Geometric representation of the signal set A, and the quadrants Q ii, =0,1, 2,3 of the complex plane.

The probability of all 2N+L samples of c(k) to be observed simultaneously, given a certain value of θ, can be separated in the marginal probabilities for its sample to be observed, except for those samples c(k), k∈Ι ΙU , which are pairwise correlated. '

The ML estimator of θ given c(k), ˆθ , maximizes this function with respect to c θ.

For k∉Ι ΙU , ' p c k2( ( )) 1 4= , since r(k) is zero-mean Gaussian process with independent real and imaginary parts. Hence, the second product of (3.20) is a constant, which can be omitted. The ML estimate ˆθ becomes c

and ∗ denotes convolution. To obtain the log-likelihood function we thus feed the resulting sequence by means of a moving sum of length L, see Figure 3.6. The ML estimation of θ selects the peaks of this function.

Figure 3.6 Look-up table implementation of the complex-quantization ML estimator.

In the Appendix B, the complex-quantization ML estimator based on c(k) is determined by calculating the p1(g ). Moreover, it is shown that taking the real part of the correlation between c(k) and c k( −N), instead of applying the non-linearity g(k) yields an equivalent and attractive structure for the ML estimator, as illustrated in

Figure 3.7. The ML estimate ˆθ becomes c

In most applications the arrival time θ is approximately constant over several, say M, received frames. This essentially means that instead of just one frame r(k), M frames are observed simultaneously containing information about the unknown θ. Generalizing the discussion preceding, it can be shown that the log-likelihood function for θ given ci(k), i=1,L,M, becomes

We call this method averaging and complex-quantization (ACQ) algorithm.

Figure 3.7 Equivalent implementation of the complex-quantization ML estimator.

### 3.3.2 Smoothing Complex-Quantization Algorithms 

As mentioned in Subsection 3.1.2, if the estimated start position of the frame locates within the data interval or multipath effect exists, the sampled OFDM frame

will contain some samples that belong to other OFDM frame. Therefore, each symbol at the FFT output is rotated and dispersed due to the ISI from other frame.

A solution to remedy this problem is to use different smoothing algorithms in place of the moving sum scheme shown in Figure 3.7. Instead of using the moving sum shown in Figure 3.7 that weights Re{ ( )c k c kg *( −N)} equally, an exponentially decaying weighted function is applied to Re{ ( )c k c kg *( −N)}. We consider four smoothing algorithms here. Let L be the number of samples in a guard interval, the log-likelihood functions at time instant θ for these algorithms are given as follows. c

Moving average (MA):

Note that the moving average (MA) scheme is identical to the CQ algorithm presented in 3.3.1. The MA, SMA, and EWMA algorithms can be realized as FIR filters, and the EWA algorithm can be realized as an IIR filter. The weighting factor w for both EWMA and EWA scheme can be intentionally chosen such that w= −1 2M, where M is a positive integer. By appropriately choosing the weighting factor, the multiplication operation within the summing scheme can be replaced by an adder and a shifter.

### 3.3.3 Global Search Algorithm 

A low complexity frame synchronization based on discrete stochastic approximation algorithms, which can be considered as the modification of the PP algorithm, will be described in this subsection. This technique avoids evaluating the correlation function G( )θ for all samples within a window in PP algorithm, thereby achieving a significant reduction in computational cost. It applies the idea of discrete stochastic optimization to reduce the number of complex multiplications while achieving synchronization. This iterative method is given below in detail. Here, after m iterations, θ is the current point, m Wm( )θ for all θ∈Θ represents the number of times the synchronization algorithm has visited the point θ so far, and

*

θm is the point that the algorithm has visited most often so far.

Algorithm: Global Search (GS) Algorithm

Step 0. Select a starting point θ0∈Θ. Let W0(θ0)=1 and W0( )θ =0 for

The above algorithm resembles an adaptive filtering (LMS) algorithm in the sense that it generates a sequence of parameter values where each new parameter

value is obtained from the old one by moving in a good direction, and in the sense that it converges to the global optimizer of the objective function.

For the above procedure, iterations can be performed for m=1, 2,K,MGS, where MGS is the number of OFDM symbols over which θ and ε can be assumed to be constant. In Step 0, the initial value of θ θ= 0 can be made equal to the peak position of G( )θ (obtained using a full search) for the first window. Note that in order to ensure real-time demodulation of OFDM symbols, estimation of θ and ε should be performed in real-time for each symbol. Accordingly, the estimation of θ and ε by the GS algorithm becomes, evaluated for all point within the window of size L+N. According to (3.13), evaluating of G( )θ for a given θ involves L complex multiplications. A brute force computation, therefore, requires L L( +N) complex multiplications to obtain the complete correlation functions for a window of L+N sample points.

However, when G( )θ is evaluated for consecutive points θ within the window 1≤ ≤ +θ L N , it can be performed as depicted in Figure 3.4 using a buffer of size L, thus reducing the per point cost to a one complex multiplication. Therefore, complexity C of PP algorithm becomes L+N multiplications per symbol. In the GS algorithm, correlation function G( )θ has to be evaluated for only two points within a window of 1≤ ≤ +θ L N . This is associated with the evaluation of

( ) ( )

m m m m m

R = G θ′ −G θ in Step 2 of the GS algorithm. However, since these two points can generally be far apart within the window, evaluation of G( )θ will cost L multiplications per point. Therefore, the complexity Cof the GS algorithm

becomes 2L multiplications per symbol. Thus, the percentage reduction of computational cost EGS of the GS algorithm can be given as,

100% (1 ) 100% (3.33)

### 3.4.1 Local Search Algorithm

Consider the GS algorithm in 3.3.3, the accuracy of the GS algorithm heavily depends on the initial estimate, the reason is shown in 3.3.1. Assume that the estimate θm is incorrect but close the correct sample point. If we select the new sample point θm′ , which is different from θm , in [θm R + ]θm R , where the variable R is an integer depending on the search range we want, then the probability that the correct sample point is selected will higher than that in GS algorithm.

Besides, Wm(θ will not tend to be a large value when the correct sample point is m) selected. Consequently, it takes a shorter time for the incorrect estimate to be replaced by the correct sample point than that in GS algorithm. This local search (LS) algorithm is given below in detail. Note that only the Step 1 of the LS algorithm is different from the GS algorithm.

Algorithm: Local Search (LS) Algorithm

Step 0. Select a starting point θ0∈Θ. Let W0(θ0)=1 and W0( )θ =0 for all θ0∈Θ, θ θ0. Let m=0 and θm* =θ0. Go to Step 1.

Step 1. Given the value of θ , generate a uniform random variable m θm′ independently of the past in [θm R + ]θm R , where R is an integer, so that

for all θm′ ∈[θmR + ], θm R θ θm, we have θm′ =θ with probability 1/ 2R.

As mentioned in 3.3.3, the GS algorithm, compared with the PP algorithm, saves up a lot of computation cost. Moreover, the accuracy of the GS algorithm heavily depends on the initial estimate. If the initial estimate is incorrect, then it is likely to take a long time to achieve accurate synchronization. This can be seen as follows.

Assume that the initial estimate θ0 is incorrect. Since a new sample point is selected randomly in each iteration, the probability that the correct sample point is selected is 1/(L+ −N 1), which implies the expected number of iterations to select the correct sample point is L+ −N 1. Besides, Wm(θ tends to be a large value when m) the correct sample point is selected. Consequently, it takes a long time for the incorrect estimate to be replaced by the correct sample point. This phenomenon may make the algorithm become unacceptable for time varying channels. In this subsection, we present a modified global search (MGS) algorithm that yields better performance than the PP algorithm and requires only slightly more computational cost.

The basic idea of MGS algorithm is to maintain a few good candidates, in

The basic idea of MGS algorithm is to maintain a few good candidates, in