Blind Maximum-Likelihood Carrier-Frequency-Offset Estimation for Interleaved OFDMA Uplink Systems

(1)

Blind Maximum-Likelihood

Carrier-Frequency-Offset Estimation for

Interleaved OFDMA Uplink Systems

Hung-Tao Hsieh and Wen Rong Wu, Member, IEEE

Abstract—Blind maximum-likelihood (ML)

carrier-frequency-offset (CFO) estimation is considered to be difficult in interleaved orthogonal frequency-division multiple-access (OFDMA) uplink systems. This is because multiple CFOs have to be simultaneously estimated (each corresponding to a user’s carrier), and an ex-haustive multidimensional search is required. The computational complexity of the search may be prohibitively high. Methods such as the multiple signal classification and the estimation of signal parameters via the rotational invariance technique have been pro-posed as alternatives. However, these methods cannot maximize the likelihood function, and the performance is not optimal. In this paper, we propose a new method to solve the problem. With our formulation, the likelihood function can be maximized, and the optimum solution can be obtained by solving a polynomial function. Compared with the exhausted search, the computational complexity can be reduced dramatically. Simulations show that the performance of the proposed method can approach that of the Cramér–Rao lower bound.

Index Terms—Carrier-frequency offset (CFO), multiuser

sys-tem, orthogonal frequency-division multiple-access (OFDMA) . I. INTRODUCTION

E

MERGING as a promising technology for next-generation broadband wireless network, orthogonal frequency-division multiple access (OFDMA) has received a considerable amount of research interest recently [3]–[10]. An appealing feature of OFDMA is that the transmission signals of differ-ent users are orthogonal and that multiple-access interference (MAI) can be avoided. However, if the carrier-frequency offsets (CFOs) between transmitters and the receivers are not properly estimated and compensated, then the orthogonality will be destroyed, and intercarrier interference and MAI will arise. In OFDMA downlink systems, the signals for different users are multiplexed by the same transmitter, and the receiver of each user can estimate and compensate its own CFO easily. In such a scenario, methods for CFO estimation in single-user orthogonal frequency-division-multiplexing (OFDM) systems [1], [2] can directly be applied. However, in OFDMA uplink systems, all of the users’ CFOs have to be simultaneously estimated at the base-station (BS) receiver, which is considered to be a more

Manuscript received April 2, 2010; revised August 9, 2010 and October 12, 2010; accepted October 14, 2010. Date of publication November 9, 2010; date of current version January 20, 2011. This work was supported in part by the National Chiao Tung University-MediaTek joint research program. The review of this paper was coordinated by Dr. H. Lin.

The authors are with the Department of Communication Engineering, National Chiao Tung University, Hsinchu 300, Taiwan (e-mail: dow.cm93g@ nctu.edu.tw; wrwu@faculty.nctu.edu.tw).

Digital Object Identifier 10.1109/TVT.2010.2090179

challenging problem. Many methods have been developed for CFO estimation in OFDMA uplink systems [3]–[10]. Those methods can roughly be classified into two categories: methods 1) using and 2) not using training sequences. Methods using training sequences insert a known preamble in front of each data packet, facilitating CFO estimation at the BS receiver [3]–[6]. Methods not using training sequences, which are also referred to as blind methods, manipulate the subcarrier assign-ment scheme such that the CFO for each user can individually or jointly be estimated [7]–[10] at the BS receiver.

Maximum-likelihood (ML) methods were proposed to esti-mate the CFOs in [3]–[6] for training-based systems. It turns out that a multidimension (MD) exhaustive search is required to obtain the ML solution, and the computational complexity can be high. As a result, low-complexity ML CFO estimators were then developed. In [3], the CFOs were estimated by maximiz-ing the mean likelihood function, and an important samplmaximiz-ing method [13] originally proposed for single-carrier systems was applied. In [4], an iterative scheme, which is referred to as the alternating projection frequency estimator (APFE), is em-ployed to conduct 1-D searches for the ML solution. Although their solutions can approach the Cramér–Rao bound (CRB), the required computational complexity is still high. To solve the problem, a simplified method, i.e., the approximate APFE (AAPFE), was then proposed [4]. However, AAPFE suffers considerable performance degradation. To improve the perfor-mance of AAPFE, a divide-and-update frequency estimator [5] was later developed. Another suboptimal approach with complexity lower than APFE was reported in [6]. This scheme achieves complexity reduction by approximating the inverse of a CFO-dependent matrix with that of a predetermined matrix. To achieve better results, the system must have a large number of subcarriers.

When training sequences are not available, the CFO for each user cannot easily be estimated. Fortunately, this problem can be overcome with a proper subcarrier assignment scheme, such as subband-based or interleaved-based estimators. The subband-based CFO estimators [7], [8] require that each user is assigned with some consecutive subcarriers, and the subcarrier sets for different users are well separated in the frequency domain. With the scheme, a filter bank can be used in the BS to extract each user’s signal, and then, the conventional CFO esti-mation methods can be exploited. In interleave-based OFDMA systems, each user’s time-domain signal is periodic. With the property, the CFOs for all users can jointly be estimated by the multiple signal classification (MUSIC) [9] or the estimation of

(2)

signal parameters via rotational invariance technique (ESPRIT) methods [10]. Although the computational complexity of these methods is low, the CRB cannot be achieved, and the solutions are not optimal.

In this paper, we investigate the blind CFO estimation prob-lem in the interleaved OFDMA uplink system. Our objective is to develop a low-complexity ML CFO estimation method. The main obstacle in the ML method is that there is an inverted correlation matrix in the likelihood function. The CFOs in the likelihood function become intractable after matrix inversion. Using the matrix inversion lemma, we first transform the corre-lation matrix into a matrix with smaller size. Then, we express the matrix with a series expansion. By properly truncating the expansion, we can obtain a closed-form expression and solve the optimum CFOs with a root-finding method. Simulations show that the performance of the proposed method can ap-proach the CRB. The computational complexity of the proposed algorithm is as low as that of ESPRIT.

The rest of this paper is organized as follows: Section II briefly describes the system model and derives the proposed CFO estimation method. Section III shows the performance analysis for the proposed method. Section IV provides simu-lation results and complexity analysis. Finally, Section V draws the conclusions. For convenience, the notations used in this paper are defined as follows: (·)H_{and (}_·)∗_{denote the Hermitian}

and complex conjugate operation of a matrix, respectively; mod(., C) denotes modulo C; Im(.) and Re(.) denote the image-part-taking and the real-part-taking operation, respec-tively; E{·} denotes the expectation operation; δ(·) denotes a Dirac delta function; (·)p denotes the pth element of a vector;

(·)p,q denotes the (p, q)th element of a matrix; 0l denotes an

l-by-1 column zero vector, I denotes an identity matrix;

diag(x) denotes a diagonal matrix with the diagonal entries of x; det(·) denotes the determinant of a matrix; · denotes the Frobenius norm of a matrix; and tr(·) denotes the trace of a matrix.

II. PROPOSEDCARRIER-FREQUENCYOFFSET

ESTIMATIONMETHOD

A. Signal Model for Interleaved OFDMA Uplink System

In an OFDMA system, let M users share the Nssubcarriers

of an OFDM symbol, and let the M users simultaneously transmit their data streams. The subcarriers are divided into Q subchannels, and each subchannel has N = Ns/Q subcarriers.

Each user occupies a specific subchannel, and the subcarriers assigned to user m are denoted as skm’s, where k∈ Υm. Here,

Υmdenotes a subset of subcarrier indices. For an interleaved

OFDMA system [9], [10], the subset for the mth user is defined as Υm={qm, Q + qm, . . . , qm+ (N− 1)Q}, where qmis the

subchannel index, and qm∈ {0, 1, . . . , Q − 1}. In the system,

it is assumed that Υm

Υk = φ for m= k, where φ denotes

the empty set. In our system, we assume that the sequence each user transmits is unknown to the BS and M < Q.

Consider a specific OFDMA symbol and denote the frequency-domain signal that user m transmits as an Ns× 1

vector um. Note that the elements of um are nonzero only

in designated subcarriers, i.e., Υm. Taking the inverse

dis-crete Fourier transform (IDFT) of um, we can obtain the

time-domain signal for user m, which is denoted as ¯sm=

[¯sm(0), . . . , ¯sm(Ns− 1)]T. Inserting a cyclic prefix (CP) of

length L at the beginning of the symbol, user m can then serially transmit the resultant signal through a wireless channel. Let the channel response from user m to the BS receiver be denoted as hm(l), l = 0, . . . , Lm− 1, where Lmis the channel

length, and Lm≤ L. In addition, let the normalized CFO for

user m be denoted as εm. Then, the CP-removed received

OFDMA symbol at the BS can be expressed as

y(k) = M m=1 exp(j2πεmk/Ns) Lm−1 l=0 hm(l)¯sm(k− l) + η(k) (1) where k = 0, . . . , Ns− 1, and η(k) represents the additive

white Gaussian noise (AWGN) with a variance of σ_η2.

As mentioned, subchannel qm is assigned to user m in the

interleaved OFDMA system. It is equivalent to say that user m is assigned to subchannel zero, and a CFO of qmis introduced.

Therefore, the received noiseless symbol from user m can be rewritten as ¯ xm(k) = exp(j2πεmk/Ns) Lm−1 l=0 hm(l)¯sm(k− l) = exp (j2π(εm+ qm)k/Ns) Lm−1 l=0 hm(l)sm(k− l) = wεe,m·k_x m(k) (2)

where w = exp(j2π/Ns), εe,m= εm+ qm, xm(k) =

Lm−1

l=0 hm(l)sm(k− l), and sm(k) is the transmitted signal

of user m if subchannel zero is assigned. The term εe,m

denotes the effective CFO for user m. It includes the virtual CFO caused by the subchannel qm. Note that the periodicity

of the transmitted sequence still remains after it is passed through the channel. Since the time-domain signal has a period of N , we can make an index transformation by letting

k = (p− 1)N + n, where p = 1, . . . , Q, and n = 0, . . . , N− 1. With the transformation, we can convert the kth sample

of a signal into the nth sample in the pth period. The nth sample in each period, corresponding to a signal, can then be extracted to form a vector. Then, we have

y(n) = UD(n)x(n) + η(n) (3)

where y(n) = [y(n), y(n + N ), . . . , y(n + (Q− 1)N)]T = [y1(n), y2(n), . . . , yQ(n)]T, U is a Q-by-M matrix and

(U)p,q= w{εe,q·(p−1)N}, D(n) = diag([wn·εe,1, . . . ,wn·εe,M]T),

x(n) = [x1(n), . . . , xM(n)]T, and η(n) = [η(n), η(n + N ),

. . . , η(n+(Q−1)N)]T_{= [η}

1(n), η2(n), . . . , ηQ(n)]T. We will

use (3) as our signal model in the derivation of the ML CFO estimate.

B. Proposed Method

To the best of our knowledge, blind ML CFO estimation has not been studied before in OFDMA uplink systems. Here, we propose a method to solve the problem. For interleaved OFDMA uplink systems, the transmitted time-domain signal is obtained from the IDFT of its frequency-domain signal. From

(3)

the central limit theorem, we know that if the number of subcar-riers is reasonably large, then the corresponding time-domain signal can be approximated as a white Gaussian sequence [2]. Similar to [3], we assume that each user is under perfect power control; therefore, signals arrive at the BS with equal average power. If we further assume that each channel tap indepen-dently experiences Rayleigh fading and all users’ signals are white and independent of each other, then the received sequence

y(k) in (1) can also be approximated as a Gaussian sequence [2]

with a variance of M σ2x+ ση2, where σx2= E{|xm(n)|2}. Let

−0.5 < εm< 0.5 and f (.) be a probability density function.

Then, we can explicitly write out the log-likelihood function, which is shown in [2], as Λ(ε) = ln _N₋₁ n=0 f (y(n)) . (4)

Define Ry= E{y(n)yH(n)}, and

(Ry)p,q= ση2δ(p− q) + σx2Γ(p, q) (5) where Γ(p, q) = M m=1 w(εe,m)N (p−q)_. ₍₆₎

Thus, we can express f (y(n)) as [19], [20]

f (y(n)) =πQdet(Ry)

−1

exp−y(n)HR−1y y(n)

. (7)

The log-likelihood function can be expressed as Λ(ε) =

N

n=0

−Q · ln(π) − ln (det(Ry))− y(n)HR−1y y(n)

.

(8) Let u(n) = UD(n)x(n). Then, y(n) = u(n) + w(n). As assumed, the transmitted sequences are independent of each other, i.e., Ry= σx2UUH+ ση2I. Note that U is a

Q-by-M matrix. To use (8) and solve the Q-by-M unknown CFOs, U

must be a full-rank tall matrix. From (3), we see that U is a Vandermonde matrix (εe,m= εe,n if m= n) [17]. Since we

assume that M < Q, the full-rank property then holds. As a result, (8) can be applied.

To find the maximum of the log-likelihood function for the

ith user, we take a derivative with respect to εe,i[16], i.e.,

∂ ∂εe,i Λ(ε) =−N · tr R−1_y ∂ ∂εe,i Ry − N n=0 y(n)H ∂ ∂εe,i R−1_y y(n) . (9) We use the matrix inversion lemma [18] to write the inverse of Ryas

R−1_y = σ_η−2I− σ_η−4Uσ−2_x I + σ_η−2UHU −1UH = σ_η−2I− σ_η−4U(Rs)−1UH (10)

where Rs= σx−2I + ση−2UHU. With (10), we only need the

inverse of an M -by-M matrix Rs rather than a Q-by-Q

matrix Ry.

However, (Rs)−1 is difficult to obtain. Even if it can, the

relationship between the likelihood function and the CFOs may not be trackable after the inversion. To solve the problem, we propose using the Neumann series to expand (Rs)−1 [14].

Let S be a nonsingular matrix and β(S) be its maximum absolute eigenvalue. Then, the series∞_k=0Sk will converge to (I− S)−1 [17] if β(S) < 1 [15]. However, the condition of β(S) < 1 is not always satisfied for a nonsingular S. This problem can be overcome by dividing S by a real parameter

λ > 0 and expanding the resultant matrix. It is simple to show

that there always exists a λ such that β(Rs/λ) < 1. Now, we

can rewrite Rsas

Rs= σ−2x I + ση−2UHU

= λ(I + B) (11)

where B is obtained as (1/λ)Rs− I, and its (p, q)th element is

(B)p,q= 1 λσ2 x − 1 δ(p− q)+ 1 λσ2 η Q k=1 w(−εe,p+εe,q)N (k−1)_. (12) From the Neumann series shown, the inverse of Rscan be

expanded as 1 λRs ₋₁ = (I + B)−1 = ∞ k=0 (−1)kBk. (13)

For simplicity, we can retain the first three and truncate high-order terms, i.e.,

R−1s ≈ 1 λ 2 k=0 (−1)kBk. (14)

The determination of the optimum λ and the analysis of the truncation error will be discussed in the next section. From (12), we can find the (p, q)th element of B2as

(B2)p,q= 1 λσ2 x − 1 2 δ(p− q) + 2 λσ2 η 1 λσ2 x − 1 Γ0(p, q) + 1 λσ2 η 2 M k=1 Γ0(p, k)Γ0(k, q) (15) where Γ0(p, q) = Q

n=1w(−εe,p+εe,q)N (n−1). Substituting

(12) and (15) into (14), we then obtain R−1_s _p,q=1 λ 2− 1 σ2 xλ + 1 σ2 xλ − 1 2 δ(p− q) + 1 σ2 ηλ 2 σ2 xλ − 3 Γ0(p, q) + 1 σ2 ηλ 2 M k=1 Γ0(p, k)Γ0(k, q) . (16)

(4)

Using (16) in (10), we can approximate the inverse of Ryas R−1_y p,q= σ −2 η δ(p− q) − C0Γ(p, q) − C1 M a=1 w(εe,a)N (p−1) · M b=1 Γ0(a, b)w(−εe,b)N (q−1) − C2 M a=1 w−(εe,a)N (q−1) M b=1 w(εe,b)N (p−1) · M k=1 Γ0(b, k)Γ0(k, a) (17) where C0= 1 σ4 ηλ 2− 1 σ2 xλ + 1 σ2 xλ − 1 2 C1= 1 σ6 ηλ2 2 σ2 xλ − 3 C2= 1 σ8 ηλ3 . (18)

Note that Γ0(., .) cannot directly be estimated. However, it

can be combined with some variables in (17) and converted to Γ(., .), as defined in (6). The value of Γ(p, q) can be estimated from that of the (p− q)th diagonal term of Ryas

Γ(p, q) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ Q−m p=1 [Ry]p,p+m (Q− m)σ2 x , if m = q− p ≥ 0 Q−m q=1 [Ry]q+m,q (Q− m)σ2 x , if m = p− q > 0. (19)

The second and third terms in (17) can be rewritten as

M a=1 M b=1

Γ0(a, b)w(εe,a)N (p−1)w(−εe,b)N (q−1)

= Q n=1 M a=1 M b=1 w(−εe,a+εe,b)N (n−1) · w(εe,a)N (p−1)_w(−εe,b)N (q−1) = Q n=1 Γ(p, n)Γ(n, q) (20) M a=1 w−(εe,a)N (q−1) M b=1 w(εe,b)N (p−1) M k=1 Γ0(b, k)Γ0(k, a) = Q m=1 Q n=1 Γ(p, m)Γ(m, n)Γ(n, q). (21)

Substituting (17)–(21) into (9), we obtain (22), shown at the bottom of the page, where

γ(p, q) = N−1 n=0 y∗(n + (p− 1)N)y(n + (q − 1)N) (23) x = exp j2πN εe,i Ns . (24)

The detailed derivation of (22) is provided in Appendix I. Setting (22) to zero, we can solve all the possible 2(Q− 1) roots, i.e., ˆx. The effective CFO can then be obtained by

ˆ εe,i = Ns N ln(ˆx) j2π . (25)

As defined in (2), the true CFO for the ith user is then given by ˆ εi=−qi+ Ns N ln(ˆx) j2π (26) where qi is the subchannel index for user i. It is apparent that

after adding −qi, there will only be one root falling into the

∂ ∂εe,i Λ(ε) =j2πN Ns _Q p=1 Q q=1 N σ2_x(q− p)xq−p C0Γ(p, q)+C1 Q n=1 Γ(p, n)Γ(n, q)+C2 Q m=1 Q n=1 Γ(p, m)Γ(m, n)Γ(n, q) + γ(p, q) C0(p− q)xp−q+C1 Q n=1 (n−q)xn−qΓ(p, n)+(p− n)xp−nΓ(n, q) + C2 Q m=1 Γ(p, m) Q n=1 Γ(n, q)(m− n)xm−n+ (n−q)xn−qΓ(p, m)Γ(m, n) + (p− m)xp−mΓ(m, n)Γ(n, q) (22)

(5)

range of subchannel 0, and the root is the estimated CFO for user i.

A direct method for solving the roots in (22) is via an exhaus-tive grid search over the interval spanned by εe,i. However, the

computational complexity is high. Taking a closer look at (22), we find that (22) is a polynomial function of x, i.e.,

∂ ∂εe,i Λ(ε) = Q−1 k=1 αp(k)xk+ Q−1 k=1 αn(k)x−k = 0. (27)

The detailed derivation for αp(k) and αn(k) is provided in

Appendix II. Using (27), we can then use a more efficient root-finding method to obtain the roots.

III. PERFORMANCEANALYSIS

A. Truncation Error in (14)

As we can see, the series in (13) is infinite, and truncation has to be conducted. In the previous section, we retain the first three terms in the series. One may be curious about how large the error will be. In this section, we analyze the truncation error in (14).

For a positive-definite Hermitian matrix R with rank K, we can have its eigen-decomposition as

R = VGVH (28)

where G = diag[g1, . . . , gK] is a diagonal matrix, where gi’s

being positive are the eigenvalues of R in descending order, i.e., g1>· · · > gK, and V is an unitary matrix consisting of

the eigenvectors. As shown in Section II, R can also be expressed as R λ −1 = (I− A)−1= ∞ k=0 Ak (29)

where λ is a real number ensuring that the maximum absolute eigenvalue of R/λ is smaller than one, and A is a matrix to be determined. Substituting (28) and VVH = I into (29), we can obtain that A = V(I− G/λ)VHand that

Ak= V I−G λ k VH. (30)

From (30), it is simple to see that for the convergence of (29),|1 − gi/λ|, i = 1, 2, . . . , K has to be smaller than 1. In

addition, the smaller the value of|1 − gi/λ|, the faster the

con-vergence we can have. Since the values of gi’s may be different,

the convergent rate of each|1 − gi/λ| (which is referred to as a

mode) may be different. As a result, the overall convergence is dominated by the mode with the maximum|1 − gi/λ|. To have

the fastest convergence, we then want to find a λ minimizing the maximum|1 − gi/λ|(1 ≤ i ≤ K). This yields a min–max

optimization problem as min

λ i=1,···,Kmax |1 − gi/λ| (31)

subject to the constraints

|1 − gi/λ| < 1 (32)

where i = 1, 2, . . . , K. The optimum value of λ has been shown to be [22]

λ = g1+ gK

2 . (33)

Substituting (33) into |1 − gi/λ|, we find that there is the

same maximum value yielded by g1and gK. Denote the value

as the slowest convergence rate (SCR) of R, i.e.,

M(R) = |1 − g1/λ| = |1 − gK/λ| =

g1− gK

g1+ gK

=S(R) − 1

S(R) + 1 (34)

whereS(R) = g1/gK is the eigenvalue spread (EVS) of R. It

is obvious that a smaller EVS yields a smaller SCR. Further-more, if the SCR is smaller, then the convergence of the series of (29) will be faster, and the truncation error will be smaller. However, a closed-form expression for the truncation error is difficult to obtain. Instead of the exact value of the error, we will try to derive an upper bound. Let the number of the terms retained in (29) beL and the power of the truncation error be

E. Then, we have E =∞ k=0 Ak− L−1 k=0 Ak = ∞ k=L Ak ≤∞ k=L Ak_≤ ∞ k=L Ak ≤∞ k=L Mk_(R) ₍₃₅₎

whereM(R) is the maximum diagonal value of I − (G/λ) in (30). IfM(R) = 0, then

E ≤ M(R)L

1− M(R) =

(S(R) − 1)L

2 (S(R) + 1)L−1. (36)

It is simple to see that when S(R) = 1 and M(R) = 0,

E = 0, giving the fastest convergence of (29). In this case, R

is diagonal, and only one term is required in (29).

We now compare the EVSs of Ry and Rs in (10)

and show that S(Rs) <S(Ry). As shown in Section II,

Ry= σ2xUUH+ σ2ηI, and Rs= σx−2I + σ−2η UHU =

σ−2x I + ση−2Ru. Let {gu,1, . . . , gu,Q} be the eigenvalues of

UUH and gu,1≥ · · · ≥ gu,Q. Since the rank of U, which is

a Q-by-M matrix, is M , the smallest Q−M eigenvalues of UUH are zero, i.e., gu,M +1 =· · · = gu,Q= 0. In addition,

the nonzero eigenvalues of UUHand UH_{U are the same. This}

indicates that we can obtain the eigenvalues of Rsfrom Ryas

eig(Rs) = σx−2+ ση−2eig(Ru) (37)

where eig(Ru) denotes the first M eigenvalues of Ru.

There-fore, the EVSs of Ryand Rscan easily be obtained as

S(Ry) = σ_x2gu,1+ ση2 /σ_η2 = ρ· gu,1+ 1 (38) S(Rs) = σ_x−2+ σ−2_η gu,1 /σ_x−2+ σ_η−2gu,M = (ρ· gu,1+ 1)/(ρ· gu,M+ 1) (39)

(6)

Fig. 1. EVS of Rufor Mmax= 2 (1≤ Δq ≤ 15).

where ρ = σ2

x/ση2. Note that the received SNR is defined as

SNR = M σ2

x/ση2= M ρ. It is easy to see that the EVS of Rs

is smaller than that of Ry. Therefore, the SCR of Rsis smaller

than that of Ry. Thus, the matrix inversion lemma used in (10)

not only reduces the computational complexity but also reduces the truncation error in (14).

From (39), it is also simple to see that for low SNR, the EVS of Rsapproaches 1, and therefore, the truncation error in (14)

can be ignored. For high SNR, the EVS of Rsapproaches that

of Ru. The EVS of Ru= UHU depends on the subchannel

assignment since the (a, b)th entry of U is w{εe,b·(a−1)N}_and

εe,b= εb+ qb. To analyze the truncation error, we first have

to analyze the EVS of UH_{U. Unfortunately, a general}

closed-form for the EVS is difficult to obtain. Here, we study two special cases to show that the EVS of UH_{U is low and that}

the truncation error in (14) can be small in our applications. Define a Mmax-user system as a system that can

simultane-ously handle Mmaxusers at most. The first case we consider is

a two-user system. For the system, the EVS of UHU can be solved from (3) in closed form as

S(UH_{U) =}Q + ϑ1,2 Q− ϑ1,2 (40) where ϑ1,2={[1 − cos(2πδε)− cos(2πδε/Q) + cos(2πδε(Q− 1)/Q)]0.5· [1 − cos(2πδε/Q)]−0.5.

As we can see, the EVS varies with δε=|εe,1− εe,2| =

|q1− q2+ ε1− ε2|. Therefore, S(UHU) is dependent on

Δε =|ε1− ε2|, and Δq = |q1− q2|. Note that Δq is the

dif-ference of the neighbor subchannel indices. We now use an example to examine the EVS of (UHU). Let Ns= 128 and

N = 8. Then, there are Ns/N = 16 subchannels, and 1≤

Δq≤ 15. Fig. 1 shows the result. From the figure, we can see that the EVSs for Δq = 1 and Δq = 15 are much larger than those for 2≤ Δq ≤ 14. Fig. 2 shows the EVSs for 2 ≤ Δq≤ 14. It is clear that all the values are smaller than 1.5. This indicates that the truncation error will be small as long

as the adjacent subchannels are not used simultaneously, i.e.,

| mod (Δq − i · Q, Q)| = 1, where i is an integer.

The second case we consider is a four-user system. For the system, the closed-form solution of the EVS is not ob-tainable. Simulations are then conducted to obtain numeri-cal results. Note that in this case the EVS is a function of

{q1, . . . , q4, ε1, . . . , ε4}. It is difficult to examine the behavior

of the EVS in terms of these variables. For convenience, we still define two variables: 1) Δq =|q1− q2| = |q2− q3| = |q3−

q4| and 2) Δε =

M−1

i=1 |εi+1− εi|. Note that the definition

and the implication of Δε are different from those in the user case. Using the same simulation setting as that in the two-user system, we obtain the EVS versus Δε in Fig. 3. Here, we assume that adjacent subchannels are not used. From the figure, we can find that all the EVSs are smaller than 1.7. We can expect that the truncation error in (14) will be small. From Figs. 2 and 3, we can also find that the smallest truncation error can be obtained when Δq = 8 and Δq = 4 for the two-user and four-user systems, respectively. However, a large Δq will result in a smaller Mmax, which is the maximum number of users.

Thus, the selection of Δq is a tradeoff between Mmax and the

SCR of Ru. In this paper, we let the smallest Δq be 2.

Using (33) and (37), we can see that the optimum λ in (11) is equal to σ−2_x + σ−2_η (gu,1+ gu,M)/2. Note that the eigenvalues

(7)

TABLE I

NORMALIZEDTRUNCATIONERRORVERSUSSNR

of Rucan be estimated as eig(σ−2x (Ry− σ2ηI)), and Rycan be

estimated asN_n=0−1y(n)yH(n). Therefore, the optimum λ can then be calculated. To evaluate the performance of the proposed expansion, we define a normalized truncation error as (with the optimum λ) En = Eε Rs/λ· (Rs/λ)−1− 2 k=0 (−1)kBk (41) where ε is the set for all possible εe,m(m = 1, . . . , M ). Using

the result in (36), we can also define a normalized upper bound as En≤ Eε Rs/λ · (Rs/λ)−1− 2 k=0 (−1)kBk ≤ Eε Rs/λ (S(Rs/λ)− 1)3 2 (S(Rs/λ) + 1)2 . (42)

We now use some examples to evaluate the normalized truncation error and its upper bound in (42). The result is shown in Table I. It is simple to see that the normalized truncation error increases with the decrease of Δq and with the increase of SNR. In addition, the deviation of the upper bound from the actual error is small; the upper bound overestimates the error by 1–2 dB. From Table I, we can also see that even with Δq = 2, the truncation error is still quite small, i.e.,−20 dB. Section IV gives more results to show the property.

B. CRB Analysis

For the training-based method, only the AWGN is consid-ered as a random variable, and the CRB for CFO estimation can then be derived [4]. In the blind method, the transmit symbol is treated as an additional random variable, and the CRB for blind CFO estimation can also be derived [2], [11]. Here, we generalize the result in [2] and [11] (for single-user OFDM systems) to derive the CRB in OFDMA systems. From (4), the (p, q)th entry of the Fisher information matrix F is given by (F)p,q =−E ∂2_{ln (Λ(ε))} ∂εp∂εq =−E ∂2_{ln (Λ(ε))} ∂εe,p∂εe,q (43)

where 1≤ p, q ≤ M. Substituting (8) into (43) yields (F)p,q= N· tr −R−1 y ∂Ry ∂εp R−1_y ∂Ry ∂εq + R−1_y ∂ 2_R y ∂εp∂εq − E _N₋₁ n=0 yH(n)Jy(n) = N· tr R−1_y ∂Ry ∂εp R−1_y ∂Ry ∂εq (44) where J =− R−1_y ∂Ry ∂εp R−1_y ∂Ry ∂εq R−1_y + R−1_y ∂2_R y ∂εp∂εq R−1_y − R−1 y ∂Ry ∂εp R−1_y ∂Ry ∂εq R−1_y . (45)

From [16], we see that

E _N₋₁ n=0 yH(n)Jy(n) = N−1 n=0 EyH(n)Jy(n) = N−1 n=0 EtrJy(n)yH(n) = N· tr(JRy). (46)

Finally, the CRB for the εiestimate is obtained as

CRB(εi) = (F−1)i,i. (47)

We average the diagonal terms of (47) to have a single index for performance comparison.

C. Computational Complexity

Here, the computational complexity of the proposed method is assessed and compared with that of the existing schemes. For the proposed method, there are three operation steps. In the first step, we need to calculate the autocorrelation matrix

(8)

in (5) and its eigen-decomposition to obtain (37). The com-putational complexity for this step is O(Q3_{+ Q}2_{N ). In the}

second step, we need to calculate the coefficients in (27). From the received signal and the known autocorrelation matrix, we can obtain γ(p, q) in (23) and Γ(p, q) in (19). Let Γ(p, q) and γ(p, q) be the (p, q)th entries of two matrices Γ and γ, respectively. The computational complexity for constructing these matrices is O(Q2N ). Note that not all the coefficients

in (27) (see (57) and (58) in Appendix II) are required for the calculation. Some coefficient pairs in (57) and in (58) are complex conjugates of each other. In addition, some terms in (57) and (58) appear repeatedly. For example, F1(p, q) in (57),

the sixth term of (57), and the seventh term of (57) all include

mΓ(p, m)Γ(m, n). Without the redundant computations, the

complexity for calculating the coefficients in (57) or (58) is found to be O(6Q3_{). Then, we evaluated the computational}

complexity for calculating all the coefficients in the polynomial (27). The last step is the root-searching process in (27) and the CFOs sorting for each user in (26). Since there are 2Q− 1 terms in (27), the roots can be solved with the complexity of

O(8Q3_{) [2]. Compared with the root-searching process, the}

complexity in calculating (26) is small and can be ignored. Adding all together and taking only dominant terms, we can have that the entire complexity for the proposed method is

O(15Q3+ 2Q2N ). For the ESPRIT frequency estimator, the

complexity is shown to be O(5Q3+ Q2_{N ) [10]. Therefore,}

the computational complexity of the proposed method is on the same complexity order as that of ESPRIT.

Next, we evaluate the computational complexity of the training-based schemes. For APFE, the total complexity has been shown to be O(M NcNw(L3+ LNs2)) [3], [4], where

Nc denotes the number of iterations, and Nw denotes the

number of grid points used for each iteration. For simplified AAPFE, the computational complexity is O(M NcNwNs2K)

[4], [5]. Since the computational complexity of the APFE algorithm is high, suboptimum training-based schemes are then proposed [3], [5], [6]. For the method in [3], the com-putational complexity is O(2M T Ns2+ T Ns(M N )2), where

T is the number of the Monte Carlo runs finding a mean

likelihood [13]. The computational complexity for the method in [5] is O(Nc(M2N2+ 1/2Nslog2(Ns) + 3/2M3N3+

3/8M2_N2_N

s)), whereas that in [6] is O(Nc(Ns(M N )2+

M3_{)) for}_{L = 1. Here, L is the number of terms retained in}

an infinite series [6]. Note that forL = 1, the method has the worse performance but the lowest computational complexity. In addition, note that the simulation results in [6] indicate thatL should be at least 3 for acceptable performance. However, as shown in [6], the computational complexity order forL = 3 is difficult to evaluate.

IV. SIMULATIONS

In this section, we report simulation results demonstrating the effectiveness of the proposed method. In the first set of sim-ulations, we compare the performance of the proposed method with existing blind methods. Note that ESPRIT is known to be better than the MUSIC algorithm [10]. Thus, we only conduct simulations for ESPRIT [10]. In the second set of simulations,

we compare the performance of the proposed method with existing ML methods, such as APFE and AAPFE. Note that existing ML schemes require training sequences. Finally, we compare the computational complexity of all schemes.

A. System Setup

In our simulations, the channel response used for each user is generated according to the HIPERLAN/2 channel model [12]. The channel response, having six taps, follows an exponen-tial power decay profile. Each tap coefficient is modeled as an independent complex Gaussian random variable with zero mean. The CFO of each user is generated with a uniform distribution in the interval (−0.5, 0.5). The symbols used for CFO estimation are modulated with a binary phase-shift-keying scheme, whereas those for data transmission are modulated with a 16-QAM scheme. The interleaved OFDMA system used in our simulations has Ns= 128 subcarriers. Since there are

multiple CFOs to be estimated, the mean square error (MSE) is used as the performance index, which is defined as

M SE =

M

m=1

(ˆεm− εm)2. (48)

All the simulation results are obtained by averaging 1000 Monte Carlo runs.

As shown in Section III, Δq, which is the difference of the neighbor subchannel indices, influences the truncation error in (14) and Mmax, which is the maximum number of users.

As shown in Section III, the larger the Δq, the smaller the truncation error and the smaller the Mmax. We compare the

results for Δq = 2 and Δq = 4 in the following.

B. Performance Assessment forΔq = 2

First, we let Δq = 2, the smallest Δq we use. In this case, the SCR will be maximal, and Mmaxis also maximal. It

corre-sponds to the worst case in the proposed method. An important design parameter for the proposed method is the number of subcarriers for each user N . The number of subchannels is then Q = Ns/N . Observing (27) and (47), we see that a higher

Q performs better but requires higher complexity. To see the

impact of N , we let M = 4 and observe the CRB for different

N . The result is shown in Fig. 4. We can see that the CRB

is almost the same for N = 4 and N = 8. To reduce compu-tations, we choose N = 8 for the simulations we conducted. Without loss of generality, we assume that the CP length is

N , which is larger than the channel length. For the first set of

simulations, we compare the performance of the proposed algo-rithm with that of ESPRIT. Fig. 5 shows the result for M = 2, whereas Fig. 6 shows the result for M = 4. As expected, the performance of the proposed algorithm is significantly better than that of ESPRIT since the proposed method conducts the ML estimation. We can also see that the proposed method can approach the CRB. At high SNR regions, the performance of the proposed algorithm slightly deviates from the CRB. This is due to our approximation used in (14). When the number of users is larger, the deviation is also larger.

(9)

Fig. 4. CRB comparison for various N .

Fig. 5. Performance comparison for ESPRIT and the proposed algorithm (M = 2).

Fig. 6. Performance comparison for ESPRIT and the proposed algorithm (M = 4).

In the second set of simulations, we compare the perfor-mance of the proposed algorithm with that of other ML algo-rithms. The simulation setup is the same as that of the first set of simulations. The ML problem can directly be solved by using an exhaustive grid search over the MD space spanning

{εe,1, . . . , εe,M}. To reduce the computational complexity, we

use the APFE and AAPFE schemes. The AAPFE is a subop-timum solution of APFE, and it also truncates the Neumann series to approximate an inverse matrix (with an order K) [21]. In each iteration, only one user’s CFO is updated, whereas the other users’ CFOs remain unchanged. The CFO update

Fig. 7. Performance comparison for training-based and the proposed algo-rithm (M = 2).

Fig. 8. Performance comparison for training-based and the proposed algo-rithm (M = 4).

is conducted by a grid-search method. Note that the purpose of the expansion is different from ours. In addition, AAPFE does not use the optimum λ to achieve the best result. In our simulations, we let Nc= 2 and Nw= 100. We have tried

K = 1 and K = 2. Fig. 7 shows the result for M = 2, and

Fig. 8 shows the result for M = 4. We can see that the APFE performs the best, and the AAPFE with K = 1 performs the worst. Note that the conventional APFE and AAPFE have to use training sequences. From the figure, we see that the performance gaps between the APFE, the AAPFE with K = 2, and the proposed blind algorithm are very small. In addition, note that all these algorithms tend to deviate from the CRB when the SNR is high.

C. Performance Assessment forΔq = 4

Fig. 9 shows the performance comparison between Δq = 2 and Δq = 4. The simulation setup is the same as that in the previous section. It is easy to find that the performance of the proposed method with Δq = 4 is better. In Section III, we have shown the relationship between the SCR and the truncation error (as shown in Table I). From Fig. 9, we can observe a similar result. The smaller the SCR, the better the estimation performance. The performance gap between the APFE and the proposed algorithm becomes smaller when Δq = 4.

(10)

Fig. 9. Performance comparison for all algorithms with Δq = 2 and Δq = 4 (M = 4).

Fig. 10. Computational complexity comparison for the proposed algorithm and algorithms in [3]–[6] and [10].

D. Complexity Comparison

Fig. 10 shows the computational complexity of the schemes we consider. Fig. 10(a) shows the complexity versus the number of subcarriers for the two-user case. Fig. 10(b) shows the complexity versus the number of users for a fixed total number of subcarriers: NS = 128, and Q = 8. From the figures, we find

that the computational complexity of the proposed method is similar to that of the conventional blind methods. However, the proposed method outperforms the conventional methods by 10 dB (see Figs. 5 and 6). Compared with the training-based methods, the proposed blind method can have similar performance (see Figs. 7 and 8) but much lower computational complexity (see Fig. 10).

V. CONCLUSION

In this paper, we have developed a new algorithm for blind ML CFO estimation in interleaved OFDMA uplink systems. Conventional methods for this problem require conducting MD exhausted searches, and the computational complexity can be

very high. The distinct feature of the proposed algorithm is that it only requires a root-searching procedure. The main idea is to use a series expansion when evaluating the ML function. The performance of the expansion is also analyzed. The operations of the proposed method are simple, and the computational complexity is low. Simulations show that the proposed method can approach the CRB. As shown in Fig. 1, a large EVS will be induced in the full-loaded scenario (Δq = 1), and the perfor-mance of the proposed method will seriously be affected. The problem can be solved by an expectation-maximization (EM) algorithm referred to as iterative space alternating generalized EM (SAGE) [23], [24]. However, the complexity of the SAGE algorithm can be very high for large Ns. Note that in

real-world applications, only a number of users will be activated at a specific time [24]. Thus, only the CFOs of the newly activated users have to be estimated, and the knowledge of the previously estimated CFOs can be exploited in each new estimation. It is interesting to incorporate the SAGE algorithm into the proposed method, which may serve as a topic for further research.

APPENDIXI DERIVATION OF(22)

Taking the derivative of (Ry)p,qwith respect to εe,i, we have

the result as ∂ ∂εe,i Ry p,q =j2πN σ 2 x Ns (p− q)w(εe,i)N (p−q)_. ₍₄₉₎

Then, the first term in the right-hand side of (9) can be derived as R−1_y ∂ ∂εe,i Ry p,q = Q k=1 (R−1_y )p,k j2πN σ2x Ns (k− q)w(εe,i)N (k−q) =j2πN σ 2 x Ns σ−2_η w(εe,i)N (p−q) − C0 Q k=1 (k− q)xk−qΓ(p, k) − C1 Q k=1 Q b=1 (k− q)xk−qΓ(p, b)Γ(b, k) − C2 Q k=1 (k− q)xk−q · Q a=1 Q b=1 Γ(p, a)Γ(a, b)Γ(b, k) (50) where x = exp j2πN εe,i Ns . (51)

(11)

To obtain the second term in (9), the derivative of the third and fourth terms in the right-hand side of (17) can first be found as ∂ ∂εe,i _Q n=1 Γ(n, q)Γ(p, n) =j2πN Ns _Q n=1 (n− q)Γ(p, n)w(εe,i)N (n−q) + (p− n)Γ(n, q)w(εe,i)N (p−n) (52) ∂ ∂εe,i _Q m=1 Q n=1 Γ(m, n)Γ(n, q)Γ(p, m) =j2πN Ns _Q m=1 Γ(p, m) Q n=1 Γ(n, q)· (m − n)w(εe,i)N (m−n) +(n− q)Γ(p, m)Γ(m, n)w(εe,i)N (n−q) +(p− m)Γ(m, n)Γ(n, q)w(εe,i)N (p−m) . (53)

Thus, we can have the second term in (9) as (54), shown at the bottom of the page.

Substituting (3), (5), (50), and (54) into (9), we can rewrite the log-likelihood function as (55), shown at the bottom of the page.

APPENDIXII DERIVATION OF(27)

Rewrite (22) as (56), shown at the top of the next page, where F1(p, q) = N σx2(q− p)[C0Γ(p, q) + C1 Q n=1Γ(p, n) Γ(n, q) + C2 Q m=1 Q n=1Γ(p, m)Γ(m, n)Γ(n, q)], and

F2(p, 1) = C0(p− q)γ(p, q). Conducting some variable

trans-formation, we can express the power of x as a single variable of k. We can then collect all the items with positive k into one expression and get (57), shown at the bottom of the next page. Similarly, we can collect all the items with negative k into another expression and get (58), shown at the top of the page after the next page.

Therefore, the derivative of the logarithm likelihood function (22) can then be reexpressed as

∂ ∂εe,i Λ(ε) = Q−1 k=1 αp(k)xk+ Q−1 k=1 αn(k)x−k. (59) ∂ ∂εe,i R−1_y _p,q=j2πN Ns − C0(p− q)w(εe,i)N (p−q) − C1 Q n=1 (n− q)w(εe,i)N (n−q)_{Γ(p, n)} − C1 Q n=1 (p− n)w(εe,i)N (p−n)_{Γ(n, q)} − C2 Q m=1 Q n=1 (m− n)w(εe,i)N (m−n)_{Γ(p, m)Γ(n, q)} + (n− q)w(εe,i)N (n−q)_{Γ(p, m)Γ(m, n)} + (p− m)w(εe,i)N (p−m)_{Γ(m, n)Γ(n, q)} (54) ∂ ∂εe,i Λ(ε) =j2πN Ns N σx2 Q p=1 Q q=1 (q− p)xq−p C0Γ(p, q) + C1 Q n=1 Γ(p, n)Γ(n, q) + C2 Q m=1 Q n=1 Γ(p, m)Γ(m, n)Γ(n, q) + γ(p, q) C0(p− q)xp−q+ C1 Q n=1 (n− q)xn−qΓ(p, n) + (p− n)xp−nΓ(n, q) + C2 Q m=1 Q n=1 (m− n)xm−nΓ(p, m)Γ(n, q) + (n− q)xn−qΓ(p, m)Γ(m, n) + (p− m)xp−mΓ(m, n)Γ(n, q) (55)

(12)

∂ ∂εe,i Λ(ε) = Q p=1 Q q=1 N σ_x2(q− p)xq−p C0Γ(p, q) + C1 Q n=1 Γ(p, n)Γ(n, q) + C2 Q m=1 Q n=1 Γ(p, m)Γ(m, n)Γ(n, q) + γ(p, q) C0(p− q)xp−q+ C1 Q n=1 (n− q)xn−qΓ(p, n) + (p− n)xp−nΓ(n, q) + C2 Q m=1 Q n=1 (m− n)xm−nΓ(p, m)Γ(n, q) + (n− q)xn−qΓ(p, m)Γ(m, n) + (p− m)xp−mΓ(m, n)Γ(n, q) = Q p=1 Q q=1 F1(p, q)xq−p+ Q p=1 Q q=1 F2(p, q)xp−q + Q p=1 Q q=1 γ(p, q) C1 Q n=1 (n− q)xn−qΓ(p, n) + (p− n)xp−nΓ(n, q) + C2 Q m=1 Q n=1 ((m− n)xm−nΓ(p, m)Γ(n, q) + (n− q)xn−qΓ(p, m)Γ(m, n) + (p− m)xp−mΓ(m, n)Γ(n, q)) = 0 (56) ∂ ∂εe,i Λ(ε) + = Q−1 k=1 _Q−k p=1,q=p+k F1(p, q)xk+ Q−k q=1,p=q+k F2(p, q)xk+ C1 Q p=1 Q−k q=1,n=q+k γ(p, q)Γ(p, n)kxk + C1 Q q=1 Q−k n=1,p=n+k γ(p, q)Γ(n, q)kxk+ C2 Q p=1 Q q=1 Q−k n=1,m=n+k γ(p, q)Γ(p, m)Γ(n, q)kxk + C2 Q p=1 Q m=1 Q−k q=1,n=q+k γ(p, q)Γ(p, m)Γ(m, n)kxk + C2 Q n=1 Q q=1 Q−k m=1,p=m+k γ(p, q)Γ(m, n)Γ(n, q)kxk = Q−1 k=1 αp(k)xk (57)

(13)

∂ ∂εe,i Λ(ε) − = 1−Q k=−1 _Q p=1−k,q=p+k F1(p, q)xk+ Q q=1−k,p=q+k F2(p, q)xk+ C1 Q p=1 Q q=1−k,n=q+k γ(p, q)Γ(p, n)kxk + C1 Q q=1 Q n=1−k,p=n+k γ(p, q)Γ(n, q)kxk+ C2 Q p=1 Q q=1 Q n=1−k,m=n+k γ(p, q)Γ(p, m)Γ(n, q)kxk + C2 Q p=1 Q m=1 Q q=1−k,n=q+k γ(p, q)Γ(p, m)Γ(m, n)kxk + C2 Q n=1 Q q=1 Q m=1−k,p=m+k γ(p, q)Γ(m, n)Γ(n, q)kxk = Q−1 k=1 _Q−k q=1,p=q+k F1(p, q)x−k+ Q−k p=1,q=p+k F2(p, q)x−k− C1 Q p=1 Q−k n=1,q=n+k γ(p, q)Γ(p, n)kx−k − C1 Q q=1 Q−k p=1,n=p+k γ(p, q)Γ(n, q)kx−k− C2 Q p=1 Q q=1 Q−k m=1,n=m+k γ(p, q)Γ(p, m)Γ(n, q)kx−k − C2 Q p=1 Q m=1 Q−k n=1,q=n+k γ(p, q)Γ(p, m)Γ(m, n)kx−k − C2 Q n=1 Q q=1 Q−k p=1,m=p+k γ(p, q)Γ(m, n)Γ(n, q)kx−k = Q−1 k=1 αn(k)x−k (58) REFERENCES

[1] J. J. van de Beek, M. Sandell, and P. O. Borjesson, “ML estimation of time and frequency offset in OFDM systems,” IEEE Trans. Signal Process., vol. 45, no. 7, pp. 1800–1805, Jul. 1997.

[2] H. T. Hsieh and W. R. Wu, “Maximum likelihood timing and carrier frequency offset estimation for OFDM systems with periodic preamble,” IEEE Trans. Veh. Technol., vol. 58, no. 8, pp. 4224–4237, Oct. 2009.

[3] J. Chen, Y. C. Wu, S. C. Chan, and T. S. Ng, “Joint maximum-likelihood CFO and channel estimation for OFDMA uplink using importance sampling,” IEEE Trans. Veh. Technol., vol. 57, no. 6, pp. 3462–3470, Nov. 2008.

[4] M. O. Pun, M. Morelli, and C.-C. J. Kuo, “Maximum-likelihood synchro-nization and channel estimation for OFDMA uplink transmissions,” IEEE

Trans. Commun., vol. 54, no. 4, pp. 726–736, Apr. 2006.

[5] Z. Wang, Y. Xin, and G. Mathew, “Iterative carrier-frequency offset estimation for generalized OFDMA uplink transmission,” IEEE Trans.

Wireless Commun., vol. 8, no. 3, pp. 1373–1383, Mar. 2009.

[6] S. Sezginer and P. Bianchi, “Asymptotically efficient reduced complex-ity frequency offset and channel estimators for uplink MIMO-OFDMA systems,” IEEE Trans. Signal Process., vol. 56, no. 3, pp. 964–979, Mar. 2008.

[7] J. J. van de Beek, P. O. Borjesson, M. L. Boucheret, D. Landstrom, J. M. Arenas, O. Odling, M. Wahlqvist, and S. K. Wilson, “A time and frequency synchronization scheme for multiuser OFDM,” IEEE J. Sel.

Areas Commun., vol. 17, no. 11, pp. 1900–1914, Nov. 1999.

[8] S. Barbarossa, M. Pompili, and G. B. Giannakis, “Channel-independent synchronization of orthogonal frequency division multiple access systems,” IEEE J. Sel. Areas Commun., vol. 20, no. 2, pp. 474–486, Feb. 2002.

[9] Z. Cao, U. Tureli, and Y. D. Yao, “Deterministic multiuser carrier fre-quency offset estimation for interleaved OFDMA uplink,” IEEE Trans.

Commun., vol. 52, no. 9, pp. 1585–1594, Sep. 2004.

[10] J. Lee, S. Lee, K. J. Bang, S. Cha, and D. Hong, “Carrier frequency offset estimation using ESPRIT for interleaved OFDMA uplink systems,” IEEE Trans. Veh. Technol., vol. 56, no. 5, pp. 3227–3231, Sep. 2007.

[11] T. J. Lv and J. Chen, “ML estimation of timing and frequency offset using multiple OFDM symbols in OFDM systems,” in Proc. IEEE Global

Telecommun. Conf., Dec. 2003, vol. 4, pp. 2280–2284.

[12] ETSI, BRAN; HIPERLAN Type 2; Physical (PHY) Layer Specification, Tech. Spec. 101 475, 2001.

[13] S. Kay and S. Saha, “Mean likelihood frequency estimation,” IEEE Trans.

Signal Process., vol. 48, no. 7, pp. 1937–1946, Jul. 2000.

[14] C. D. Meyer, Matrix Analysis and Applied Linear Algebra. Philadelphia, PA: SIAM, 2000.

[15] K. Berberidis, S. Rantos, and J. Palicot, “A step-by-step quasi-Newton algorithm in the frequency domain and its application to adaptive channel equalization,” IEEE Trans. Signal Process., vol. 52, no. 12, pp. 3335– 3344, Dec. 2004.

[16] H. L. Van Trees, Detection, Estimation, and Modulation Theory, Part IV,

Optimum Array Processing. Hoboken, NJ: Wiley, 2002.

[17] D. S. Bernstein, Matrix Mathematics: Theory, Facts, and Formulas With

Application to Linear Systems Theory. Princeton, NJ: Princeton Univ.

Press, 2005.

[18] S. Haykin, Adaptive Filter Theory, 4th ed. Englewood Cliffs, NJ: Prentice-Hall, 2002.

[19] Y. S. Choi, P. J. Voltz, and F. A. Cassara, “ML estimation of carrier frequency offset for multicarrier signals in Rayleigh fading channels,”

IEEE Trans. Veh. Technol., vol. 50, no. 2, pp. 644–655, Mar. 2001.

[20] S. M. Kay, Fundamentals of Statistical Signal Processing—Estimation

Theory. Englewood Cliffs, NJ: Prentice-Hall, 1993.

[21] D. Kincaid and W. Cheney, Mathematics of Scientific Computing, 2nd ed. Pacific Grove, CA: Brooks/Cole, 1996.

[22] A. H. Sayed, Fundamentals of Adaptive Filtering. Hoboken, NJ: Wiley, 2003.

(14)

[23] M.-O. Pun, M. Morelli, and C.-C. J. Kuo, “Iterative detection and fre-quency synchronization for OFDMA uplink transmissions,” IEEE Trans.

Wireless Commun., vol. 6, no. 2, pp. 629–639, Feb. 2007.

[24] M. Guenach, F. Simoens, H. Wymeersch, and M. Moeneclaey, “Uplink acquisition of a new user accessing a fixed wireless DS-CDMA system,” in Proc. IEEE 6th Workshop SPAWC, New York, Jun. 2005, pp. 353–357.

Hung-Tao Hsieh received the B.S. degree in physics

in 2000 from the National Central University, Jhongli, Taiwan, and the M.S. degree in electro-physics in 2002 from the National Chiao Tung University, Hsinchu, Taiwan, where he is currently working toward the Ph.D. degree.

His research interests include detection/estimation theories and communication signal processing.

Wen-Rong Wu (M’89) received the B.S. degree

in mechanical engineering from Tatung Institute of Technology, Taipei, Taiwan, in 1980 and the M.S. degree in mechanical and electrical engineering and the Ph.D. degree in electrical engineering from the State University of New York at Buffalo in 1985, 1986, and 1989, respectively.

Since August 1989, he has been a faculty member with the Department of Communication Engineer-ing, National Chiao Tung University, Hsinchu, Taiwan. His research interests include statistical signal processing and digital communication.