Linear coherent distributed estimation over unknown channels

(1)

Linear coherent distributed estimation over unknown channels

$

Chien-Hsien Wu

, Ching-An Lin

Department of Electrical and Control Engineering, National Chiao Tung University, 1001 University Road, Hsinchu 300, Taiwan

a r t i c l e

i n f o

Article history:

Received 29 September 2009 Received in revised form 5 July 2010

Accepted 1 October 2010 Available online 8 October 2010 Keywords:

Distributed estimation Channel estimation Coherent multiple access Power allocation Sensor networks

a b s t r a c t

We study linear distributed estimation with coherent multiple access channel model and MMSE fusion rule. The flat fading channels are assumed unknown at the fusion center and need to be estimated. We adopt a two-phase approach, which first estimates channels and then estimates the source signal, to minimize the MSE of the estimated signal. We study optimal power allocation under a total network power constraint. We consider the optimal power allocation scheme in which training power and data power for each sensor are optimized, and the equal power allocation scheme in which training power is optimized while data power for each sensor is set equal. In both schemes, the problem is formulated as a constrained optimization problem and analytical closed-form solution is obtained. Analytic results reveal that (i) with estimated channels, the MSE approaches to a finite nonzero value as the number of sensors increases; (ii) the optimal training powers are the same in both schemes; (iii) the MSE performance compared with the case when channels are known shows the penalty caused by channel estimation becomes worse as the number of sensors increases. Simulation results verify our findings.

1. Introduction

Wireless sensor network (WSN) is composed of a large number of signal processing sensors, each is capable of simple local computation, short range low data-rate communication, and a fusion center (FC) that has more powerful communication and processing capability. The fusion center receives signals transmitted from the sensors over the wireless channels and combines the signals for a speciﬁc processing purpose. One example of such a distributed signal processing scheme is distributed estimation. A certain parameter or variable is measured by the sensors and the measurements are sent to the fusion center, and the goal is to estimate the parameter based on the distributed sensor measurements[1,2].

In the distributed estimation scenario, the sensors could transmit measurements to the fusion center based on quantized or unquantized strategy. In the quantized strategy, the measurements sent from the sensors are quantized, encoded, and transmitted via digital modula-tion. Due to practical limitations, it is important to make efficient use of energy and bandwidth. Some research works attempt to minimize transmitted power via bit length assignment under a predefined MSE constraint [3,4], while others focus on the search of quantization threshold for a fixed bit length[5,6]. In the unquantized strategy, the sensors send raw measurements directly through channels without quantization and thus analog transmission, such as amplify-and-forward approach, is used. It is asserted in[11]that the amplify-and-forward approach is optimal over additive white Gaussian noise channels. Along this line of approach, many papers study the minimization of mean squared estimation error under a total network power constraint by optimally allocating the transmitted power for each sensor[7–10]and others analyze asymptotic behavior as the network power or the number of sensors increases[12,13].

Contents lists available atScienceDirect

journal homepage:www.elsevier.com/locate/sigpro

Signal Processing

$

Research sponsored by National Science Council under Grant NSC 97-2221-E009-046-MY3.

_{Corresponding author. Tel.: + 886 3 5712121x54403.}

E-mail addresses: [email protected] (C.-H. Wu), [email protected] (C.-A. Lin).

(2)

In the amplify-and-forward approach, two types of channel models are used, the orthogonal multiple access channel (MAC) [7–10]and the coherent MAC [7]. Linear minimum mean squared error (LMMSE) estimator for the coherent MAC model and the orthogonal MAC model with channel knowledge at the FC was discussed and its performance analyzed in[7]. The results indicate that the MSE for the orthogonal MAC model reaches a finite nonzero value as the number of sensors is increased without bound. S_{-enol and Tepedelenlio˘glu}[8]consider the orthogonal MAC model with unknown flat Rayleigh fading channels. A two-phase approach, which first estimates channels and then estimates source signal, is proposed. The result shows that with unknown fading channels, increasing the number of sensors may eventually lead to a degradation in perfor-mance if the total network power is fixed.

In this paper, we consider coherent MAC model with unknown ﬂat fading channels. We derive training-based LMMSE channel estimator. The channel estimate is then used to obtain LMMSE estimation of the source signal. We consider the allocation of power to each sensor for training and for data transmission, under a total network power constraint, so as to minimize the MSE of the estimated source signal. We consider two schemes: (i) the optimal power allocation scheme and (ii) the equal power allocation scheme. In (i), training power and data power for each sensor are optimized, and the power gain (for data) of each sensor is computed based on the respective channel estimate and sent to the sensor from the FC. In (ii), training power is optimized, but the data power for each sensor is set equal, only the phase of the estimated channel is fedback to each sensor from the FC. In both schemes, the problem is formulated as a constrained optimization problem and analytical closed-form solution is obtained. We compare the performance of the distributed estimation scheme with estimated channels to that with actual known channels. The main results of this paper are as follows: (i) the MSE with estimated channels at the FC approaches to a ﬁnite nonzero value as the number of sensors increases; (ii) the optimal training powers are the same in both schemes; (iii) compared with the case when channels are known, the penalty caused by the channel estimation error becomes worse as the number of sensors increases.

The rest of this paper is organized as follows. Section 2 describes the system model. Section 3 derives results of the two-phase approach, namely the LMMSE estimation of channels and source. Section 4 formulates the optimal power allocation problem as an optimization problem. The problem is solved for two cases: when channels are known and when channels are estimated. Comparison of performance of two cases are given. Section 5 describes the equal power allocation scheme in which the training power is optimized. Performance analysis of the scheme is also given. In Section 6, simulation results are given to verify the analytical results obtained in Sections 4 and 5. Section 7 is a brief conclusion.

2. System model

We consider a wireless sensor network with K sensors for estimating a random source signal

y

, as depicted in

Fig. 1. The measurement at the kth sensor is corrupted by an additive noise nkand ampliﬁed by a factor

a

kbefore it

is transmitted to the fusion center (FC) through a ﬂat fading channel, hk. The signal y received at the FC can be expressed as

y ¼ X

K

k ¼ 1

hk

a

kð

y

þnkÞ þ

n

ð1Þ

where

n

is the additive noise at the receiver. We assume (i) E½

y

¼0 and E½j

y

j2 ¼

s

2

y, where jxj is the magnitude of x, (ii) the measurement noises are independent and nkCN ð0,

s

2nÞfor k= 1,2,y,K, that is, nk’s are independent and circular Gaussian with zero mean and variance

s

2

n,

(iii) the channels are independent and hkCN ð0,

s

2hÞ, (iv)

n

CN ð0,

s

2

nÞ, and (v) the source signal, the channels, the measurement noises, and the receiver noise are uncorre-lated. Speciﬁcally, for 1rk,jrK,E½

y

*nk ¼0, E[nk*hl] = 0, E½

y

*

n

¼0,E½

y

*hk ¼0, and E½h*k

n

¼0, where x

*

denotes the complex conjugate of x.

The problem is to estimate the parameter

y

based on the received signal y at the FC. The fading channels are assumed unknown. We consider a two-phase approach similar to that proposed in[8]: to estimate the channels ﬁrst using training symbols sent from the sensors and then to estimate

y

based on the estimated channels and y. In both phases, we seek the linear minimum mean squared error (LMMSE) estimator.

3. LMMSE estimation 3.1. Channel estimation

During the training phase, the sensors send training symbols in sequence: the training period is divided into K time intervals and only the kth sensor sends a training symbol tk over the kth time interval. Thus, the received signal at the kth time interval can be expressed as yk¼hktkþ

n

k, k=1,2,y,K, where

n

kCN ð0,

s

2nÞ and E½

n

*

i

n

j ¼0 for iaj. For a given training sequence tk, the LMMSE estimator of hkis given by[14, p. 382]

^ hk¼

s

2 h jtkj2

s

2hþ

s

2n t kyk ð2Þ n1 n2 nK 1 2 K h1 h2 hK FC y ˆ

(3)

and the corresponding mean squared error (MSE) is

d

2k¼E½jhk ^hkj2 ¼

s

2 h

s

2n jtkj2

s

2hþ

s

2n , k ¼ 1, . . . ,K ð3Þ The MSE of ^hkdecreases as the power of the training symbol

jtkj2increases. The LMMSE problem under the training power

constraintPKk ¼ 1jtkj2rPtcan be formulated as

min pk:1r k r K 1 K XK k ¼ 1

d

2ksubject to XK k ¼ 1 pkrPt and pkZ0, k ¼ 1, . . . ,K

where pk¼ jtkj2is the training power of the kth sensor. The

problem can be solved using standard Karush–Kuhn–Tucker (KKT) condition[8]and the solution is jtkj2¼Pt=K, 8k, as

expected since the channels are independent and identically distributed. In particular, we choose the training symbol to be real and positive, that is, tk¼

ffiffiffiffiffiffiffiffiffiffi Pt=K

p

, and the resulting channel estimate is ^ hk¼

s

2 h ffiffiffiffiffiffiffiffi KPt p

s

2 hPtþK

s

2n yk, k ¼ 1,2, . . . ,K ð4Þ

with the corresponding MSE

d

2k¼ K

s

2 h

s

2n

s

2 hPtþK

s

2n , k ¼ 1, . . . ,K ð5Þ

We note that with such choices of training symbols, both the received signal ykand the channel estimate ^hkare circular

Gaussian.

3.2. Source estimation

During the second phase, channel estimates ^hk are

available at the FC, although the actual channels are unknown. We express the received signal y in (1) in terms of ^hkas y ¼ X K k ¼ 1 ^ hk

a

k

y

þ XK k ¼ 1 ^ hk

a

knkþ

e

þ

n

ð6Þ

where

e

¼PKk ¼ 1ðhk ^hkÞ

a

kð

y

þnkÞis contributed by

chan-nel estimation error. Let ^h ¼ ½ ^h1h^2 ^hKTbe the vector of

channel estimates. The LMMSE estimate of

y

given ^h is ^

y

¼ay where a ¼E½

y

*_{j ^}_h

E½jyj2

j ^h ð7Þ

From (6) it follows that E

y

y j ^h h i ¼E

y

X K k ¼ 1 ^ h_k

a

k

y

þX K k ¼ 1 ^ h_k

a

kn kþ

e

þ

n

! j ^h " # ¼ X K k ¼ 1 ^ hk

a

k

s

2 y ð8Þ

where the last equality is from the assumptions that the source signal is uncorrelated with the measurement noise and the receiver noise, and that ^hk¼E½hkjyk ¼E½hkj ^hk

since ^hkis a linear function of yk. It is derived in Appendix A that E½jyj2_{j ^}_{h ¼} XK k ¼ 1 ^ hkak 2 s2 yþ XK k ¼ 1 j ^hkj2jakj2s2nþs2nþ ðs2yþs2nÞd 2 1 XK k ¼ 1 jakj2 ð9Þ

The MSE incurred by (7) is

J ¼ E½jy ^yj2j ^h ¼s2

yaE½y

_{yj ^}_ha_{E y}h _y_{j ^}_hi_{þ jaj}2_{E jyj}h 2_{j ^}_hi

¼ 1 s2 y þ PK k ¼ 1h^kak 2 PK k ¼ 1j ^hkj2jakj2s2nþs2nþ ðs2yþs2nÞd12PKk ¼ 1jakj2 0 B @ 1 C A 1 ð10Þ When the channel hk is available at the FC, we can set

^

hk¼hk and

d

2k¼0 in (10), and the corresponding MSE

becomes Jo¼ 1

s

2 y þ PK k ¼ 1hk

a

k 2 PK k ¼ 1jhkj2j

a

kj2

s

2nþ

s

2n 0 B @ 1 C A 1 ð11Þ

The MSE Jois a lower bound of J in (10) and can serve as a benchmark against which the performance of the estima-tor (7) can be compared.

4. Optimal power allocation

During the training phase, each sensor uses the same training symbol and thus consumes the same amount of training power Pt/K, where Ptis the total allocated training power. From (5), it is clear that as Ptincreases, the MSE in channel estimation decreases. In a sensor network, there is likely a total power constraint, that is, there is an upper bound imposed on the sum of training power and the power used to transmit data. Hence, when more power is allocated for training, less power is available for data transmission and vice versa. Under the total power constraint, the minimum MSE of ^

y

, that is, J in (10), depends on the training power Ptand how the remaining network power is allocated to each sensor for data transmission. In the following, we consider the optimal power allocation problem, that is, to choose Ptand data power for each sensor to minimize J under a total power constraint. For comparison, we will also consider the case when channel information is available, no training, no channel error, and all power is used for data transmission. The comparison of the two cases will show the penalty incurred due to the fact that the channel is unknown. 4.1. When channels are known

If the channels are known at the FC, the phase of

a

kis

chosen as +

a

k¼ +hk, so that hk

a

k¼ jhkjj

a

kjand the MSE

Join (11) becomes Jo¼ 1

s

2 y þ

z

PKk ¼ 1gkj

a

kj 2

z

s

2 nð PK k ¼ 1gk2j

a

kj2Þ þ1 0 B @ 1 C A 1 ð12Þ where

z

¼

s

2 h=

s

2

nis the channel SNR, and gk¼ jhkj=

s

his the

normalized channel gain for the kth sensor. Such choices of phases make Jo smallest among

a

k’s of the same

magni-tude. Note that gkhas a Rayleigh distribution with density function fg(x)=2x exp( x2), x Z 0, and E½gk ¼

ffiffiffiffiffiffiffiffiffi

p

=4 p

; gk2 has an exponential distribution with density function fg2ðxÞ ¼ expðxÞ,x Z 0, and E[gk2]=1[15, p. 51]. The signal transmitted from the kth sensor is

a

kð

y

þnkÞwith power

(4)

Pk¼E½j

a

kð

y

þnkÞj2 ¼ j

a

kj2ð

s

2yþ

s

2nÞ. From (12), the optimal

power allocation problem with the total network power constrained to P 40 can be formulated as the following optimization problem: minjakj:1r k r K 1

s

2 y þ

z

PKk ¼ 1gkj

a

kj 2

z

s

2 n PK k ¼ 1g2kj

a

kj2 þ1 0 B @ 1 C A 1 subject to X K k ¼ 1 j

a

kj2ð

s

2yþ

s

2nÞrP: 8 > > > > > > > > < > > > > > > > > : ð13Þ From (13), we make the following observations:

(i) If the inequality sign in the constraint of problem (13) is replaced by the equality sign, the solution does not change. Hence we could consider the optimization problem with equality constraint. The argument is as follows. Since the constraint func-tion is quadratic in j

a

kj, if a set of j

a

kjis such that

strict inequality holds, we can equally scale up each j

a

kj, so that equality holds. And if we equally

scale up each j

a

kj, we get a lower function value of

Jo because in (12) the second term inside the parentheses becomes larger. Consequently, with optimal j

a

kj, the inequality constraint must be

active.

(ii) Consider the optimal MSE in (13), say, Jo* as a function of the power P, then Jo* is a strictly decreasing function of P, that is, if P24P1, then

J*

oðP2ÞoJ*oðP1Þ. The argument is similar: if the power

level increases, we can equally scale up j

a

kj to

obtain a lower value of Joand thus a lower value of optimal MSE Jo*can be obtained.

(iii) Since the function Jo*(P) is one-to-one and decreas-ing, the inverse function P(Jo*) is also one-to-one and decreasing. Hence instead of ﬁnding j

a

kj that

minimize Jo in (12) under an equality constraint on power level, we can ﬁnd j

a

kjthat minimize the

power level subject to an equality constraint on MSE. And if the constraint value on MSE is such that the resulting minimum power level matches the given value P in (13), the corresponding j

a

kjare the

optimal ones we set out to ﬁnd. We thus consider the following optimization problem:

minjakj:1r k r K PK k ¼ 1jakj2ðs2yþs2nÞ subject to 1 s2 y þ z PKk ¼ 1gkjakj 2 zs2 n PK k ¼ 1g2kjakj2 þ1 0 B @ 1 C A 1 ¼Jo 8 > > > > > < > > > > > : ð14Þ where 0oJor

s

2y.

The solution of (14) is derived in Appendix B and given by j

a

kj2¼

m

XK k ¼ 1 g2 kð

s

2 yþ

s

2 nÞ ½ð

s

2_yþ

s

2 nÞ þ

s

2n

z

g2k

m

2 !1 g2 k ½ð

s

2_yþ

s

2 nÞ þ

s

2n

z

g2k

m

2 ð15Þ where

m

satisﬁes XK k ¼ 1

z

g2 k

z

s

2 ng2kþ ð

s

2yþ

s

2nÞ=

m

¼1 Jo 1

s

2 y ð16Þ

The multiplier

m

is the total network power since from (15) we havePKk ¼ 1j

a

kj2ð

s

2yþ

s

2nÞ ¼

m

. It is also clear from

(16) that there is a one-to-one correspondence between the total power

m

and the constraint Jo, and that

m

increases as Jodecreases and vice versa. Hence the original problem (13) is solved if we choose Joso that

m

is equal to P, and the choice is

Jo¼ 1

s

2 y þX K k ¼ 1 g2 k

s

2 ngk2þ ð

s

2 yþ

s

2nÞz1P !1 ð17Þ which is the minimum MSE of (13). Accordingly, the optimal power allocation is, for 1rkrK,

Pk¼ j

a

kj2ð

s

2yþ

s

2nÞ ¼ XK k ¼ 1 g2 kð

s

2yþ

s

2nÞ ½ð

s

2 yþ

s

2nÞ þ

s

2n

z

gk2P 2 !1 g 2 kð

s

2 yþ

s

2 nÞ ½ð

s

2 yþ

s

2nÞ þ

s

2n

z

gk2P 2P ð18Þ

and j

a

kj2¼Pk=ð

s

2yþ

s

2nÞ. Since the minimum MSE depends

on the total network power P and the number of sensors K, we hereafter write the MSE Join (17) as Jo(P,K).

As the power P increases, we expect Jo to decrease, which is easy to see from (17). For a ﬁxed K, as P-1, we have lim P-1JoðP,KÞ ¼

s

2 y 1þ K

b

ð19Þ where

b

¼

s

2 y=

s

2

n is the observation SNR. The limit dose

not go to zero but is roughly proportional to 1/K as we would expect. On the other hand, for a ﬁxed P 4 0, as K increases, we have lim K-1JoðP,KÞ ¼ 1

s

2 y þlim K-1KE g2 k

s

2 ng2kþ ð

s

2yþ

s

2nÞz1P " #!1 ¼lim K-1 1 K E

z

g2 kP ð

s

2 yþ

s

2nÞ þ

s

2n

z

gk2P " # ( )1 ¼0 ð20Þ

where in the first equality we used the law of large numbers [17]. From (20), we conclude that in the coherent MAC model, the MSE decreases in the order of 1/K as K goes to infinity even though the total network power P is finite. Similar conclusion for the unit variance case,

s

2

y¼

s

2n¼

s

2n¼1, appeared in[7]. 4.2. When channels are estimated

Suppose training for channel estimation consumes power Pt, then the remaining power for data transmission is P Pt. The power allocation problem now is to optimally choose training power Ptand data power for each sensor. The phase of

a

kis chosen to match that of ^hk, i.e., +

a

k¼

(5)

uncorrelated we have

s

2 h¼

s

2 ^ hþ

d

2 k, where

d

2 k¼E½jhk ^hkj2. Use (5) and

s

2 ^ h¼

s

2 h

d

2

k, we can express the MSE in (10) as

J ¼ 1 s2 y þ z2 PK k ¼ 1g^kjakj 2 Pt z2_s2 n PK k ¼ 1g^ 2 kjakj2 PtþzPtþKzðs2yþs2nÞ PK k ¼ 1jakj2 þK 0 B @ 1 C A 1 ð21Þ where ^gk¼ j ^hkj=

s

h^ is the normalized estimated channel

gain for the kth sensor. Since ^hkis circular Gaussian, ^gkand

gk have identical distribution. From (21), the MMSE optimization problem under a total network power constraint can be formulated as

Again instead of solving problem (22) directly, we consider a problem in which the roles of objective function and constraint are interchanged. The solution to problem (22) is given in the following proposition, the proof of which is given in Appendix C.

Proposition 1. For K 4 1, the solution to (22) gives the optimal training power

Poptt ¼

Kð

z

P þ 1ÞpffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiKð

z

P þ 1Þð

z

P þ KÞ

z

ðK1Þ ð23Þ

where

z

¼

s

2

h=

s

2n is the channel SNR, and the associated optimal data power for the kth sensor is

Popt_k ¼ X K k ¼ 1 ^ g2_k= ^

f

2_k !1 ^ g2_k ^

f

2k ðPPopt_t Þ ð24Þ where

f

^k¼ ½ð

s

2yþ

s

2nÞ þ

m

z

2

s

2 ng^ 2 kP opt t þ

m

K

z

ð

s

2yþ

s

2nÞ and

m

¼Poptt =ðK þ K

z

ðPP opt

t ÞÞ. The incurred MSE is

JðP,KÞ ¼ 1 s2 y þX K k ¼ 1 ^ g2_k s2 ng^ 2 kþ ðs2yþs2nÞ K1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi KðzP þ1Þ p pffiffiffiffiffiffiffiffiffiffiffiffiffiffizP þK !2 0 B B B B B @ 1 C C C C C A 1 ð25Þ Note that the optimal training power in (23) depends on the number of sensors K, the channel SNR

z

, and the total network power P. From (25), we see that the MSE decreases as the power P increases. For a ﬁxed K, as P-1, we obtain

lim

P-1JðP,KÞ ¼

s

2

y

1 þ K

b

ð26Þ

which is the same as (19). This makes sense since P-1 implies Ptopt-1 and thus the MSE of channel estimation in

(5) approaches to zero, that is, ^hk-hkas P-1 in the mean

square sense. It is shown in Appendix D that, for a ﬁxed P, lim K-1JðP,KÞ ¼

s

2 y 1 þ

b

1þ

b

ð ffiffiffiffiffiffiffiffiffiffiffiffiffi

z

P þ 1 p 1Þ2 1 ð27Þ

The MSE does not approach to zero. The reason is that the order of 1/K decrease in MSE in (20) is offset by the order of K increase in the power of the error term E½j

e

j2j ^h in (A.3) in Appendix A. Therefore, in the presence of the channel estimation error, the MSE reaches a ﬁnite nonzero value as K goes to inﬁnity.

4.3. Comparison of two cases

If the total network power and number of sensors are ﬁxed, with estimated channel, the estimation perfor-mance is worse than when channel information is available due to the presence of channel estimation

error. To quantitative compare the two cases, we set the same MSE objective, use optimal power allocation for both cases, and determine the respective total network power that would be required. Suppose to achieve the selected MSE, total network power Pa _{is required} when channel information is available and the required total network power is Pewhen channels are estimated. The ratio Pa_/Pe _{gives an indication of the penalty} incurred by the consumption of training power and the presence of channel estimation error. A small ratio would imply a heavy penalty. But the MSE expres-sions in (17) and (25) are random variables, we instead derive the condition on Pa _{and P}e _{under which the} distributions of MSEs are identical. This is possible due to the fact that the random variables gk and ^gk have

identical Rayleigh distribution. From (17) and (25), the distributions of MSE expressions are identical if the deterministic terms in the denominator are equal, that is, K1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Kð

z

Pe_þ1Þ p pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

z

Pe_þK !2 ¼ 1

z

Pa ð28Þ Rearranging (28), we get Pa Pe ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Kð1 þ1=ð

z

Pe_ÞÞ p pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 þ K=ð

z

Pe_Þ K1 !2 ð29Þ

Note that the ratio in (29) is less than one, and for Pe large

Pa

Pe

1

ðpffiffiffiffiKþ1Þ2: ð30Þ

The ratio decreases as the number of sensors K increases. This means that the penalty caused by channel estimation becomes worse as the number of sensor increases. minPt;jakj:1r k r K 1

s

2 y þ

z

2 PKk ¼ 1g^kj

a

kj 2 Pt

z

2

s

2 n PK k ¼ 1g^ 2 kj

a

kj2 Ptþ

z

PtþK

z

ð

s

2_yþ

s

2nÞ PK k ¼ 1j

a

kj2 þK 0 B @ 1 C A 1 subject to PKk ¼ 1j

a

kj2ð

s

2yþ

s

2nÞ þPtrP: 8 > > > > > < > > > > > : ð22Þ

(6)

5. Equal power allocation

The optimal power allocation scheme discussed in the previous section requires that the complex

a

k be

com-puted based on channel estimate ^hk(or hk) and sent to the kth sensor through feedback channel from the FC. If the gains are not computed and fedback, in order to reduce computations and save (feedback) bandwidth, a reasonable strategy is to allocate equal power for each sensor for data transmission. In the following, we study the performance of the equal power allocation scheme. We again consider two cases: (i) channels are known at the FC and (ii) channels are estimated. In the latter case, we consider the optimal choice of training power Ptto achieve the smallest MSE. We compare performance of the two cases in terms of the power ratio Pa_/Pe_{as in the previous} section.

5.1. When channels are known

We set the phase of

a

kas +

a

k¼ +hk. This requires

feedback of a real number from the FC. With equal power allocation, we have j

a

kj2¼P=ðKð

s

2yþ

s

2nÞÞ, for k= 1,y,K and

the MSE in (12) can be rewritten as

JoðP,KÞ ¼

s

2y 1 þ

b

1 þ

b

1 K PK k ¼ 1gk 2 1 K 1 1 þ

b

1 K PK k ¼ 1gk2 þ 1 K

z

P 0 B B B @ 1 C C C A 1 ð31Þ

It is easy to see from (31) that Jodecreases as P increases. For a ﬁxed K, as P-1, we have

lim P-1 1 JoðP,KÞ ¼

s

2 y 1 þ

b

ðPKk ¼ 1gkÞ2 PK k ¼ 1g2k ! r

s

2 y ð1 þ K

b

Þ where the last inequality uses the Cauchy–Schwartz inequality and the equal sign holds if and only if g1¼ ¼gK. Therefore, as P-1, we have a MSE lower

bound as follows: lim P-1JoðP,KÞ Z

s

2 y 1 þ K

b

ð32Þ

Since equality holds in (19), we see that the performance of the equal power scheme is usually worse than that of the optimal power scheme as P-1. On the other hand, for a ﬁxed P, as K-1, we have ð1=KÞPK

k ¼ 1gk-E½gk ¼

ffiffiffiffiffiffiffiffiffi

p

=4 p

and ð1=KÞPK

k ¼ 1g2k-E½gk2 ¼1, thus (31) becomes

lim K-1JoðP,KÞ ¼ limK-1

s

2 y K

b

1þ

b

p

4 1 1 þ

b

þ 1

z

P 0 B B @ 1 C C A 1 ¼0 ð33Þ

Hence, the MSE decreases in the order of 1/K and approaches to zero as K-1 even though the total power P is ﬁnite. Similar conclusion appeared in[7]for the unit variance case.

5.2. When channels are estimated

If the power Pt _{is used for channel estimation, the} transmitted data power for the kth sensor is Pk= (P Pt)/K,

or equivalently, j

a

kj2¼ ðPPtÞ=ðKð

s

2yþ

s

2nÞÞ. Again the

phase of

a

k is chosen as +

a

k¼ + ^hk and the MSE

derived from (21) is JðP,KÞ ¼s2 y 1þ z2_K b 1 þb 1 K PK k ¼ 1g^k 2 ðPPtÞPt z2 1 1 þb 1 K PK k ¼ 1g^ 2 k ðPPtÞPtþzPtþzKðPPtÞ þK 0 B B B @ 1 C C C A 1 ð34Þ From (34), the optimization problem becomes to choose Pt so that the MSE J is minimum under the total network power constraint. From (34) the MMSE optimization problem can be formulated equivalently as

minPt z2K b 1 þb 1 K PK k ¼ 1g^k 2 ðPPtÞPt z2 1 1 þb 1 K PK k ¼ 1g^ 2 k ðPPtÞPtþzPtþzKðPPtÞ þK subject to 0rPtrP: 8 > > > > > > < > > > > > > : ð35Þ It can be shown that the second derivative of the objective function in (35) with respect to Pt_{is positive. Hence, the} optimization problem (35) is convex since the objective function is convex and the constraint is linear. The following proposition gives the optimal training power and the corresponding MSE.

Proposition 2. For K 41, the solution to (35) gives the optimal training power

Poptt ¼

Kð

z

P þ 1ÞpffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiKð

z

P þ 1Þð

z

P þKÞ

z

ðK1Þ ð36Þ

where

z

¼

s

2

h=

s

2n, and the incurred MSE

JðP,KÞ ¼s2 y 1 þ b 1 þb 1 K PK k ¼ 1g^k 2 1 K 1 1 þb 1 K PK k ¼ 1g^ 2 k þ ðK1Þ 2 K2pffiffiffiffiffiffiffiffiffiffiffiffiffi_z_{P þ 1}pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 þðzP=KÞ2 0 B B B B B B @ 1 C C C C C C A 1 ð37Þ

Proof. Please see Appendix E.

Note that the optimal training powers for both the equal and optimal power allocation schemes are the same. From (37) with ﬁxed K, as P-1, we obtain limP-1ð1=JðP,KÞÞr

s

2y ð1 þ K

b

Þand thus

lim

P-1JðP,KÞ Z

s

2

y

1 þK

b

ð38Þ

which is the same as (32). On the other hand, for a ﬁxed P, when K-1, we obtain lim K-1JðP,KÞ ¼

s

2 y 1 þ

p

4

b

1 þ

b

z

P þ 1 p 1Þ2 1 ð39Þ which is worse than (27). Note that the MSE in (39) also approaches a ﬁnite nonzero value as the number of sensors goes to inﬁnity due to the same reason as stated in Section 4.2.

(7)

5.3. Comparison of two cases

To compare performance of the two cases, we set Pa and Pe _{respectively so that the MSE expressions in (31)} and (37) have the same distribution as in Section 4.3. From (31) and (37), the distributions of the MSE are identical if the deterministic terms in the denominator are equal, that is,

K1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Kð

z

Pe_þ1Þ p pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

z

Pe_þK !2 ¼ 1

z

Pa ð40Þ

This equation is the same as (28) and thus we have the same ratio of penalty incurred by the training power consumption and the channel estimation error as shown in (29) and (30).

6. Numerical results

In this section, we use a number of numerical simulations to verify the analytical results obtained in previous sections. All random parameters,

y

, nk, hk, and

n

, are set as zero-mean circular Gaussian. The parameter

y

and the channel hkare assumed to have unit variance, that is, we set

s

2

y¼

s

2h¼1 (0 dB). The observation noise

variance

s

2

n¼ 10 dB and the receiver noise variance

s

2

n¼ 1 dB, so that the observation SNR

b

¼

s

2y=

s

2nand the

channel SNR

z

¼

s

2

h=

s

2n are 10 and 1 dB, respectively. We ﬁrst compute the average MSE of the optimal power allocation scheme. The average MSE is the average of 105 _{independent runs. The theoretical MSE is given} in (25), where only the normalized channel gains ^gkare

random. To obtain the simulation MSE, we use the LMMSE estimators in (2) and (7) with all random variables independently generated, and take the average MSE of ^

y

. It is clear fromFig. 2that the theoretical and simulation values of MSE are very close. For a ﬁxed P, we see that the MSE decreases as the number of sensors K increases and approaches to the lower bound (27). The results for P =14 and 17 dB show that the 3 dB difference in total power leads to about 3 dB difference in MSE for K Z 20.

The comparison between theoretical and simulation average MSEs for the equal power scheme is shown in Fig. 3, where the theoretical result averages the MSE in (37). Again the ﬁgure shows that the theoretical and simulation values are very close. As K increases, the average MSE decreases and approaches to the lower bound in (39). The results for P= 14 and 17 dB also show roughly 3 dB difference in MSE for K Z 20.

Fig. 4 shows the comparison of MSEs between the equal and optimal power schemes for a ﬁxed P= 16 dB. It shows that the optimal power scheme performs better than the equal power scheme. For K Z20, the difference in MSE between the two schemes approaches to a constant value 0.007 (approximately 20% difference), which is about the difference between the respective low bounds. For comparison, we also simulate the two-phase approach proposed in[8]based on the orthogonal model, where the kth sensor transmits the measured signal

a

kð

y

þnkÞto the kth receiver through an unknown fading

0 20 40 60 80 100 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 K A v erage MSE

Optimal Power Allocation Scheme Simulation: P = 14dB Theoretical: P = 14dB Simulation: P = 17dB Theoretical: P = 17dB

Fig. 2. Mean square error (MSE) with optimal power allocation.

0 20 40 60 80 100 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 K A v erage MSE Simulation: P = 14dB Theoretical: P = 14dB Simulation: P = 17dB Theoretical: P = 17dB Equal Power Allocation Scheme

Fig. 3. Mean square error (MSE) with equal power allocation.

0 20 40 60 80 100 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 K A v erage MSE P = 16dB

Equal Power Allocation Scheme Optimal Power Allocation Scheme

Fig. 4. Comparison between equal and optimal power allocation schemes.

(8)

channel hk, k= 1,y,K. The kth receiver is corrupted by an additive noise

n

kCN ð0,

s

2nÞ, where

s

2n¼ 1 dB, and E½

n

i

n

j ¼0 for iaj; then K received data is collected by

the FC with the LMMSE fusion rule for estimating the signal. The coherent model is shown in (1), where the received signal is a linear combination of the K trans-mitted data corrupted by a noise.Fig. 5shows that with a ﬁxed P= 17 dB, the MSE of the orthogonal model exhibits a conspicuous degradation as K 4 40, while the MSE of the coherent model approaches to a constant value. Also compared with the orthogonal model, the coherent model has a lower average MSE regardless of the number of sensors used. This is a consequence of using orthogonal model, which results in K different receiver noise

n

kat the

FC so that the increase of K does not reduce the effect of receiver noise; while in the coherent model, only one receiver noise is generated at the FC, which leads to increased signal to noise ratio as K increases. In the ﬁgure,

we see that as K increases, the MSE of the coherent model is less sensitive to the channel estimation error than that of the orthogonal model.

Fig. 6shows the ratio of average MSE E[Jo]/E[J] versus the ratio Pa_/Pe_{for the optimal power allocation scheme with a} fixed K=16 sensors. In the figure, the curves corresponding to total network power Pe_{=20, 23dB, 27 dB, and 30 dB,} respectively. The curves all cross the horizontal lines E[Jo]/E[J]=1 at about 0.04 very close to the predicted 1=ðpffiffiffiffiKþ1Þ2 in (30).Fig. 7shows the ratio of average MSE E[Jo]/E[J] versus the ratio Pa/Pefor the equal power scheme. The total network power is fixed at Pe_{=30 dB and the} number of sensors K=9, 25, and 36. We see that the curve for K=9 crosses E[Jo]/E[J]=1 at about 0.06, the curve for K=25 at about 0.03, and the curve for K=36 at about 0.02. The curves show that the penalty caused by channel estimation becomes worse as the number of sensors K increases.

7. Conclusion

We study distributed estimation with coherent multiple access channel model and MMSE fusion rule. We use a two-phase approach for channel and source signal estimations; in both phases, the MMSE criterion is used. We study optimal power allocation problem under a total network power constraint. We obtain expressions of optimal training power and optimal data power for each sensor and the resulting MSE as a function of total network power P and the number of sensor K when channel estimates are used to compute power gains

a

k and fedback to the sensors. For the equal

power scheme, we obtain an expression for the optimal training power and the resulting MSE. In both schemes, the optimal training powers are equal. Our results show that with estimated channels, the MSEs approach to ﬁnite nonzero values as the number of sensors increases. We note that this is in contrast with the result obtained for orthogonal MAC model [8] which shows the MSE perfor-mance eventually deteriorates as the number of sensor

0 20 40 60 80 100 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 K A v erage MSE

Optimal Power Allocation Scheme, P = 17dB Orthogonal Coherent

Fig. 5. Performance of optimal power allocation scheme: coherent model and orthogonal model.

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.5 1 1.5 2 2.5 3 E [J o ]/E [J]

Optimal Power Allocation Scheme: K = 16 Pe_{= 20dB}

Pe_{= 23dB}

Pe_{= 27dB}

Pe_{= 30dB}

Pa/Pe

Fig. 6. MSE ratio versus total power ratio: optimal power scheme with different Pe . 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.8 1 1.2 1.4 1.6 1.8 2 E [J o ]/E [J] K = 9 K = 25 K = 36 Pa/Pe

Equal Power Allocation Scheme: Pe_{= 30dB}

Fig. 7. MSE ratio versus total power ratio: equal power scheme with different K.

(9)

increases. The MSE performance compared with the case when channels are known shows the penalty caused by channel estimation becomes worse as the number of sensors increases.

Acknowledgment

We thank the reviewers, their comments improve the paper.

Appendix A. Derivation of (9)

We ﬁrst show that given ^h,

e

is uncorrelated with

n

,PKk ¼ 1h^k

a

knk, andPKk ¼ 1h^k

a

k

y

. Since E½

n

*

y

¼0, E½h*k

n

¼0, and

E½

n

*_n

l ¼0, E½

n

*

e

j ^h ¼ 0. We show that

e

andPKk ¼ 1h^k

a

knkare uncorrelated as follows:

E X K k ¼ 1 ^ hk

a

knk !

e

j ^h " # ¼E X K k ¼ 1 ^ hkðhk ^hkÞj

a

kj2jnkj2 ! j ^h " # ¼ X K k ¼ 1 ^ hkðE½hkj ^h ^hkÞj

a

kj2

s

2n¼0 ðA:1Þ

where the ﬁrst equality uses that E½n

k

y

¼0 and E[nk*nl] =0 for kal. The last equality follows because ^hk¼E½hkjyk ¼E½hkj ^hk.

Similarly, E X K k ¼ 1 ^ hk

a

k

y

!

e

j ^h " # ¼E X K k ¼ 1 XK l ¼ 1 ^ hkðhl ^hlÞ

a

k

a

lj

y

j2 ! j ^h " # ¼ X K k ¼ 1 XK l ¼ 1 ^ hkðE½hlj ^h ^hlÞ

a

k

a

l

s

2y¼0 ðA:2Þ

Finally the conditional variance E½j

e

j2j ^h ¼ E X K k ¼ 1 ðhk ^hkÞðhk ^hkÞj

a

kj2ð

s

2yþ

s

2 nÞj ^h " # ¼ ð

s

2 yþ

s

2 nÞ XK k ¼ 1 E½ðhk ^hkÞðhk ^hkÞj ^hj

a

kj2 ¼ ð

s

2yþ

s

2nÞ XK k ¼ 1

d

2kj

a

kj2¼ ð

s

2yþ

s

2nÞ

d

2 1 XK k ¼ 1 j

a

kj2 ðA:3Þ

where the last equality uses

d

1¼ ¼

d

kin (5). By (A.1)to (A.3) and (6), equality (9) follows.

Appendix B. Derivation of (15)

We introduce a slack variable t ¼PK_{k ¼ 1}gkj

a

kjand rewrite the problem (14) as

minjakj,t PK k ¼ 1j

a

kj2ð

s

2yþ

s

2 nÞ subject to PKk ¼ 1gkj

a

kjt ¼ 0 1 þ

z

s

2 n PK k ¼ 1g2kj

a

kj2 ¼ 1 Jo 1

s

2 y !1

z

t2 8 > > > > > < > > > > > : The Lagrangian is Lðj

a

kj,t,

l

,

m

Þ ¼ XK k ¼ 1 j

a

kj2ð

s

2yþ

s

2 nÞ þ

l

XK k ¼ 1 gkj

a

kjt ! þ

m

1 þ

z

s

2 n XK k ¼ 1 g2 kj

a

kj2 1 Jo 1

s

2 y !1

z

t2 2 4 3 5 where

l

,

m

2

R

, and the associated necessary conditions[16]for optimality are

@L @j

a

kj ¼2ð

s

2 yþ

s

2nÞj

a

kj þ

l

gkþ2

m

z

s

2ngk2j

a

kj ¼0 ðB:1Þ @L @t ¼

l

2

m

1 Jo 1

s

2 y !1

z

t ¼ 0 ðB:2Þ @L @

l

¼ XK k ¼ 1 gkj

a

kjt ¼ 0 ðB:3Þ @L @

m

¼1þ

z

s

2 n XK k ¼ 1 g2 kj

a

kj2 1 Jo 1

s

2 y !1

z

t2_¼₀ _ðB:4Þ

(10)

From (B.1), j

a

kj ¼

l

gk=ð2

f

kÞ, where

f

k¼ ð

s

2yþ

s

2nÞ þ

m

z

s

2ngk2, and thus from (B.3), we have t ¼ PK k ¼ 1

l

gk2=ð2

f

kÞand then from (B.2), we have 1 Jo 1

s

2 y ¼ X K k ¼ 1

m

z

g2 k

f

k ¼ X K k ¼ 1

z

g2 k

z

s

2 ngk2þ ð

s

2yþ

s

2nÞ=

m

ðB:5Þ Hence, we obtain (16). Finally use t ¼

l

ð1=Jo1=

s

2yÞ=ð2

m

z

Þfrom (B.2) and j

a

kj ¼

l

gk=ð2

f

kÞin (B.4) to get

l

2 4 ¼ 1 Jo 1

s

2 y

m

2

_z

XK k ¼ 1

z

s

2 ng4k

f

2k 0 B B B @ 1 C C C A 1 ¼ X K k ¼ 1 ð

s

2 yþ

s

2nÞg2k

f

2k !1

m

ðB:6Þ

where the last equality follows from (B.5). Therefore, we have j

a

kj2¼ ð

l

2=4Þðg2k=

f

2

kÞand (15) is established.

Appendix C. Proof of Proposition 1

Instead of solving (22) directly, we consider the following problem: minPt,jakj ð

s

2 yþ

s

2 nÞ PK k ¼ 1j

a

kj2þPt subject to 1

s

2 y þ

z

2 PKk ¼ 1g^kj

a

kj 2 Pt

z

2

s

2 n PK k ¼ 1g^ 2 kj

a

kj2 Ptþ

z

PtþK

z

ð

s

2_yþ

s

2nÞ PK k ¼ 1j

a

kj2 þK 0 B @ 1 C A 1 ¼J 8 > > > > > < > > > > > : where 0oJ r

s

2 y. Let t ¼ PK

k ¼ 1g^kj

a

kj, the optimization problem becomes

minPt,jakj,t ð

s

2 yþ

s

2 nÞ PK k ¼ 1j

a

kj2þPt subject to PKk ¼ 1g^kj

a

kjt ¼ 0

z

2

s

2 n PK k ¼ 1g^ 2 kj

a

kj2 Ptþ

z

PtþK

z

ð

s

2yþ

s

2nÞ PK k ¼ 1j

a

kj2 þK ¼ 1 J 1

s

2 y !1

z

2t2_P t 8 > > > > > < > > > > > : The Lagrangian is Lðj

a

kj,Pt,t,

l

,

m

ÞT ¼ ð

s

2yþ

s

2nÞ XK k ¼ 1 j

a

kj2þPtþ

l

XK k ¼ 1 ^ gkj

a

kjt ! þ

m

z

2

s

2 n XK k ¼ 1 ^ g2kj

a

kj2 ! Pt " þ

z

PtþK

z

ð

s

2yþ

s

2 nÞ XK k ¼ 1 j

a

kj2 ! þK 1 J 1

s

2 y !1

z

2t2_P t 3 5 where

l

,

m

2

R

, and the associated necessary conditions for optimality are

@L @j

a

kj ¼2ð

s

2 yþ

s

2 nÞj

a

kj þ

l

g^kþ

m

½2

z

2

s

2 ng^ 2 kPtj

a

kj þ2K

z

ð

s

2yþ

s

2 nÞj

a

kj ¼0 ðC:1Þ @L @Pt ¼1 þ

m

z

2

s

2 n XK k ¼ 1 ^ g2kj

a

kj2 ! þ

z

1 J 1

s

2 y !1

z

2t2 2 4 3 5 ¼ 0 ðC:2Þ @L @t ¼

l

2

m

1 J 1

s

2 y !1

z

2Ptt ¼ 0 ðC:3Þ @L @

l

¼ XK k ¼ 1 ^ gkj

a

kjt ¼ 0 ðC:4Þ @L @

m

¼

z

2

s

2 n XK k ¼ 1 ^ g2kj

a

kj2 ! Ptþ

z

PtþK

z

ð

s

2yþ

s

2 nÞ XK k ¼ 1 j

a

kj2 ! þK 1 J 1

s

2 y !1

z

2t2Pt¼0 ðC:5Þ From (C.1), j

a

kj ¼

l

g^k=ð2 ^

f

kÞ, where

f

^k¼ ð

s

2yþ

s

2 nÞ þ

m

z

2

s

2 ng^ 2

kPtþ

m

K

z

ð

s

2_yþ

s

2nÞ, thus it follows from (C.4) that

t ¼ ð

l

=2ÞPKk ¼ 1ð ^g 2

k= ^

f

kÞand then from (C.3), we have

1 J 1

s

2 y ¼

m

z

2Pt XK k ¼ 1 ^ g2k ^

f

k ¼

z

2Pt XK k ¼ 1 ^ g2k ð

s

2 yþ

s

2nÞ=

m

þ

z

2

s

2 ng^ 2 kPtþK

z

ð

s

2_yþ

s

2nÞ ðC:6Þ

(11)

Use t ¼

l

ð1=J1=

s

2

yÞ=ð2

m

z

2

PtÞfrom (C.3) and j

a

kj ¼

l

g^k=ð2 ^

f

kÞin (C.2) to get

l

2 4 ¼ 1 mþ

z

1 Js12 y

m

2

_z

2_P2 t

z

2

s

2 n PK k ¼ 1g^ 4 k= ^

f

2 k ¼ ð1þ

m

z

Þ=ð1 þK

m

z

Þ PK k ¼ 1g^ 2 kð

s

2yþ

s

2nÞ= ^

f

2 k Pt ðC:7Þ

where the last equality follows from (C.6). Since the data power for the kth sensor is Pk¼ j

a

kj2ð

s

2yþ

s

2 nÞ ¼ ð

l

2=4Þð ^g2kð

s

2yþ

s

2 nÞ= ^

f

2

kÞ, the total power for data transmission is

PK k ¼ 1Pk¼ ð

l

2=4ÞPKk ¼ 1ð ^g 2 kð

s

2yþ

s

2 nÞ= ^

f

2 kÞ ¼

ð1 þ

m

z

ÞPt=ð1 þ K

m

z

Þ. With the total network power constraint P, it follows from (C.2) and (C.5) that

m

¼ Pt

K þ K

z

ðPPtÞ

ðC:8Þ where we usePKk ¼ 1Pk¼PPt. Moreover, sincePKk ¼ 1PkþPt¼P, we have

2 þ ðK þ 1Þ

z

m

1 þK

z

m

Pt¼P ðC:9Þ

Substituting (C.8) into (C.9), we get the optimal training power in (23). With Ptoptand

m

, we get Pkoptin (24) and the MSE in (25) follows from (C.6). Appendix D. Derivation of (27) Rewrite (25) as JðP,KÞ ¼ 1

s

2 y þ 1

s

2 yþ

s

2n bðKÞX K k ¼ 1 ^ g2k

g

bðKÞ ^g2kþK !1 where bðKÞ ¼ ½Kðpffiffiffiffiffiffiffiffiffiffiffiffiffi

z

P þ 1pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 þ

z

P=KÞ=ðK1Þ2 _and

_g

¼

s

2n=ð

s

2yþ

s

2

nÞ. Note that limK-1bðKÞ ¼ ð

ffiffiffiffiffiffiffiffiffiffiffiffiffi

z

P þ1 p

1Þ2_{. We will show}

that the sum inside the parentheses converges to E½ ^g2_k ¼1 as K-1. Since ^ g2_k K ^ g2_k

g

bðKÞ ^g2kþK ¼

g

bðKÞ ^g 4 k K½

g

bðKÞ ^g2kþK r

g

bðKÞ ^g 4 k K2 we have ^g2_k=K

g

bðKÞ ^g4_k=K2_{r ^g}2 k=½

g

bðKÞ ^g 2 kþKr ^g 2 k=K and thus XK k ¼ 1 ^ g2k K XK k ¼ 1

g

bðKÞ ^g4k K2 r XK k ¼ 1 ^ g2k

g

bðKÞ ^g2kþK rX K k ¼ 1 ^ g2k K

It follows from the law of large numbers that as K-1, we havePK k ¼ 1g^ 2 k=K ¼ E½ ^g 2 k ¼1, XK k ¼ 1

g

bðKÞ ^g4k K ¼

g

z

P þ 1 p 1Þ2_{E½ ^g}4 k and XK k ¼ 1

g

bðKÞ ^g4k K2 ¼0

because E½ ^g4kis ﬁnite. Therefore,

XK k ¼ 1 ^ g2k

g

bðKÞ ^g2kþK ¼1, as K-1 and (27) follows.

Appendix E. Proof of Proposition 2 Let c1¼K

b

1 þ

b

1 K XK k ¼ 1 ^ gk !2 and c2¼ 1 1 þ

b

1 K XK k ¼ 1 ^ g2k !

(12)

then the Lagrangian of the problem (35) is LðPt,

m

1,

m

2Þ ¼

z

2c1ðPPtÞPt

z

2c2ðPPtÞPtþ

z

Ptþ

z

KðPPtÞ þK

þ

m

1ðPtPÞ

m

2Pt

and the associated KKT conditions are

z

2 c1½

z

ðK1ÞP2t2Kð

z

P þ 1ÞPtþKPð

z

P þ1Þ ð

z

2c2ðPPtÞPtþ

z

Ptþ

z

KðPPtÞ þKÞ2 þ

m

1

m

2¼0 ðE:1Þ

m

1ðPtPÞ ¼ 0,

m

1Z0 ðE:2Þ

m

2Pt¼0,

m

2Z0 ðE:3Þ

Since the training power have to be greater than 0, we have

m

2¼0. If

m

140, then Pt¼P, but then (E.1) leads to

m

1o0 a

contradiction. Therefore, we have

m

1¼

m

2¼0 and P 4 Pt40. From (E.1), we have

z

ðK1ÞP2

t2Kð

z

P þ1ÞPtþKPð

z

P þ 1Þ ¼ 0 ) Pt¼

Kð

z

P þ1Þ₇pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiKð

z

P þ 1Þð

z

P þKÞ

z

ðK1Þ

where we take negative term since positive term cannot satisfy the constraint P Z Pt_{. Let a ¼ Kð}

_z

_{P þ 1Þ and b ¼}

_z

_{P þ K, then}

we have Poptt ¼ ða

ffiffiffiffiffiffi ab p Þ=½

z

ðK1Þ and PPtopt¼ ðb ffiffiffiffiffiffi ab p

Þ=½

z

ðK1Þ, and from (34), (37) follows:

JðP,KÞ ¼

s

2 y 1 þ c1½ða þ bÞ ffiffiffiffiffiffi ab p 2ab c2½ða þ bÞ ffiffiffiffiffiffi ab p 2ab þ ðK1Þ2pffiffiffiffiffiffi_ab !1 ¼

s

2 y 1 þ c1 c2þ ðK1Þ2 ðpffiffiffiapffiffiffibÞ2 0 B B B @ 1 C C C A 1

where the ﬁrst equality uses that

ðK1ÞðapffiffiffiffiffiffiabÞ þKðK1ÞðpffiffiffiffiffiffiabbÞ þ KðK1Þ2_{¼ ðK1Þ½ðK1Þ}pffiffiffiffiffiffi_ab_{þaKb þ K}2_K

|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl}

¼0

¼ ðK1Þ2pffiffiffiffiffiffi_ab_:

References

[1] J.J. Xiao, A. Ribeiro, Z.Q. Luo, G.B. Giannakis, Distributed compression-estimation using wireless sensor networks, IEEE Signal Process. Mag. 23 (4) (2006) 27–41.

[2] Z. Zhao, A. Swami, L. Tong, The interplay between signal processing and networking in sensor networks, IEEE Signal Process. Mag. 23 (4) (2006) 84–93.

[3] J.J. Xiao, S. Cui, Z.Q. Luo, A.J. Goldsmith, Power scheduling of universal decentralized estimation in sensor networks, IEEE Trans. Signal Process. 54 (2) (2006) 413–422.

[4] J.Y. Wu, Q.Z. Huang, T.S. Lee, Minimal energy decentralized estima-tion via exploiting the statistical knowledge of sensor noise variance, IEEE Trans. Signal Process. 56 (5) (2008) 2171–2176.

[5] A. Ribeiro, G.B. Giannakis, Bandwidth-constrained distributed esti-mation for wireless sensor networks, part I: Gaussian case, IEEE Trans. Signal Process. 54 (3) (2006) 1131–1143.

[6] T.C. Aysal, K.E. Barner, Constrained decentralized estimation over noisy channels for sensor networks, IEEE Trans. Signal Process. 56 (4) (2008) 1398–1410.

[7] J.J. Xiao, S. Cui, Z.Q. Luo, A.J. Goldsmith, Linear coherent decentralized estimation, IEEE Trans. Signal Process. 56 (2) (2008) 757–770. [8] H. S-enol, C. Tepedelenlio˘glu, Performance of distributed estimation

over unknown parallel fading channels, IEEE Trans. Signal Process. 56 (12) (2008) 6057–6068.

[9] S. Cui, J.J. Xiao, A.J. Goldsmith, Z.Q. Luo, H.V. Poor, Estimation diversity and energy efﬁciency in distributed sensing, IEEE Trans. Signal Process. 55 (9) (2007) 4683–4695.

[10] I. Bahceci, A.J. Khandani, Linear estimation of correlated data in wireless sensor networks with optimum power allocation and analog modulation, IEEE Trans. Commun. 56 (7) (2008) 1146–1156.

[11] M. Gastpar, B. Rimoldi, M. Vetterli, To code, or not to code: lossy source channel communication revisited, IEEE Trans. Inf. Theory 49 (5) (2003) 1147–1158.

[12] M.K. Banavar, C. Tepedelenlio˘glu, A. Spanias, Estimation over fading channels with limited feedback using distributed sensing, IEEE Trans. Signal Process. 58 (1) (2010) 414–425.

[13] K. Liu, H. El-Gamal, A. Sayeed, On optimal parametric ﬁeld estimation in sensor networks, in: Proceedings of IEEE/SP 13th Workshop on Statistics and Signal Processing, July 2005, pp. 1170–1175.

[14] S.M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory, Prentice-Hall PTR, 1993.

[15] G.L. St ¨uber, Principles of Mobile Communication, Kluwer Academic Press, Norwell, MA, 2001.

[16] S. Boyd, L. Vanderberghe, Convex Optimization, Cambridge University Press, Cambridge, U.K, 2003.

[17] H. Stark, J.W. Woods, Probability and Random Processes with Application to Signal Processing, Prentice-Hall, Inc, 2002.