The fading number of multiple-input multiple-output fading channels with memory

(1)

The Fading Number of

Multiple-Input

Multiple-Output

Fading Channels with Memory

Stefan M. Moser

Department of Communication Engineering National Chiao Tung University (NCTU)

Hsinchu, Taiwan Email: stefan.moser@ieee.org

Abstract-The fading number of a general (not necessarily

Gaussian)regularmultiple-input multiple-output (MIMO) fading

channelwitharbitrary temporal andspatial memoryis derived. The channelis assumed to be non-coherent,i.e.,neither receiver nor transmitter have knowledge about the channel state, but they only know the probability law of the fading process. The

fading number is the second term in the asymptotic expansion

of channel capacity when the signal-to-noise ratio (SNR) tends to infinity.

Itis shown that thefadingnumber can beachieved by aninput

that is the product of two independent processes: a stationary

and circularly symmetric direction- (or unit-) vector process whose distribution needs to be chosen such that it maximizes the fading number, and a non-negative magnitude process that is independent and identically distributed (IID) and that escapes to infinity.

Additionally, in the more general context of an arbitrary stationarychannel modelsatisfyingsomeweakconditions on the channel law, it is shown that the optimal input distribution is

stationary apartfrom some edge effects. I. INTRODUCTION

Inrecent yearsthere has been an ever increasing interestin the fundamental theoretical understanding of wireless mobile communication systems, and in particular in the channel capacity whichgives anultimate limitonthe informationrate that can be transmitted reliably over these channels if we do notconstrain delay and computing complexity.

Unfortunately, itturns outthat thecapacity,especiallyinthe high signal-to-noise ratio (SNR) regime, is highly sensitive to some of the basic assumptions made in the modeling of the channel. For example, there is atremendous difference in the high-SNR capacity depending on the assumptions made about thechannelstateinformation that isdirectlyorindirectly

available to the receiver. If the channel state is perfectly

knowntothe receiver(coherentdetection), the capacity grows logarithmically in the SNR similar to the situation without fading [1]. If the channel state is not available directly, but needs to be estimated by the receiver based on the received sequenceofchanneloutputsymbols (non-coherentdetection),

the capacity depends highly on the assumptions made about the fading process: for regular

fading1

the capacity grows

only double-logarithmically in the SNR [2], [3], i.e., at high SNR thesechannels becomeextremelypower-inefficientinthe

'Foramathematical definition ofregularityseeSectionII.

sense that for every additional bit capacity the SNR needs to be squared or, respectively, on a dB-scale the SNR needs to be doubled! For non-regular Gaussian fading the high-SNR behavior of capacity depends on the specific power spectral density and can be anything between the logarithmic and the double-logarithmic growth [4].

In an attempt to specify the threshold between the efficient low- to medium-SNR regime and the highly inefficient high-SNR regime of regular fading channels, [3], [5] define the fading number X as the secondterm inthehigh-SNR asymp-totic expansion of capacity, i.e., thecapacity athigh SNRcan be written as

C(SNR) = log logSNR+ X +o(1) (1)

whereo(l) denotessome termsthat tendto zero as SNR --> oo.

We definehigh-SNR to be theregion where the

o(l)-terms

in (1) arenegligible. Notethat duetotheextremely slow growth of log logSNR, the fading number is usually the dominant term in thelower range of thehigh-SNR regime. Hence, it is ofgreatpractical interestto have a system with large fading number.

Sofar, the fading number has been successfully derived in some special cases only: the case of single-input multiple-output(SIMO) fading channels withmemoryhas been solved in [6], [5], the fading number of memoryless multiple-input

single-output(MISO) fading channels has been derivedin [3],

[5], andveryrecentlythememoryless input

multiple-output(MIMO) case was solved in [7].

Inthispaper we presentthefading number for the remaining cases of MISO and MIMO fading channels with memory. The rest of this paper is structured as follows: after some remarks about notationwewill define the channel model in the

followingsection. InSectionIII wegivesomeauxiliaryresults that are interesting also in a moregeneral context. SectionIV contains the main result and an outline of the proof. Before we conclude in Section VI we specialize the main result to some interesting cases in SectionV.

We will often split a complex vector v e Ctm up into its magnitude llvll and its direction v v where we reserve this notation exclusively for unit vectors, i.e., throughout the paper every vector carrying a hat, v or

V,

denotes a (deterministic or random, respectively) vector of unit length

(2)

vectors we shall need a differential entropy-like quantity for random vectors that take value onthe unit sphereinCm: letA denote thearea measure onthe unitsphereinCm. If arandom vector V takes value in the unit sphere and has the density p> (v) with respect to A, then we shall let

h (V)-AE [log

P(V)]

(2)

ifthe expectation is defined.

Allrates specifiedinthis paper are in nats perchanneluse, i.e., log(.) denotes thenatural logarithmic function.

II. THECHANNELMODEL

We consider a channel with nT transmit antennas and nR receive antennas whose time-k output Yk C CnR is given by

Yk =

lHkXk

+Zk

where xk e C1'T denotes the time-k channel input vector; the random matrix

HIk

C CInRXT denotes the time-k fading matrix; and the random vector Zk C CnR denotes additive noise. We assume that the random vectors {Zk} are spatially andtemporally white, zero-mean, circularly symmetric, com-plex Gaussian random variables, i.e., Zk are independent and

identically

distributed

(IID)

CJ

(0, Ou21)

for some or2 > 0. Here denotes the identity matrix.

As for themulti-variate fadingprocess

{Hk

1},

we shallonly assume that it is stationary, ergodic, of finite second moment

E

[THIk

I2]

< oc

(where

F

denotes the Frobenius

norm),

and of finite differential entropy rate

h({Hk})

> -oc (the regularity assumption). Hence, we do notnecessarily assume afadingprocess that is Gaussian distributed. Furthermore, we assume that the fading process

{Hk}

and the additive noise process {Zk} are independent and ofajoint law that does not dependon the channelinput {xk}.

As for the input, we consider two different constraints: a peak-power constraint and an average-power constraint. We use S to denote the maximal allowed instantaneous power in the former case, and to denote the allowed average power in the latter case. For both cases we set SNR

A-S

The capacity

C(SNR)

of the channel (3) is given by

C(SNR)

= lim -supI

(XI; YI)

no--oQ n1

The fading number X is now defined as in [3, Definition 4.6], [5, Definition 6.13] by

X({hEk})-

A _SNRTO {C(SNR) -loglog SNR}.

(8)

Primafacie the fading number depends on whether a peak-power constraint (5) or an average-power constraint (6) is imposed onthe input.However,itwillturn outthat the MIMO fading number with memory is identical for both cases.

III. PRELIMINARY RESULTS

The proof of the main result relies on some observations thatholdin more general contextand arethereforeinteresting bythemselves.We stateheretwoof these observations without proof.

A. Capacity-Achieving InputDistributions and Stationarity One of the mainassumption aboutourchannel model is that thefading process and the additive noise arestationary. From an intuitive point of view it is clear that a stationary channel model should have a capacity-achieving input distribution that is also stationary. Unfortunately, we are not aware of a rigorous proof of this claim. However, we give herea slightly less strong statement that basically says that capacity can be approached up to an e > 0 by a distribution that looks stationary apart from edge effects:

Theorem1: Assume some general channel model with in-put Xk e C1'T and output Yk C CnR. Let the channel model be stationary, i.e., for every choice of n e N and distribution Q e P(CTX12) on X' the mutual information

I(Xjn;

Yj)

does not change when shifting the input block overtime. Assumean average-powerconstraint (6) and let the channel model be such that a zero inputyields a zeromutual information: I(0;

Yk)

=0.

Now fix some non-negative integer i and some power S with corresponding SNR

A/u2.

Then for every fixed e > O there corresponds some positive integer r1 r=(S,) and somejoint distributions

Q(+

e P

(C'T

A

(I+±))

such that for ablocklength n sufficiently largethere exists some input

Xl'

satisfying the following:

1) Theinput X'nearly achieves capacity inthesense that

(4)

where the supremum is over the set ofall probability distri-butions on X' satisfying the constraints, i.e.,

llXkll2 <

S,

almost surely, k= 1,2,...,n (5)

for apeak-power constraint, or

n

E E

[

Xk 2] < (6)

k=1 for an average-power constraint.

From [3, Theorem 4.2], [5, Theorem 6.10] wehave liM

{C(SNR)-

log log

SNR}

< oo. (7)

SNRTO

(9)

II

(Xn; yn)

> C

(F)1

2) For everyinteger,uwith0 <p <

i,

every

length-(,u+1)

block ofadjacent vectors

(X,

...,

X+H,)

taken from xlpxl1+*.,Xn-2±+2 (10)

has the samejoint distribution

QS+,

where this distri-bution

Q"+'

is given as corresponding marginal distri-bution of

Q'+

l.

3) Inparticular, allvectors in (10) have the samemarginal

Ql.

4) The marginal distribution

Qsle

gives rise to a second moment

5:

(3)

5) The first r 1- vectors and the last 2(r, -1) vectors satisfy the power constraint possibly strictly:

E

[llXe

l2]

<

S,

f C {1I . ., r1-

}Un-2r1+3,

....,n}. (12) Proof: The proof is based on a shift-and-mix argument similarto aproof givenin[6] using the fact thatadeterministic zero attheinput yields zero information. U Remark 2: Neglecting the edge-effects for the moment, Theorem 1 basically says that, for every ,u < , every block of,u + 1 adjacent vectors has the same distribution indepen-dent of the time shift. From this immediately follows that the distribution of every subset of (not necessarily adjacent) vectorsofa,u + 1 block does notchange when the vectors are shifted intime (simply marginalize those vectors out that are notmember of the subset). Hence, Theorem 1 almost proves that the capacity-achieving input distribution is stationary: the only problems are the edge effects and the fixed (but freely selectable) value ofK.2

B. Capacity-Achieving InputDistributions and Circular Sym-metry

The secondpreliminary remarkconcernscircularsymmetry. We say that a a vector random process {Wk} is circularly symmetric if

{Wk {Wke }' (13)

where stands for "equal in law" and where the process {9

)k}

is IID - U

([0,

27])

and

independent

of

{Wk}.

Note

that this is not to be confused with isotropically distributed,

which means that a vector has equal probability to point in every direction.

Remark 3: Note an important subtlety of this definition: being circularlysymmetric does notonly imply that for every time k the corresponding random vector Wk is circularly

symmetric, but also that frompast vectors

Wk-1

one cannot gainany knowledge about thepresentphase, i.e., thephase is IID.

Lemma 4: Assume a channel as given in (3). Then the capacity-achieving input process can be assumed to be

cir-cularly symmetric, i.e., the input {Xk} can be replaced by

{Xkeiek},

where {9k} is IID -

([0,

27])

and

independent

ofevery other random quantity.

Remark5: The proof ofLemma 4 relies only on the fact that the additive noise is assumed to be circularly symmetric. Hence, for the lemmato hold the noise need notbe Gaussian distributed and may even have memory as long as it is

circularly symmetric.

IV. THEFADING NUMBER OFMIMO FADINGCHANNELS

WITHMEMORY

Theorem 6: Consider a MIMO fading channel with mem-ory (3) where thestationary andergodic fadingprocess

{Hl}

takes value in CInR<'T and satisfies

h({Hk})

> -oc and 2Asa matterof fact one canchoose i-arbitrarily large, however,notethat the size oftheedgeswhere thelemmadoesnotholddependson/,!

E[ lHIk

l]2]

< oc. Then, irrespective of whether a peak-power constraint (5) or an average-power constraint (6) is imposed onthe input, the fading number

x({Hk})

is given by

X({hE'k})

sup

hA

( I00

H{

ThIX } i

Q{X}

-711kX0

E1Iexd

stationary cire. sym.

+nRE [log|Hloxo 2] log2

h(hIoXo

{|hicX}I

f

&oX°

}

(14)

where the maximization is over all stochastic unit-vector processes {Xk} that are stationary and circularly symmetric. Moreover, the fading number is achievable by a stationary input that can be expressed as a product oftwo independent processes:

Xk =Rk Xk, (15)

where {X}C

Ck2'

is a stationary and circularly symmetric unit-vector process with theprobability distribution that max-imizes (14), and where {Rk} e

R+

is a scalar non-negative IID random process such that

log

R2

U

([log

logF,

log])

. (16)

Notethat thisinput satisfies the peak-power constraint(5)(and therefore also the average-power constraint (6)).

Proof: The proof is rather long and technical. We will give here only an outline. The proof consists oftwo parts: in a first part we derive an upper bound on the fading number assuminganaverage-powerconstraint(6)onthechannelinput. In a secondpart wethen derive a lower bound on the fading number by assuming one particular input distribution that satisfies the peak-power constraint (5). We then show that both bounds coincide. Since a peak-power constraint is more restrictive than thecorrespondingaverage-powerconstraint the theorem follows.

a) Outline of Upper Bound: Similarlyto theproof of the SIMO fading number [6], [5] we use the chain rule to write

-I

(Xl;

Yn)

₁₂₁

In

X:,TI(Xln;

y,c

yk-

l

k=l

(17) and then split eachterm on the RHS of the above into terms

that arememoryless and terms that takecare of the memory:

IT(Xln;

yk|

-1)

/<(

kI1

THkXk

IE1IX~ <

I(Xk;Yk)-I

. I

~ ~ ~

ThkXkf

IEIXd

, (18) Note that in the situation ofmultiple-antennas at both

trans-mitter and receiver it is not possible to gain full knowledge

aboutallfadingcoefficients evenifboth X and Yareknown!

- fHf Ik-I

-.Cn)

(4)

Note further that, in order to be able to discard the noise, we rely on the observation that the capacity-achieving input distribution escapes toinfinity [6], [5].

In a next step we now could use the bounding techniques known from [3], [5] to get aboundonthememoryless MIMO fading channel. Unfortunately, it turns out that this will lead to anon-tight bound. Insteadwe split thetermI

(Xk;

Yk)

up into magnitudetermanda termthat takescareof the direction:

I(Xk;Yk)

<

I(Xk;

Yk|)

+I

(Xk;

jYkj

Yk

)

(19)

and show that

~~

ThI~~~kXk~

IX;y 1IYk)<I

0Xk,k

TIXk /

(20)

The first term in (19)now (almost) looks like aMISO fading channel where we can fix the fact that the output is non-negative by multiplying llYkll by a independent circularly symmetric phase (note that this does not change the mutual information).

Inordertohaveabound that isindependent of the unknown capacity-achieving inputdistribution, in a final stepwe maxi-mize the bound over all input distributions. Here we canrely on Theorem 1 which says that we can restrict ourselves to

stationary input distributions.

Hence, the upperbound basically looks like

X({THIk})

< XMISO,IID

HIHX

+ Z e

i)

+I

(Hkxkl,

Xk;

'rX

)

I ( Xk

Hk

Xk kl\

+IISkXk IEkX1)f- (2) Herethe firsttermcorresponds to anexpression that is similar to thememoryless fading number, the secondterm takescare

of the contribution of the direction of the input, and the last two terms take care of the contribution of the memory.

Notethatwehave skipped over alot ofproblems like, e.g., the edge effects of Theorem 1,the order of the limits ofn >

oc and S-> oc, the fact that the escaping-to-infinity property

onlycomes into play in thelimit when S tends to infinity, or

the care that is needed when dealing with the "almost MISO

fading number."

b) Outline ofLower Bound: To derive a lower bound wechoose a specific input distribution which naturally yields

alower bound to channel capacity. Let{Xk} be of the form

Xk =Rk X. (22)

Here {Xk} is a sequence of random unit-vectors forming

a stochastic process that is stationary, circularly symmetric,

and ofa distribution that achieves the maximum in (14). The random variables Rk e

R+

are IID and satisfy (16). Finally

we assumethat

{Rk}L {Xk}.

We then again start using the chain rule to write 1 I(Xn; Yn) n12

In

E

TI(X,;

yn

Xk-1

),

n12 k=l (23) and thentreat each term separately:

I(Xk;Yi

X 1)

>I(Xk;Yk

yk±

1 X 1)

+I

(Xk;

Y + 1 ,x

1)

(24) Note that the first term basically is the memoryless situation based on the side-information of past and future terms. To simplify the notation let's call this side-information Sk:

A (Y1nYk-1

jk-1).

Sk = tk+D: 1 1 J (25)

Contrary to the derivation of the upper bound that has been based on the memoryless MISO case, we will base the derivation of the lower bound on memoryless SIMO, i.e., we split the first termin (24) into twoparts:

I

(Xk;

Yk

Sk)

=I

(Xk;

Yk

Sk)

+I

(Rk;

Yk

Xk,

Sk)

.

(26) Now we have the problem that the second term does not correspond exactly to the SIMO situation since the input of the channel is real instead ofcomplex. This is fixedby various arithmeticchanges whichatthe endyieldthefollowingbound:

I(Xk;Yk Sk)

.I(Rkei®k;Ykeik

|Xk,Sk)

+h,(Ysk

Sk)

-hA(Yke®k Xxks,Sk). (27) Note that our choice of Rk guarantees that RkeiCk achieves the fading number of memoryless SIMO fading with side-information. Hence, we get

X({THk})

>XIID

(HOXO

XO,

SO)

+

hA

(YO SO)

-hA

(Yoe®oie°

Xo,

SO)

+I

(Xo;Y1

Y -'X )

I

rOO

XO,

So)

+

nRE

00

-log2

-h(hoXo

Xo,

So)

+

hA

(Yo

So)

-hA

(Yoe®odE|

XO,

So)

+

I(Xo;

Y1

Y_1

, X ) (28) Inthisoutlinewehaveagain simplifiedthingsconsiderably,

e.g., wehave interchanged the order of thelimits of n -> oo andS -> oc, andwehaveneglected the influence of the noise

invarious

places.

The result now follows by showing that the lower bound is equivalent to the upper bound. This follows from some

(5)

V. SOME SPECIAL CASES A. MISOFading With Memory

Next we are going to study the fading number of MISO fading with memory which has been unknown so far. If we specialize Theorem 6 to the situation of only one antenna at the receiver we getthe following:

Corollary 7: Consider a MISO fading channel with mem-ory where the stationary and ergodic fading process {Hk}

takes value in C(T and satisfies h({Hk}) > -oc and

E[lHk

i2]

< oc.

Then,

irrespective of whether a peak-power constraint (5) or an average-power constraint (6) is imposed on the input, thefading number x({H }) is given by

X(fH

T)

=sup {

log

7 + E

[log

IHTX

o2]

stationary

-h(HXo

{HXH}f

X°

O)

}(29)

where the maximization is over all stochastic unit-vector processes {Xk} that are stationary.

Remark 8: Note that in contrast to the situation without memory where the optimal input is a beam-forming input using a deterministic direction that maximizes the fading number, herebeam-forming isingeneralnotoptimalanymore. B. SpatiallyIIDGaussian MIMOFading Channels with Tem-poral Memory

Assume a fading process HIk = D +

TIk

where all

compo-nents of the matrix process

{ThIk

} are independent and iden-tically distributed zero-mean, unit-variance Gaussian random processes withspectral distribution function

F(.),

i.e., the fad-ing components are spatially IID, but have temporal memory. Notethat forsome constantunitarynRxnRmatrixU andsome constantunitarynTxnTmatrixV the law ofUHIkVisidentical tothelaw of

1Tk.

Therefore, without loss ofgenerality,wemay restrict ourselves to matrices D that are "diagonal" with the

singular values ofD,

|di

> d2 > ... >

dmin{nR,nT}

> O°

on its diagonal.

Proposition 9: Thefading number ofaspatially IID Gaus-sian MIMO fading channel with temporal memory is upper-bounded as follows:

X({HIk}) < min{rnR,t}

m.1

log uR

+/lRlmingRnR,

nT}I

+nR

log

nR -nR-l10

F(nR)

nRlogEpred where 62 ((d 2.dmn{---} 2)1/minf{nR,nT} (i + d 2 .. +dRn) (31)

and where c_pred2 denotes the prediction error when predicting

the value of one component of

lHo

after having observed the infinite past.

VI. DISCUSSION & CONCLUSION

We have derived the fading number in the most general situation of MIMO fading with memory where the fading process is not limited to a Gaussian distribution, but may be any stationary, ergodic, and regular distribution of finite energy. In particular we allow both temporal and spatial memory. The MIMO fading numberis achievable byan input process that can be written as product of two independent processes: an IID non-negative "magnitude" process and a stationary and circularly symmetric "direction" process. The former has the samelogarithmically uniform distribution (16) that has been used in previous publications about the fading number [3], [5], [6]. The "direction" process depends on the particular law of the fadingprocess, i.e., it needstobe chosen suchas tomaximize thefading number. Theexpression of the fading number is therefore not given in a completely closed form butstill containsamaximization.However, onehastobe awarethat wehave notspecified the fadingprocess in closed formeither,i.e., wedonotbelieve itpossibletofurthersimply

(14) without making more detailed assumptions about

{Hk}.

We are still workingon afully closed-form expression for the importantspecial situation of Gaussian fading processes.

The proof of the main result is strongly based on a new theorem showing that thecapacity-achieving input distribution ofa stationary channel model can (almost) be assumed to be stationary. Eventhough this result is veryintuitive, we are not awareofany proofin the literature. We believe this resultto be of importance also in many other situations.

We also have specialized the result to MISO fading with memoryand shown thatin contrast tothememoryless situation this fading number is in general not achieved by beam-forming.

ACKNOWLEDGMENTS

This work was supported by the Industrial Technology Re-search Institute (ITRI), Zhudong, Taiwan, under JRC NCTU-ITRI and by the National Science Council under NSC 95-2221-E-009-046.

REFERENCES

[1] I. E. Telatar, "Capacity of multi-antenna Gaussian channels," Europ. Trans. Telecommun., vol. 10, no. 6, pp.585-595, Nov.-Dec. 1999. [2] G. Taricco and M. Elia, "Capacity of fading channel with no side

information," Electron. Lett., vol. 33, no. 16, pp. 1368-1370, Jul. 31, 1997.

[3] A. Lapidoth and S. M. Moser, "Capacity bounds via duality with

applicationstomultiple-antenna systems on flat fading channels,"IEEE

Trans. Inform. Theory,vol.49,no. 10, pp. 2426-2467, Oct. 2003.

[4] A.Lapidoth, "On the asymptotic capacityofstationary Gaussianfading channels,"IEEETrans.Inform. Theory,vol.51,no.2,pp.437-446, Feb. 2005.

[5] S. M. Moser, "Duality-based bounds on channel capacity," Ph.D. dissertation, SwissFed. Inst. ofTechn., Zurich, Oct. 2004, Diss. ETH No. 15769. [Online]. Available:http://moser.cm.nctu.edu.tw/

[6] A. Lapidoth and S. M. Moser, "The fading number of single-input multiple-output fading channels with memory," IEEE Trans. Inform. Theory, vol.52,no. 2,pp.437-453, Feb. 2006.

[7] S. M.Moser, "Thefadingnumberofmemoryless input multiple-outputfading channels,"Apr. 2007,toapp. inIEEETrans.Inform. The-ory. [Online]. Available:http://moser.cm.nctu.edu.tw/publications.html