• 沒有找到結果。

The fading number of memoryless multiple-input multiple-output fading channels

N/A
N/A
Protected

Academic year: 2021

Share "The fading number of memoryless multiple-input multiple-output fading channels"

Copied!
15
0
0

加載中.... (立即查看全文)

全文

(1)

@2RRR n() @2 =0= 12n02 n i=0 Cni(2ix + 2(n 0 i)y 0 n)2 = 12n02 n i=0 Ci n((2x 0 2y)i + 2ny 0 n)2 = (2x 0 2y)2 n i=0 Ci ni2+ (2ny 0 n)2 n i=0 1 + 2(2x 0 2y)(2ny 0 n) n i=0 Ci ni:

From the fact that the derivatives ofH(Z) with respect to " are uni-formly bounded on[0; 1=2] (see [6], also implied by Theorem 1.1 of [4] and the computation ofH"(Z)j"=0), we draw the conclusion that the second coefficient ofH(Z) is equal to

H00(Z)j "=1=2= 04 100 01 10+ 01 2 : ACKNOWLEDGMENT

The authors wish to thank the anonymous reviewer for pointing out the Faa Di Bruno formula, which greatly simplified the proof of Lemma 2.3.

REFERENCES

[1] D. Blackwell, “The entropy of functions of finite-state Markov chains,” in Trans. 1st Prague Conf. Information Thoery, Statistical Decision Functions, Random Processes, Prague, Czechoslovakia, 1957, pp. 13–20.

[2] G. Constantine and T. Savits, “A multivariate Faa Di Bruno formula with applications,” Trans. Amer. Math. Soc., vol. 348, no. 2, pp. 503–520, Feb. 1996.

[3] R. Gharavi and V. Anantharam, “An upper bound for the largest Lyapunov exponent of a Markovian product of nonnegative matrices,” Theor. Comp. Sci., vol. 332, no. 1–3, pp. 543–557, Feb. 2005. [4] G. Han and B. Marcus, “Analyticity of entropy rate of hidden Markov

chains,” IEEE Trans. Inf. Theory, vol. 52, no. 12, pp. 5251–5266, Dec. 2006.

[5] T. Holliday, A. Goldsmith, and P. Glynn, “Capacity of finite state chan-nels based on Lyapunov exponents of random matrices,” IEEE Trans. Inf. Theory, vol. 52, no. 8, pp. 3509–3532, Aug. 2006.

[6] P. Jacquet, G. Seroussi, and W. Szpankowski, “On the entropy of a hidden Markov process,” in Proc. Data Compression Conf., Snowbird, UT, Mar. 2004, pp. 362–371.

[7] J. Kemeny and J. Snell, Finite Markov Chains. Princeton, N.J.: Van Nostrand, 1960.

[8] R. Leipnik and T. Reid, “Multivariable Faa Di Bruno formulas,” in Electronic Proc 9th Annu. Int. Conf. Technology in Collegiate Mathe-matics [Online]. Available: http://archives.math.utk.edu/ICTCM/EP-9. html#C23

[9] D. Lind and B. Marcus, An Introduction to Symbolic Dynamics and Coding. Cambridge, U.K.: Cambridge Univ. Press, 1995.

[10] B. Marcus, K. Petersen, and S. Williams, “Transmission rates and fac-tors of Markov chains,” Contemp. Math., vol. 26, pp. 279–294, 1984. [11] E. Ordentlich and T. Weissman, “On the optimality of symbol by

symbol filtering and denoising,” IEEE Trans. Inf. Theory, vol. 52, no. 1, pp. 19–40, Jan. 2006.

[12] E. Ordentlich and T. Weissman, “New bounds on the entropy rate of hidden Markov process,” in Proc. Information Theory Workshop, San Antonio, TX, Oct. 2004, pp. 117–122.

[13] E. Ordentlich and T. Weissman, Personal Communication.

[14] Y. Peres, “Analytic dependence of Lyapunov exponents on transition probabilities,” in Lyapunov’s Exponents, Proceedings of a Workshop (Lecture Notes in Mathematics). Berlin, Germany: Springer-Verlag, 1990, vol. 1486.

[15] Y. Peres, “Domains of analytic continuation for the top Lyapunov ex-ponent,” Ann. Inst. H. Poincaré Probab. Statist., vol. 28, no. 1, pp. 131–148, 1992.

[16] O. Zuk, I. Kanter, and E. Domany, “Asymptotics of the entropy rate for a hidden Markov process,” J. Statist. Phys., vol. 121, no. 3–4, pp. 343–360, 2005.

[17] O. Zuk, E. Domany, I. Kanter, and M. Aizenman, “From finite-system entropy to entropy rate for a hidden Markov process,” IEEE Signal Process. Lett., , vol. 13, no. 9, pp. 517–520, Sep. 2006.

The Fading Number of Memoryless Multiple-Input Multiple-Output

Fading Channels Stefan M. Moser, Member, IEEE

Abstract—In this correspondence, we derive the fading number of mul-tiple-input multiple-output (MIMO) flat-fading channels of general (not necessarily Gaussian) regular law without temporal memory. The channel is assumed to be noncoherent, i.e., neither receiver nor transmitter have knowledge about the channel state, but they only know the probability law of the fading process. The fading number is the second term, after the double-logarithmic term, of the high signal-to-noise ratio (SNR) expansion of channel capacity. Hence, the asymptotic channel capacity of memory-less MIMO fading channels is derived exactly. The result is then specialized to the known cases of single-input–multiple-output (SIMO), multiple-input single-output (MISO), and single-input–single-output (SISO) fading chan-nels, as well as to the situation of Gaussian fading.

Index Terms—Channel capacity, fading number, Gaussian fading, gen-eral flat fading, high signal-to-noise ratio (SNR), multiple antenna, mul-tiple-input multiple-output (MIMO), noncoherent.

I. INTRODUCTION

It has been recently shown in [1], [2] that, whenever the matrix-valued fading process is of finite differential entropy rate (a so-called regular process), the capacity of noncoherent input multiple-output (MIMO) fading channels typically grows only double-logarith-mically in the signal-to-noise ratio (SNR).

This is in stark contrast to both, the coherent fading channel where the receiver has perfect knowledge about the channel state, and to the noncoherent fading channel with nonregular channel law, i.e., the differential entropy rate of the fading process is not finite. In the former case the capacity grows logarithmically in the SNR with a

Manuscript received June 1, 2006; revised March 12, 2007. This work was supported by the Industrial Technology Research Institute (ITRI), Zhudong, Taiwan, under Contract G1-95003.

The author is with the Department of Communication Engineering, National Chiao Tung University (NCTU), Hsinchu, Taiwan (e-mail: stefan.moser@ieee. org).

Communicated by K. Kobayashi, Associate Editor for Shannon Theory. Digital Object Identifier 10.1109/TIT.2007.899512

(2)

factor in front of the logarithm that is related to the number of receive and transmit antennas [3].

In the latter case, the asymptotic growth rate of the capacity de-pends highly on the specific details of the fading process. In the case of Gaussian fading, nonregularity means that the present fading realiza-tion can be predicted precisely from the past realizarealiza-tions. However, in every noncoherent system the past realizations are not known a priori, but need to be estimated either by known past channel inputs and out-puts or by means of special training signals. Depending on the spectral distribution of the fading process, the dependence of such estimations on the available power can vary largely which gives rise to a huge va-riety of possible high-SNR capacity behaviors: it is shown in [4], [5], and [6] that depending on the spectrum of the nonregular Gaussian fading process, the asymptotic behavior of the channel capacity can be varied in a large range: it is possible to have very slow double-loga-rithmic growth, fast logadouble-loga-rithmic growth, or even exotic situations where the capacity grows proportionally to a fractional power oflog SNR.

Similarly, Liang and Veeravalli show in [7] that the capacity of a Gaussian block-fading channel depends critically on the assumptions one makes about the time-correlation of the fading process: if the cor-relation matrix is rank deficient, the capacity grows logarithmically in the SNR, otherwise double-logarithmically.

In this correspondence we will only consider noncoherent channels with regular fading processes, i.e., the capacity at high SNR will be growing double-logarithmically. To quantify the rates at which this poor power efficiency begins, [1], [2] introduce the fading number as the second term in the high-SNR asymptotic expansion of channel ca-pacity. Hence, the capacity can be written as

C(SNR) = log(1 + log(1 + SNR)) +  + o(1) (1) whereo(1) tends to zero as the SNR tends to infinity, and where  is a constant, denoted fading number, that does not depend on the SNR.

Explicit expressions of the fading number are known for a number of fading models. For channels with memory, the fading number of single-input–single-output (SISO) fading channels is derived in [1], [2] and the single-input–multiple-output (SIMO) case is derived in [8] and [2].

For memoryless fading channels, the fading number is known in the situation of only one antenna at transmitter and receiver (SISO)

(H) = log  + log jHj2 0 h(H) (2)

in the situation of a SIMO fading channel1

(H) = h( ^He2) + nR log kHk2 0 log 2 0 h(H) (3) (both are special cases from the corresponding situation with memory), and also in the case of a multiple-input single-output (MISO) fading channel [1], [2]

(H ) = sup ^

x log  + log jH ^xj

2 0 h(H ^x) : (4) The most general situation of multiple antennas at both transmitter and receiver, however, has been solved so far only in the special case of a particular rotational symmetry of the fading process: if every rotation of the input vector of the channel can be “undone” by a corresponding rotation of the output vector, and vice-versa, then the fading number has been shown in [1], [2] to be

( ) = log 0(nn

R)+ nR log k ^ek

2 0 h( ^e) (5)

1For a precise definition of the notation used in this corrspondence, we refer

to Section II.

where^e 2 n is an arbitrary constant vector of unit length, and where nRdenotes the number of receive antennas. Such fading channels are called rotation-commutative in the generalized sense (for a detailed definition see Section V).

In this correspondence, we will extend these results and derive the fading number of general memoryless MIMO fading channels.

The remainder of this correspondence is structured as follows. Be-fore we proceed in Section III to introduce the channel model in detail, the following section will clarify our notation. We will then present the main result, i.e., the fading number of the general memoryless MIMO fading channel in Section IV. The corresponding proof is found in Sec-tion VII.

In Section V, the known fading numbers of SISO, SIMO, MISO, and rotation-commutative MIMO fading channels are derived once more as special cases of the new general result from Section IV. In Section VI, we investigate the situation of Gaussian fading processes. We will con-clude in Section VIII.

II. NOTATION

We try to use uppercase letters for random quantities and lower-case letters for their realizations. This rule, however, is broken when dealing with matrices and some constants. To better differentiate be-tween scalars, vectors, and matrices we have resorted to using different fonts for the different quantities. Uppercase letters such asX are used to denote scalar random variables taking value in the reals or in the complex plane . Their realizations are typically written in low-ercase, e.g.,x. For random vectors we use bold face capitals, e.g., X and bold lowercase for their realizations, e.g.,x. Deterministic ma-trices are denoted by uppercase letters but of a special font, e.g., ; and random matrices are denoted using another special uppercase font, e.g., . The capacity is denoted byC, the energy per symbol by E, and the signal-to-noise ratio is denoted by SNR.

We use the shorthand Hab for (Ha; Ha+1; . . . ; Hb). For more complicated expressions, such as Ha^xa; Ha+1^xa+1; . . . ; Hb^xb , we use the dummy variable` to clarify notation: H`^x` b

`=a. Hermitian conjugation is denoted by(1)y, and(1) stands for the transpose (without conjugation) of a matrix or vector. The trace of a matrix is denoted bytr (1).

We usek1k to denote the Euclidean norm of vectors or the Euclidean operator norm of matrices. That is

kxk m t=1 jx(t)j2; x 2 m (6) k k max k ^wk=1k ^wk: (7)

Thus,k k is the maximal singular value of the matrix .

The Frobenius norm of matrices is denoted byk 1 kFand is given by the square root of the sum of the squared magnitudes of the elements of the matrix, i.e.,

k kF tr ( y ): (8)

Note that for every matrix

k k  k kF (9)

as can be verified by upper bounding the squared magnitude of each of the components of w using the Cauchy–Schwarz inequality.^

We will often split a complex vectorv 2 mup into its magnitude kvk and its direction

(3)

where we reserve this notation exclusively for unit vectors, i.e., throughout the correspondence every vector carrying a hat, ^v or ^V, denotes a (deterministic or random, respectively) vector of unit length

k^vk = k ^Vk = 1: (11)

To be able to work with such direction vectors we shall need a differ-ential entropy-like quantity for random vectors that take value on the unit sphere in m: let denote the area measure on the unit sphere in m. If a random vector ^V takes value in the unit sphere and has the densitypV^(^v) with respect to , then we shall let

h( ^V) 0 log pV^( ^V) (12) if the expectation is defined.

We note that just as ordinary differential entropy is invariant under translation, so ish( ^V) invariant under rotation. That is, if is a de-terministic unitary matrix, then

h( ^V) = h( ^V): (13)

Also note thath( ^V) is maximized if ^V is uniformly distributed on the unit sphere, in which case

h( ^V) = log cm (14)

wherecmdenotes the surface area of the unit sphere in m cm= 2

m

0(m): (15)

The definition (12) can be easily extended to conditional entropies: if W is some random vector, and if conditional on W = w the random vector ^V has density pVjW^ (^vjw) then we can define

h( ^V j W = w) 0 log pVjW^ ( ^VjW) W = w (16) and we can defineh( ^Vj W) as the expectation (with respect to W) ofh( ^V j W = w).

Based on these definitions, we have the following lemma.

Lemma 1: LetV be a complex random vector taking value in m and having differential entropyh(V). Let kVk denote its norm and ^V denote its direction as in (10). Then

h(V) = h(kVk) + h( ^V j kVk)

+ (2m 0 1) [log kVk] (17)

= h( ^V) + h(kVk j ^V) + (2m 0 1) [log kVk] (18) whenever all the quantities in (17) and (18), respectively, are defined. Hereh(kVk) is the differential entropy of kVk when viewed as a real (scalar) random variable.

Proof: Omitted.

We shall write X  N (; ) if X 0  is a circularly sym-metric, zero-mean, Gaussian random vector of covariance matrix (X 0 )(X 0 )y = . By X  U ([a; b]) we denote a random variable that is uniformly distributed on the interval[a; b]. The prob-ability distribution of a random variableX or random vector X is denoted byQXorQX, respectively.

Throughout the correspondencee2denotes a complex random vari-able that is uniformly distributed over the unit circle

e2 Uniform on fz 2 : jzj = 1g: (19)

When it appears in formulas with other random variables,e2is always assumed to be independent of these other variables.

All rates specified in this correspondence are in nats per channel use, i.e.,log(1) denotes the natural logarithmic function.

III. THECHANNELMODEL

We consider a channel withnTtransmit antennas andnRreceive antennas whose time-k output Yk2 n is given by

Yk= kxk+ Zk: (20)

Herexk 2 n denotes the time-k input vector; the random matrix k 2 n 2n denotes the time-k fading matrix; and the random vectorZk2 n denotes the time-k additive noise vector.

We assume that the random vectors fZkg are spatially and tem-porally white, zero-mean, circularly symmetric, complex Gaussian random vectors, i.e.,fZkg  IID N 0; 2 for some2 > 0. Here denotes the identity matrix.

As for the matrix-valued fading processf kg we will not specify a particular distribution, but shall only assume that it is stationary, er-godic, of a finite-energy fading gain, i.e.,

k kk2F < 1 (21)

and regular, i.e., its differential entropy rate is finite

h(f kg) lim n"1

1

nh( 1; . . . ; n) > 01: (22) Furthermore, we will restrict ourselves to the memoryless case, i.e., we assume thatf kg is independetn and identically distributed (IID) with respect to timek. Since there is no memory in the channel, an IID input processfXkg will be sufficient to achieve capacity and we will therefore drop the time indexk hereafter, i.e., (20) simplifies to

Y = x + Z: (23)

Note that while we assume that there is no temporal memory in the channel, we do not restrict the spatial memory, i.e., the different fading componentsH(i;j)of the fading matrix may be dependent.

We assume that the fading and the additive noiseZ are indepen-dent and of a joint law that does not depend on the channel inputx.

As for the input, we consider two different constraints: a peak-power constraint and an average-power constraint. We useE to denote the maximal allowed instantaneous power in the former case, and to denote the allowed average power in the latter case. For both cases we set

SNR E

2: (24)

The capacityC(SNR) of the channel (23) is given by C(SNR) = sup

Q I(X; Y) (25)

where the supremum is over the set of all probability distributions on X satisfying the constraints, i.e.

kXk2 E; almost surely (26)

for a peak-power constraint, or

kXk2  E (27)

(4)

Specializing [1, Theorem 4.2], [2, Theorem 6.10] to memoryless MIMO fading, we have

lim

SNR"1fC(SNR) 0 log log SNRg < 1:

(28)

Note that [1, Theorem 4.2], [2, Theorem 6.10] is stated under the as-sumption of an average-power constraint only. However, since a peak-power constraint is more stringent than an average-peak-power constraint, (28) also holds in the situation of a peak-power constraint.

The fading number is now defined as in [1, Definition 4.6], [2, Definition 6.13] by

( ) lim

SNR"1fC(SNR) 0 log log SNRg: (29) Prima facie the fading number depends on whether a peak-power con-straint (26) or an average-power concon-straint (27) is imposed on the input. However, it will turn out that the memoryless MIMO fading number is identical for both cases.

IV. MAINRESULT A. Preliminaries

Before we can state our main result, we need to introduce three con-cepts: The first concerns probability distributions that escape to infinity, the second a technique of upper bounding mutual information, and the third concept concerns circular symmetry.

1) Escaping to Infinity: We start with a discussion about the concept of capacity-achieving input distributions that escape to infinity.

A sequence of input distributions parameterized by the allowed cost (in our case of fading channels the cost is the available power or SNR) is said to escape to infinity if it assigns to every fixed compact set a probability that tends to zero as the allowed cost tends to infinity. In other words this means that in the limit—when the allowed cost tends to infinity—such a distribution does not use finite-cost symbols.

This notion is important because the asymptotic capacity of many channels of interest can only be achieved by input distributions that es-cape to infinity. As a matter of fact one can show that every input distri-bution that only achieves a mutual information of identical asymptotic growth rate as the capacity must escape to infinity. Loosely speaking, for many channels it is not favorable to use finite-cost input symbols whenever the cost constraint is loosened completely.

In the following we will only state this result specialized to the situa-tion at hand. For a more general descripsitua-tion and for all proofs we refer to [8, Sec. VII.C.3], [2, Sec. 2.6].

Definition 2: LetfQX;EgE0be a family of input distributions for the memoryless fading channel (23), where this family is parameterized by the available average powerE such that

Q kXk2  E; E  0: (30)

We say that the input distributionsfQX;EgE0escape to infinity if for everyE0 > 0

lim

E"1QX;E(kXk 2 E

0) = 0: (31)

We now have the following lemma.

Lemma 3: Let the memoryless MIMO fading channel be given as in (23) and letfQX;EgE0be a family of distributions on the channel input that satisfy the power constraint (30). LetI(QX;E) denote the mutual information between input and output of channel (23) when the

input is distributed according to the lawQX;E. Assume that the family of input distributionsfQX;EgE0is such that the following condition is satisfied:

lim E"1

I(QX;E)

log log E = 1: (32)

ThenfQX;EgE0must escape to infinity.

Proof: A proof can be found in [8, Theorem 8, Remark 9], [2, Corollary 2.8].

2) An Upper Bound on Channel Capacity: In [1] and [2] a new approach of deriving upper bounds to channel capacity has been intro-duced. Since capacity is by definition a maximization of mutual infor-mation, it is implicitly difficult to find upper bounds to it. The proposed technique bases on a dual expression of mutual information that leads to an expression of capacity as a minimization instead of a maximiza-tion. This way it becomes much easier to find upper bounds.

Again, here we only state the upper bound in a form needed in the derivation of Theorem 7. For a more general form, for more mathemat-ical details, and for all proofs we refer to [1, Sec. IV], [2, Sec. 2.4].

Lemma 4: Consider a memoryless channel with inputs 2 n and outputT 2 . Then for an arbitrary distribution on the input S the mutual information between input and output of the channel is upper bounded as follows:

I(S; T )  0h(T jS) + log  + log + log 0 ;  +(1 0 ) log(jT j2+ ) + 1

jT j2 +  (33) where ; > 0 and   0 are parameters that can be chosen freely, but must not depend on the distribution ofS.

Proof: A proof can be found in [1, Sec. IV], [2, Sec. 2.4]. 3) Capacity-Achieving Input Distributions and Circular Symmetry: The final preliminary remark concerns circular symmetry. We say that a random vectorW is circularly symmetric if

W= W 1 eL 2 (34)

where2  U ([0; 2]) is independent of W and where= stands forL “equal in law”. Note that this is not to be confused with isotropically distributed, which means that a vector has equal probability to point in every direction. Circular symmetry only concerns the phase of the components of a vector, not the vector’s direction.

The following lemma says that for our channel model an optimal input can be assumed to be circularly symmetric.

Lemma 5: Assume a channel as given in (23). Then the ca-pacity-achieving input distribution can be assumed to be circularly symmetric, i.e., the input vectorX can be replaced by Xe2, where 2  U ([0; 2]) is independent of every other random quantity.

Proof: A proof is given in Appendix A.

Remark 6: Note that the proof of Lemma 5 relies only on the fact that the additive noise is assumed to be circularly symmetric.

B. Fading Number of General Memoryless MIMO Fading Channels We are now ready for the main result, i.e., the fading number of a memoryless MIMO fading channel.

Theorem 7: Consider a memoryless MIMO fading channel (23) where the random fading matrix takes value in n 2n and satisfies

(5)

and

k k2

F < 1: (36)

Then, irrespective of whether a peak-power constraint (26) or an av-erage-power constraint (27) is imposed on the input, the fading number ( ) is given by (37) shown at the bottom of the page. Here ^X denotes a random vector of unit length andQX^ denotes its probability law, i.e., the supremum is taken over all distributions of the random unit-vector

^

X. Note that the expectation in the second term is understood jointly over and ^X.

Moreover, this fading number is achievable by a random vectorX = ^

X1R where ^X is distributed according to the distribution that achieves the fading number in (37) and whereR is a nonnegative random vari-able independent of ^X such that

log R2 U ([log log E; log E]) : (38) Proof: A proof is given in Section VII.

Note that—even if it might not be obvious at first sight—it is not hard to show that the distributionQX^ that achieves the supremum in (37) is circularly symmetric. This is in agreement with Lemma 5.

The evaluation of (37) can be pretty awkward mainly due to the first term, i.e., the differential entropy with respect to the surface area mea-sure . We therefore will derive next an upper bound to the fading number that is easier to evaluate.

To that goal firstly note that for an arbitrary constant nonsingular nR2 nR matrix and an arbitrary constant nonsingularnT2 nT matrix

( ) = ( ); (39)

see [1, Lemma 4.7], [2, Lemma 6.14]. Second, note that for an arbitrary random unit vector ^Y 2 n

h( ^Y)  log cn = log 2 n

0(nR) (40)

wherecn denotes the surface area of the unit sphere in n as defined in (15) and where the upper bound is achieved with equality only if ^Y is uniformly distributed on the sphere, i.e., ^Y is isotropically distributed. Using these two observations we get the following upper bound on the fading number.

Corollary 8: The fading number of a memoryless MIMO fading channel as given in Theorem 7 can be upper bounded as follows: ( )  nRlog  0 log 0(nR)

+ inf

; sup^x nR log k ^xk

2 0 h( ^x) (41) where the infimum is over all nonsingularnR2 nRcomplex matrices

and nonsingularnT2 nTcomplex matrices .

Proof: Using the two observations (39) and (40), we immediately get from Theorem 7

( )  inf

; supQ X^ nRlog  0 log 0(nR)

+ nR [log k Xk^ 2j ^X = ^x]

0 h( X j ^^ X = ^x) : (42)

The result now follows by noting that (41) can always be achieved by choosingQX^ in (42) to be the distribution which with probability 1 takes on the value^x that achieves the maximum in (41).

This upper bound is possibly tighter than the upper bound given in [1, Lemma 4.14], [2, Lemma 6.16] because of the additional infimum over .

V. SOMEKNOWNSPECIALCASES

In this section we will briefly show how some already known results of various fading numbers can be derived as special cases from this new more general result.

We start with the situation of a fading matrix that is rotation-com-mutative in the generalized sense, i.e., the fading matrix is such that for every constant unitarynT2 nTmatrix tthere exists annR2 nR constant unitary matrix rsuch that

r =L t (43)

where= stands for “has the same law”; and for every constant unitaryL nR2 nRmatrix rthere exists a constant unitarynT2 nTmatrix t such that (43) holds [1, Definition 4.37], [2, Definition 6.37].

The property of rotation-commutativity for random matrices is a generalization of the isotropic distribution of random vectors, i.e., we have the following lemma.

Lemma 9: Let be rotation-commutative in the generalized sense. Then the following two statements hold.

• If ^X 2 n is an isotropically distributed random vector that is independent of , then X 2^ n is isotropically distributed. • If^e; ^e02 n are two constant unit vectors, then

k ^ek= k ^eL 0k; k^ek = k^e0k = 1 (44) h( ^e) = h( ^e0); k^ek = k^e0k = 1: (45) Proof: For a proof see, e.g., [1, Lemma 4.38], [2, Lemma 6.38].

From Lemma 9 it immediately follows that in the situation of rota-tion-commutative fading the only term in the expression of the fading number (37) that depends onQX^ is

h ^ X k ^Xk : This entropy is maximized if X^

k ^Xk is uniformly distributed on the surface of the nR-dimensional complex unit sphere, which can be achieved according to Lemma 9 by the choice of an isotropic distribu-tion forQX^. Then according to (14) and (15)

h ^ X k ^Xk = log 2 n 0(nR): (46) ( ) = sup Q h ^ X k ^Xk + nR log k ^Xk2 0 log 2 0 h( ^X j ^X) : (37)

(6)

The expression of the fading number (37) then reduces to (5)

( ) = log 20(nn

R)0 log 2 + nR log k ^ek

2 0 h( ^e) (47) where^e is an arbitrary constant unit vector in n .

In case of a SIMO fading channel, the direction vector ^X reduces to a phase terme8. From Lemma 5 we know that an optimal choice of e8is circularly symmetric, such that (37) becomes

(H) = h( ^He2) + nR log kHk2 0 log 2 0 h(H): (48) Before we continue with the MISO case, we would like to remark that the only term in (37) that depends on the distribution of the phase of each component ofX is

h ^ X k ^Xk :

Since we know from Lemma 5 that ^X is circularly symmetric, we can therefore equivalently write

h ^ X k ^Xk = h ^ X k ^Xke 2 : (49)

Turning to the MISO case now note that the distribution of H ^X

jH ^Xje 2

is identical to the distribution ofe2, independently of the distribution ofH and ^X. Hence,

h H ^X jH ^Xje

2 = h

(e2) = log 2: (50) The fading number (37) then becomes

(H ) = sup Q log 2 + log jH ^Xj 2 0 log 2 0 h(H ^X j ^X) (51) = sup Q X^ log  + log jH ^xj 2 j ^X = ^x 0 h(H ^x ^X = ^x) (52)  sup ^ x log  + log jH ^xj 2 0 h(H ^x) (53) which can be achieved for a distribution ofQX^ that with probability1 takes on the value^x that achieves the fading number in (53).

Finally, the SISO case is a combination of the arguments of the SIMO and MISO case, i.e., using

h(e2) = log 2 (54)

we get

(H) = log 2 + log jHj2 0 log 2 0 h(H) (55)

= log  + log jHj2 0 h(H): (56)

VI. GAUSSIANFADING

The evaluation of the fading number is rather difficult even for the usually simpler situation of Gaussian fading processes. However, we are able to give the exact value for some important special cases, and we will give bounds on some others.

Throughout this section we assume that the fading matrix can be written as

= + ~ (57)

where all components of ~ are independent of each other and zero-mean, unit-variance Gaussian distributed, and where denotes a con-stant line-of-sight matrix.

Note that for some constant unitarynR2 nRmatrix and some constant unitarynT2 nT matrix the law of ~ is identical to the law of ~ . Therefore, without loss of generality, we may restrict ourselves to matrices that are “diagonal,” i.e., fornR nT

= ( ~ n 2(n 0n )) (58)

or, fornR > nT

= ~

(n 0n )2n (59)

where ~ is aminfnR; nTg 2 minfnR; nTg diagonal matrix with the singular values of on the diagonal.

A. Scalar Line-of-Sight Matrix

We start with a scalar line-of-sight matrix, i.e., we assume ~ = d where denotes the identity matrix.

Under these assumptions the fading number has been known already fornR = nT = m, in which case the fading matrix is rotation commutative [1], [2]:

( ) = mgm(jdj2) 0 m 0 log 0(m): (60) Heregm(1) is a continuous, monotonically increasing, concave func-tion defined as shown in (61) at the bottom of the page, form 2 , whereEi (1) denotes the exponential integral function defined as

Ei (0x) 0 1

x e0t

t dt; x > 0 (62)

and (1) is Euler’s psi function given by

(m) 0 +

m01

j=1 1

j (63)

with  0:577 denoting Euler’s constant. This function gm(1) is the expected value of the logarithm of a noncentral chi-square random

gm() log() 0 Ei (0) + m01 j=1(01) j e0(j 0 1)! 0 (m01)! j(m010j)! 1 j ;  > 0 (m);  = 0 (61)

(7)

variable, i.e., for some Gaussian random variables fUjgmj=1 IID N (0; 1) and for some complex constants fjgmj=1we have

log m j=1 jUj+ jj2 = gm(s2) (64) where s2 m j=1 jjj2 (65)

(see [9], [1, Lemma 10.1], [2, Lemma A.6] for more details and a proof). We would like to emphasize thatgm() is continuous for all   0, i.e., in particular lim #0 log() 0 Ei (0) + m01 j=1 (01)j 2 e0(j 0 1)! 0 (m 0 1)! j(m 0 1 0 j)! 1  j = (m) (66) for allm 2 . Moreover, for all m 2 and  0

log  0 Ei (0)  gm()  log(m + ): (67) A derivation of (67) can be found in Appendix B.

We now consider the case wherenR  nT:

Corollary 10: AssumenR  nTand a Gaussian fading matrix as given in (57). Let the line-of-sight matrix be given as

= d ( n n 2(n 0n )) : (68)

Then

( ) = nRgn (jdj2) 0 nR0 log 0(nR) (69) wheregm(1) is defined in (61).

Proof: We write for the unit vector ^X ^

X = 4444440 (70)

where444 2 n and4440 2 n 0n . Then ^ X = ^X + ~ ^X = d444 + ~H (71) where ~H  N (0; n ). Hence h( ^X j ^X) = h( ~H) = nRlog e (72) nR log k ^Xk2 = nRgn (jdj2k444k2)  nRgn (jdj2) (73) h ^ X k ^Xk  log 2 n 0(nR): (74)

Here, the equality in (73) follows from the fact thatkd444 + ~Hk2 is noncentral chi-square distributed and from (64); the inequality in (73) follows from the monotonicity ofgm(1) and is tight if k444k = 1, i.e., 44

40 = 0; and the inequality in (74) follows from (14) and (15) and is tight if444 is uniformly distributed on the unit sphere in n so that X^ is isotropically distributed. The result now follows from Theorem 7.

The casenR> nTis more difficult since then (74) is in general not tight. We will only state an upper bound.

Proposition 11: AssumenR> nTand a Gaussian fading matrix as given in (57). Let the line-of-sight matrix be given as

= d n (n 0n )2n : (75) Then ( )  nTlog 1 + jdj 2 nT + nRlog nR0 nR0 log 0(nR): (76) Proof: This result is a special case of Proposition 13 and has been published before in [1, Eq. (128)], [2, Eq. (6.224)].

B. General Line-of-Sight Matrix

Next we assume Gaussian fading as defined in (57) with a general line-of-sight matrix having singular values d1; . . . ; dminfn ;n g. Hence, ~ , defined in (58) and (59), is given as

~ = diag d1; . . . ; dminfn ;n g (77) wherejd1j  jd2j  1 1 1  jdminfn ;n gj > 0.

We again start with the casenR nT.

Corollary 12: AssumenR  nT and a Gaussian fading matrix as given in (57). Let the line-of-sight matrix have singular values d1; . . . ; dn , wherejd1j  jd2j  1 1 1  jdn j > 0. Then

( )  nRgn (k k2) 0 nR0 log 0(nR) (78) wheregm(1) is given in (61) and where k k2 = jd1j2.

Proof: A proof is given in Appendix C.

The situationnR> nTis again more complicated. We include this case in a new upper bound based on (41) which holds independently of the particular relation betweennRandnT.

Proposition 13: Assume a Gaussian fading matrix as given in (57) and let the line-of-sight matrix be general with singular values d1; . . . ; dminfn ;n g, wherejd1j  jd2j  1 1 1  jdminfn ;n gj > 0. Then the fading number is upper bounded as follows:

( )  minfnR; nTg log  2 minfnR; nTg +nRlog nR0 nR0 log 0(nR) (79) where 2 jd1j21 1 1 1 1 jdminfn ;n gj2 1= minfn ;n g 1 1 + 1 jd1j2 + 1 1 1 + 1 jdminfn ;n gj2 : (80) Proof: A proof is given in Appendix D.

VII. PROOF OF THEMAINRESULT

The proof of Theorem 7 consists of two parts. First, we derive an upper bound to the fading number assuming an average-power con-straint (27) on the input. The key ingredients here are the preliminary results from Section IV-A.

In a second part we then show that this upper bound can actually be achieved by an input that satisfies the peak-power constraint (26). Since a peak-power constraint is more restrictive than the corresponding av-erage-power constraint, the theorem follows.

(8)

Because the proof is rather technical, we will give a short overview to clarify the main ideas.

The upper bound relies strongly on Lemma 3 which says that the input can be assumed to take on large values only, i.e., at high SNR the additive noise will become negligible so that we can bound

I(X; Y) I(X; X): (81)

This term is then split into a term that only considers the magnitude of X and a term that takes into account the direction:

I(X; X) = I(X; k Xk) + I X; X^

k ^Xk k Xk : (82)

For the first term—which is related to MISO fading—we then use the bounding technique of Lemma 4.

Because Lemma 3 only holds in the limit whenE tends to infinity, we introduce an eventkXk2> E0for some fixedE0 0 and condition everything on this event.

To derive a lower bound on capacity we choose a specific input dis-tribution of the form

X = R 1 ^X (83)

where the distribution ofR is such that it achieves the fading number of an SIMO fading channel and where the distribution of ^X is independent ofR and will be only specified at the very end of the derivation (it will be chosen to maximize the fading number). We then split the mutual information into two terms:

I(X; Y) = I(R; Y j ^X) + I( ^X; Y): (84) The first term (almost) corresponds to an SIMO fading channel with side-information for which the fading number is known. The second term is treated separately.

A. Derivation of an Upper Bound

In the following we will use the notationR kXk to denote the magnitude of the input vectorX, i.e., we have X = R1 ^X. Note that in this section we are not allowed to assume thatR is independent of ^X. From Lemma 3, we know that the capacity-achieving input distribu-tion must escape to infinity. Hence, we fix an arbitrary finiteE0  0 and define an indicator random variable as follows:

E 1; if kXk2 E0

0; otherwise. (85)

Let

p Pr [E = 1] = Pr kXk2 E

0 (86)

where we know from Lemma 3 that lim

E"1p = 1: (87)

We now bound as follows:

I(X; Y)  I(X; E; Y) (88)

= I(E; Y) + I(X; Y j E) (89)

= H(E) 0 H(E j Y) + I(X; Y j E) (90)

 H(E) + I(X; Y j E) (91) = Hb(p) + pI(X; Y j E = 1) + (1 0 p)I(X; Y j E = 0) (92)  Hb(p) + I(X; Y j E = 1) + (1 0 p)C(E0) (93) where Hb() 0 log  0 (1 0 ) log(1 0 ) (94) is the binary entropy function. Here, (88) follows from adding an ad-ditional random variable to mutual information; the subsequent two equalities follow from the chain rule and from the definition of mutual information (notice that we use entropy and not differential entropy becauseE is a binary random variable); in the subsequent inequality we rely on the nonnegativity of entropy; and the last inequality follows from boundingp  1 and from upper bounding the mutual information term by the capacityC for the available power which—conditional on E = 0—is E0.

We remark that even thoughC(E0) is unknown, we know that it is finite and independent ofE so that from (87) we have

lim

E"1fHb(p) + (1 0 p)C(E0)g = 0: (95) We continue with the second term of (93) as follows:

I(X; Y j E = 1) = I(X; X + Z j E = 1) (96)  I(X; X + Z; Z j E = 1) (97) = I(X; X; Z j E = 1) (98) = I(X; X j E = 1) + I(X; Z j X; E = 1) (99) = I(X; X j E = 1) (100) = I X; k Xk; X k Xk E = 1 (101) = I X; k Xk; X^ k ^Xk E = 1 (102) = I(X; k Xk j E = 1) + I X; X^ k ^Xk k Xk; E = 1 (103)  I(X; k Xk; e2j E = 1) + I X; X^ k ^Xk k Xk; E = 1 (104) = I(X; k Xke2j E = 1) + I X; X^ k ^Xk k Xk; E = 1 : (105) Here, (97) follows from adding an additional random vectorZ to the argument of the mutual information; the subsequent equality from sub-tracting the known vectorZ from Y; the subsequent two equalities follow from the chain rule and the independence between the noise and all other random quantities; then we split X into magnitude and di-rection vector and use the chain rule again; (104) follows from adding a random variable to mutual information: we introducee2that is in-dependent of all the other random quantities and that is uniformly dis-tributed on the complex unit circle; and the last equality holds because fromk Xke2we can easily get backk Xk and e2.

We next apply Lemma 4 to the first term in (105), i.e., we choose S = X and T = k Xke2. Note that we need to condition everything on the eventE = 1. We get

I(X; k Xke2j E = 1)

 0h(k Xke2j X; E = 1) + log  + log + log 0 ; 

(9)

+ (1 0 ) log k Xk2+  E = 1 + 1 k Xk2 E = 1 + 

(106)

where ; > 0, and   0 can be chosen freely, but must not depend onX.

Notice that from a conditional version of Lemma 1 withm = 1 follows that h(k Xke2j X = x; E = 1) = h(e2j X = x; E = 1) + h(k Xk j e2; X = x; E = 1) + [log k Xk j X = x; E = 1] (107) = log 2 + h(k Xk j X = x; E = 1) + [log k Xk j X = x; E = 1] (108)

where we have used thate2is independent of all other random quanti-ties and uniformly distributed on the unit circle. Taking the expectation overX conditional on E = 1 we then yield

h(k Xke2j X; E = 1) = log 2 + h(k Xk j X; E = 1) + [log k Xk j E = 1] (109) = log 2 + h(k ^Xk 1 R j ^X; R; E = 1) + log k ^Xk 1 R E = 1 (110) = log 2 + h(k ^Xk j ^X; R; E = 1) + log R E = 1 + log k ^Xk j E = 1 + [log RjE = 1] (111) = log 2 + h(k ^Xk j ^X; E = 1) + 2 [log R j E = 1] + log k ^Xk E = 1 (112)

where the second equality follows from the definition ofR kXk; where the third equality follows from the scaling property of entropy with a real argument; and where the last equality follows because given

^

X, k ^Xk is independent of R.

Next we assume0 < < 1 such that 1 0 > 0. Then we define

 sup kxk E log k xk 2+  0 log k xk2 (113) such that (1 0 ) log(k Xk2+ ) E = 1 = (1 0 ) log(k Xk2+ ) 0 log k Xk2 E = 1 + (1 0 ) log k Xk2 E = 1 (114)  (1 0 ) sup kxk E log(k xk 2+ ) 0 log k xk2 + (1 0 ) log k Xk2 E = 1 (115) = (1 0 )+ (1 0 ) log k Xk2 E = 1 (116)  + (1 0 ) log k Xk2 E = 1 : (117)

Note that in the first inequality we have made use of the fact thatE = 1, i.e., thatkXk2  E0. Finally, we bound

1 k Xk2 E = 1 = 1 k ^Xk21 R2 E = 1 (118)  1 sup ^ x k ^xk 21 R2 E = 1 (119) = 1 sup ^ x k ^xk 2 1 R2 E = 1 (120)  1 sup ^ x k ^xk 2 1 E p (121)

where we have used the fact thatR needs to satisfy the average-power constraint (27) to get the following bound:

E  R2 (122)

= p R2 E = 1 + (1 0 p) R2 E = 0 (123)

 p R2 E = 1 : (124)

Plugging (112), (117), and (121) into (106) we yield

I(X; k Xke2j E = 1)

 0 log 2 0 h(k ^Xk j ^X; E = 1) 0 2 [log R j E = 1] 0 log k ^Xk E = 1 + log + log 0 ;  + (1 0 ) log k Xk2E = 1 +   + 1 sup ^ x k ^xk 2 E p +  : (125)

Next, we continue with the second term in (105):

I X; X^ k ^Xk k Xk; E = 1 = h ^ X k ^Xk k Xk; E = 1 0 h ^ X k ^Xk k ^Xk 1 R; ^X; R; E = 1 (126) = h ^ X k ^Xk k Xk; E = 1 0 h ^ X k ^Xk k ^Xk; ^X; R; E = 1 (127)  h ^ X k ^Xk E = 1 0 h ^ X k ^Xk k ^Xk; ^X; E = 1 : (128) Here, the last inequality follows because conditioning cannot increase entropy and because given ^X and k ^Xk, the term ^X=k ^Xk does not depend onR.

(10)

Hence, using (128), (125), and (105) in (93), we get I(X;Y)

 Hb(p) + (1 0 p)C(E0) 0 log 2 0 h(k ^Xk j ^X; E = 1) 0 2 [log R j E = 1] 0 log k ^Xk E = 1 + log + log 0 ;  + (1 0 ) log k Xk2 E = 1 + + 1 sup ^ x k ^xk 2 E p +  + h ^ X k ^Xk E = 1 0 h ^ X k ^Xk k ^Xk; ^X; E = 1 (129) = Hb(p) + (1 0 p)C(E0) 0 log 2 0 h( ^X j ^X; E = 1) + (2nR0 1) log k ^Xk E = 1 0 2 log R E = 1 0 log k ^Xk jE = 1 + log + log 0 ;  + 2 [log k Xk j E = 1] 0 log k Xk2j E = 1 +   + 1 sup ^ x k ^xk 2 E p+  + h ^ X k ^Xk E = 1 (130) = h ^ X k ^Xk E = 1 0 h( ^X j ^X; E = 1) + nR log k ^Xk2jE = 1 0 log 2 + log 0 ;  + 1 sup ^ x k ^xk 2 E p+  + + Hb(p) + (1 0 p)C(E0) + log 0 log k Xk2E = 1 (131)  h ^ X k ^Xk E = 1 0 h( ^X j ^X; E = 1) + nR log k ^Xk2 E = 1 0 log 2 + log 0 ;  + 1 sup ^ x k ^xk 2 E p+  + + Hb(p) + (1 0 p)C(E0) + (log 0 log E00 ): (132)

Here, (130) follows again from a conditional version of Lemma 1 sim-ilar to (107)–(112) which allows us to combine the fourth and the last term in (129); in the subsequent equality we arithmetically rearrange the terms; and the final inequality follows from the following bound:

log k Xk2 E = 1  inf kxk E log k xk 2 (133) = log E0+ inf ^ x log k ^xk 2 (134) log E0+  (135)

where the last line should be taken as a definition for. Notice that

01 <  < 1 (136)

as can be argued as follows: the lower bound on  follows from [1, Lemma 6.7f)], [2, Lemma A.15f)] because h( ) > 01 and

k k2

F < 1. The upper bound on  can be verified using the concavity of the logarithm function and Jensen’s inequality.

Note that (132) does not depend on the distribution ofR anymore, but only on ^X! Hence, we can get an upper bound on capacity by taking the supremum over all possible distributionsQX^. This then gives us the following upper bound on the fading number:

( ) = lim

E"1 C(E) 0 log 1 + log 1 + E2 (137)

= lim

E"1 QsupI(X; Y) 0 log 1 + log 1 + E2 (138)  lim E"1 supQ h ^ X k ^Xk 0 h( ^X j ^X) 0 log 2 + nR log k ^Xk2 + log 0 ;  + 1 supx^ k ^xk 2 E p+  + + Hb(p) + (1 0 p)C(E0) + (log 0 log E00 )

0 log 1 + log 1 + E2 (139) = lim E"1 supQ h ^ X k ^Xk 0 h( ^X j ^X) 0 log 2 + nR log k ^Xk2 + log 0 ;  + 1 sup ^ x k ^xk 2 E p+  + + Hb(p) + (1 0 p)C(E0) + (log 0 log E00 ) 0 log 1 + log 1 + E 2 (140) = sup Q h ^ X k ^Xk 0 h( ^X j ^X) 0 log 2 + nR log k ^Xk2 + lim

E"1 log 0 ;  0 log 1 + 1

sup^x k ^xk 2 E

p +  + + Hb(p) + (1 0 p)C(E0) + (log 0 log E00 ) + log 1 0 log 1 + log 1 + E2 (141) = sup Q h ^ X k ^Xk 0 h( ^X j ^X) + nR log k ^Xk 2 0 log 2 + log(1 0 e0) +  +  0 log : (142) Here, the first two equalities follows from the definition of the fading number (29); the subsequent inequality from (132); (140) follows be-cause the parameters , , and  must not depend on the input distribu-tionQX^ (however, note that we are allowed to let them depend onE); the subsequent equality follows since the first four terms do not depend onE; and in the last equality we have used (95) and made the following choices on the free parameters and

(E) = log E + log sup ^

x [k ^xk2]

(11)

(E) = 1

(E)e= (144)

for some constant  0. For this choice, note that lim

E"1 log 0 ;  0 log 1 = log(1 0 e 0)

(145) lim

E"1 (log 0 log E00 ) =  (146) lim E"1 1 supx^ k ^xk 2 E p+  = 0 (147) lim

E"1 log 1 0 log 1 + log 1 + E2 = 0 log : (148) (Compare with [1, App. VII], [2, Sec. B.5.9].)

To finish the derivation of the upper bound, we let go to zero. Note that ! 0 as  # 0 as can be seen from (113). Note further that

lim #0flog(1 0 e 0) 0 log g = 0: (149) Therefore, we get ( )  sup Q h ^ X k ^Xk 0h( ^X j ^X) + nR log k ^Xk2 0 log 2 : (150)

B. Derivation of a Lower Bound

To derive a lower bound on capacity (or the fading number, respec-tively) we choose a specific input distribution. LetX be of the form

X = R 1 ^X: (151)

Here ^X 2 n is assumed to be a random unit-vector that is circularly symmetric, but whose exact distribution will be specified later. The random variableR 2 +0 is chosen to be independent of ^X and such that

log R2 U [log x2

min; log E] (152)

where we choosex2minas x2

min log E: (153)

Note that this choice ofR satisfies the peak-power constraint (26) and therefore also the average-power constraint (27).

Using such an input to our MIMO fading channel we get the fol-lowing lower bound to channel capacity:

C(E)  I(X; Y) (154) = I(R; ^X; Y) (155) = I( ^X; Y) + I(R; Y j ^X) (156) = I( ^X; Y) + I(R; Ye2j ^X) 0 I(R; Ye2j ^X) + I(R; Y j ^X) (157) = I( ^X; Y) + I(R; e2; Ye2j ^X) 0 I(e2; Ye2j ^X; R) 0 I(R; Ye2j ^X) + I(R; Y j ^X): (158)

Here we have introduced a new random variable2  U ([0; 2]) which is assumed to be independent of every other random quantity.

The last two terms can be rearranged as follows:

0I(R; Ye2j ^X) + I(R; Y j ^X)

= 0h(Ye2j ^X) + h(Ye2j ^X; R) + h(Y j ^X)

0 h(Y j ^X; R) (159)

= 0h(Ye2j ^X) + h(Ye2j ^X; R) + h(Ye2j ^X; e2)

0 h(Ye2j ^X; R; e2) (160)

= 0I(e2; Ye2j ^X) + I(e2; Ye2j ^X; R): (161) Here the second equality follows becausee2is independent of every-thing else so that we can add it to the conditioning part of the entropy without changing its values, and because differential entropy remains unchanged if its argument is multiplied by a constant complex number of magnitude1.

Combining this with (158), we yield

C(E)  I( ^X; Y) + I(R; e2; Ye2j ^X) 0 I(e2; Ye2j ^X) (162) = I( ^X; Y) + I(Re2; Ye2j ^X) 0 I(e2; Ye2j ^X)

(163)

where the last equality follows because fromRe2 the random vari-ablesR and e2can be gained back.

We continue with bounding the first term in (163)

I( ^X; Y) = I( ^X; Y; Z) 0 I( ^X; Z j Y) (x ) (164)  I( ^X; Y; Z) 0 (xmin) (165) = I( ^X; ^XR) 0 (xmin) (166) = I ^X; X^ k ^Xk; k ^Xk 1 R 0 (xmin) (167) = I ^X; X^ k ^Xk + I X; k ^^ Xk 1 R ^ X k ^Xk 0 (xmin): (168)

Here the first equality follows from the chain rule; in the subsequent in-equality we lower bound the second term by0(xmin) which is defined in Appendix E and is shown there to be independent of the input distri-butionQXand to tend to zero asxmin" 1; in the subsequent equality we useZ in order to extract ^XR from Y and then drop (Y; Z) since given XR it is independent of the other random variables; and the^ last equality follows again from the chain rule.

Similarly, we bound the third term in (163)

I(e2; Ye2j ^X)  I(e2; Ye2; Ze2j ^X) (169) = I(e2; Xe2; Ze2j ^X) (170) = I(e2; Xe2j ^X) + I(e2; Ze2 j Xe2; ^X) (171) = I(e2; Xe2j ^X) (172) = I e2; k ^Xk 1 R; X^ k ^Xke 2X^ (173)

(12)

= I e2; X^ k ^Xke 2 X^ + I e2; k ^Xk 1 R X^ k ^Xke 2; ^X : (174) Hence, plugging these results into (163), we get

C(E)  I(Re2; Ye2j ^X) + I ^X; X^ k ^Xk + I X; k ^^ Xk 1 R X^ k ^Xk 0 I e2; X^ k ^Xke 2 X^ 0 I e2; k ^Xk 1 R X^ k ^Xke 2; ^X 0 (xmin): (175)

We next bound the third and fifth mutual information term in (175) I X; k ^^ Xk 1 R X^ k ^Xk 0 I e 2; k ^Xk 1 R Xe^ 2 k ^Xk; ^X = h k ^Xk 1 R X^ k ^Xk 0 h k ^Xk 1 R ^ X k ^Xk; ^X 0 h k ^Xk 1 R X^ k ^Xke 2; ^X + h k ^Xk 1 R X^ k ^Xke 2; ^X; e2 (176) = h k ^Xk 1 R X^ k ^Xk 0 h k ^Xk 1 R ^ X k ^Xk; ^X 0 h k ^Xk 1 R X^ k ^Xke 2; ^X + h k ^Xk 1 R X^ k ^Xk; ^X (177) = h k ^Xk 1 R X^ k ^Xk 0 h k ^Xk 1 R ^ X k ^Xke 2; ^X (178)  h k ^Xk 1 R X^ k ^Xk 0 h k ^Xk 1 R ^ Xe2 k ^Xk (179) = h k ^Xk 1 R X^ k ^Xk 0 h k ^Xk 1 R ^ X k ^Xk (180) = 0: (181)

Here, the inequality follows from conditioning that reduces entropy; and the second last equality holds because we have assumed ^X to be circularly symmetric, i.e., ^X “destroys” the random phase shift of e2.

Therefore, we are left with the following bound:

C(E)  I(Re2; Ye2j ^X) + I ^X; X^ k ^Xk 0I e2; X^ k ^Xke 2 X 0 (x^ min): (182) Now, we rewrite the second and third term as follows:

I ^X; X^ k ^Xk 0 I e 2; X^ k ^Xke 2 X^ = h ^ X k ^Xk 0 h ^ X k ^Xk X 0 h^  ^ X k ^Xke 2 X^ + h ^ X k ^Xke 2 X; e^ 2 (183) = h ^ X k ^Xk 0 h ^ X k ^Xk X 0 h^  ^ X k ^Xke 2 X^ + h ^ X k ^Xk X^ (184) = h ^ X k ^Xk 0 h ^ X k ^Xke 2 X^ (185)

where the second equality follows from (13) with a choice = e0 

n and from the fact that e2 is independent of all other random quantities.

This leaves us with

C(E)  I(Re2; Ye2j ^X) + h  ^ X k ^Xk 0h ^ X k ^Xke 2 X 0 (x^ min): (186) Next, we let the power grow to infinityE ! 1 and use the defi-nition of the fading number (29). SinceRe2is circularly symmetric with a magnitude distributed according to (152), we know from [1, Eq. (108) and Theorem 4.8], [2, Eq. (6.194) and Theorem 6.15], thatRe2 achieves the fading number of a memoryless SIMO fading channel with partial side-information. In our situation we have

I(Re2; Ye2j ^X) = I(Re2; ^XRe2+ Z j ^X) (187) = I(Re2; ^XRe2+ Z; ^X) (188) where ^X serves as partial receiver side-information (that is independent of the SIMO inputRe2). Note that a random vectorA is said to contain only partial side-information aboutB if h(BjA) > 01, i.e., in our case we need

h( ^X j ^X) > 01 (189)

which is satisfied since we assume thath( ) > 01 and k k2F < 1 (see [1, Lemma 6.6], [2, Lemma A.14]).

Hence ( )  lim E"1 I(Re 2; ^XRe2+ Z j ^X) + h  X^ k ^Xk 0 h ^ X k ^Xke 2 X 0 (x^ min) 0 log 1 + log 1 + E 2 (190) = lim E"1 I(Re 2; ^XRe2+ Z j ^X)

(13)

0 log 1 + log 1 + E2 0 (xmin) + h ^ X k ^Xk 0 h ^ X k ^Xke 2 X^ (191) = ( ^X j ^X) + h ^ X k ^Xk 0 h ^ X k ^Xke 2 X^ (192) = h ^ X k ^Xke 2 X + n^ R log k ^Xk2 0 log 2 0 h( ^X j ^X) + h ^ X k ^Xk 0 h ^ X k ^Xke 2 X^ (193) = h ^ X k ^Xk + nR log k ^Xk 2 0 log 2 0 h( ^X j ^X): (194) Here in (192), we have used the fact that our choice (153) guarantees that(xmin) tends to zero as E ! 1 (see Appendix E) and that we achieve the SIMO fading number for a channel with inputRe2 and output ^xRe2+ Z; the subsequent equality follows from the fading number of a memoryless SIMO fading channel where the receiver has access to some partial side-information [1, Eq. (108)], [2, Eq. (6.194)]:

(HjS) = h( ^He2j S) + nR log kHk2 0 log 2 0 h(HjS): (195)

The result now follows by choosing the distribution QX^ such as to maximize the lower bound (194) to the fading number.

VIII. CONCLUSION

We have derived the fading number of a MIMO fading channel of general fading law including spatial, but without temporal memory. Since the fading number is the second term after the double-logarithmic term of the high-SNR expansion of channel capacity, this means that we have precisely specified the behavior of the channel capacity asymp-totically when the power grows to infinity. The result shows that the asymptotic capacity can be achieved by an input that consists of the product of two independent random quantities: a circularly symmetric random unit vector (the direction) and a nonnegative (i.e., real) random variable (the magnitude). The distribution of the random direction is chosen such as to maximize the fading number and therefore depends on the particular law of the fading process. The nonnegative random variable is such that (38) is satisfied. This is the well-known choice that also achieves the fading number in the SISO and SIMO case and is also used in the MISO case where it is multiplied by a constant beam-direc-tion^x. All these special cases follow nicely from this new result.

We have then derived some new results for the important special sit-uation of Gaussian fading. For the case of a scalar line-of-sight matrix (68) assuming at least as many transmit as receive antennasnR nT we have been able to state the fading number precisely

 = nRgn (jdj2) 0 nR0 log 0(nR) (196) where gm(1) denotes the expected value of a noncentral chi-square random variable (see (61)). We see that the asymptotic capacity only depends on the number of receive antennas and is growing proportion-ally tonRlog jdj2.

For a general line-of-sight matrix, we have shown an upper bound that grows likeminfnR; nTg log 2where2is a certain kind of

av-erage of all singular values of the line-of-sight matrix (see (79) and (80)).

We would like to emphasize that even though all results on the fading number are asymptotic results for the theoretical situation of infinite power, they are still of relevance for finite SNR values: it has been shown that the approximation

C(SNR)  log(1 + log(1 + SNR)) +  (197) holds already for moderate values of the SNR. Actually, pulling our-selves by our bootstraps, let us consider for the moment that (197) starts to be valid for an SNR somewhere in the range of 30 to 80 dB. In this caselog(1 + log(1 + SNR)) will have a value between 2 and 3 nats. Hence, once the capacity is appreciably above + 2 nats, the approx-imation (197) is likely to be valid [10], [11].

Therefore, the fading number can be seen as an indicator of the max-imal rate at which power efficient communication is possible on the channel. For a further discussion about the practical relevance of the fading number we refer to [10] and [12].

APPENDIXA PROOF OFLEMMA5

Assume that2  U ([0; 2]), independent of every other random quantity. Then I(X; Y) = I(X; Y j e2) (198) = I(Xe2; Ye2j e2) (199) = I(Xe2; Xe2+ Z j e2) (200) = I( ~X; ~X + Z j e2) (201) = h( ~X + Z j e2) 0 h( ~X + Z j ~X; e2) (202) = h( ~X + Z j e2) 0 h( ~X + Z j ~X) (203)  h( ~X + Z) 0 h( ~X + Z j ~X) (204) = I( ~X; ~X + Z): (205)

Here the first equality follows because2 is independent of every other random quantity; the third equality follows becauseZ is circularly sym-metric; in the subsequent equality we substitute ~X = Xe2; and the inequality follows since conditioning reduces entropy.

Hence, a circularly symmetric input achieves a mutual information that is at least as big as the original mutual information.

APPENDIXB DERIVATION OFBOUNDS(67)

In this appendix we will derive the bounds (67) ongm(1). We start with the upper bound which follows directly from (64) and (65) and from Jensen’s inequality:

gm(s2) = log m j=1 jUj + jj2 (206)  log m j=1 jUj + jj2 (207) = log m j=1 (1 + jjj2) (208) = log(m + s2): (209)

For the lower bound we also start with (64) and choose1 = s and 2 = 1 1 1 = m = 0. Then we get

gm(s2) = log m

j=1

(14)

 log jU1+ 1j2 (211)

= g1(s2) (212)

= log s20 Ei 0s2 : (213)

Here, (211) follows from dropping some nonnegative terms in the sum; and in the subsequent two equalities we use the definition ofg1(1).

APPENDIXC PROOF OFCOROLLARY12 We choose a constantnT2 nTmatrix as follows:

diag 1d

1; . . . ; 1dn ; 1d1; . . . ; 1d1 (214) and then we note that for a unit vector^x = (^x(1); . . . ; ^x(n ))

^x = ^x + ~ ^x = ^x(1) .. . ^x(n ) + ~ ^x  + ~H (215) where ~H  N 0; 2(^x)n with 2(^x) j^x(1)j2 jd1j2 + 1 1 1 + j^x (n )j2 jdn j2 + j^x (n +1)j2 jd1j2 + 1 1 1 + j^x (n )j2 jd1j2 (216)

and where 2 n withkk  1. Therefore

h( X j ^^ X = ^x) = nRlog e2(^x) (217) log k ^xk2 = log 2(^x) + g

n kk 2

2(^x) (218) (where the last equality follows from (64)) and hence

nR log k Xk^ 2 0 h( X j ^^ X) = nR gn j ^X

(1)j2+ 1 1 1 + j ^X(n )j2 2( ^X)

0 nRlog e: (219)

The upper bound on the fading number now follows from (39); from Theorem 7 by upper bounding theh-term bylog cn ; and from the additional observations thatgm(1) is a monotonically increasing func-tion, that j ^X(1)j2+ 1 1 1 + j ^X(n )j2  1 (220) and that 2( ^X) = jX^(1)j2 jd1j2 + 1 1 1 + j ^ X(n )j2 jdn j2 + jX^(n +1)j2 jd1j2 + 1 1 1 + j ^ X(n )j2 jd1j2 (221)  jXjd^(1)j2 1j2 + 1 1 1 + j ^ X(n )j2 jd1j2 (222) = 1jd 1j2 j ^X (1)j2+ 1 1 1 + j ^X(n )j2 (223) = 1jd 1j2 = 1 k k2 (224)

where the inequality follows sincejd1j  jd2j  1 1 1  jdn j.

APPENDIXD PROOF OFPROPOSITION13

This upper bound is based on the upper bound given in Corollary 8 for a choice of = n . IfnR> nTwe choose for

diag ad

1; . . . ; adn ; b; . . . ; b (225) with

b n2

T (226)

for as given in (80), and with a such that det = 1, i.e.,

a (d11 1 1 1 1 dn ) 1 b : (227)

For such a choice we note that

^x = a ^x0 + N 0; jajjd 2 1j2 ; . . . ; N 0; jaj 2 jdn j2 ; N 0; b2 ; . . . ; N 0; b2 (228) so that k ^xk2 = 2 b2 + (n R0 nT)b2 (229) = nR  2 nT : (230)

Hence, using Jensen’s inequality and the fact thatdet = 1 we get nR log k ^xk2 0 h( ^x)

 nRlog k ^xk2 0 log det 0 h( ^x) (231) = nRlog nR 

2 nT

n =n

0 nRlog e: (232) Plugging this into the upper bound (41) of Corollary 8, we yield

  nRlog  0 log 0(nR) + nRlog nR + nTlog  2 nT 0 nRlog e (233) = nTlog  2 nT + nRlog nR0 log 0(nR) 0 nR: (234) IfnR  nTwe choose for = diag ad 1; . . . ; adn (235)

witha such that det = 1, i.e.,

a (d11 1 1 1 1 dn ) : (236)

For such a choice we note that

^x = a ^x(1); . . . ; ^x(n ) + N 0; jajjd 2

1j2 ; . . . ; N 0; jaj 2

(15)

so that k ^xk2 = jaj2 j^x(1)j2+ 1 1 1 + j^x(n )j2 + jajjd 2 1j2 + 1 1 1 + jaj 2 jdn j2 (238)  2 (239)

where we have boundedj^x(1)j2+ 1 1 1 + j^x(n )j2  1. Hence, using Jensen’s inequality and the fact thatdet = 1 we get

nR log k ^xk2 0 h( ^x)

 nRlog k ^xk2 0 log det 0 h( ^x) (240)

 nRlog 20 nRlog e: (241)

Plugging this into the upper bound (41) of Corollary 8, we yield

  nRlog  0 log 0(nR) + nRlog 20 nRlog e (242) = nRlog 

2

nR + nRlog nR0 log 0(nR) 0 nR: (243) The result now follows by combining (234) and (243).

APPENDIXE

ADDITIONALDERIVATION FOR THEPROOF OF THELOWERBOUND In the derivation of the lower bound to the fading number we need to find the following upper bound

I( ^X; Z j Y)  (xmin) (244)

and to show that(xmin) does not depend on the input distribution QX and tends to zero asxmintends to infinity.

Such a bound can be found as follows:

I( ^X; Z j Y) = h(ZjY) 0 h(Z j Y; ^X) (245)  h(Z) 0 h(Z j Y; ^X; R) (246) = h(Z) 0 h(Z j ^XR + Z; ^X; R) (247)  h(Z) 0 inf ^ x rxinf h(Z j ^xr + Z) (248) = h(Z) 0 inf ^ x h(Z j ^xxmin+ Z) (249) = sup ^ x I(Z; ^xxmin+ Z) (250) = sup ^ x I Z xmin; ^x + Zxmin (251) = sup ^ x h ^x + Zxmin 0 h( ^x) (252) (xmin) (253)

where we have used the fact that we have chosenR such that R  xmin. Note that (252) does not depend on the inputX anymore. The convergence

lim

x "1(xmin) = 0 (254) follows from [1, Lemma 6.11], [2, Lemma A.19].

ACKNOWLEDGMENT

The author would like to thank Amos Lapidoth for long, fruitful dis-cussions and Tobias Koch for giving the right hint to fix a serious bug in an earlier version of the proof. Moreover, the author thanks the anony-mous reviewer for his detailed and very helpful comments.

REFERENCES

[1] A. Lapidoth and S. M. Moser, “Capacity bounds via duality with ap-plications to multiple-antenna systems on flat fading channels,” IEEE Trans. Inf. Theory, vol. 49, pp. 2426–2467, Oct. 2003.

[2] S. M. Moser, “Duality-based bounds on channel capacity,” Ph.D. dis-sertation, Swiss Fed. Inst. Technol., Zurich, Switzerland, 2004. [3] E. ˙I. Telatar, “Capacity of multi-antenna Gaussian channels,” Europ.

Trans. Telecommun., vol. 10, no. 6, pp. 585–595, Nov.–Dec. 1999. [4] A. Lapidoth, “On the high SNR capacity of stationary Gaussian fading

channels,” in Proc. 41st Allerton Conf. Commun., Contr. Comput., Monticello, IL, Oct. 1–3, 2003, pp. 410–419.

[5] T. Koch, “On the asymptotic capacity of multiple-input single-output fading channels with memory,” Master’s thesis, Signal and Inf. Proc. Lab., ETH Zurich, Zurich, Switzerland, 2004.

[6] A. Lapidoth, “On the asymptotic capacity of stationary Gaussian fading channels,” IEEE Trans. Inf. Theory, vol. 51, pp. 437–446, Feb. 2005. [7] Y. Liang and V. V. Veeravalli, “Capacity of noncoherent

time-selec-tive Rayleigh-fading channels,” IEEE Trans. Inf. Theory, vol. 50, pp. 3095–3110, Dec. 2004.

[8] A. Lapidoth and S. M. Moser, “The fading number of single-input mul-tiple-output fading channels with memory,” IEEE Trans. Inf. Theory, vol. 52, pp. 437–453, Feb. 2006.

[9] A. Lapidoth and S. M. Moser, “The expected logarithm of a noncen-tral chi-square random variable,” [Online]. Available: http://moser.cm. nctu.edu.tw/explog.html

[10] T. Koch and A. Lapidoth, “The fading number and degrees of freedom in non-coherent MIMO fading channels: a peace pipe,” in Proc. IEEE Int. Symp. Information Theory, Adelaide, Australia, Sep. 2005, pp. 661–665.

[11] A Lapidoth, “The Fading Number and Degrees of Freedom: A Peace Pipe,” presented at the Shushan Purim, Israel, Mar. 27, 2005. [12] T. Koch and A. Lapidoth, “Degrees of freedom in non-coherent

sta-tionary MIMO fading channels,” in Proc. Winter School Coding Inf. Theory, Bratislava, Slovakia, Feb. 20–25, 2005, pp. 91–97.

參考文獻

相關文件

In the presence of inexact arithmetic cancelation in statement 3 can cause it to fail to produce orthogonal vectors. The cure is process

(a) A special school for children with hearing impairment may appoint 1 additional non-graduate resource teacher in its primary section to provide remedial teaching support to

2-1 註冊為會員後您便有了個別的”my iF”帳戶。完成註冊後請點選左方 Register entry (直接登入 my iF 則直接進入下方畫面),即可選擇目前開放可供參賽的獎項,找到iF STUDENT

It clarifies that Upāyakauśalya, in the process of translation, has been accepted in Confucian culture, and is an important practice of wisdom in Mahāyāna Buddhism which

An OFDM signal offers an advantage in a channel that has a frequency selective fading response.. As we can see, when we lay an OFDM signal spectrum against the

Microphone and 600 ohm line conduits shall be mechanically and electrically connected to receptacle boxes and electrically grounded to the audio system ground point.. Lines in

The continuity of learning that is produced by the second type of transfer, transfer of principles, is dependent upon mastery of the structure of the subject matter …in order for a

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005..