• 沒有找到結果。

Information Rates for a Discrete-Time Gaussian Channel with Intersymbol

N/A
N/A
Protected

Academic year: 2022

Share "Information Rates for a Discrete-Time Gaussian Channel with Intersymbol "

Copied!
13
0
0

加載中.... (立即查看全文)

全文

(1)

1527 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 37, NO. 6, NOVEMBER 1991

Information Rates for a Discrete-Time Gaussian Channel with Intersymbol

Interference and Stationary Inputs

Shlomo Shamai (Shitz), Senior Member, IEEE, Lawrence H . Ozarow, Member, IEEE, and Aaron D. Wyner, Fellow, IEEE

Abstract -Bounds a r e presented on Zi,i,d,-the achievable i n - formation rate for a discrete Gaussian channel with intersymbol interference (IS11 present a n d i.i.d. channel input symbols gov- erned by a n arbitrary predetermined distribution p , ( x ) . Upper bounds on I , the achievable information r a t e with the symbol independence demand relaxed, a r e given a s well. The bounds a r e formulated in terms of the average mutual information of a memoryless Ga u ssia n channel with scaled i.i.d. input symbols governed by the s am e symbol distribution p , ( x ) where the scaling value is interpreted a s a n enhancement ( u p p e r bounds) o r degradation (lower bounds) factor. The bounds apply for channel symbols with a n arbitrary symbol distribution p,(x), discrete as well us continuous, a n d thus facilitate bounding the capacity of the IS1 (dispersive) Gaussian channel u n d e r a vari- ety of constraints imposed on the identically distributed c h a n n e l symbols.

The use of the bounds is demonstrated for binary (two-level) i.i.d. symmetric symbols a n d a channel with causal ISI. I n particular a channel with two a n d three IS1 coefficients, that is, IS1 memory of degree one a n d two, respectively, is examined.

The bounds on 1i.i.d. a r e compared to the approximated (by Monte Carlo methods) known value of Zi.i,d, a n d their tightness is considered. An application of the new lower bound on Zi,i,d.

yields a n improvement on previously reported lower bounds for the capacity of t h e continuous-time strictly bandlimited ( o r b an d p ass ) Ga us s ian channel with either peak power or simulta- neously peak power a n d bandlimiting constraints imposed on the channel’s inpu t waveform.

Index Terms --ISI, additive Gaussian channel, capacity, aver- age mutual-information.

I. INTRODUCTION

ONSIDER the discrete-time Gaussian channel

C

(DTGC) with intersymbol interference (ISI) de- scribed by

Manuscript received July 19, 1990; revised February 18, 1991. This work was done at AT&T Bell Laboratories, Murray Hill, NJ.

S. Shamai (Shitz) is with the Electrical Engineering Department, Technion-Israel Institute of Technology, Haifa 32000, Israel.

L. H. Ozarow is with the General Electric Corporate Research and Development Center, Room Kwc 611, P.O. Box 8, Schenectady, NY 12301.

A. D. Wyner is with AT&T Bell Laboratories, Room 2C-365, 600 Mountain Ave., Murray Hill, NJ 07974.

IEEE Log Number 9101891.

where ( x k } are stationary identically distributed real-val- ued channel input symbols, ( y k > are the corresponding channel output observables, ( h k ) are real IS1 coefficients’, and ( n k } are independent identically distributed (i.i.d.1 zero-mean Gaussian noise samples with variance E ( n i ) =

(T2.

A convenient way to describe the channel (1) using matrix notation is

y N = H N x N + n N (2)

and it resides on the notion of the N-block DTGC [l], [2].

Here, y N = ( y o , y l ; * * , y N - , I T , x N = ( x o , ~ 1 , ‘ . . , X N - 1 I T

and n N = ( n o , n , ; . ., n N - are column vectors with N components standing, respectively, for the output sam- ples, channel symbols and noise samples and superscript T denotes the transpose operation. The equivalence be- tween (2) and (1) is evident for N + cc [ll and in this case, which is of interest here, “end effects” are suppressed [l]

and the rows of H = H” are specified by circular shifts of the IS1 coefficients {hi}. We assume throughout finite energy )lhJ12 <cc where h stands for the IS1 vector ( h o , h , , h , . . . and

(1. II

denotes the I , norm. Note that as far as information is concerned, the model in (2) can also be used when the stationary noise samples are correlated with an invertible correlation matrix T N = E [ n N ( n N ) ’ ] . The conclusion follows by employing an information-loss- less linear orthogonalizing transformation on the channel output vector y N .

This classic model has been extensively used in the information-theoretic and communications literature [l], [3]-[6] (and references therein), and thoroughly examined from a variety of aspects. The DTGC is well adapted to describe pulse amplitude modulated (PAM) signaling through a dispersive Gaussian channel encountered often in telephone lines 151, and magnetic recording [6], when optimal matched filters [4], sample-whitened matched fil- ters [7] or mean-square whitened matched filters [SI as well as linear suboptimal prefilters [9] are used by the detector.

‘For the sake of brevity we refer to h , as the null IS1 coefficient. If there are only M nonzero IS1 coefficients say h , , h , ; . ’ , h,+, we refer to the channel as having IS1 memory of order M - 1.

0018-9448/91$01 .OO 0 1991 IEEE

~ .~ ~-

_ - -~ _T

(2)

1528 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 37, NO. 6, NOVEMBER 1991

The average mutual information (in nats unless other- wise stated) per channel use,

1

N - m h’

I = Iim - z ( y N ; x N ) , (3) evaluated for a given input distribution p , ~ ( 8 ) is of interest since it determines the achievable rate of reliable communication through this channel with this specific input distribution [10]-[12].2 We are also interested in Zi.i,d,, the resultant average mutual information where i.i.d.

input symbols are used, that is, ~ , N ( u ~ ) =

n;“=

l p x ( a l ) . Capacity,

C = lim - 1 sup z ( y N ; x N ) ,

N + m P X N ( U N )

(4) where the supremum is taken over all admissible distribu- tions p,.(aN) satisfying certain constraints, is well known only in the case of block average power [ll, [21, [101-[121 or symbol average power [ l ] constraints, imposed on the channel symbols {xi). In this case the capacity is achieved by letting {xi) be dependent Gaussian random variables [l], [lo]. The average mutual information Zi,i,d. for i.i.d.

Gaussian symbols {xi} is also well known [l] and of interest in a variety of cases [l], [131.

In several problems of primary theoretical importance, the constraints imposed on x , preclude the use of Gauss- ian channel symbols. Peak limited [9], or both simultane- ously peak and band limited channels [14], [15], are such specific examples. Furthermore, in most practical cases where coding over the IS1 channel is employed [161-[201 the channel symbols are discrete taking on values from a finite alphabet. It is therefore important to estimate ei- ther the capacity C or the average mutual information Z and Zi,i,d, for non-Gaussian symbols as well.

The main lower bounding technique on C, I , or Ii,i.d, dates back to Shannon [ 141 and was extensively applied in numerous interesting channels with various constraints imposed on the channel input waveform, [91, [151, [21].

This technique relies heavily on the convolutional in- equality of entropy powers [22] and the asymptotic prop- erties of log determinants of Toeplitz matrices [23]. The use of the convolutional entropy power inequality pre- cludes the application of these techniques to discretely distributed channel symbols {xi). Other lower bounds on Zi.i.d, based on the cut-off rate R , [24] for these channels are also adapted to continuous channel symbol alphabets.

Binary i.i.d. symbols are considered in [25]-[27] and even for this special case no general analytical methods for computing Zi,i.d, are known and the difficulties in under- taking this task are pointed out in [251, where Monte Carlo techniques, were applied to approximate Zi.i,d. for certain channels, with relatively few nonzero IS1 coeffi- cients. The cut-off rate, however, for binary i.i.d. channel symbols, is determined in terms of the maximum eigen- value of an IS1 related matrix [41, [25], the evaluation of

*The proof of the direct part of the coding-theorem requires in certain cases more stringent assumptions [ll].

which is formidable for channels with large memory.

Closed form bounds on this cut-off rate which can evi- dently be employed as lower bounds on the information rate I, I were recently reported in [27]. Unfortunately, the results of [271 do not extend to other discrete alpha- bets and the bounds are not always tight.

Orthogonalizing transformation [lo], [20], [28] is appli- cable only in cases where the constraints imposed on the channel inputs {x,) can be translated into another set of constraints imposed on ( i C } where x N = 9iN and where 9 is an N X N shaping matrix which orthogonalizes the channel. This is easily done for block average power constraints [ l ] but is a subtle problem for other sets of constraints (i.e., peak limit, equispaced discrete symbols).

The Tomlinson precoding approach [17], [19], [29] intro- duces similar obstacles since it is in general unknown how to characterize the inputs of the Tomlinson precoder such that the outputs {x,) (which form the channel inputs) will satisfy a given set of constraints.

Upper bounds on either C, I , or Z l l d are found by replacing the actual channel symbols with Gaussian sym- bols having the same second-order moments and using

“water-pouring’’ arguments whenever needed [8], [lo], [XI, D I .

In certain cases where continuous-time constraints of the peak power [15] or constant magnitude (envelope) [3013 type are imposed, improvement on the Gaussian based upper bounds was achieved. Unfortunately it seems that no other general bounding techniques, applicable for arbitrary symbol distributions, either continuous or dis- crete, are available.

The information rate I, I d which corresponds to i.i.d.

channel symbols {x,) with a given arbitrary symbol distri- bution p , ( x ) evidently forms a lower bound on C in the case when i.i.d. symbols, the distribution of which is governed by p,(a), are permissible but are not necessarily the optimal, capacity achieving selection. Nevertheless, the information rate I l l d deserves also attention for its own sake [l], [25] mainly, since most practical coding schemes [31] approximate closely the statistics of random codes, that is: i.i.d. inputs {x,) which are not necessarily uniformly distributed, [4], [lo], [31]. In a variety of inter- esting cases however, specializing to i.i.d. input symbols does limit generality, therefore the information rate I with the independence restriction relaxed is also ad- dressed. Upper bounds on Z maximized over all individ- ual symbol distributions p,(a) under the relevant constraints, yield corresponding upper bounds on the capacity C.

In this paper, we derive lower and upper bounds on Z I l d and upper bounds on Z formulated in terms of the average mutual information for ISI-less (memoryless) scalar channels with i.i.d. inputs governed by the same symbol distribution p,(a). These are easily evaluated

3Als0 see S. Shamai (Shitz) and I. Bar-David, “On the capacity penalty due to input-bandwidth restrictions with application to rate- limited binary signalling,” IEEE Trans. Inform. Theory, vol. 36, pp.

623-627, May 1990.

(3)

SHAMAI (SHITZ) et al. : INFORMATION RATES FOR DISCRETE-TIME GAUSSIAN CHANNEL 1529

either numerically [32] or bounded once again to give closed form expressions using techniques which are mainly applicable for scalar memoryless channels (see [33] for an example). The simple upper bounds based on Gaussian input symbols are also mentioned.

Though we have specialized here to the real valued ISI-channel the main results reported carry over with mainly notational changes4 to the complex IS1 channel for which {xi}, {nil, and { h i ) are complex valued. The complex representation is adapted to describe passband systems with quadrature amplitude modulation (QAM).

The lower and upper bounds on Zi,i.d, and I are formu- lated in the next section. In Section 111, the bounds on Zi.i.d. are calculated for independent equiprobably binary channel symbols and for causal channels with IS1 memory of degree one and two ( h , # 0 for i = 0 , l and i = 0,1,2 respectively). The bounds are compared with the approxi- mated value of 1i.i.d. calculated in [251 using Monte Carlo techniques and their tightness is addressed. The lower bound on Zi,i.d, is also applied to the continuous-time strictly bandlimited and bandpass channels with inputs constrained to be peak power limited (PPL) [9] or simul- taneously band and peak power limited (BPPL) [15].

Improved lower bounds, especially at low regions of the signal-to-noise ratio, over those previously reported [9], [14], [15] are found by incorporating the optimized dis- crete symbol distribution in the lower bound derived here.

The paper concludes with a discussion. Appendix A in- cludes the proofs and in Appendix B some upper bounds on li,i,d, are presented in addition to those appearing in Section 111.

11. BOUNDS ON THE INFORMATION RATES In this section, we present lower and upper bounds on

Ii,i,d.= Z(3) evaluated for i.i.d. channel input symbols, that is, p x ~ ( a N ) = r I E 1 p , ( a , ) where p,(a) is an arbitrary, either discrete or continuous, known probability function.

An upper bound on Z(3) for identically distributed sym- bols (any individual symbol is governed by the probability function p,(a)) not necessarily independent, is also de- rived. The bounds are formulated in terms of average mutual information values for scalar memoryless Gauss- ian channels with i.i.d. inputs.

A. Lower Bounds

The following theorem proven in the Appendix A

Theorem 1: A lower bound on 1i.i.d. denoted by Z, is establishes a lower bound on Zi.i,d, (3).

specified by

4Conjugate transpose and complex norms are introduced wherever needed.

where

I, = I ( p x

+

v ; x ) , (6) with x being a random variable with the probability functionp,(a) and v being a zero-mean Gaussian random variable with the same distribution as that of n k in (1) (variance a 2 ) . The degradation factor p equals

1 T r

p = exp

-1

I n \ H ( h )

l2

d h ,

2%- 0 (7)

where

m

H ( h ) = h,exp(-ilh), i = m , (8)

l = - m

is the IS1 “transfer” function having a 2%- period.

The lower bound I, is given in terms of the average mutual information of a scalar memoryless channel with input x having the same probability function as the origi- nal x i and output p x

+

v where v is a Gaussian random variable with the same distribution as that of nk in (1) (variance a2). The factor p2 is, therefore, interpreted as a power degradation factor that rises due to the memory introduced by the IS1 coefficients {hi}.

It is realized, using a classical results of spectral factor- ization theory [3, Section 8.61, [201 that p equals exactly to the leading (zero) coefficient of the discrete-time IS1 channel at the output of the feed-forward filter that yields an equivalent representation of the channel in (1) having only causal IS1 coefficient^.^ If this is already the case, that is h, = O for I

<

0, and the discrete IS1 channel is minimum phase, i.e., the channel in (1) can be interpreted as modeling the output of a sample-whitened matched filter [7], then p = lhol. The lower bound (6) is interpreted therefore as the average mutual information that corre- sponds to the ideal decision feedback equalizer (DFE) [3]

with errorless past decisions, which are used to fully neutralize the causal IS1 effect [191, [20l, [291, [341. Note, however, that no assumptions of errorless past decisions were incorporated in the derivation of the lower bound I , (see Appendix A).

For no ISI, that is only h , # 0, p = lhol as it should; in this case the bound is exact I, = Zi,i.d,. Note that no restriction whatsoever was imposed on p,(a) making the results applicable to a wide class of problems as is further discussed and demonstrated in the next section.

Tightening the bound ,Z and a comparison with the

“interleaved” straight-forward lower bound are discussed in Appendix A.

B. Upper Bounds -i.i.d. Symbols

Several upper bounds on Zi.i.d. where the input symbols are governed by the probability function p,(a) are sum-

% is assumed that the Z-transform of the causal IS1 coefficients at the output of the feed forward filter has no zeros at the origin. In this case p is interpreted also as the exponent of the null coefficient of the complex cepstrum associated with these causal IS1 coefficients, see A. V. Oppenheim and R. W. Schafter, Digital Signal Processing. En- glewood Cliffs, NJ: Prentice-Hall, 1975, ch. 10.

(4)

1530 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 37, NO. 6, NOVEMBER 1991

marized in Theorem 2 and Lemma 1, which are proved in Appendix A. Additional upper bounds for this case ap- pear in Appendix B.

Theorem 2 -Matched Filter Bound:

P, V ( U * ) =

n

p,(a,)

/ = I

I I,M = I ( llhllx

+

v ; x), (9) where x is a random variable with the probability func- tion p,(a), v is a zero-mean Gaussian variable with the same distribution as that of n k in (1) (variance u 2 ) and the power enhancement factor is the norm

The notion “matched filter bound” stems from the fact evidenced in Appendix A, that I,,,,, corresponds to a single shot transmission meaning that only one symbol is transmitted. For uncoded communication this assumption leads to the matched filter lower bound on error probabil- ity [3]. Again, the upper bound I,, (9) is formulated in terms of the mutual information of a memoryless channel with i.i.d. inputs where llh1I2 (10) takes on the interpreta- tion of a power enhancement factor as opposed to the power degradation factor in (71, p 2 I llhl12, appearing in the lower bound IL.

The Gaussian upper bound I,, to follow results imme-

diately by invoking standard arguments (see Appendix A) and it is stated in the following lemma.

Lemma 1 -Gaussian Bound:

1

2 - 2n-

iT

In

(

1

+ (

PA / u

) I

H ( A )

I ’)

d A , ( 1 1)

where PA = E1xI2. This upper bound on Ii,i,d. equals the mutual information I,, = lim, ~~ l / N ( y N ; x N ) in the case where {xi) are assumed to be i.i.d. Gaussian random variables with variance (average power) PA. For symmet- ric binary symbols this kind of a bound was mentioned and used in L2.51.

Two additional upper bounds I,,(i,i,d,) and I,Z(i,i,d.) are stated respectively in Lemmas B1, B2, in Appendix B.

C. Upper bounds - Identically Distributed Symbols We relax now the independence demand and assume that the symbols {xi} that are not necessarily i.i.d. are identically distributed where each symbol is governed by the probability function p,(a). The upper bounds stated at Theorem 3 and Lemma 2 are proved in Appendix A.

Theorem 3 -Maximal Gain Bound:

1

N + m N

I = Iim -I( y N ; x ” ) I I,, = I ( ( x

+

v ; x), (12)

where x is a random variable governed by the probability function p,(a), v is a zero-mean Gaussian variable with the same distribution as that of n k in (1) (variance u 2 ) and the power enhancement factor

5=oF”=T

I H ( A ) l > (13) where H ( A ) is given by (8). The enhancement factor

5

is interpreted as the maximal gain of the IS1 “transfer-func- tion” H(A). Note that

d n

I ( 5 E:= -,lhrl.

The Gaussian-based upper bound I,,, stated next is specified by the average mutual information over this channel, taking {xl} to be the Gaussian symbols with the same correlation as that corresponding to the actual sym- bols.

Lemma 2 -Gaussian Information Bound:

1

I = N + m Iim - I ( ~ , ; x , ) N I I,,,

= -/-In 1 (1

+ s

( A )

I

H ( A )

12)

d h , (14)

257 0 U

where H ( A ) is given by (8) and where

CO

S,(A)= rx(1)er’*, i=m, (15)

stands for the discrete power spectral density of the sequence {xl} for which r,(l) = E ( X , + ~ X , ) denotes the correlation coefficients. For i.i.d. symbols the bound I,,, (14) reduces to I,, (11). Clearly,

I = - m

1 P

I,,, I C , = - with 0 being the solution of

In [ma~(@1H(A)1~,1)] d h (16) 2%-

I

~ ~ m a x ( O - I H ( h ) l - ’ , O ) d A = n - P ~ / u 2 . (17) The value C, is interpreted as the capacity under average power constraints [1] that results by maximizing (14) over all S,(A) that satisfy a symbol average power constraint, that is, r,(O) = E ( x 2 ) = l/n-/,“S,(h)dA = PA.

First, note that for no ISI, llhll = IhJ, and i.i.d. symbols, we have that I = I,,,,, = I,,, while I I I,, = I,,, = C,.

For Gaussian symbols {xJ with correlation coefficients r,(Z), and IS1 present, we have I = I,,,.

Assume now that (x,} are i.i.d. and each symbol x is a discrete symmetrically distributed random variable that takes on N possible values and satisfies E(1xI2) = PA. It is clear that for PA / a 2 --)LE both I,.l,d. and I,,,,, + approach to the entropy of the discrete random variable x -

b ( x ) ,

where 6 stands for the standard entropy function [lo], resembling thus the correct behaviour of Il,l.d,, while I,,

+ w . For low SNR ( P A / a 2 + O), it can be shown, in a similar way to that used in [25] for binary symbols, that

(5)

SHAMAI (SHITZ) el U / . : INFORMATION RATES FOR DISCRETE-TIME GAUSSIAN CHANNEL 1531

1i.i.d.- I,, -+ ( I / ~ > I I ~ I I ~ P , / a 2 , see also [351 for similar arguments.

In Appendix B, another two upper bounds on Zi,i,d.

denoted by Z,l(,.i,d,) (B.1) and IUZ(i,i,d,) (B.4) are derived.

These bounds may turn, for certain cases, tighter as compared to I,,,,, (9) and I,, (11) presented here, see further discussion in Appendix B.

111. APPLICATIONS

We apply here several of the bounds presented in the previous section to some interesting examples. In Section 111-A, we address the binary symmetric case, that is, { x L ) are i.i.d. binary symmetrically distributed [25] symbols with causal minimum phase IS1 the memory order of which is L - 1 , that is, h,=O for 1 < O and

lr

L . In particular we examine the cases of L = 2 and L = 3.

In Section 111-B, we specialize on lower bounds for the continuous-time bandlimited baseband channel with ei- ther a peak power limit (PPL) [9] or simultaneous band- width limit and PPL (BPPL) [14], [15] constraints imposed on the continuous-time channel input signal. The relevant results for the bandpass case are also mentioned.

A . Binary Symmetric Symbols with Causal Finite Minimum Phase ISZ

Consider the binary symmetric case, that is, x , are i.i.d.

binary symbols taking on the values k f i with equal probability 1/2. This is an interesting application since in several communication problems the transmitter is re- stricted to use only binary alphabets [91, [lo], [161, [181, [25], [27], [32]. We specialize here to the causal minimum phase IS1 representation, as is the case at the output of the sample-whitened matched filter (or the feed forward part of the DFE equalizer [3]) and assume that the IS1 memory is of degree L - 1, that is, h, = 0 for 1 < 0 and 12 L .

The lower bound

I L = Cb( h t P M / a ’ ) ,

= c b

( I I

h

11

’PM

’ )

9

( 18a)

( 18b) are given in terms of Cb(R) = ~ ( 6 a

+

p ; a ) the capacity of a Gaussian scalar channel with binary inputs, where a is a binary random variable taking on the values k 1 with equal probability 1/2 and p is a normalized Gaussian random variable. The argument R is, therefore, inter- preted as the signal-to-noise ratio. The notation Cb(R) is used since it is actually the capacity of the memoryless Gaussian channel with binary inputs and it is determined by a single integral [lo, Problem 4.221, [25, (4.1411

and the upper bound

Equivalent forms appear in [4, pp. 1531 and [32, pp. 2741.

The function C,(R) has been evaluated numerically for

example in [25] and [32] while closed form bounds [331 are further discussed in the next section.

For the sake of simplicity we turn now to the case of only two nonzero IS1 coefficients (IS1 memory of degree 1) having the values h, = (1

+

a’)-’/’, h, = a ( l + a 2 ) - 1 / 2 , and lh,l = 0 for i # 0 , l . We use here the convenient normalization [25] Ilh11’ = h i

+

h; = 1. The parameter

- 1 I a I 1 determines the amount of IS1 present, a = 0 corresponds to no IS1 while a =

+

1(- 1) corresponds to the duobinary (dicode) case [3], [5] that yields the maxi- mum IS1 possible with memory of degree 1. This very simple model is important in some practical cases encoun- tered in magnetic recording [61, [161, [181, 12.51, [271.

For this case,

and

I,, = (1/2) I n ( l + P,,,,/u*)

+

(1/2) In 1/2+ (1/2)

where ,,Z follows from (11) substituting in (8) 2cu l + a 2

IH(A)

)*

= h:

+

h; +2h0h,cosA = l + - cos A and using the integral [36, Section 4.224, p. 3291. The upper bounds Zul(,

,

), and I,,,

,

specialized for this binary case (given, respectively, by equations (B.5a) and (B.5b) in Appendix B) were found to be less tight as compared to min(ZUM, .,),Z

The bounds Z, (20a), ZUM (20b) and ,,Z (20c) in (bits/channel use) are shown in Fig. 1 for a = 1 (the duobinary case) versus the signal-to-noise ratio P,,,, / a 2 and are compared to jl the approximated value of 1,

,

d

calculated in [25, Fig. 4.41 using Monte Carlo techniques.

The bounds ZL and ZuM are 3 dB apart and the matched- filter upper bound ,,Z (20b) is tighter for low and mod- erate values of the signal-to-noise ratio PM/a2 while the lower bound I, (20a) is found to be tighter for high values of the signal-to-noise ratio. Note that the Gaussian upper bound ,,Z is remarkably tight for small values of the signal to noise ratio P M / u 2 I 0 (dB) and it is the pre- ferred upper bound in the region PM/a2 I

-

2.5 (4dB).

We turn now to examine the IS1 case where L = 3 and h , = 1/2, h, =

m,

h, = 1/2, (h:

+

h:

+

h; = 1) which

was considered also in [25, see Figs. 2.3 and 3.11. For this channel,

I

H ( A )

l2

= (cos A

+ m)’

(6)

1532 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 37, NO. 6 , NOVEMBER 1991

P,/w2 (dB)

Fig. 1. Bounds on the information rate Ii,i,d, (bits/channel symbol) for symmetric i.i.d. binary symbols and two equal IS1 coefficients h , = h , =

I / & (IS1 memory of degree one) versus signal-to-noise ratio P M / u z (dB). L,: Lower bound (20a). I,,: Matched-filter upper bound (20b).

IUc: Gaussian upper bound (20c). Ii,,,d; approximated value of li,i,d, by Monte Carlo techniques [25].

= - 1

[

1

+

PM /a2(cos A

+ \11/2)2]

d A ( 2 1 ~ ) 2.rr 0

are shown in Fig. 2 (in bits/channel use) along with

I,,

d ,

the Monte Carlo approximated value of 1, , d [25, Fig. 4.61.

Indeed the bounds are looser in this case with a larger IS1 memory ( L - 1 = 2) when compared to the previous case with unit ( L -1 = 1) IS1 memory. However, the lower bound ZL seems to capture the behavior of Zlld for large signal-to-noise ratio PM / u 2 values (which is determined basically by h,) while the upper bound is tight for asymp- totically low values of PM / a 2 for which ZuM (as well as lUG) + 1/2P;/p2 in agreement with the exact asymp- totic behavior of [25, Corollary 4.21. In midrange values of the signal-to-noise ratio, the bounds ZL and I u M seem not to be tight. This observation is believed to hold in general and it is further supported by the Gaussian case (i.e., x , are i.i.d. Gaussian) for which Zuc --f 1 / 2 P M / a 2 for P M / a 2 + 0 and C,+ I,,--+

1/21n(p2PM/a2) for P M / a 2 + W . Note that for the Gaussian case and for asymptotically high signal-to-noise ratio P M / a 2 +CO, C, + ,,Z [201, [34] evidencing that no loss in capacity under symbol average power constraint incurs by using i.i.d. Gaussian inputs. We conjecture that the same holds for non-Gaussian continuous symbols as well.

1 .

5 Py/u2

( dB)

Fig. 2. Bounds on the information rate Ii.i,d, (bits/channel symbol) for symmetric i.i.d. binary symbols and three IS1 coefficients h , = h , = 1/2, h, =

J1/2

(IS1 memory of degree two) versus signal-to-noise ratio P M / u 2 (dB). ZL: Lower bound (21a). I,,:Matched-filter upper bound (21b). Zuc: Gaussian upper bound (21c). Ii,i,d, approximated value of li,i,d, by Monte Carlo techniques [251.

B. Lower Bounds on the Capacity of the Bandlimited Continuous- Time Channel with a PPL or BPPL Constraints

We turn our attention to the strictly bandlimited con- tinuous-time channel for which the channel filter's trans- fer function D( f) = 1 for

I f 1

I W and 0, otherwise. The transmitted channel input, s ( t ) = C k x k g ( t - k T ) is taken to be a PAM signal where g ( t ) stands for the pulse shape and T is the symbol duration. The signal d t ) is con- strained either to be peak power limited to PM [9] (ab- breviated here as the PPL constraint) or to satisfy both a PPL constraint and a strict bandwidth constraint [141, [151, that is, s ( t ) is of bandwidth no larger than W (these joint constraints are abbreviated by BPPL). We specialize here on lower bounds on the capacity of this channel under the PPL and BPPL constraints6. Following [9], [151 we restrict the signal to the PAM class for the baseband case consid- ered here. The channel symbols are chosen to be i.i.d.

digits x , satisfying the peak constraint Ix,I I (where subscript M stands for maximum). The pulse shape g ( t ) is rectangular g ( t ) = (1, I t ( I T/2j [91 for the PPL constraint and spectral cosine7 g ( t ) = .rr2/4[1 - ( 2 t / T)2]-' cos(Tt / T ) for the BPPL case respectively, while the symbol duration T =(2W)-' [9], [15]. It has been verified that the signals s ( t ) so constructed satisfy the respective PPL [9] and BPPL [15] constraints. The

6For upper bounds on capacity under these constraints, see [301 In [14], a spectral triangular pulse g ( 0 = [ ( a t / W 1 s i n ( a t / TI]*

an: [15].

was selected.

(7)

SHAMAI (SHITZ) et al. : INFORMATION RATES FOR DISCRETE-TIME GAUSSIAN CHANNEL 1533

frequency response of the receiver filter is chosen to match d ( t ) = 9'-

'Nf)

where 9- and 9 - I stand for the Fourier transform pair. Evaluating p using the calcula- tions reported in [91 and [151 gives p pppL = e / r and p A p B p p L = ~ / 8 for the PPL and BPPL cases respec- tively. Thus, by Theorem 1 and with proper scaling, the lower bound on capacity (per channel symbol of duration T = (2Wl-l) is

where p equals pppL or peppL for the PPL and BPPL constraints respectively and where u 2 = NOW with No /2 standing for the two-sided spectral density of the additive white Gaussian noise. The random variable x may take any probability function satisfying 1x1

s a.

The ran-

than those reported in [9], [151, since in [9] and [15] a uniform distributions for x in [ -

a,

G ]was applied and here the optimizing distributions is used. However, the improvement measured with respect to the signal to noise ratio P M / ( N o W ) decreases from 101og10 ( e ~ / 2 ) =

6.3 dB achieved for asymptotically low signal-to-noise ratios P M / ( N o W ) -+ 0, until it completely vanishes for asymptotically high signal-to-noise ratios P / ( N O W ) -+ w.

The bandpass case with either the PPL [9] or BPPL [15]

constraints can be treated in a similar manner since Theorem 1 applies also for the complex case (see Ap- pendix A, Part f). In this case where a QAM signalling is employed the channel symbols { x , ) are i.i.d. complex satisfying I x l i f i and (72,) stand for i.i.d. complex Gaussian random variables with independent, zero-mean real and imaginary components each of variance u 2 = 2N0W. The analysis yields,

(25) 2WC,,[ ( e / r ) 2 P M / ( N O W ) ] ,

2WC,,[ ( ~ / 8 ) ~ f ' , / ( N O W ) ] ,

bandpass and PPL constraint bandpass and BPPL constraint

i

I,,=

dom variable p stands as usual for a normalized zero-mean Gaussian variable with unit variance. In [91, [141, [151, x had to be chosen continuously distributed otherwise the convolutional inequality of entropy powers [22] upon which the derivation of [91, [14], [151 relies, collapses.

Here, free from such restrictions, we chose the channel symbol distribution to maximize the bound in (22). This maximizing distribution is well known and reported in [37]. Denote by C,(R) the capacity derived in [371, that is, C , ( R ) = sup Z ( a + p ; a ) , (23)

la/ I fi

where the supremum is taken over all distributions of the real random variable a satisfying la1 _<

fi

and where p is a zero-mean, real, unit-variance ( E ( p 2 ) = 1) Gaussian random variable. The optimized bound Z,, for this chan- nel equals the optimized ZL multiplied by 2W (measured in nats per sec), and thus takes the form

1.

~wc,[

( ~ / ~ ) ' P ~ / ( N , w ) ] , PPL constraint ZWC,[ ( T / ~ ) ' P , / ( N O W ) ] , BPPL constraint

(24)

i

ILO =

It has been shown [37] that the distribution of the random variable a in (23) achieving C,(R) is discrete and further, for R c;

-

6.25 [37, Fig. 31, it is binary symmetric, while for R + CQ it approaches a uniform distribution. It follows that [37],

The lower bounds reported here (24) are strictly tighter

where C,,(R) stands for the capacity found in [38], which is also defined by (23). However, a is now a complex random variable and p is a zero-mean complex Gaussian random variable with normalized i.i.d. components [E(Re p)' = E(Im p)' = 1, EKRe pXIm p ) ) = 01.

In [38], it has been proved that the distribution of the complex random variable a achieving C,,(R) is uniform in arg(a) and independently discrete in (al. For R I

-

6,

the constant envelope distribution [38], that is, la1 = 1 with probability 1, is optimal while for R -+a the optimal distribution approaches the one that is uniform over a disk with radius

fi.

This observation yields, therefore,

R I - 6 , R + w

-

In

[

(2e)-'R

+

11,

R+o I

C A R ) -+

-

R ,

cSb(

R ) =

I

Cce(R),

where C,,(R) is given [39] by,

Cc,(R) = - l m $ ( [ ) l n ( F ) d T + l n (

--)

2R

with $ ( T ) = 2Texp[ - R(1-k T ~ ) ] Z , ( ~ R T ) and where Zo(*) stands for the zero order modified Bessel function. The improvement of the lower bounds ZLo (25) on the bounds reported in [9] and [15] which were derived for complex input symbols uniformly distributed over a disk of radius

a,

measured with respkct to the signal-to-noise ratio P M / ( N o W ) , decreases from 10loglo 2e = 7.35 dB achieved for asymptotically low signal-to-noise ratios P, / ( N O W )

+ 0, until it completely vanishes for asymptotically high signal-to-noise ratios P / ( N O W ) +a.

IV. QISCUSSION AND CONCLUSION We focus here on the achievable information rates for the classical discrete tide Gaussian channel with IS1 present and with identically distributed, not necessarily

(8)

1534 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 37, NO. 6, NOVEMBER 1991

Gaussian, symbols. Lower and upper bounds on Zi,i,d. (the information rate, for i.i.d. but otherwise arbitrary channel input symbols) as well as upper bounds on I (the informa- tion rate for identically distributed input symbols, not necessarily independent) are derived. The bounds are formulated in terms of the average mutual information between the output and input of a scalar memoryless Gaussian channel. This formulation enables a unified treatment of discrete as well as continuous channel sym- bol distributions using the same underlined framework.

These bounds are therefore easily calculated, either ana- lytically or bounded again by applying the extensive re- sults and techniques developed for memoryless channels.

To demonstrate this, we turn back to Section III-A where binary symbols are considered and note that Cb(R), which is expressed in an integral form (19), can be further lower bounded as

li,i,d, for high signal-to-noise ratio values and continuous inputs, as has been demonstrated by the examples in Section 111 and the Gaussian case for which Zi,i,d. is given by (11). Other approaches which lead to memoryless channels representation, as the orthogonalizations method [lo], [20], [28] and the Tomlinson nonlinear filter [29]

cannot be directly used due to the difficulties in translat- ing the constraints imposed on the {x,}-the channel inputs to a corresponding set of constraints imposed on { i [ } , the inputs of the resultant memoryless channels. This translation is straight forward for a block average power constraints.

For example, if ( x l } , the outputs of a Tomlinson filter are demanded to be i.i.d. with a given probability func- tion, it is not at all clear how to restrict { i [ } , the input of the Tomlinson filter to satisfy this demand. For the spe-

where b,,(a) is the binary entropy function [lo]. Equations (26a) and (26b) were taken from [33] (where N = 2 in notations of [33] was substituted) while ( 2 6 ~ ) is the cut-off rate for the binary case [4, p. 1421 which clearly lower bounds C,(R). Evident upper bounds on 1i.i.d. and I based on a Gaussian assumption are also mentioned.

The lower bound on 1i.i.d. (6) can be used to lower bound the capacity of the dispersive (ISI) Gaussian chan- nel under a variety of constraints imposed on the input symbols which do not preclude the use of i.i.d. symbols.

Upper bounds on capacity are constructed by supremizing the upper bounds on Z over the relevant constraints induced on an individual input symbol.

Incorporating the convolutional inequality of entropy powers [22]’ with the lower bound ZL ( 6 ) reduces exactly to the lower bounds derived using the standard technique described in detail in [9].

Assuming a causal IS1 channel (as observed for exam- ple at the output of a sampled-whitened matched filter or a feedforward equalizer), the lower bound in Theorem 1 is interpreted as the average mutual information of a zero-forcing decision-feedback equalizer having ideal er- rorless feedback decisions. Note, however, that the error- less past decision assumption has not been employed here to derive this lower bound. We conclude therefore that, as far as the average mutual information 1i.i.d. is con- cerned, ignoring the information carried by the rest of the IS1 coefficients {hi, i

>

0) over compensates for the opti- mistic assumption of errorless past decisions, yielding thus an overall lower bound Z, on 1i.i.d; Indeed this lower bound seems to capture the exact asymptotic behavior of

‘Whenever the channel symbols are continuous random variables.

(26c) cia1 case of uniformly distributed (within the extreme levels) i.i.d. { i [ } , it is readily verified that the outputs { x , ) are also uniformly distributed i.i.d. random variables.

Also in this special case the bound in Theorem 1 is superior over the Tomlinson based bound and that is due to the information destroying modulu operation at the Tomlinson receiver. The information loss incurred by the modulu operation is diminished with the increase of the signal-to-noise ratio.

The matched filter upper bound I,, (9) shows that under a given average power constraint at the channel output, that is, llhJI2 is kept constant, IS1 cannot improve on the information rate 1i.i.d. over that of an ISI-less channel (that is, h = h,,), This is attributed mainly to the fact that the symbols {xi} were chosen i.i.d. as is also concluded in [25] for the binary and Gaussian cases. This feature is not necessarily true if optimal statistical depen- dence, (induced by the capacity achieving statistics) is introduced into the channel symbols as has been demon- strated for Gaussian symbols in [ll. This is clearly evi- denced by the upper bound ZUc (12) on Z which shows that the increase in the information rate cannot exceed the corresponding information rate for an ISI-less chan- nel with h , taken to be the maximal value of the IS1

“transfer” function, that is: h, = maXO <,, rr IH(A)I (13).

It was concluded (see also [25] for the binary case) that I,, is an asymptotically tight bound on 1i.i.d. for signal- to-noise ratios approaching zero.

Since I,,Z,,,Z,, are formulated in terms of the mu- tual information of an ISI-less (memoryless) scalar chan- nel with a power enhancement factor for IUM, I,, and a power degradation factor for I,, we conclude that if i.i.d.

channel symbols are permissible, the introduction of IS1 does not drastically modify the underlying functional de-

(9)

SHAMAI (SHITZ) et al.: INFORMATION RATES FOR DISCRETE-TIME GAUSSIAN CHANNEL 1535

pendence of Z on a properly defined measure of signal- to-noise ratio.

The application of the bounds was demonstrated for i.i.d. binary symmetric symbols and channels with IS1 memory of degree 1 and 2, that is, a two or three component IS1 vector. The bounds were compared to Zi.i.d,-the Monte Carlo based 1251 approximation of the exact value of Ii.i,d.-and the asymptotic tightness of ZL and ZuM,ZuG for respectively high and low values of the signal-to-noise ratio, has been verified.

By using the lower bounding technique IL (6) in Theo- rem 1, we have been able to improve on the previously known lower bound, 191, [141, [151, for the capacity of a continuous-time strictly bandlimited or bandpass Gauss- ian channel with either peak power limiting (PPL) or simultaneously band and peak power limiting (BPPL) imposed on the channel input waveform. This relative improvement, which increases as the signal-to-noise ratio diminishes, is attributed to the possibility of incorporating here the optimized discrete symbol distribution that maxi- mize the lower bound ZL (6). This lower bound (6) has been recently used to derive lower bounds on the capacity of the peak- and slope-limited magnetization model with binary signalling [40].

For the sake of simplicity the results were specialized to real Gaussian channels, however the techniques used here can be extended over to account for complex Gauss- ian channels describing passband systems. The same basic structure of the bounds as compared to those appearing in Section I1 is maintained.

We have specialized here to discrete-time IS1 channels and mentioned that these are well adapted to character- ize PAM and QAM signaling in additive Gaussian noise.

The processes of translating the continuous waveform channel to the discrete-time channel has not been explic- itly addressed, rather few alternatives as the matched filter [41 or sample-whitened matched filter [7] were men- tioned. Other alternatives of linear prefiltering, as the minimum mean-square linear equalizer, combined with matched sampling [81, which are also modeled by the discrete-time IS1 channel (l), may, in certain cases, turn advantageous.

ACKNOWLEDGMENT

The authors are grateful to A. Dembo for interesting discussions and to an anonymous reviewer for his careful reading of the manuscript and useful suggestions.

APPENDIX A PROOFS

In this appendix, we prove Theorem 1, 2, and 3 and Lemmas 1 and 2, which appear in Section 11. We assume here that the symbols X I and the Gaussian noise samples n , are i.i.d. real random variables. Extensions to the complex case is shortly discussed at the end of this ap- pendix. We further assume a nonsingular channel, that is, H N is a nonsingular matrix. The assumption incurs no

loss in generality when optimal filters (i.e., sample- whitened matched filter) are employed since the resultant H N is lower triangular being invertible as h , > 0. Never- theless, in the context of this paper, it is only a technical assumption that permits simple proofs. All the results are still valid provided IH(A)l is integrable, which is guaran- teed since

1

H(A)I2 was assumed integrable (finite power).

Note, however, that if H ( h ) equals zero over a region (not isolated zeros) then p = 0 (7), yielding thus a trivial lower bound in Theorem 1.

In the proofs of Theorems 1 and 2 and Lemma 1 it is assumed that {xl} is an i.i.d. sequence while this assump- tion is relaxed in the proofs of Theorem 3 and Lemma 2.

Proof of Theorem 1: Since H N is nonsingular, then

where

z N = x N

+

m N , ( A 4

z N = ( H N ) - ' y N , and m N = ( H N ) - ' n N is a Gaussian vec- tor with a correlation matrix

r,N

= =

u 2 ( H N ) - ' ( H N ) - I T . The function G d ( . ) stands here for the differential entropy [lo]. The chain rule [lo, ch. 21 yields

1 I N - 3 I N - 1

(A.3) Conditioning which does not increase the differential entropy [lo, ch. 21 gives

b d ( X ,

+

m,lz'-l) 2 f j d ( X ,

+

m , I z / - ' , x [ - ' )

= b , ( x ,

+

m,Ix'-',m'-'), (A.4) where the right-hand-side equality in (A.4) follows since

& I = x I - 1

+

m[-' (A.2). Express m, by

m , = E(m,lm'-')

+

P,, (A.5 ) where PI is an innovation Gaussian random variable statistically independent of the Gaussian vector m'-

'.

The function E ( m , ( m ' - ' ) denoting conditional expecta- tion is a linear function of m'-' since the random vari- ables involved are jointly Gaussian. Now, since x[, m l - ' ,

x'-l, and P, are all statistically independent, by (A.4), (A.5)

b d ( x [

+

m,Iz'-') 2 b d ( x ,

+

mIIx'-',m'-')

= b d ( x,

+ P , P )

= @ d ( XI

+

P , ) . (A.6) Using the entropy chain rule and (A.5) yields

(10)

1536 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 37, NO. 6 , NOVEMBER 1991

Inserting now (A.7) and (A.6) into (A.3) and using (A.1) gives

1 1 N - 1

-I( z"; x") 2 - I ( x I

+

P I ; XI). (A.8)

N N I = O

The limit innovation variable p = 1imldm

P I ,

which takes also the interpretation of the stationary innovation vari- able in the noise prediction process [34], is a well-defined Gaussian random variable with variance

2 / N

-'

up'=

E ( p 2 ) = u 2 [ N - r m lim {detlHNI)

]

. (A.9)

Invoking the asymptotic properties of log determinants of Toeplitz matrices [23] specifies up' in terms of "the IS1 transfer function" H ( A ) (8) which is given by the discrete Fourier transform of { h J . The proof of Theorem 1 is completed by following standard arguments [lo] (see also [l] for more details) to show that

1 N - 1

and then rescaling Z(x

+

p; x) = Z ( p x

+

p p ; x ) letting p

The bound can be further tightened by relaxing the conditioning in (A.4), that is, conditioning on x ' - ~ , k = 2,3, . . rather than on XI-' ( k = 1). This yields an im- provement on the bound ZL by a factor, expressed as a conditional mutual information, the evaluation of which is given in terms of a k - 1 fold integral.

Another straight-forward lower bound for i.i.d. symbols (xI} results by applying the inequality l / N Z ( z N ; x N > >

Z(z; x) = Z(x

+

m ; x ) [lo], which is interpreted also as the mutual information achieved by employing ideal interleav- ing in (A.21, which effectively cancels the correlation present among the components of the vector m". The resultant bound, given by ( 6 ) where p in (7) is replaced here by

= ( c 7 j p / a > - l .

follows by noting that

U 2

E ( m 2 ) = - / T l H ( A ) [ - 2 d A .

T O

This lower bound is found to be inferior when compared to the one given in Theorem 1 as is evidenced by Jensen inequality9

- 1/2

p = (expL/TlnlH(A)l-2dA) T O

U

'See [3, ch. 81 where this inequality is stated in the context of the output signal-to-noise ratio superiority of the zero-forcing DFE over the linear zero-forcing equalizers.

Proof of Theorem 2: By the chain rule

1 1 N - 1

- I ( Y " ; x " ) = N / = 0

C

Z ( J " ; X I I X ' - ' ) . (A.lO)

1 N - 1

1

- z ( y " ; x " ) I - z(yN;X,IXf-l,XI+,,...,X").

N N I = O

( A . l l ) For i.i.d. symbols Z(X'-', x / + ~ , ' . ., x N ; x,) = 0 and, t h e r e f o r e ,

Z ( y N , x l - l , xi+'; . -, x N ; x,) which is evaluated by a straight forward calculation since the IS1 effect of

X I - l , xl+ l,. e , x N is fully neutralized". In other words to evaluate ~ ( y " ; x J x ' - ~ , x ~ + ~ , * ., x,) we may use x k = 0 for k = 0,1, . 1 - 1 , 1 + 1 , 1 + 2,- . * , N in (2) which leaves us with the formulation

I ( y N ; x I I x ' -

',

x / + l , . . * , X N ) =

y k = h k - , x ,

+

n k , k = 0 , l ; . e , N > 1. (A.12)

It can readily be verified that

I(Y";X,)=I(-CI/;X,), (A.13) where 9, = E::Jhk-,yk is the maximal ratio combining that rises from the maximum likelihood rule when applied to (A.12). It is clearly seen, using power rescaling, that

which upon substitution in (A.ll), taking the limit for N + 03, yields Theorem 2. 0

Proof of Lemma I : It is well known that the average mutual information l / N Z ( y N ; x") is upper bounded by a Gaussian distribution of the vector x N under a given correlation matrix E ( " T ) constraint [lo]. Thus, in our case, letting x i be i.i.d. Gaussian random variables with E(x,)'= PA yields Zi.i.d,= ZUc (11) [l], which sets the upper bound stated in Lemma 1. 0

Proof of Theorem 3: We use (A.l) and (A.2) which stay valid also for non-i.i.d. {XI}-the case examined here.

Let now

" = J I N + p " , (A.15) where

tJJ

vectors with

tJJ"

having the covariance matrix

and p" are independent zero-mean Gaussian

In the above expression I N stands for the N X N unit matrix and

6

is a nonnegative scaling factor to be deter- mined. This (A.15) representation is possible if

"This is identified as the mutual information with errorless past and future "decisions" that are provided as side information.

參考文獻

相關文件

This paper realizes a zero-voltage-transition pulse-width-modulated (PWM) boost battery charger for reducing switching losses caused by the switch to enhance the

A constant state u − is formed on the left side of the initial wave train followed by a right facing (with respect to the velocity u − ) dispersive shock having smaller

(18%) Determine whether the given series converges or diverges... For what values of x does the series

The aim of the competition is to offer students a platform to express creatively through writing poetry in English. It also provides schools with a channel to

From Theorem 2.2, the polynomial ring F [x] in Example 1.2 and the Gaussian integers Z[i] in Example 1.3 are principle ideal domains.. In general, to prove a ring is a principle

If  is positive, the electric field lines are radially outward, normal to the Gaussian surface and distributed uniformly along it. The charge enclosed is the total charge in

However,the “degree of unreliability” is somewhat like that for Gaussian elimination with partial pivoting, a method that works very well in practice.... It involves fewer

Ensure: This algorithm implements the Gaussian elimination procedure to re- duce A to upper triangular and modify the entries of b accordingly.. 1: for k