• 沒有找到結果。

The fundamental Toeplitz eigenvalue distribution theory has an interesting application for characterizing the limiting behavior of determinants. Suppose

4.4. TOEPLITZ DETERMINANTS 51 now that Tn(f ) is a sequence of Hermitian Toeplitz matrices such that that f (λ) ≥ mf > 0. Let Cn = Cn(f ) denote the sequence of circulant matrices constructed from f as in (4.26). Then from (4.28) the eigenvalues of Cn are f (2πm/n) for m = 0, 1, . . . , n− 1 and hence detCn =Qnm=0−1 f (2πm/n). This in turn implies that

ln (det(Cn))n1 = 1

nln detCn= 1 n

nX−1 m=0

ln f (2πm n).

These sums are the Riemann approximations to the limiting integral, whence

nlim→∞ln (det(Cn))n1 =

Z 1

0

ln f (2πλ) dλ.

Exponentiating, using the continuity of the logarithm for strictly positive arguments, and changing the variables of integration yields

nlim→∞(det(Cn))1n = e1

R

0 ln f (λ) dλ.

This integral, the asymptotic equivalence of Cn and Tn(f ) (lemma 4.6), and Corollary 2.5 togther yield the following result ([13], p. 65).

Theorem 4.5 Let Tn(f ) be a sequence of Hermitian Toeplitz matrices such that ln f (λ) is Riemann integrable and f (λ)≥ mf > 0. Then

nlim→∞(det(Tn(f )))1n = e1

R

0 ln f (λ) dλ. (4.64)

Chapter 5

Applications to Stochastic Time Series

Toeplitz matrices arise quite naturally in the study of discrete time random processes. Covariance matrices of weakly stationary processes are Toeplitz and triangular Toeplitz matrices provide a matrix representation of causal linear time invariant filters. As is well known and as we shall show, these two types of Toeplitz matrices are intimately related. We shall take two viewpoints in the first section of this chapter section to show how they are related. In the first part we shall consider two common linear models of random time series and study the asymptotic behavior of the covariance ma-trix, its inverse and its eigenvalues. The well known equivalence of moving average processes and weakly stationary processes will be pointed out. The lesser known fact that we can define something like a power spectral density for autoregressive processes even if they are nonstationary is discussed. In the second part of the first section we take the opposite tack — we start with a Toeplitz covariance matrix and consider the asymptotic behavior of its tri-angular factors. This simple result provides some insight into the asymptotic behavior or system identification algorithms and Wiener-Hopf factorization.

The second section provides another application of the Toeplitz distri-bution theorem to stationary random processes by deriving the Shannon information rate of a stationary Gaussian random process.

Let {Xk; k ∈ I} be a discrete time random process. Generally we take I = Z, the space of all integers, in which case we say that the process is two-sided, or I = Z+, the space of all nonnegative integers, in which case we say that the process is one-sided. We will be interested in vector

53

representations of the process so we define the column vector (n−tuple) Xn= (X0, X1, . . . , Xn−1)t, that is, Xn is an n-dimensional column vector. The mean vector is defined by mn= E(Xn), which we usually assume is zero for convenience. The n× n covariance matrix Rn={rj,k} is defined by

Rn = E[(Xn− mn)(Xn− mn)]. (5.1) This is the autocorrelation matrix when the mean vector is zero. Subscripts will be dropped when they are clear from context. If the matrix Rn is Toeplitz, say Rn = Tn(f ), then rk,j = rk−j and the process is said to be weakly stationary. In this case we can define f (λ) =

X k=−∞

rkeikλ as the power spectral density of the process. If the matrix Rn is not Toeplitz but is asymptotically Toeplitz, i.e., Rn ∼ Tn(f ), then we say that the process is asymptotically weakly stationary and once again define f (λ) as the power spectral density. The latter situation arises, for example, if an otherwise sta-tionary process is initialized with Xk= 0, k ≤ 0. This will cause a transient and hence the process is strictly speaking nonstationary. The transient dies out, however, and the statistics of the process approach those of a weakly stationary process as n grows.

The results derived herein are essentially trivial if one begins and deals only with doubly infinite matrices. As might be hoped the results for asymp-totic behavior of finite matrices are consistent with this case. The problem is of interest since one often has finite order equations and one wishes to know the asymptotic behavior of solutions or one has some function defined as a limit of solutions of finite equations. These results are useful both for finding theoretical limiting solutions and for finding reasonable approximations for finite order solutions. So much for philosophy. We now proceed to investigate the behavior of two common linear models. For simplicity we will assume the process means are zero.

5.1 Moving Average Processes

By a linear model of a random process we mean a model wherein we pass a zero mean, independent identically distributed (iid) sequence of random variables Wk with variance σ2 through a linear time invariant discrete time filtered to obtain the desired process. The process Wkis discrete time “white”

5.1. MOVING AVERAGE PROCESSES 55 noise. The most common such model is called a moving average process and is defined by the difference equation

Un =

Xn k=0

bkWn−k =

Xn k=0

bn−kWk (5.2)

Un= 0; n < 0.

We assume that b0 = 1 with no loss of generality since otherwise we can incorporate b0 into σ2. Note that (5.2) is a discrete time convolution, i.e., Un is the output of a filter with “impulse response” (actually Kronecker δ response) bk and input Wk. We could be more general by allowing the filter bk to be noncausal and hence act on future Wk’s. We could also allow the Wk’s and Uk’s to extend into the infinite past rather than being initialized.

This would lead to replacing of (5.2) by

Un=

X k=−∞

bkWn−k =

X k=−∞

bn−kWk. (5.3)

We will restrict ourselves to causal filters for simplicity and keep the initial conditions since we are interested in limiting behavior. In addition, since stationary distributions may not exist for some models it would be difficult to handle them unless we start at some fixed time. For these reasons we take (5.2) as the definition of a moving average.

Since we will be studying the statistical behavior of Unas n gets arbitrarily large, some assumption must be placed on the sequence bkto ensure that (5.2) converges in the mean-squared sense. The weakest possible assumption that will guarantee convergence of (5.2) is that

X k=0

|bk|2 <∞. (5.4)

In keeping with the previous sections, however, we will make the stronger assumption

X k=0

|bk| < ∞. (5.5)

As previously this will result in simpler mathematics.

Equation (5.2) can be rewritten as a matrix equation by defining the lower triangular Toeplitz matrix

Bn=

If the filter bn were not causal, then Bn would not be triangular. If in addition (5.3) held, i.e., we looked at the entire process at each time instant, then (5.7) would require infinite vectors and matrices as in Grenander and Rosenblatt [12]. Since the covariance matrix of Wk is simply σ2In, where In is the n× n identity matrix, we have for the covariance of Un:

From (5.9) it is clear that rk,j is not Toeplitz because of the min(k, j) in the sum. However, as we next show, as n → ∞ the upper limit becomes large and RU(n) becomes asymptotically Toeplitz. If we define

b(λ) =

5.2. AUTOREGRESSIVE PROCESSES 57

相關文件