Toeplitz Determinants - By limiting the generality of the matrices considered the essential ide

The fundamental Toeplitz eigenvalue distribution theory has an interesting application for characterizing the limiting behavior of determinants. Suppose

4.4. TOEPLITZ DETERMINANTS 51 now that T_n(f ) is a sequence of Hermitian Toeplitz matrices such that that f (λ) ≥ mf > 0. Let C_n = C_n(f ) denote the sequence of circulant matrices constructed from f as in (4.26). Then from (4.28) the eigenvalues of C_n are f (2πm/n) for m = 0, 1, . . . , n− 1 and hence detCn =^Qⁿ_m=0⁻¹ f (2πm/n). This in turn implies that

ln (det(C_n))ⁿ¹ = 1

nln detC_n= 1 n

nX−1 m=0

ln f (2πm n).

These sums are the Riemann approximations to the limiting integral, whence

nlim→∞ln (det(C_n))ⁿ¹ =

Z ₁

ln f (2πλ) dλ.

Exponentiating, using the continuity of the logarithm for strictly positive arguments, and changing the variables of integration yields

nlim→∞(det(C_n))¹ⁿ = e^2π¹

R_2π

0 ln f (λ) dλ.

This integral, the asymptotic equivalence of C_n and T_n(f ) (lemma 4.6), and Corollary 2.5 togther yield the following result ([13], p. 65).

Theorem 4.5 Let T_n(f ) be a sequence of Hermitian Toeplitz matrices such that ln f (λ) is Riemann integrable and f (λ)≥ mf > 0. Then

nlim→∞(det(Tn(f )))¹ⁿ = e^2π¹

R_2π

0 ln f (λ) dλ. (4.64)

Chapter 5 Applications to Stochastic Time Series

Toeplitz matrices arise quite naturally in the study of discrete time random processes. Covariance matrices of weakly stationary processes are Toeplitz and triangular Toeplitz matrices provide a matrix representation of causal linear time invariant filters. As is well known and as we shall show, these two types of Toeplitz matrices are intimately related. We shall take two viewpoints in the first section of this chapter section to show how they are related. In the first part we shall consider two common linear models of random time series and study the asymptotic behavior of the covariance ma-trix, its inverse and its eigenvalues. The well known equivalence of moving average processes and weakly stationary processes will be pointed out. The lesser known fact that we can define something like a power spectral density for autoregressive processes even if they are nonstationary is discussed. In the second part of the first section we take the opposite tack — we start with a Toeplitz covariance matrix and consider the asymptotic behavior of its tri-angular factors. This simple result provides some insight into the asymptotic behavior or system identification algorithms and Wiener-Hopf factorization.

The second section provides another application of the Toeplitz distri-bution theorem to stationary random processes by deriving the Shannon information rate of a stationary Gaussian random process.

Let {Xk; k ∈ I} be a discrete time random process. Generally we take I = Z, the space of all integers, in which case we say that the process is two-sided, or I = Z+, the space of all nonnegative integers, in which case we say that the process is one-sided. We will be interested in vector

representations of the process so we define the column vector (n−tuple) Xⁿ= (X₀, X₁, . . . , X_n₋₁)^t, that is, Xⁿ is an n-dimensional column vector. The mean vector is defined by mⁿ= E(Xⁿ), which we usually assume is zero for convenience. The n× n covariance matrix Rn={rj,k} is defined by

R_n = E[(Xⁿ− mⁿ)(Xⁿ− mⁿ)^∗]. (5.1) This is the autocorrelation matrix when the mean vector is zero. Subscripts will be dropped when they are clear from context. If the matrix R_n is Toeplitz, say R_n = T_n(f ), then r_k,j = r_k_−j and the process is said to be weakly stationary. In this case we can define f (λ) =

X∞ k=−∞

r_ke^ikλ as the power spectral density of the process. If the matrix R_n is not Toeplitz but is asymptotically Toeplitz, i.e., R_n ∼ Tn(f ), then we say that the process is asymptotically weakly stationary and once again define f (λ) as the power spectral density. The latter situation arises, for example, if an otherwise sta-tionary process is initialized with X_k= 0, k ≤ 0. This will cause a transient and hence the process is strictly speaking nonstationary. The transient dies out, however, and the statistics of the process approach those of a weakly stationary process as n grows.

The results derived herein are essentially trivial if one begins and deals only with doubly infinite matrices. As might be hoped the results for asymp-totic behavior of finite matrices are consistent with this case. The problem is of interest since one often has finite order equations and one wishes to know the asymptotic behavior of solutions or one has some function defined as a limit of solutions of finite equations. These results are useful both for finding theoretical limiting solutions and for finding reasonable approximations for finite order solutions. So much for philosophy. We now proceed to investigate the behavior of two common linear models. For simplicity we will assume the process means are zero.

5.1 Moving Average Processes

By a linear model of a random process we mean a model wherein we pass a zero mean, independent identically distributed (iid) sequence of random variables W_k with variance σ² through a linear time invariant discrete time filtered to obtain the desired process. The process W_kis discrete time “white”

5.1. MOVING AVERAGE PROCESSES 55 noise. The most common such model is called a moving average process and is defined by the difference equation

U_n =

Xn k=0

b_kW_n_−k =

Xn k=0

b_n_−kW_k (5.2)

U_n= 0; n < 0.

We assume that b₀ = 1 with no loss of generality since otherwise we can incorporate b₀ into σ². Note that (5.2) is a discrete time convolution, i.e., U_n is the output of a filter with “impulse response” (actually Kronecker δ response) b_k and input W_k. We could be more general by allowing the filter b_k to be noncausal and hence act on future W_k’s. We could also allow the W_k’s and U_k’s to extend into the infinite past rather than being initialized.

This would lead to replacing of (5.2) by

U_n=

X∞ k=−∞

b_kW_n_−k =

X∞ k=−∞

b_n_−kW_k. (5.3)

We will restrict ourselves to causal filters for simplicity and keep the initial conditions since we are interested in limiting behavior. In addition, since stationary distributions may not exist for some models it would be difficult to handle them unless we start at some fixed time. For these reasons we take (5.2) as the definition of a moving average.

Since we will be studying the statistical behavior of U_nas n gets arbitrarily large, some assumption must be placed on the sequence b_kto ensure that (5.2) converges in the mean-squared sense. The weakest possible assumption that will guarantee convergence of (5.2) is that

X∞ k=0

|bk|² <∞. (5.4)

In keeping with the previous sections, however, we will make the stronger assumption

X∞ k=0

|bk| < ∞. (5.5)

As previously this will result in simpler mathematics.

Equation (5.2) can be rewritten as a matrix equation by defining the lower triangular Toeplitz matrix

B_n=

If the filter b_n were not causal, then B_n would not be triangular. If in addition (5.3) held, i.e., we looked at the entire process at each time instant, then (5.7) would require infinite vectors and matrices as in Grenander and Rosenblatt [12]. Since the covariance matrix of W_k is simply σ²I_n, where I_n is the n× n identity matrix, we have for the covariance of Un:

From (5.9) it is clear that r_k,j is not Toeplitz because of the min(k, j) in the sum. However, as we next show, as n → ∞ the upper limit becomes large and R_U⁽ⁿ⁾ becomes asymptotically Toeplitz. If we define

b(λ) =

5.2. AUTOREGRESSIVE PROCESSES 57

在文檔中 By limiting the generality of the matrices considered the essential ideas and results can be conveyed in a more intuitive manner without the mathematical machinery required for the most general cases (頁 52-59)