Absolutely Equal Eigenvalue Distributions

It is possible to strengthen theorem 2.4 and some of the interim results used in its derivation using reasonably elementary methods. The key additional idea required is the Wielandt-Hoffman theorem [27], a result from matrix theory that is of independent interest. The theorem is stated and a proof following Wilkinson [28] is presented for completeness.

Lemma 2.4 (Wielandt-Hoffman theorem) Given two Hermitian matrices A and B with eigenvalues α_k and β_k in nonincreasing order, respectively, then

n⁻¹

nX−1 k=0

|αk− βk|² ≤ |A − B|².

Proof: Since A and B are Hermitian, we can write them as A = U diag(α_k)U^∗, B = W diag(β_k)W^∗, where U and W are unitary. Since the weak norm is not effected by multiplication by a unitary matrix,

|A − B| = |Udiag(αk)U^∗− W diag(βk)W^∗|

= |diag(αk)U^∗− U^∗W diag(β_k)W^∗|

= |diag(αk)U^∗W − U^∗W diag(β_k)|

= |diag(αk)Q− Qdiag(βk)|,

where Q = U^∗W = {qi,j} is also unitary. The (i, j) entry in the matrix diag(α_k)Q− Qdiag(βk) is (α_i− βj)q_i,j and hence

|A − B|² = n⁻¹

nX−1 i=0

nX−1 j=0

|αi− βj|²|qi,j|^{2 ∆}=

nX−1 i=0

nX−1 j=0

|αi− βj|²p_i,j (2.38)

where we have defined p_i,j − n⁻¹|qi,j|² Since Q is unitary, we also have that

nX−1 i=0

|qi,j|² =

nX−1 j=0

|qi,j|² = 1 (2.39)

or nX−1

i=0

p_i,j =

nX−1 j=0

p_i,j = 1

n. (2.40)

2.4. ABSOLUTELY EQUAL EIGENVALUE DISTRIBUTIONS 17 This can be interpreted in probability terms: p_i,j = n⁻¹|qi,j|² is a probability mass function or pmf on{0, 1, . . . , n − 1}² with uniform marginal probability mass functions. Recall that it is assumed that the eigenvalues are ordered so that α₀ ≥ α1 ≥ α2 ≥ · · · and β0 ≥ β1 ≥ β2 ≥ · · ·.

We claim that for all such matrices P satisfying (2.40), the right hand side of (2.38) is minimized by P = n⁻¹I, where I is the identity matrix, so that

nX−1 i=0

nX−1 j=0

|αi− βj|²p_i,j ≥ⁿ^X⁻¹

i=0

|αi− βi|²,

which will prove the result. To see this suppose the contrary. Let ` be the smallest integer in {0, 1, . . . , n − 1} such that P has a nonzero element off the diagonal in either row ` or in column `. If there is a nonzero element in row ` off the diagonal, say p_`,a then there must also be a nonzero element in column ` off the diagonal, say p_b,` in order for the constraints (2.40) to be satisfied. Since ` is the smallest such value, ` < a and ` < b. Let x be the smaller of p_l,a and p_b,l. Form a new matrix P⁰ by adding x to p_`,` and p_b,a and subtracting x from p_b,` and p_`,a. The new matrix still satisfies the constraints and it has a zero in either position (b, `) or (`, a). Furthermore the norm of P⁰ has changed from that of P by an amount

x^³(α_`− β`)²+ (α_b− βa)²− (α`− βa)²− (αb− β`)²^´

= −x(α`− αb)(β_`− βa)

≤ 0

since ` > b, ` > a, the eigenvalues are nonincreasing, and x is positive.

Continuing in this fashion all nonzero offdiagonal elements can be zeroed out without increasing the norm, proving the result. 2 Applying the Cauchy-Schwartz inequality and the Wielandt-Hoffman the-orem yields the following strengthening of lemma 2.3,

n⁻¹

nX−1 k=0

|αk− βk| ≤ n⁻¹

vu utⁿX⁻¹

k=0

(α_k− βk)²√

n≤ |An− Bn|,

which we formalize as the following lemma.

Lemma 2.5 Given two Hermitian matrices A and B with eigenvalues α_n and β_n in nonincreasing order, respectively, then

n⁻¹

nX−1 k=0

|αk− βk| ≤ |A − B|.

Note in particular that the absolute values are outside the sum in lemma 2.3 and inside the sum in lemma 2.5. As was done in the weaker case, the result can be used to prove a stronger version of theorem 2.4. This line of reasoning, using the Wielandt-Hoffman theorem, was pointed out by William F. Trench who used special cases in his paper [20]. Similar arguments have become standard for treating eigenvalue distributions for Toeplitz and Hankel matrices. See, for example, [7, 25, 3]. The following theorem provides the derivation. The specific statement result and its proof follow from a private communication from William F. Trench.

Theorem 2.5 Let A_n and B_nbe asymptotically equivalent sequences of Her-mitian matrices with eigenvalues α_n,kand β_n,kin nonincreasing order, respec-tively. Since A_nand B_n are bounded there exist finite numbers m and M such that

m≤ αn,k, β_n,k ≤ M , n = 1, 2, . . . k = 0, 1, . . . , n − 1. (2.41) Let F (x) be an arbitrary function continuous on [m, M ]. Then

nlim→∞n⁻¹

nX−1 k=0

|F (αn,k)− F (βn,k)| = 0. (2.42)

Note that the theorem strengthens the result of theorem 2.4 because of the magnitude inside the sum. Following Trench [21] in this case the eigenvalues are said to be asymptotically absolutely equally distributed.

Proof: From lemma 2.5 n⁻¹^X

k=0

|αn,k− βn,k| ≤ |An− Bn|, (2.43)

which implies (2.42) for the case F (r) = r. For any nonnegative integer j

|α^j_n,k− β_n,k^j | ≤ j max(|m|, |M|)^j⁻¹|αn,k− βn,k|. (2.44)

2.4. ABSOLUTELY EQUAL EIGENVALUE DISTRIBUTIONS 19 By way of explanation consider a, b ∈ [m, M]. Simple long division shows that

a^j− b^j a− b =

Xj l=1

a^j^−lb^l⁻¹ so that

|a^j − b^j

a− b | = |a^j − b^j|

|a − b|

= |

Xj l=1

a^j^−lb^l⁻¹|

≤

Xj l=1

|a^j^−lb^l⁻¹|

Xj l=1

|a|^j^−l|b|^l⁻¹

≤ j max(|m|, |M|)^j⁻¹,

which proves (2.44). This immediately implies that (2.42) holds for functions of the form F (r) = r^j for positive integers j, which in turn means the result holds for any polynomial. If F is an arbitrary continuous function on [m, M ], then from theorem 2.3 given ² > 0 there is a polynomial P such that

|P (u) − F (u)| ≤ ², u ∈ [m, M].

Using the triangle inequality, n⁻¹

nX−1 k=0

|F (αn,k)− F (βn,k)|

= n⁻¹

nX−1 k=0

|F (αn,k)− P (αn,k) + P (α_n,k)− P (βn,k) + P (β_n,k)− F (βn,k)|

≤ n⁻¹ⁿ^X⁻¹

k=0

|F (αn,k)− P (αn,k)| + n⁻¹ⁿ^X⁻¹

k=0

|P (αn,k)− P (βn,k)|

+n⁻¹

nX−1 k=0

|P (βn,k)− F (βn,k)|

≤ 2² + n⁻¹ⁿ^X⁻¹

k=0

|P (αn,k)− P (βn,k)|

As n → ∞ the remaining sum goes to 0, which proves the theorem since ²

can be made arbitrarily small. 2

Chapter 3 Circulant Matrices

The properties of circulant matrices are well known and easily derived ([15], p. 267,[6]). Since these matrices are used both to approximate and explain the behavior of Toeplitz matrices, it is instructive to present one version of the relevant derivations here.

A circulant matrix C is one having the form

C =







c₀ c₁ c₂ · · · cn−1

c_n₋₁ c₀ c₁ c₂ ... c_n₋₁ c₀ c₁ . ..

... . .. . .. ... c₂ c₁ c₁ · · · c_n₋₁ c₀







, (3.1)

where each row is a cyclic shift of the row above it. The structure can also be characterized by noting that the (k, j) entry of C, C_k,j, is given by

C_k,j = c(j−k) mod n,

which identifies C as a special type of Toeplitz matrix.

The eigenvalues ψ_k and the eigenvectors y^(k) of C are the solutions of

Cy = ψ y (3.2)

or, equivalently, of the n difference equations

mX−1 k=0

c_n_−m+ky_k+

nX−1 k=m

c_k_−my_k= ψ y_m; m = 0, 1, . . . , n− 1. (3.3)

Changing the summation dummy variable results in

n−1−mX

k=0

c_ky_k+m+

nX−1 k=n−m

c_ky_k_−(n−m)= ψ y_m; m = 0, 1, . . . , n− 1. (3.4) One can solve difference equations as one solves differential equations — by guessing an (hopefully) intuitive solution and then proving that it works.

Since the equation is linear with constant coefficients a reasonable guess is y_k = ρ^k (analogous to y(t) = e^sτ in linear time invariant differential equa-tions). Substitution into (3.4) and cancellation of ρ^m yields

n−1−mX

k=0

c_kρ^k+ ρ⁻ⁿ

nX−1 k=n−m

c_kρ^k= ψ.

Thus if we choose ρ⁻ⁿ = 1, i.e., ρ is one of the n distinct complex n^th roots of unity, then we have an eigenvalue

ψ =

nX−1 k=0

c_kρ^k (3.5)

with corresponding eigenvector

y = n^−1/2^³1, ρ, ρ², . . . , ρⁿ⁻¹^´⁰, (3.6) where the prime denotes transpose where the normalization is chosen to give the eigenvector unit energy. Choosing ρ_m as the complex n^th root of unity, ρ_m = e^−2πim/n, we have eigenvalue

ψ_m =

nX−1 k=0

c_ke^−2πimk/n (3.7)

and eigenvector

y^(m) = n^−1/2^³1, e^−2πim/n,· · · , e−2πi(n−1)/n´

. From (3.7) we can write

C = U ΨU^∗, (3.8)

where

U = ⁿy⁽⁰⁾|y⁽¹⁾| · · · |y⁽ⁿ⁻¹⁾^o

= n^−1/2ⁿe^−2πimk/n; m, k = 0, 1, . . . , n− 1^o

Ψ ={ψkδ_k_−j} where δ is the Kronecker delta,

δ_m =

½1 m = 0 0 otherwise.

To verify (3.8) denote that the (k, j)^th element of U ΨU^∗ by ak,j and that

ob-serve that a_k,jwill be the product of the kth row of U Ψ, which is{n^−1/2e^−2πimk/nΨ_k; m = 0, 2, . . . , n− 1}, times the jth row of U, {n^−1/2e^2πimj/n; m = 0, 2, . . . , n− 1}

so that

a_k,j = n⁻¹

nX−1 m=0

e2πim(j−k)/nψ_m

= n⁻¹

nX−1 m=0

e2πim(j−k)/n nX−1 r=0

c_re^−2πimr/n

= n⁻¹

nX−1 r=0

cr nX−1 m=0

e2πim(j−k−r)/n. (3.9)

But we have

nX−1 m=0

e2πim(j−k−r)/n =

½n k− j = −r mod n

0 otherwise .

so that a_k,j = c−(k−j) mod n. Thus (3.8) and (3.1) are equivalent. Furthermore (3.9) shows that any matrix expressible in the form (3.8) is circulant.

It should also be familiar to those with standard engineering backgrounds that ψ_m in (3.7) is simply the discrete Fourier transform (DFT) of the sequence c_k and (3.8) can be interpreted as a combination of the Fourier inversion formula and the Fourier cyclic shift formula.

Since C is unitarily similar to a diagonal matrix it is normal. Note that all circulant matrices have the same set of eigenvectors. This leads to the following properties.

Theorem 3.1 Let C ={ck−j} and B = {bk−j} be circulant n × n matrices with eigenvalues

ψ_m =

nX−1 k=0

c_ke^−2πimk/n

β_m =

nX−1 k=0

b_ke^−2πimk/n , respectively.

1. C and B commute and

CB = BC = U^∗γU ,

where γ ={ψmβ_mδ_k,m}, and CB is also a circulant matrix.

2. C + B is a circulant matrix and

C + B = U^∗ΩU, where Ω ={(ψm+ β_m)δ_k,m}

3. If ψ_m 6= 0; m = 0, 1, . . . , n − 1, then C is nonsingular and C⁻¹ = U^∗Ψ⁻¹U

so that the inverse of C can be straightforwardly constructed.

Proof. We have C = U^∗ΨU and B = U^∗ΦU where Ψ and Φ are diagonal matrices with elements ψ_mδ_k,m and β_mφ_k,m, respectively.

1. CB = U^∗ΨU U^∗ΦU

= U^∗ΨΦU

= U^∗ΦΨU = BC

Since ΨΦ is diagonal, (3.9) implies that CB is circulant.

2. C + B = U^∗(Ψ + Φ)U 3. C⁻¹ = (U^∗ΨU )⁻¹

= U^∗Ψ⁻¹U

if Ψ is nonsingular.

2 Circulant matrices are an especially tractable class of matrices since inverses, products, and sums are also circulant matrices and hence both straightforward to construct and normal. In addition the eigenvalues of such matrices can easily be found exactly.

In the next chapter we shall see that certain circulant matrices asymptoti-cally approximate Toeplitz matrices and hence from Chapter 2 results similar to those in theorem 3.1 will hold asymptotically for Toeplitz matrices.

Chapter 4 Toeplitz Matrices

4.1 Bounded Toeplitz Matrices

In this chapter the asymptotic behavior of inverses, products, eigenvalues, and determinants of finite Toeplitz matrices is derived by constructing an asymptotically equivalent circulant matrix and applying the results of the previous chapters. Consider the infinite sequence{tk; k = 0,±1, ±2, · · ·} and define the finite (n× n) Toeplitz matrix Tn = {tk−j} as in (1.1). Toeplitz matrices can be classified by the restrictions placed on the sequence t_k. If there exists a finite m such that t_k = 0, |k| > m, then Tn is said to be a finite order Toeplitz matrix. If t_k is an infinite sequence, then there are two common constraints. The most general is to assume that the t_k are square summable, i.e., that

X∞ k=−∞

|tk|² <∞ . (4.1)

Unfortunately this case requires mathematical machinery beyond that as-sumed in this paper; i.e., Lebesgue integration and a relatively advanced knowledge of Fourier series. We will make the stronger assumption that the t_k are absolutely summable, i.e.,

X∞ k=−∞

|tk| < ∞. (4.2)

This assumption greatly simplifies the mathematics but does not alter the fundamental concepts involved. As the main purpose here is tutorial and we wish chiefly to relay the flavor and an intuitive feel for the results, this paper

will be confined to the absolutely summable case. The main advantage of (4.2) over (4.1) is that it ensures the existence and continuity of the Fourier series f (λ) defined by

f (λ) =

Not only does the limit in 4.3) converge if (4.2) holds, it converges uniformly for all λ, that is, we have that

where the righthand side does not depend on λ and it goes to zero as n→ ∞ from (4.2), thus given ² there is a single N , not depending on λ, such that

¯¯¯¯ Note that (4.2) is indeed a stronger constraint than (4.1) since

X∞

Note also that (4.2) implies that f (λ) is bounded since

|f(λ)| ≤ ^X^∞

4.1. BOUNDED TOEPLITZ MATRICES 27 The matrix T_n will be Hermitian if and only if f is real, in which case we denote the least upper bound and greatest lower bound of f (λ) by M_f and m_f, respectively. Observe that max(|mf|, |Mf|) ≤ M|f|.

Since f (λ) is the Fourier series of the sequence t_k, we could alternatively begin with a bounded and hence Riemann integrable function f (λ) on [0, 2π]

(|f(λ)| ≤ M_|f| < ∞ for all λ) and define the sequence of n × n Toeplitz matrices

T_n(f ) =

(2π)⁻¹

Z _2π

f (λ)e^−i(k−j)dλ ; k, j = 0, 1,· · · , n − 1^¾ . (4.5) As before, the Toeplitz matrices will be Hermitian iff f is real. The as-sumption that f (λ) is Riemann integrable implies that f (λ) is continuous except possibly at a countable number of points. Which assumption is made depends on whether one begins with a sequence t_k or a function f (λ) — either assumption will be equivalent for our purposes since it is the Riemann integrability of f (λ) that simplifies the bookkeeping in either case. Before finding a simple asymptotic equivalent matrix to T_n, we use Corollary 2.1 to find a bound on the eigenvalues of T_n when it is Hermitian and an upper bound to the strong norm in the general case.

Lemma 4.1 Let τ_n,k be the eigenvalues of a Toeplitz matrix T_n(f ). If T_n(f ) is Hermitian, then

m_f ≤ τn,k≤ Mf. (4.6)

Whether or not T_n(f ) is Hermitian,

k Tn(f )k≤ 2M|f| (4.7)

so that the matrix is uniformly bounded over n if f is bounded.

Proof. Property (4.6) follows from Corollary 2.1:

maxk τn,k = max

x (x^∗Tnx)/(x^∗x) (4.8) mink τ_n,k = min

x (x^∗T_nx)/(x^∗x)

so that Combining (4.9)-(4.10) results in

m_f ≤

which with (4.8) yields (4.6). Alternatively, observe in (4.11) that if e^(k) is the eigenvector associated with τ_n,k, then the quadratic form with x = e^(k) yields x^∗T_nx = τ_n,k^Pⁿ_k=0⁻¹|xk|². Thus (4.11) implies (4.6) directly. 2 We have already seen in (2.13) that if T_n(f ) is Hermitian, thenk Tn(f )k=

max_k|τn,k|=^∆ |τn,M|, which we have just shown satisfies |τn,M| ≤ max(|Mf|, |mf|) which in turn must be less than M_|f|, which proves (4.7) for Hermitian ma-trices.. Suppose that T_n(f ) is not Hermitian or, equivalently, that f is not real. Any function f can be written in terms of its real and imaginary parts, f = fr + ifi, where both fr and fi are real. In particular, fr = (f + f^∗)/2

4.2. FINITE ORDER TOEPLITZ MATRICES 29

在文檔中 By limiting the generality of the matrices considered the essential ideas and results can be conveyed in a more intuitive manner without the mathematical machinery required for the most general cases (頁 18-31)