算子及矩陣數值域之研究(III)

(1)

行政院國家科學委員會專題研究計畫成果報告

算子及矩陣數值域之研究(3/3)

計畫類別：個別型計畫

計畫編號： NSC92-2115-M-009-001-

執行期間： 92 年 08 月 01 日至 93 年 07 月 31 日

執行單位：國立交通大學應用數學系

計畫主持人：吳培元

報告類型：完整報告

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫可公開查詢

中華民國 93 年 8 月 18 日

(2)

伴隨矩陣：可約性、數值域和相似於收縮算子

吳培元

在本篇論文中，我們研究伴隨矩陣的某些在酉等價下不變的性質。我們得到

一個使伴隨矩陣是可約的條件，並證明伴隨矩陣的數值域是一個中心是原點的圓

盤的充份且必要的條件是這個矩陣是等於一個(冪零)喬丹區塊。但一般來說，一

個伴隨矩陣並不能完全由其數值域所決定。我們也決定了會和一個有限矩陣相似

的收縮算子的缺陷指數可能的值。

(3)

Linear Algebra and its Applications 383 (2004) 127–142

www.elsevier.com/locate/laa

Companion matrices: reducibility, numerical

ranges and similarity to contractions

聻

Hwa-Long Gau

a

, Pei Yuan Wu

b,∗

a_{Department of Mathematics, National Central University, Chung-Li 320, Taiwan} b_{Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan}

Received 20 November 2002; accepted 30 November 2003 Submitted by P. Rosenthal

Dedicated to Heydar Radjavi on his 70th birthday

Abstract

In this paper, we study some unitary-equivalence properties of the companion matrices. We obtain a criterion for a companion matrix to be reducible and show that the numerical range of a companion matrix is a circular disc centered at the origin if and only if the matrix equals the (nilpotent) Jordan block. However, the more general assertion that a companion matrix is determined by its numerical range turns out to be false. We also determine, for an n× n matrix A with eigenvalues in the open unit disc, the defect index of a contraction to which A is similar.

AMS classification: 15A60

Keywords: Companion matrix; Reducible matrix; Numerical range; Defect index

聻_{Research supported by the National Science Council of the Republic of China under research projects} NSC-91-2115-M-008-011 and NSC-91-2115-M-009-007 of the respective authors.

∗_{Corresponding author.}

E-mail addresses: hlgau@math.ncu.edu.tw (H.-L. Gau), pywu@math.nctu.edu.tw (P.Y. Wu).

(4)

For any complex polynomial p(z)= zn+ a1zn−1+ · · · + an₋₁z+ an, there is associated an n× n matrix             0 1 0 1 · · · · · · 0 1

−an −an−1 · · · −a2 −a1             , (1)

called the companion matrix of p and denoted by M(p). A familiar special case is the (nilpotent) Jordan block Jn when all the aj’s are zero. Such a matrix has the

property that its minimal polynomial and characteristic polynomial are both equal to p. Hence companion matrices are nonderogatory and, in particular, are such that every eigenvalue has geometric multiplicity one. They arise as the building blocks in the rational form of general matrices: every square matrix is similar to a dir-ect sum M(p1)⊕ · · · ⊕ M(pk)of companion matrices with pj+1 dividing pj for

all j .

In this paper, we study some unitary-equivalence properties of the companion matrices. Previous works in this respect are the determination of their singular val-ues [1, pp. 224–225] and an explicit construction of their polar decomposition [10]. Here we consider other properties of these matrices such as their reducibility and their numerical ranges. In Section 1 below, we solve the problem when a compan-ion matrix is reducible, that is, when it is unitarily equivalent to the direct sum of two other matrices. We obtain a complete characterization of reducibility in terms of the eigenvalues. It roughly says that a companion matrix is reducible when its eigenvalues are “equally distributed” on at most two circles with center at the origin and radii reciprocal to each other. It follows as corollaries that a companion matrix unitarily equivalent to a direct sum with one unitary summand or with at least three summands must itself be unitary. We take up the numerical ranges of companion matrices in Section 2. We show that a companion matrix whose numerical range is a circular disc centered at the origin must be equal to the Jordan block. However, the more general assertion that a companion matrix is determined by its numerical range is not true. We give an example of two distinct 3× 3 companion matrices whose numerical ranges are the same elliptic disc. On the other hand, we also prove that if two 3× 3 companion matrices have the same circular disc as their numerical ranges, then they must equal to each other. Whether this remains true for n× n companion matrices is not known. Finally, in Section 3, we use the rational form for matrices to prove an improvement over a classical result of Rota on the similarity of a matrix to a contraction.

(5)

1. Reducibility

A matrix is reducible if it is unitarily equivalent to the direct sum of two other matrices. In this section, we give a criterion in terms of the eigenvalues for a com-panion matrix to be reducible.

Theorem 1.1. An n× n (n 2) companion matrix A is reducible if and only if

its eigenvalues are of the form: aωj1

n, . . . , aω jp

n , (1/¯a)ω jp+1

n , . . . , (1/¯a)ωjnn, where

a /= 0, ωndenotes the nth primitive root of 1, 1 p n − 1, and {j1, . . . , jp} and

{jp+1, . . . , jn} form a partition of {0, 1, . . . , n − 1}. In this case, A is unitarily

equivalent to a direct sum A1⊕ A2with σ (A1)= {aωnj1, . . . , aω jp

n } and σ (A2)= {(1/¯a)ωjp+1

n , . . . , (1/¯a)ωjnn}. In particular, every reducible companion matrix is

invertible.

Here for any matrix B, σ (B) denotes the set of its eigenvalues.

Proof of Theorem 1.1. Assume that A is unitarily equivalent to the direct sum

A1⊕ A2on Cp⊕ Cn−p(1 p n − 1) : UA = (A1⊕ A2)Ufor some unitary U . Since A is nonderogatory, A1and A2have no common eigenvalue. We next show that all eigenvalues of A have algebraic multiplicity one. Indeed, if a is an eigenvalue of

Awith algebraic multiplicity bigger than one, then x1= (1, a, a2, . . . , an−1)T and

x2= (0, 1, 2a, . . . , (n − 1)an−2)T are generalized eigenvectors of a. We assume that a is also an eigenvalue of A1. Let b be any eigenvalue of A2. Then b is also an eigenvalue of A with the corresponding eigenvector y= (1, b, b2, . . . , bn−1)T. Since a /= b, we infer that Ux1 and U x2 are in Cp⊕ 0 and Uy is in 0 ⊕ Cn−p. Hence Ux1, Uy = x1, y = n−1 j₌₀ (a ¯b)j = 0 (2) and Ux2, Uy = x2, y = ¯b n−1 j₌₀ j (a ¯b)j−1= 0. (3)

We obtain b /= 0 from (2) and hencen_j−1₌₀j (a ¯b)j−1= 0 from (3). These imply

that a ¯bis a multiple zero of the polynomialn_j−1₌₀zj, which is certainly absurd. We conclude that eigenvalues of A can only have algebraic multiplicity one. Moreover, from (2) we also have (a ¯b)n= 1. Since b is an arbitrary eigenvalue of A2, we deduce that eigenvalues of A2are of the form (1/¯a)ωnjk, k= p + 1, . . . , n, while those of A1 are of the form aωjk

n , k= 1, . . . , p. It is obvious that {j1, . . . , jp} and {jp+1, . . . , jn}

(6)

To prove the converse, we assume that the eigenvalues of A are of the asserted form. If b= aωjk

n and c= (1/¯a)ω jl

n, where 1 k p and p + 1 l n, then their

corresponding eigenvectors x= (1, b, b2, . . . , bn−1)Tand y= (1, c, c2, . . . , cn−1)T

satisfy x, y = n−1 j=0 (b¯c)j = n−1 j=0 ωj (jn k−jl) = 0.

Let H1and H2be the subspaces of Cngenerated by the eigenvectors of aωjn1, . . . ,

aωjnpand (1/¯a)ω jp+1

n , . . . , (1/¯a)ωnjn, respectively. Then H1and H2are invariant sub-spaces of A which are orthogonal to each other. A is obviously unitarily equivalent to the direct sum of the restrictions A1= A|H1and A2= A|H2. This completes the proof.

The next corollary gives conditions for a companion matrix to be unitary. The equivalence of (a) and (e) therein is a consequence of Theorem 1.1.

Corollary 1.2. The following conditions are equivalent for an n× n companion

matrix A of the form (1): (a) A is unitary;

(b) A is normal;

(c) a1= · · · = an−1= 0 and |an| = 1;

(d) the eigenvalues of A are of the form aωjn, j = 0, 1, . . . , n − 1, where |a| = 1

and ωnis the nth primitive root of 1;

(e) A is unitarily equivalent to a direct sum A1⊕ A2with A1unitary.

Proof. (a)⇒(b) is trivial. To prove (b)⇒(c), assume that A is normal. Carrying

out the matrix multiplications in AA∗= A∗Aand equating the first n− 1 diagonal entries of the two products reveal that|an| = 1 and a2= · · · = an−1= 0. Then the

equality of the (n, 1) entries (−an−1= a1an) yields that a1= 0. If (c) holds, then the characteristic polynomial of A is zn_{+ a}

n. Hence the

eigen-values of A are of the form asserted in (d).

Next assume that (d) holds. If b= aωk_n and c= aωl_n are two distinct eigen-values of A, then their corresponding eigenvectors x= (1, b, b2, . . . , bn−1)T and

y= (1, c, c2, . . . , cn−1)Tsatisfy x, y = n−1 j₌₀ (b¯c)j = n−1 j₌₀ ωj (kn −l)= 0.

Thus A is unitarily equivalent to the diagonal matrix diag(a, aωn, . . . , aωnn−1), which

(7)

To complete the proof, we need only show that (e) implies (d). Indeed, if (e) holds, then A is reducible. Therefore, the eigenvalues of A1and A2are of the form asserted in Theorem 1.1. Since A1is unitary, we must have|a| = 1. Thus the eigenvalues of

Aare aωjn, j = 0, 1, . . . , n − 1, that is, (d) holds.

Corollary 1.3. A companion matrix unitarily equivalent to the direct sum of three

or more matrices must be unitary.

Proof. Let A be an n× n companion matrix unitarily equivalent to A1⊕ · · · ⊕

Ak, k 3, and let a, b and c be any eigenvalues of A1, A2and A3, respectively. We infer from Theorem 1.1 that|ab| = |bc| = |ca| = 1 and hence |a| = |b| = |c| = 1. This shows that all eigenvalues of A have modulus one. By Theorem 1.1 again, the eigenvalues of A are of the form aωjn, j= 0, 1, . . . , n − 1. Therefore, A is unitary

by Corollary 1.2.

2. Numerical ranges

Recall that the numerical range of an n× n matrix A is the subset

W (A)= {Ax, x : x ∈ Cn,x = 1}

of the plane. Properties of the numerical range can be found in [5, Chapter 1]. In this section, we consider to what extent a companion matrix is determined by its numerical range. For 2× 2 companion matrices, the numerical range provides the complete information: if A and B are 2× 2 companion matrices, then A = B if and

only if W (A)= W(B). This is the consequence of the fact that 2 × 2 matrices with

equal numerical ranges are unitarily equivalent. Unfortunately, the same cannot be said about companion matrices of size three. The next example gives two distinct such matrices with equal numerical ranges.

Example 2.1. Let A=   00 10 01 −√3i 4 (√3/4)i   and B =  _√00 10 01 3i 4 −(√3/4)i   . We show that W (A)= W(B) via a result of Kippenhahn [7] that the numerical range of any n× n matrix C equals the convex hull of the real points of the dual curve of

pC(x, y, z)= 0, where pC is the degree-n homogeneous polynomial in x, y and z

given by

(8)

with Re C= (C + C∗)/2 and Im C= (C − C∗)/(2i). In our case, we have pA(x, y, z)= det  x   0 1/2 ( √ 3/2)i 1/2 0 5/2 −(√3/2)i 5/2 0   +y   0 −i/2 − √ 3/2 i/2 0 (3/2)i −√3/2 −(3/2)i √3/4   + z  10 01 00 0 0 1     = det   z (x− yi)/2 √ 3(−y + xi)/2 (x+ yi)/2 z (5x+ 3yi)/2 −√3(y+ xi)/2 (5x − 3yi)/2 (√3/4)y+ z

  = z3₊ √ 3 4 yz 2₋1 4(29x 2_{+ 13y}2_)z₋ √ 3 16(29x 2_{+ 13y}2_)y = z+ √ 3 4 y z2−29 4 x 2₋13 4 y 2 and, similarly, pB(x, y, z)= z− √ 3 4 y z2−29 4 x 2₋13 4 y 2 .

The dual curve of pA= 0 consists of the point (0,

√

3/4) and the ellipse 4x2

29 + 4y2

13 = 1. (4)

Since (0,√3/4) lies inside the ellipse, the numerical range W (A) equals the (closed) elliptic disc bounded by (4). In a similar fashion, we obtain that W (B) equals this same elliptic disc.

The (noncircular) elliptic disc turns out to be the only exceptional numerical range for 3× 3 companion matrices.

Theorem 2.2. Let A and B be 3× 3 companion matrices. If W(A) = W(B) is not

a noncircular elliptic disc, then A= B.

We start the proof by noting that a classification of numerical ranges of 3× 3 matrices A was obtained before by Kippenhahn [7] followed by Keeler et al. [6]. The former is based on the factorability of pA and has W (A) classified into four

classes:

(a) pAfactors into three linear factors:

pA(x, y, z)=

3

j=1

(9)

In this case, A is normal and W (A) is the (closed) triangular region with vertices

(aj, bj), j = 1, 2, 3.

(b) pAfactors into a linear factor and an irreducible quadratic one:

pA(x, y, z)= (z + ax + by)q(x, y, z).

Then W (A) is the convex hull of the point (a, b) and the ellipse E which is the dual of q(x, y, z)= 0. It is an elliptic disc if (a, b) lies inside E.

(c) pA is irreducible and the dual curve of pA= 0 has degree four. In this case,

W (A)has a line segment on its boundary.

(d) pA is irreducible and the dual curve of pA= 0 has degree six. Then the dual

curve consists of two parts, one inside the other, and W (A) is an ovular region (that is, a region with a strictly convex boundary).

The paper [6] further developes these into criteria in terms of entries of A for the above cases. These we will also use in the following discussions.

The next proposition and its corollaries take care of the cases of irreducible pA.

Proposition 2.3. Let A and B be square matrices (not necessarily of the same size).

If W (A)= W(B), then pAand pBhave a common irreducible factor.

Proof. Let pA = p1· · · pkand pB= q1· · · qlbe factorizations of pAand pBinto

irreducible factors. Let Ciand Djdenote the curves pi = 0 and qj = 0, respectively,

and let C_i∗and D∗_jbe their respective duals. If W (A)= W(B), then some C_i∗and D_j∗ have a common arc (by Kippenhahn’s result) so that they have infinitely many com-mon tangent lines. By duality, the curves Ci and Dj have infinitely many common

points. Since pi and qj are irreducible, Bézout’s theorem [8, Theorem 3.1] implies

that pi = qj as required.

Corollary 2.4. Let A and B be n× n matrices and assume that pAis irreducible.

Then W (A)= W(B) if and only if pA= pB.

Proof. The necessity follows easily from Proposition 2.3 while the sufficiency is a

consequence of Kippenhahn’s result.

Corollary 2.5. Let A and B be n× n companion matrices and assume that pAis

irreducible. Then W (A)= W(B) if and only if A = B.

Proof. In view of Corollary 2.4 and the fact that a companion matrix is completely

determined by its eigenvalues, we need only prove that the equality of pA and pB

implies that A and B have the same eigenvalues. Indeed, if pA(x, y, z)= pB(x, y, z)

for all x, y and z, then, letting x= 1 and y = i, we obtain det(Re A + iIm A +

zIn)= det(Re B + iIm B + zIn) or det(A+ zIn)= det(B + zIn) for all z, which

(10)

We next move to the case of reducible pA. The following proposition gives the

uniqueness result for 3× 3 companion matrices when the numerical range is a cir-cular disc.

Proposition 2.6. (a) For any point a in the plane, there is a 3× 3 companion matrix

whose numerical range is a circular disc centered at a. The number of such matrices is at most three.

(b) Let A and B be 3× 3 companion matrices. If W(A) = W(B) is a circular

disc, then A= B.

The proof of this proposition involves quite a bit of algebraic computations. It depends on the following criterion for a 3× 3 matrix to have a circular numerical range (cf. [6, Corollary 2.5]).

Proposition 2.7. The numerical range of a 3× 3 matrix A is a circular disc if and

only if

(a) A has a multiple eigenvalue a (so that its eigenvalues are a, a and b),

(b) 2a tr(A∗A)= tr(A∗A2)+ 2|a|2a+ 2(2a − b)|b|2, and (c) 4|a − b|2+ 2|a|2+ |b|2 tr(A∗A).

In this case, W (A) is the circular disc with center a and radius (tr(A∗A)−

2|a|2_{− |b|}2₎1/2_/_2.

The next lemma simplifies the present situation by allowing us to focus on the companion matrices whose circular numerical ranges are centered on the x-axis.

Lemma 2.8. If A is a companion matrix, then λA is unitarily equivalent to a

com-panion matrix for any λ with|λ| = 1.

Proof. Assume that the companion matrix A is of size n. For any λ with|λ| = 1,

let U be the n× n unitary matrix diag (λ, 1, ¯λ, ¯λ2, . . . , ¯λn−2₎_{. A little computation}

shows that U∗(λA)U is a companion matrix. We now proceed to prove Proposition 2.6.

Proof of Proposition 2.6. (a) In view of Lemma 2.8, we may assume that a 0.

Since, by Proposition 2.7, a matrix with its numerical range a circular disc has two eigenvalues equal to its center, we need only consider the companion matrix A of the form   00 10 01 a2b −(a2+ 2ab) 2a + b   ,

(11)

where b is to be determined. Some computations with the above matrix yield that the equality in Proposition 2.7(b) is the same as

2a(a4|b|2+ |a2+ 2ab|2+ |2a + b|2+ 2)

= (2a + b)a4_|b|2_{− a(a + 2 ¯b)[a}2_b_{− a(a + 2b)(2a + b)]} + (2a + b) + (2a + ¯b)[−a(a + 2b) + (2a + b)2_] + 2a3_{+ (2a − b)|b|}2_,

which can be simplified to

a2(a2+ 4)|b|2b+ 2a(a2+ 1)b2+ 2a|b|2+ b − a2¯b − 2a = 0. (5) We show that any b satisfying (5) must be real. Indeed, substituting b= x + iy (x, y real) into (5) and taking the real and imaginary parts of the resulting equality, we obtain a2(a2+ 4)(x2+ y2)x+ 2a(a2+ 1)(x2− y2) + 2a(x2_{+ y}2₎_{+ x − a}2_x_{− 2a = 0} (6) and a2(a2+ 4)(x2+ y2)y+ 2a(a2+ 1)2xy + y + a2y = 0. (7) If y /= 0, then we derive from (7) that

y2= −x2−4(a

2_{+ 1)}

a(a2_{+ 4)}x−

a2+ 1

a2_(a2_{+ 4)}. (8) Plugging (8) into (6) and simplifying the resulting equality yields a6x= a3. If a= 0, then (7) already gives y= 0, contradicting our assumption. Thus a /= 0 and hence

x= 1/a3_{. Equality (8) then becomes}

y2= − 1 a6− 4(a2+ 1) a4_(a2_{+ 4)}− a2+ 1 a2_(a2_{+ 4)} <0,

a contradiction. Thus y must be zero and every b satisfying (5) is real. Consider the cubic polynomial

p(z)= a2(a2+ 4)z3+ 2a(a2+ 2)z2+ (1 − a2)z− 2a (9) associated with (5). It has one or three real zeros b. These b’s certainly satisfy (5) and hence also the equality in Proposition 2.7(b) for our A. We now check that they satisfy the inequality of Proposition 2.7(c). Assume otherwise that some b is outside the circle with center a and radius (tr(A∗A)− 2|a|2− |b|2)1/2/2. Then b must be a corner of W (A) and hence a reducing eigenvalue of A (Ax= bx and A∗x= ¯bx

for some nonzero vector x) (cf. [5, Theorems 1.6.3 and 1.6.6]). Thus A is reducible, Theorem 1.1 implies that the eigenvalues of A are distinct, which contradicts our assumption that a is a multiple eigenvalue of A. We conclude from Proposition 2.7 that there exists a companion matrix with numerical range a circular disc centered at a.

(12)

(b) Again, by Lemma 2.8 we may assume that W (A)= W(B) is a circular disc centered at a point a 0. Then A and B are of the forms:

A=   00 10 01 a2b −(a2+ 2ab) 2a + b   and B =   00 10 01 a2c −(a2+ 2ac) 2a + c   .

We need to prove that b= c. As in (a), b and c must both be real. If a = 0, then b =

c= 0 by (5) and hence A = B as desired. For the remaining part of the proof, we

assume that a > 0. Since the radius of W (A) is given by (tr(A∗A)− 2a2− b2)1/2/2 or

1 2(a

4_b2_{+ (a}2_{+ 2ab)}2_{+ (2a + b)}2_{+ 2 − 2a}2_{− b}2₎1/2 by Lemma 2.7 and that for W (B) by a similar expression, we have

a4b2+ (a2+ 2ab)2+ (2a + b)2− b2

= a4_c2_{+ (a}2_{+ 2ac)}2_{+ (2a + c)}2_{− c}2_. This can be simplified to

[a(a2_{+ 4)(b + c) + 4(a}2_{+ 1)](b − c) = 0.}

Assume contrapositively that b /= c. Then we obtain from above

b+ c = −4(a

2_{+ 1)}

a(a2_{+ 4)}. (10) On the other hand, consider the cubic polynomial p given in (9). As proved in part (a), we have p(b)= p(c) = 0. Let d be the remaining (real) zero of p so that

b+ c + d = −2(a 2_{+ 2)} a(a2_{+ 4)} (11) and bcd = 2 a(a2_{+ 4)} (12) hold. We subtract (10) from (11) to obtain d= 2a/(a2+ 4) and then divide (12) by this expression for d to have bc= 1/a2. Using this, we may eliminate c from (10) to obtain

(13)

Moreover, substitute d= 2a/(a2+ 4) into

p(d)= a2(a2+ 4)d3+ 2a(a2+ 2)d2+ (1 − a2)d− 2a = 0

and simplify the resulting equality to obtain 6a(a2+ 4)(2a4− a2− 4) = 0 or

2a4= a2+ 4. (14)

This can be solved for a2to give

a2= 1 4(1+ √ 33) < 7 4. (15) Using (14), we simplify (13) to a5b2+ 2(a2+ 1)b + a3= 0. (16) Considered as an equation in b, this has discriminant

4(a2+ 1)2− 4a8= 3 2(a

2_{− 4) < 0}

by (14) and (15). This shows that solutions b of (16) are not real, a contradiction. We conclude that b= c and hence A = B, completing the proof.

We now wrap up the proof of Theorem 2.2.

Proof of Theorem 2.2. Consider the following three cases:

(a) pA factors into three linear factors. Then A is normal and hence W (A) is an

equilateral triangular region (cf. Corollary 1.2). Thus if W (A)= W(B), then the eigenvalues of A and B are both the vertices of the triangular region. Hence

A= B.

(b) pA factors into a linear factor and an irreducible quadratic one. Then W (A) is

the convex hull of a point P and an ellipse E. If P is inside E, then W (A) is, by our assumption, a circular disc. Hence the equality of W (A) and W (B) implies

A= B by Proposition 2.6(b). On the other hand, if P is outside E, then P is a

corner of W (A). In this case, W (A)= W(B) implies that the three eigenvalues (one is the point P and the other two are the foci of E) of A and B coincide. It follows that A= B.

(c) pAis irreducible. Then A= B follows from Corollary 2.5.

In view of Proposition 2.6(b), we may wonder whether an n× n companion mat-rix can be completely determined by its circular numerical range. For this we have some reservation. But in case the circular numerical range is centered at the origin, then this is indeed true.

Theorem 2.9. If A is an n× n companion matrix whose numerical range is a

cir-cular disc centered at the origin, then A equals the Jordan block Jn.

The proof is based on two known facts: (a) a finite matrix A has its numerical range equal to a circular disc centered at the origin if and only if the maximum

(14)

eigenvalue λ of Re(wA) is independent of w,|w| = 1, in which case, λ is the radius of the disc, and (b) W (Jn)is a circular disc with center at the origin and radius

cos(π/(n+ 1)) (cf. [3, Proposition 1]).

Proof of Theorem 2.9. Let A be the matrix given by (1) and let λ be the

maxi-mum eigenvalue of Re(wA),|w| = 1. Thus det(λIn− Re(wA)) = 0 for all w. The

expansion of the determinant of

λIn− Re(wA) =           λ −w/2 an¯w/2 − ¯w/2 λ · · · · · · · · · · · · −w/2 a3¯w/2 − ¯w/2 λ (a2¯w − w)/2 anw/2 · · · a3w/2 (a2w− ¯w)/2 λ + Re(a1w)           (17) can be considered as a (trigonometric) polynomial in w. Since it has infinitely many zeros, the coefficients of wjfor j= 0, ±1, . . . , ±n are all zero. Making use of this, we show that all the aj’s are also zero. Indeed, since the coefficient of wn can be

computed to be (−1)n−1_a

n/2n, it follows that an= 0. Assuming that an= · · · =

aj+1= 0 (2 j n − 1), we prove by induction that aj = 0. Consider the matrix

in (17) partitioned as Aj Bj Cj Dj ,

where Aj, Bj, Cj and Dj are submatrices of sizes (n− j) × (n − j), (n − j) × j,

j× (n − j) and j × j, respectively. We claim that Aj is invertible. Indeed, if

pj denotes the characteristic polynomial of Re Jn−j, then det Aj = det(λIn−j−

Re(wJn−j))= pj(λ). Hence we have to show that pj(λ) /= 0. Assume otherwise

that pj(λ)= 0. Then λ is an eigenvalue of Re Jn−j and hence is in W (Re Jn−j)=

Re W (Jn−j), which implies that λ cos(π/(n − j + 1)). On the other hand, since

Jn−1is a submatrix of A, we have W (Jn−1)⊆ W(A). These are circular discs with

center the origin and radii cos(π/n) and λ, respectively. Thus cos(π/n) λ and therefore cos(π/n) cos(π/(n − j + 1)). It follows that j 1, contradicting our assumption. Hence we have pj(λ)= det Aj = 0 and therefore A/ jis invertible. Then

det(λIn− Re(wA)) = det Aj· det(Dj − CjA−1j Bj)

(15)

Since det(Dj− CjA−1j Bj) = det   Dj−    0 · · · − ¯w/2 .. . ... 0 · · · 0       ∗ · · · ∗ .. . ... ∗ · · · det Aj+i/det Aj    ×    0 · · · 0 .. . ... −w/2 · · · 0       = det    λ− (1/4)(pj+1(λ)/pj(λ)) · · · ∗ .. . ... ∗ · · · ∗    ,

where the remaining entries of this last matrix are exactly the same as those of Dj,

the coefficient of wjin det(Dj − CjA−1_j Bj)is aj/2j. Hence the coefficient of wjin

det(λIn− Re(wA)) equals pj(λ)aj/2j. Since this is zero and pj(λ) /= 0, we obtain

aj = 0 as asserted.

Finally, we need to check that a1= 0. Since

λIn− Re(wA) =          λ −w/2 − ¯w/2 λ . .. . .. . .. . .. . .. λ −w/2 − ¯w/2 λ + Re(a1w)          from what was proved above, we obtain

det(λIn− Re(wA)) = (λ + Re(a1w))p1(λ)− −w 2 −¯w 2 p2(λ) (18) = (λ + Re(a1w))p1(λ)− 1 4p2(λ)= 0.

The coefficient of w in det(λIn− Re(wA)) is p1(λ)a1/2. Hence p1(λ)a1= 0. We claim that p1(λ) /= 0. Indeed, if p1(λ)= 0, then (18) yields p2(λ)= 0, which in turn leads to λ cos(π/(n − 1)) and hence contradicts λ cos(π/n) as before. We conclude that a1= 0 and A = Jnas asserted.

We end this section with a general question on the numerical ranges of companion matrices.

(16)

Problem 2.10. Which nonempty closed convex subset of the plane is the numerical

range of some n× n companion matrix?

We doubt that there will be any easy-to-describe and clean-cut answer. However, some partial ones obtained from the results in these two sections are already very interesting. For example, we have the answer for the 2× 2 case: a closed elliptic disc

with foci a and b is the numerical range of some 2× 2 companion matrix if and only if its minor axis has length|1 + a ¯b|. This is the same as saying that the matrix

a c

0 b

is unitarily equivalent to a companion matrix if and only if|c| = |1 + a ¯b|. The latter can be proved by the equalities of traces, determinants and Frobenius norms of two 2× 2 unitarily equivalent matrices. On the other hand, a closed polygonal region (with at least three sides) is the numerical range of some n× n companion matrix if

and only if its boundary is a regular n-gon which is inscribed in the unit circle. This

is a consequence of Corollaries 1.3 and 1.2. Finally, a closed circular disc centered

at the origin is the numerical range of some n× n companion matrix if and only if its radius equals cos(π/(n+ 1)) (cf. Theorem 2.9).

3. Similarity to contractions

A classical result of Rota on similarity to contractions says that if A is an operator with spectrum contained in the open unit disc D, then A is similar to a strict contrac-tion (one with norm strictly less than one) (cf. [4, Corollary 2 to Problem 153]). In this section, we use a slight generalization of his arguments to prove a more precise improvement for finite matrices.

For any n× n matrix A, let µ(A) be its multiplicity, that is, µ(A) is the minimum number of vectors{x1, . . . , xm} in Cnfor which{Ajxk: j 0, 1 k m} spans

Cn. It is well-known that µ(A) equals the number of companion matrices in the rational form of A. A is cyclic if its multiplicity is one. The defect index of an n× n contraction A is dA = rank (In− A∗A).

Theorem 3.1. Let A be an n× n matrix with all its eigenvalues in D. Then A is

similar to a contraction with defect index k if and only if µ(A) k n.

Since an n× n contraction has defect index n if and only if it is a strict one, the aforementioned result of Rota (or rather its finite-dimensional version) is a special case of the preceding theorem. The next lemma is another special case. Its proof is inspired by that of [11, Theorem 2].

Lemma 3.2. If A is an n× n companion matrix with eigenvalues in D and k is a

natural number less than or equal to n, then A is similar to a contraction with defect index k.

(17)

Proof. Since the eigenvalues of A are in D, the series∞_m₌₀Am2 converges. Let P be the n× n diagonal matrix diag (1, . . . , 1, 0, . . . , 0) with k many 1’s and let

X=∞_m₌₀(Am)∗P Am. This latter series also converges because

(Am)∗P Am (Am)∗Am Am2In. Since X n_−k m=0 A∗mP Am = n−k m=0 diag (0, . . . , 0, 1 (m+1)st , . . . , ₁ (m+k)th ,0, . . . , 0) In,

we infer that X is invertible. If B= X1/2AX−1/2, then, letting y= X−1/2x, we have Bx2_{= X}1/2_AX−1/2_x2 = A∗XAy, y = (X − P )y, y = Xy, y − Py, y = X1/2_{x, X}−1/2_x_{− Py}2 = x2_{− P X}−1/2_x2_x2

for any x. This shows that B is a contraction. Moreover, since ker(In− B∗B)= {x ∈ Cn: Bx = x}

= {x ∈ Cn_{: P X}_−1/2_x _{= 0}}

from above, we infer that dim ker(In− B∗B)= dim ker P = n − k and hence dB =

k. This proves that A is similar to the contraction B with defect index k.

Proof of Theorem 3.1. Assume that A is similar to a contraction B with defect

index k. It is known that µ(B) k for any contraction B with eigenvalues in D (cf. [2, Proposition 5.3]). Hence µ(A)= µ(B) k as asserted.

Conversely, assume that µ(A) k n. Since A is similar to a direct sum A1⊕ · · · ⊕ Al (l= µ(A)) of companion matrices with eigenvalues all in D and since

Lemma 3.2 implies that each Aj is similar to a contraction Bjwith

dBj =

k− l + 1 if j = 1,

(18)

we obtain that A is similar to the contraction B= B1⊕ · · · ⊕ Bl with dB= l j=1 dBj = (k − l + 1) + 1 + · · · + 1 l−1 = k. This completes the proof.

The next corollary appears in [9, Theorem 3.27].

Corollary 3.3. A finite matrix is similar to a contraction of class Snif and only if

it has eigenvalues inD and is cyclic.

Recall that an n× n matrix is said to be of class Snif it is a contraction, has all

its eigenvalues in D and has its defect index equal to one.

Proof of Corollary 3.3. For the necessity, since contractions of class Sn have

ei-genvalues in D and have multiplicity one, the same is true for any matrix similar to an Sn-contraction. The sufficiency follows from Theorem 3.1.

References

[1] S. Barnett, Matrices: Methods and Applications, Clarendon Press, Oxford, 1990.

[2] R.G. Douglas, Canonical models, Topics in Operator Theory, Amer. Math. Soc., Providence, 1974, pp. 161–218.

[3] U. Haagerup, P. de la Harpe, The numerical radius of a nilpotent operator on a Hilbert space, Proc. Amer. Math. Soc. 115 (1992) 371–379.

[4] P.R. Halmos, A Hilbert Space Problem Book, second ed., Springer, New York, 1982.

[5] R.A. Horn, C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge, 1991.

[6] D.S. Keeler, L. Rodman, I.M. Spitkovsky, The numerical range of 3× 3 matrices, Linear Algebra Appl. 252 (1997) 115–139.

[7] R. Kippenhahn, Über den Wertevorrat einer Matrix, Math. Nachr. 6 (1951) 193–228. [8] F. Kirman, Complex Algebraic Curves, Cambridge University Press, Cambridge, 1992. [9] H. Radjavi, P. Rosenthal, Invariant Subspaces, Springer, New York, 1973.

[10] P. van den Driessche, H.K. Wimmer, Explicit polar decomposition of companion matrices, Electron. J. Linear Algebra 1 (1996) 64–69.

[11] N.J. Young, Analytic programmes in matrix algebras, Proc. Lond. Math. Soc. 36 (3) (1978) 226– 242.