• 沒有找到結果。

某些矩陣的高-吳數

N/A
N/A
Protected

Academic year: 2021

Share "某些矩陣的高-吳數"

Copied!
50
0
0

加載中.... (立即查看全文)

全文

(1)

國立交通大學理學院應用數學系

博士論文

某些矩陣的高

-

吳數

Gau-Wu numbers of certain matrices

研究生: 李信儀

Student: Hsin-Yi Lee

指導教授: 吳培元 教授

Advisor: Pei Yuan Wu

中華民國一百零三年六月

(2)

某些矩陣的高

-

吳數

Gau-Wu numbers of certain matrices

研 究 生:李信儀

Student: Hsin-Yi Lee

指導教授:吳培元 教授 Advisor: Dr. Pei Yuan Wu

國 立 交 通 大 學

應 用 數 學 系

博 士 論 文

A Thesis

Submitted to Department of Applied Mathematics College of Science

National Chiao Tung University in partial Fulfillment of the Requirements

for the Degree of Doctor of Philosophy

in

Applied Mathematics

June 2014

Hsinchu, Taiwan, Republic of China

中華民國一百零三年六月

June, 2014

(3)

某些矩陣的高

-

吳數 研究生: 李信儀 指導教授: 吳培元 教授

國立交通大學

應用數學系 博士班

摘要

對於一個

乘 的矩陣 ,令 ( ) 表示數值域邊界上的點

⟩ 所對應的正交單位向量

的最大個數。我們稱這個數

( ) 為高-吳數。若 為正規或二次矩陣,則其高-吳數 ( ) 可

以明確地被計算出來。而對於一個矩陣 形如 ,我們證明了

高-吳數為 2 時,其充分且必要條件為其中一個矩陣,稱之為 ,的

數值域,完全落在另外一個矩陣 的數值域的內部且 ( ) 為 2。

對於一個不可約的矩陣 ,我們可以確切地決定何時其高-吳數等於

n

。這些結果以及已知的 4 乘 4 矩陣的數值域的圖形,可用以決定

任何一個 4 乘 4 可約矩陣的高-吳數。

此外,設

為一個 乘 n (n 大於或等於 2) 的非負矩陣,其形如

[

],

此處 m 大於或等於 3 並且對角線上所出現的零均為零方陣。若

的實部為不可約的矩陣,則其高-吳數 ( ) 的上界為 m-1。再者,我

們也得到這種矩陣的高-吳數達到其上界的充分且必要條件。除此之

外,我們也研究了另外一類型的非負矩陣,稱之為雙隨機矩陣。我們

證明了任何一個 3 乘 3 的雙隨機矩陣的高吳數必定為 3。另外,我

們也決定了 4 乘 4 的雙隨機矩陣的數值域及其高-吳數。最後我們

也考慮一般的 乘 n (n 大於或等於 5) 雙隨機矩陣,藉由其可能的

數值域的圖形得到其高-吳數的下界。

(4)

Gau-Wu numbers of certain matrices

Student: Hsin-Yi Lee Advisor: Dr. Pei Yuan Wu

Department of Mathematics

National Chiao Tung University

ABSTRACT

For any

- -n , let ( ) stand for the maximal number of

orthonormal vectors

such that the scalar products

⟩ lie in the

boundary of the numerical range

( ). This number ( ) is called the

Gau-Wu number of the matrix

If is a normal or a quadratic matrix,

then the exact value of

( ) can be computed. For a matrix of the

form

, we show that ( ) if and only if the numerical range

of one summand, say, , is contained in the interior of the numerical

range of the other summand

and ( ) . For an irreducible matrix

, we can determine exactly when the value of ( ) equals the size of

. These are then applied to determine ( ) for a reducible matrix of

size

in terms of the shape of ( )

Moreover, if

is an n- -n (n ) nonnegative matrix of the form

[

],

where m and the diagonal zeros are zero square matrices, with

irreducible real part, then

( ) has an upper bound In addition,

we also obtain necessary and sufficient conditions for

( ) for

such a matrix

The other class of nonnegative matrices we study is the

doubly stochastic ones. We prove that the value of

( ) is equal to

for any

-by- doubly stochastic matrix . Next, for any -by-

doubly stochastic matrix, we also determine its numerical range. This

result can be applied to find the value of

( ) for any doubly stochastic

matrix

of size in terms of the shape of ( ) Furthermore, the

lower bound of

( ) is also found for a general n- -n (n ) doubly

stochastic matrix

via the possible shapes of ( )

(5)

誌 謝

本論文得以完成,首先要特別感謝我的指導教授 吳培元老師在這五年多來,無 論是學業上或研究上都不斷地給予各種幫助。除了學術上的教導,我更從他的身 上學習到做人處事的道理以及態度。非常感謝吳老師所教授的一切。另外要感謝 口試委員黃毅青教授、簡茂丁教授、高華隆教授與王國仲教授的指點,使得本論 文更臻完善。 在博士班求學的生涯中,感謝許多交大應用數學系教授的指導與助理們的幫助, 如王國仲教授對於數值域中所探討的高-吳數所給予的寶貴建議與意見,讓我能 看到自己的盲點,進而解決某一類矩陣的問題。其他還有林文偉講座教授、李明 佳教授、王夏聲教授、林琦焜教授、白啓光教授、葉立明教授、許義容教授、張 書銘教授等等,在課業與其他方面給我的協助。另外要感謝中央大學高華隆教授, 當我在處理數值域中的問題時,他所給予的協助與鼓勵,讓我對這領域的看法有 了全新的體驗與認識。 沒有師門的學長們的幫忙,很難度過這漫長的博士生涯。感謝逢甲大學張其棟教 授教導我學會繪圖軟體的使用。更感謝中山大學蔡明誠博士在數不清多少個夜晚 同我視訊。一方面檢視我的證明,另一方面給予不斷地鼓勵。使我體悟了這句名 言『問松林,松林幾經冬?山川何如昔,風雲與古同。』另外要感謝系上的學長、 同學與學弟們,包括高雄師範大學吳恭儉教授、呂明杰博士、黃韋強博士、黃皜 文博士、李忠逵博士、陳德軒、龔柏任、黃俊銘、陳哲楷等等。還有其他透過網 路認識的朋友們(族繁不及備載),非常感謝這一路上有他們的陪伴,使得我的 研究生涯多采多姿。 最後要特別感謝家人,包括我的雙親、愛妻文鳳、姊妹們以及愛女采潔。感謝他 們始終不斷地給我支持以及無私的付出與關懷,讓我能夠專心地在研究工作上持 續邁進。尤其是愛女還年幼,雙親與妻子對女兒地細心照料,讓我不必為生活瑣 碎以及孩子的事情而操心,進而能順利地取得博士學位,謹以此論文獻給你們。

(6)

Contents

Chinese Abstract

………

i

English Abstract

………

ii

Acknowledgement

………

iii

Contents

………

iv

1

Introduction………

1

2

Gau-Wu numbers of direct sums of matrices…………

5

2.1

Introduction ………

5

2.2

Direct sum………

6

2.3

Applications and discussions………

21

3

Gau-Wu numbers of nonnegative matrices………

30

3.1

Introduction………

30

3.2

Nonnegative block shift matrix………

31

3.3

Doubly stochastic matrix………

37

4

References………

43

(7)

1

Introduction

Let A be an n-by-n complex matrix. Its numerical range W (A) is, by definition, the set {hAx, xi: x ∈ Cn, kxk = 1}, where h·, ·i and k · k denote the standard inner

product and its associated norm in Cn, respectively. One of the most important

properties of the numerical range is its convexity. In fact, the study of the numerical range originates from the discovery of this property by Toeplitz [17] and Hausdorff [7]: the former proved that the boundary of the numerical range is always a convex curve, but left open the possibility that it may have interior holes while the latter, using a different approach, showed that this cannot happen. An interesting account on the history of this theorem can be found in [6].

For a matrix A, let A∗ denote its adjoint, Re A its real part (A + A)/2 and Im A

its imaginary part (A − A∗)/2i. The set of eigenvalues of A is denoted by σ(A). For

any subset △ of C, △∧ denotes its convex hull, that is, △is the smallest convex set

containing △. We list below several important properties of the numerical range. (1) W (U∗AU) = W (A) for any unitary matrix U.

(2) W (A) is a compact subset of C.

(3) W (aA + bI) = aW (A) + b for any scalars a and b. (4) W (Re A) = Re W (A) and W (Im A) = Im W (A). (5) If A =  B ∗ ∗ ∗  , then W (B) ⊆ W (A). (6) σ(A) ⊆ W (A).

(7) If A is normal, then W (A) is equal to σ(A)∧. (8) W (P

n ⊕An) = (∪n

W (An))∧.

For other properties of the numerical range, the reader may consult [8, Chapter 1].

In Chapter 2, we consider the maximum number k = k(A) for which there exist orthonormal vectors x1, ..., xk ∈ Cn with hAxj, xji in the boundary ∂W (A) of W (A)

(8)

for all j. Note that k(A) is also the maximum size of a compression of A with all its diagonal entries in ∂W (A). Recall that a k-by-k matrix B is a compression of A if B = V∗AV for some n-by-k matrix V with VV = I

k. Here Ik denotes the k-by-k

identity matrix. In particular, if n equals k, then A and B are said to be unitarily

similar, which we denote by A ∼= B. The number k(A) was introduced in [5] and [19] and is called the Gau-Wu number by [2]. It relates properties of the numerical range to the compressions of A. In particular, it was shown in [5, Lemma 4.1 and Theorem 4.4] that 2 ≤ k(A) ≤ n for any n-by-n (n ≥ 2) matrix A, and k(A) = ⌈n/2⌉ for any Sn-matrix A (n ≥ 3). Recall that an n-by-n matrix A is of class Sn if it

is a contraction, that is, k A k≡ maxkxk=1kAxk ≤ 1, its eigenvalues are all in the

open unit disc D ≡ {z ∈ C : |z| < 1}, and the rank of In − A∗A equals one. In

[19, Theorem 3.1], it was proven that, for an n-by-n (n ≥ 2) weighted shift matrix A with weights w1, ..., wn, k(A) = n if and only if either |w1| = · · · = |wn| or n is even

and |w1| = |w3| = · · · = |wn−1| and |w2| = |w4| = · · · = |wn|. Recall that an n-by-n

(n ≥ 2) matrix of the form         0 w1 0 . .. . .. wn−1 wn 0        

is called a weighted shift matrix with weights w1, ..., wn. Moreover, in [2] k(A) is

computed for two classes of n-by-n matrices as follows. An n-by-n matrix A is almost

normal if it has n − 1 orthogonal eigenvectors. Note that every almost normal matrix

is unitarily similar to An ⊕ Aa, where An is normal while Aa is almost normal and unitarily irreducible (cf. [14]). Recall that a matrix A is unitarily reducible if and

only if A is unitarily similar to A1⊕ A2 for some lower-dimensional matrices A1 and

A2; otherwise, A is unitarily irreducible. In [2, Theorem 3], it was proven that, for

any almost normal matrix A, k(A) = l1+ l2, where l1 is the number of eigenvalues of

(9)

An located on ∂W (A), counting their multiplicities, and l2 =               

0 if W (Aa) lies in the interior of W (An),

2 if there exist distinct parallel supporting lines of W (A) passing through points of W (Aa), or

1 otherwise.

Furthermore, [2, Theorem 5] shows that if A is an n-by-n (n ≥ 3) tridiagonal Toeplitz

matrix of the form

           a c 0 . . . 0 b a c . .. ... 0 . .. ... ... 0 ... ... b a c 0 . . . 0 b a            , then k(A) =    n if |b| = |c|, ⌈n/2⌉ otherwise.

We will show that if A is a normal or a quadratic matrix, then the exact value of k(A) can be computed. Recall that a quadratic matrix A is one which satisfies A2 + z

1A + z2I = 0 for some scalars z1 and z2. For a matrix A of the form B ⊕ C,

we show that k(A) = 2 if and only if the numerical range of one summand, say, B is contained in the interior of the numerical range of the other summand C and k(C) = 2. For an irreducible matrix A, we can determine exactly when the value of k(A) equals the size of A. These are then applied to determine k(A) for a reducible matrix A of size 4 in terms of the shape of W (A). These results also appeared in [10].

In Chapter 3, we continue to study k(A) for two classes of n-by-n nonnegative matrices A. Recall that an n-by-n matrix A = [aij]ni,j=1 is a nonnegative matrix,

denoted by A  0, if aij ≥ 0 for all i and j. Recall also that a square matrix P

(10)

and all other entries are 0. Note that any permutation matrix P is unitary with P∗ = PT = P−1. Two square matrices A and B of the same size are permutationally similar if there is a permutation matrix P such that PTAP = B, which is denoted by

A ∼=p B. A matrix A is permutationally reducible if it is permutationally similar to a

matrix of the form  

B C 0 D

, where B and D are square matrices; otherwise, A is

permutationally irreducible. This should not be confused with the notion of unitarily

reducible (resp., irreducible) matrix. For nonnegative matrices, reducibility (resp., irreducibility) in general refers to the permutational one. Note that the reducibility (or irreducibility, for that matter) of nonnegative matrices is preserved under the permutational similarity, and the irreducibility of a nonnegative matrix A passes to

that of Re A. The converse of the latter is false as witness A =   0 1 0 0  . If A is an n-by-n (n ≥ 2) nonnegative matrix of the form

        0 A1 0 0 . .. . .. Am−1 0 0         ,

where m ≥ 3 and the diagonal zeros are zero square matrices, with irreducible real part, then k(A) has an upper bound m − 1. In addition, we also obtain necessary and sufficient conditions for k(A) = m − 1 for such a matrix A. The other class of nonnegative matrices we study is the doubly stochastic ones. Recall that an n-by-n nonnegative matrices A is doubly stochastic if its row sums and column sums are all equal to one. It is proven that the value of k(A) can be determined for any doubly stochastic matrix A of size 3 or 4 in terms of the shape of W (A). Note that the shapes of W (A) can be determined completely by the tests given in [1, Theorems 1 and 3]. Moreover, the lower bound of k(A), in general, is also found for an n-by-n (n ≥ 5) doubly stochastic matrix via possible shapes of W (A).

(11)

2

Gau-Wu numbers of direct sums of matrices

2.1

Introduction

In Section 2.2 below, we first determine the value of k(A) for a normal matrix A (Proposition 2.2.1). Then we consider the direct sum A = B ⊕ C, where the numerical ranges W (B) and W (C) are assumed to be disjoint. In this case, we show that the value of k(A) is equal to the sum of k1(B) and k1(C) (Theorem 2.2.2), where

k1(B) and k1(C) are defined as follows. We define k1(B) to be the maximum number

k for which there are orthonormal vectors x1, . . . , xk in Cn such that hBxi, xii is in

∂W (A) ∩ ∂W (B) for all i = 1, . . . , k, and similarly for k1(C). Based on the proof of

Theorem 2.2.2, we obtain the same formula for k(A) under a slightly weaker condition on B and C (Theorem 2.2.6). In Section 2.3, we give some applications of Theorem 2.2.6. The first one (Proposition 2.3.1) shows that the equality k(A) = k1(B) + k1(C)

holds for a matrix A of the form B ⊕ C with normal C. In particular, we are able to determine the value of k(A) for any 4-by-4 reducible matrix A (Corollary 2.3.4 and Propositions 2.3.7 − 2.3.9). Moreover, the number k(A ⊕ (A + aIn)) can be

determined for any n-by-n matrix A and nonzero complex number a (Proposition 2.3.10). At the end of Section 2.3, we propose several open questions on k(B ⊕ C) and give a partial answer for one of them (Proposition 2.3.11). That is, the equality k(⊕m

j=1A) = m · k(A) holds if the dimension of Hξ(A) equals one for each ξ ∈ ∂W (A),

where the subspace Hξ(A) is defined in the first paragraph of Section 2.2. By using

this, we can determine the value of k(A) for a quadratic matrix A (Corollary 2.3.12).

Note that all of the results in Sections 2.2 and 2.3 have also appeared in [10].

(12)

positive definite, denoted by A > 0, if A is Hermitian and hAx, xi > 0 for all x 6= 0. In

is the n-by-n identity matrix. The n-by-n diagonal matrix with diagonals ξ1, ..., ξn is

denoted by diag (ξ1, ..., ξn). The cardinal number of a set S is #(S). The notation δij

is the Kronecker delta, that is, δij has the value 1 if i = j, and the value 0 if otherwise.

The span of a nonempty subset S of a vector space V , denoted by span (S), is the subspace consisting of all linear combinations of the vectors in S.

2.2

Direct sum

We start by reviewing a few basic facts concerning the boundary points of a numerical range. For an n-by-n matrix A, a point ξ in ∂W (A) and a supporting line L of W (A) which passes through ξ, there is a θ in [0, 2π) such that the ray from the origin which forms angle θ from the positive x-axis is perpendicular to L. In this case, Re (e−iθξ) is

the maximum eigenvalue of Re (e−iθA) with the corresponding eigenspace E

ξ,L(A) ≡

ker Re (e−iθ(A − ξI

n)). Let Kξ(A) denote the set {x ∈ Cn : hAx, xi = ξkxk2} and

Hξ(A) the subspace spanned by Kξ(A). If the matrix A is clear from the context, we

will abbreviate these to Eξ,L, Kξ and Hξ, respectively. For other related properties,

we refer the reader to [4, Theorem 1] and [19, Proposition 2.2]. The next proposition on the value of k(A) for a normal matrix A is an easy consequence of [19, Lemma 2.9]. It can be regarded as a motivation for our study of this topic.

Proposition 2.2.1. If A is an n-by-n normal matrix with p eigenvalues (counting multiplicity) in ∂W (A), then k(A) = p.

Proof. We may assume, after a unitary similarity, that A is a matrix of the form

B ⊕ C, where B = diag (λ1, . . . , λp) and C = diag (λp+1, . . . , λn) with λ1, . . . , λp ∈

∂W (A) and λp+1, . . . , λn ∈ int W (B). It follows from [19, Lemma 2.9] that k(A) =

(13)

k(B ⊕ C) = k(B) = p.  One of our main results of this section is the following theorem for k(A) when A is a matrix of the form B ⊕C with disjoint W (B) and W (C). Recall that the value of k1(B) is the maximum number k for which there are orthonormal vectors x1, . . . , xk

in Cn such that hBx

i, xii is in ∂W (A) ∩ ∂W (B) for all i = 1, . . . , k. If the subset

∂W (A) ∩∂W (B) is empty, then we define k1(B) = 0. The following theorem provides

a formula for determining the value of k(A) by k1(B) and k1(C).

Theorem 2.2.2. Let A = B ⊕ C, where B and C are n-by-n and m-by-m

matrices, respectively. If the numerical ranges W (B) and W (C) are disjoint, then

k(A) = k1(B) + k1(C) ≤ k(B) + k(C). In this case, k(A) = k(B) + k(C) if and only if k1(B) = k(B) and k1(C) = k(C). In particular, k(A) = m + n if and only if

k1(B) = k(B) = n and k1(C) = k(C) = m.

This will be proven after the following lemma which is the case when C equals a 1-by-1 matrix [c].

Recall that z is an extreme point of the convex subset ∆ of C if z belongs to ∆ and cannot be expressed as a convex combination of two other (distinct) points of ∆; otherwise, z is a nonextreme point. Recall also that a point z is a corner of a convex set ∆ of the complex plane if z is in the closure of ∆ and ∆ has two supporting lines passing through z. If A is a finite matrix, ξ = hAx, xi and kxk = 1, then x is called a unit vector corresponding to the point ξ in W (A).

Lemma 2.2.3. If A = B ⊕ [c] is an n-by-n matrix, where B is of size n − 1 and c is a scalar, then k(A) = k1(B) + k1([c]).

Proof. By Proposition 2.2.1, we may assume that the interior of the numerical

(14)

Lemma 2.9]. Obviously, k(B) = k1(B) and k1([c]) = 0 in this case. Hence it remains

to consider the case when c is outside the interior of W (B). That is, we will prove that k(A) = k1(B) + 1 for c /∈ int W (B). By the definition of k(A), there are points

ξj = hAzj, zji in ∂W (A), j = 1, 2, . . . , k(A), with hzi, zji = δij for i, j = 1, ..., k(A).

Clearly, the inequality k(A) ≥ k1(B) + 1 holds. Assume that k(A) ≥ k1(B) + 2. Let

zj = xj⊕ yj for each j. We claim that every xj is a nonzero vector. Indeed, if xj0 = 0

for some j0, then yj0 6= 0 and hzj, zj0i = hyj, yj0i = 0 for all j 6= j0. This implies that

yj = 0 for all j 6= j0 and thus k1(B) is at least k1(B) + 1, which is absurd. Hence

the claim has been proven. From ξj = hAzj, zji = kxjk2bj + kyjk2c ∈ ∂W (A), where

bj = hB (xj/ kxjk) , xj/ kxjki, it follows that ξj is in the (possibly degenerate) line

segment [c, bj], and bj is in the boundary of W (B) for each j. We note that there are

at least two nonzero yj’s; this is because if otherwise, then we obtain the inequality

k1(B) ≥ k1(B)+1, which is a contradiction. Hence we may assume that y1, ..., yh 6= 0,

where h ≥ 2, and that this h is the maximal such number.

If c is not in W (B), then there are exactly two points p and q in the boundary of W (B) such that the two line segments [c, p] and [c, q] are in the boundary of W (A) and the relative interior of these two line segments are disjoint from the boundary of W (B) by the fact that W (A) is the convex hull of the union of W (B) and the singleton {c}. Hence there are three cases to consider: the intersection of the boundary of W (B) and the supporting line at p (resp., q) containing [c, p] (resp., [c, q]) is (1) {p} (resp., {q}), (2) a line segment [p, p′] (resp., {q}) or {p} (resp., a line segment [q, q] ), or (3)

a line segment [p, p′] (resp., a line segment [q, q]) (cf. Figure 2.2.4). We need only

prove case (2) since other cases can be done similarly.

p p p c c c q q q W (B) W (B) W (B) p′ p′ q′ (1) (2) (3) 8

(15)

Figure 2.2.4

Define three (disjoint) subsets consisting of the corresponding unit vectors, and their cardinal numbers, respectively, in the following:

R ≡ {zj : ξj ∈ [c, p′)} with r ≡ # (R) ,

S ≡ {zj : ξj ∈ (c, q)} with s ≡ # (S) , and

T ≡ {zj : ξj ∈ ∂W (A)\([c, p′) ∪ (c, q))} with t ≡ # (T ) .

So, k(A) = r + s + t. Obviously, every zj ∈ T is of the form xj ⊕ 0. Moreover, we

partition R into two disjoint subsets R1 ≡ {zj : yj 6= 0} and R2 ≡ {zj : yj = 0}. We

call their cardinal numbers r1 and r2, respectively. Without loss of generality, we

may assume that R1 = {z1, ..., zr1}, R2 = {zr1+1, ..., zr1+r2}, S = {zr+1, ..., zr+s}, and

T = {zr+s+1, ..., zr+s+t}, where r1+ r2 = r. This shows that r1+ s = h ≥ 2.

First assume that s = 0. Then r1 ≥ 2. For the clarity of the proof, the following

method is called (∗). Since every yj, j = 1, . . . , r1, is nonzero, we define the vectors

z′

j = (xj/yj) ⊕ 1 for these j’s so that the vectors in M ≡

 z′ 1− z′j / z1′ − zj′ r1 j=2= (((x1/y1) − (xj/yj)) ⊕ 0) / z′1− z′j r1

j=2 are linearly independent and are

perpen-dicular to vectors in T ∪R2. This together with [4, Theorem 1] shows that span (M) ⊆

∪η∈[c,p′]Kη(A) and thus every unit vector in span (M) is a unit vector corresponding

to some η ∈ ∂W (B). Choosing an orthonormal basis {vj⊕ 0}rj=21 for the subspace

span (M), we deduce from the orthonormality of the vectors in T ∪ R2∪ {vj⊕ 0}rj=21

that

(16)

which is impossible. Hence we must have s ≥ 1.

If s = 1, then r1 ≥ 1. A similar argument as above yields that

k1(B) ≥    t + r2+ 1 if r1 = 1, and t + r2+ (r1− 1) + 1 if r1 ≥ 2

by considering the orthonormal subsets T ∪ R2∪ {(xr+1/ kxr+1k) ⊕ 0} and T ∪ R2∪

{vj ⊕ 0}rj=21 ∪ {(xr+1/ kxr+1k) ⊕ 0}, where {vj ⊕ 0}rj=21 is an orthonormal subset of

span (R1), via applying (∗) on R1. The above inequalities imply that

k1(B) ≥    r + s + t − 1 ≥ k(A) − 1 ≥ k1(B) + 1 if r1 = 1, and r + s + t − 1 ≥ k(A) − 1 ≥ k1(B) + 1 if r1 ≥ 2.

This is a contradiction. Hence s ≥ 2.

If r1 = 0, then applying (∗) on S, we reach a contradiction since

k1(B) ≥ t + r2+ (s − 1) = r + s + t − 1 = k(A) − 1 ≥ k1(B) + 1.

If r1 = 1, then we obviously have the linear independence of the subset N ≡

 z′ 1− zj′ / z′1− zj′ r+s j=r+2 = (((x1/y1) − (xj/yj)) ⊕ 0) / z1′ − zj′ r+s j=r+2 by

ap-plying (∗) on S again. Let {vj⊕ 0}r+sj=r+2 be an orthonormal basis for the subspace

span (N). Hence

k1(B) ≥ t + r2+ (s − 1) + 1 = r + s + t − 1 = k(A) − 1 ≥ k1(B) + 1

by the orthonormality of the vectors in T ∪ R2 ∪ {vj ⊕ 0}r+sj=r+2 ∪ {(x1/ kx1k) ⊕ 0}.

This is again a contradiction. If r1 ≥ 2, then applying Method I on S and R1,

we have the linear independence of the subsets P ≡  z′ 1− zj′ / z1′ − zj′ r+s j=r+2 = (((x1/y1) − (xj/yj)) ⊕ 0) / z′1− z′j r+s j=r+2 and Q ≡  z′ 1− zj′ / z1′ − zj′ r1 j=2 = (((x1/y1) − (xj/yj)) ⊕ 0) / z′1− z′j r1 j=2, respectively. Let {vj⊕ 0} r+s j=r+2 be an

or-thonormal basis for span (P ). Then span (P ) ⊕ span (x ⊕ y) = span (S) for some unit vector x ⊕ y orthogonal to span (P ). Clearly, x is a nonzero vector; this is

(17)

because if otherwise, then 0 ⊕ y(∈ span (S)) is orthogonal to z1 = x1 ⊕ y1(∈ R1),

which contradicts the fact that y and y1 are nonzero scalars. Let {vj⊕ 0}rj=21 be

an orthonormal basis for the subspace span (Q). Then we conclude that the subset T ∪ R2∪ {vj⊕ 0}rj=21 ∪ {vj ⊕ 0}r+sj=r+2∪ {(x/ kxk) ⊕ 0} is orthonormal so that

k1(B) ≥ t + r2+ (r1− 1) + (s − 1) + 1 = r + s + t − 1 = k(A) − 1 ≥ k1(B) + 1,

which is a contradiction. This completes the proof of case (2).

In case (1), we define three subsets consisting of the corresponding unit vectors, and their cardinal numbers, respectively, as follows:

R ≡ {zj : ξj ∈ [c, p)} with r ≡ # (R) ,

S ≡ {zj : ξj ∈ (c, q)} with s ≡ # (S) , and

T ≡ {zj : ξj ∈ ∂W (A)\([c, p) ∪ (c, q))} with t ≡ # (T ) .

As for case (3), we have

R ≡ {zj : ξj ∈ [c, p′)} with r ≡ # (R) ,

S ≡ {zj : ξj ∈ (c, q′)} with s ≡ # (S) , and

T ≡ {zj : ξj ∈ ∂W (A)\([c, p′) ∪ (c, q′))} with t ≡ # (T ) .

As before, we partition R (resp., S) into two disjoint subsets R1 ≡ {zj : yj 6= 0}

and R2 ≡ {zj : yj = 0} (resp., S1 ≡ {zj : yj 6= 0} and S2 ≡ {zj : yj = 0}). Based

on the arguments for case (2), we get a series of contradictions for each individual case. In a similar fashion, we remark that if A = B ⊕ cIm, where c /∈ W (B), then

k(A) = k1(B) + k1(cIm) = k1(B) + m. This remark will be used in the remaining

part of the proof.

To complete the proof, we let c be in the boundary of W (B). Assume that ∂W (B) contains no line segment. We infer that c = bj = ξj for j = 1, ..., h since these

(18)

of W (B). Define a new vector z′

j = (xj/yj) ⊕ 1 for each j = 1, ..., h. Then the

subset S ≡  z′ 1− zj′ / z′1− z′j h j=2 = (((x1/y1) − (xj/yj)) ⊕ 0) / z1′ − zj′ h j=2

is linearly independent. Since c is an extreme point of W (A), we have Hc(A) =

Kc(A) by [4, Theorem 1] and span (S) is a subspace of Hc(A). Let {vj⊕ 0}hj=2 be

an orthonormal basis for span (S). Then c = hA (vj ⊕ 0) , vj⊕ 0i = hBvj, vji is in

∂W (B) for j = 2, . . . , h. Hence

k(B) ≥ (h − 1) + (k(A) − h) = k(A) − 1 ≥ k(B) + 1.

This is a contradiction. So, we may assume that ∂W (B) contains a line segment l such that c belongs to l. If c is not an extreme point of l, then we infer that c = bj = ξj or ξj ∈ (c, bj) for j = 1, ..., h since xj and yj are nonzero vectors for these

j’s. Hence zj ∈ Hc(A) for j = 1, ..., h by [4, Theorem 1]. Similar arguments show

that Hc(A) has an orthonormal subset {wj⊕ 0}hj=2. Since Hc(A) = ∪η∈lKη(A) by [4,

Theorem 1], this implies that wj ⊕ 0 ∈ Kηj(A), where ηj ∈ l, for j = 2, ..., h. From

ηj = hA (wj⊕ 0) , wj⊕ 0i = hBwj, wji ∈ l ⊆ ∂W (B), where j = 2, ..., h, we reach a

contradiction since

k(B) ≥ (h − 1) + (k(A) − h) = k(A) − 1 ≥ k(B) + 1.

For the remaining part of the proof, let c be an extreme point of l, where l is a line segment on the boundary of W (B). We consider two cases: either (a) there is only one line segment in ∂W (B) passing through c, or (b) there are exactly two line segments in ∂W (B) passing through c. In case (a), since xj and yj are nonzero vectors

for j = 1, ..., h, we infer that c = bj = ξj or ξj ∈ (c, bj) for these j’s. This implies that

zj ∈ Hη(A) by [4, Theorem 1], where η is not an extreme point of l. So, the same

arguments as above lead us to a contradiction. For case (b), since c is a corner of W (B), c is a reducing eigenvalue of B by [3, Theorem 1]. Thus B is unitarily similar to a matrix of the form B′⊕cI

n′, where c is not an eigenvalue of B′, and the size of B′

and n′ are both less than n. Obviously, c /∈ W (B). We apply the preceding remark

(19)

as for the case of c /∈ W (B) to see that k(A) = k(B⊕ cI

n′+1) = k1(B′) + n′ + 1,

and k(B) = k(B′ ⊕ cI

n′) = k1(B′) + n′. In addition, k(B) = k1(B) in this case.

Hence we obtain that k(A) = k1(B) + 1, which contradicts our assumption that

k(A) ≥ k1(B) + 2. With this, we conclude the proof of the asserted equality. 

We remark that the part of the proof of Lemma 2.2.3 on c /∈ W (B) involves the following three cases (1), (2), and (3) depending on whether ∂W (B) contains a line segment or otherwise. In case (1), we have R = {zj : yj 6= 0} and S = {zj : yj 6= 0},

in (2) R = R1 ∪ R2, where R1 = {zj : yj 6= 0} and R2 = {zj : yj = 0}, and

S = {zj : yj 6= 0}, and in (3) R = R1 ∪ R2, where R1 = {zj : yj 6= 0} and

R2 = {zj : yj = 0}, and S = S1∪ S2, where S1 = {zj : yj 6= 0} and S2 = {zj : yj = 0}.

Note that the key point is to handle R and S in (1), R1 and S in (2), and R1 and

S1 in (3), that is, all nonzero yj’s of the three cases. We find that the proofs of the

three cases are almost the same. This observation can facilitate the proof of Theorem 2.2.2 as follows. If ∂W (B) contains a line segment such that this line segment is a portion of ∂W (A) and stretches to a point of ∂W (C), then we take the same method as the proof of Lemma 2.2.3 on c /∈ W (B) to partition the corresponding R into R1 = {zj : yj 6= 0} and R2 = {zj : yj = 0}. As mentioned above, we need only handle

R1. On the other hand, if ∂W (B) contains no such line segments, then we need only

handle the corresponding R = {zj : yj 6= 0}. From this, there is no difference between

the proofs of the two cases. Hence we may assume, in the proof of Theorem 2.2.2, that ∂W (B) and ∂W (C) contain no line segments.

Before giving a proof of Theorem 2.2.2, we note several things. First of all, by Lemma 2.2.3, we may assume that both of the numerical ranges W (B) and W (C) are not singletons. Secondly, we may further assume that ∂W (B) and ∂W (C) contain no line segment by the above remark. Thirdly, since W (A) is the convex hull of the union of W (B) and W (C), there are two line segments, called [a, p] and [b, q], in ∂W (A), where a, b ∈ ∂W (B) and p, q ∈ ∂W (C). Fourthly, it is easy to check that a 6= b and

(20)

p 6= q. Indeed, if a = b, then a is a corner. By [3, Theorem 1], we obtain that a is a reducing eigenvalue of A, and hence a is a reducing eigenvalue of B. This shows that W (B) must contain a line segment, which contradicts our previous assumption. Similarly, we also have p 6= q. Combining the above, we have the following Figure 2.2.5 as the numerical range W (A).

a b W (B) W (C) p q Figure 2.2.5

As before, by the definition of k(A), there exist ξj = hAzj, zji ∈ ∂W (A), j =

1, 2, . . . , k(A), where zj = xj ⊕ yj, and hzi, zji = δij for i, j = 1, ..., k(A). We define

four (disjoint) subsets consisting of the corresponding unit vectors, and their cardinal numbers, respectively, as follows:

R ≡ {zj : ξj ∈ (a, p)} with r ≡ # (R) ,

S ≡ {zj : ξj ∈ (b, q)} with s ≡ # (S) ,

TB ≡ {zj : ξj ∈ ∂W (A) ∩ ∂W (B)} with t1 ≡ # (TB) , and

TC ≡ {zj : ξj ∈ ∂W (A) ∩ ∂W (C)} with t2 ≡ # (TC) .

Since the intersection of W (B) and W (C) is empty, and ∂W (B) and ∂W (C) contain

(21)

no line segment, we may assume that

R = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}rj=1,

S = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}r+sj=r+1,

TB = {zj = xj ⊕ 0 : xj 6= 0}r+s+tj=r+s+11 , and

TC = {zj = 0 ⊕ yj : yj 6= 0}r+s+tj=r+s+t1+t12+1.

So, k(A) = r + s + t1 + t2, k1(B) ≥ t1 and k1(C) ≥ t2. Clearly, the inequality

k(A) ≥ k1(B) + k1(C) holds. Now we are ready to prove Theorem 2.2.2.

Proof of Theorem 2.2.2. We need only prove that the reversed inequality k1(B) +

k1(C) ≥ k(A) holds. First, we consider the case r = 0. Assume that s = 0. Then our

assertion is obvious since

k1(B) + k1(C) ≥ t1+ t2 = r + s + t1+ t2 = k(A).

Assume that s = 1, i.e., z1 = x1⊕ y1 ∈ S. Then k1(B) ≥ t1+ 1 since the unit vector

(x1/ kx1k) ⊕ 0 is clearly orthogonal to TB and hB (x1/ kx1k) , x1/ kx1ki is in ∂W (B)

by the convex combination

hAz1, z1i = kx1k2  B x1 kx1k , x1 kx1k  + ky1k2  C y1 ky1k , y1 ky1k  ∈ (b, q) . Hence k1(B) + k1(C) ≥ (t1+ 1) + t2 = r + s + t1+ t2 = k(A).

Assume that s = 2, i.e., z1 = x1⊕ y1 and z2 = x2⊕ y2 ∈ S. If x1 and x2 are linearly

independent, then by the Gram-Schmidt process, there are two unit vectors z′ 1 and

z′

2, where zj′ = x′j ⊕ y′j with x′j 6= 0 for j = 1, 2 such that x′1 and x′2 are mutually

orthogonal, and span ({z1, z2}) is equal to span ({z1′, z′2}). Choosing the two unit

vectors (x′

1/ kx′1k) ⊕ 0 and (x′2/ kx′2k) ⊕ 0, we obtain that k1(B) ≥ t1+ 2. Hence

(22)

On the other hand, if x1 and x2 are linearly dependent, say, x2 = λx1 for some scalar

λ, then we define a new unit vector

z2′ = z2− λz1

kz2− λz1k = 0 ⊕

y2− λy1

ky2− λy1k ∈ span ({z 1, z2})

so that span ({z1, z2}) = span ({z1′}) ⊕ span ({z′2}) for some unit vector z1′ ≡ x′1⊕ y1′,

where z′

1 and z2′ are mutually orthogonal. Clearly, x′1 6= 0 for otherwise it leads

to x1 = x2 = 0, which contradicts the definition of S. From the two unit vectors

(x′

1/ kx′1k) ⊕ 0 and z2′, we infer that k1(B) ≥ t1+ 1 and k1(C) ≥ t2 + 1. Hence

k1(B) + k1(C) ≥ (t1+ 1) + (t2+ 1) = r + s + t1 + t2 = k(A).

Assume that s ≥ 3, that is, S = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}sj=1. We consider

the largest linearly independent subset of {xj}sj=1 as follows. Without loss of

gener-ality, we may assume that this can be {xj}sj=1, {x1} or {xj}lj=1, where 1 < l < s. For

the first two cases, it can be done by applying similar arguments as for the case of s = 2. In the last case, since xj is a linear combination of x1, ..., xl for j = l + 1, ..., s,

it is easy to check that the unit vectors

(1) zj zj− Σ l i=1a (j) i zi zj− Σ l i=1a (j) i zi = 0 ⊕   yj − Σli=1a (j) i yi yj − Σ l i=1a (j) i yi  , j = l + 1, ..., s,

are linearly independent. Let y′ j = yj−Σli=1a(j)i yi yj−Σ l i=1a (j) i yi for j = l + 1, ..., s. Since F ≡ span z′ j = 0 ⊕ yj′ s j=l+1 

is a subspace of the space V ≡ span {zj}sj=1



, the or-thogonal complement of F in V , called E, can be written as span z′

j ≡ x′j ⊕ y′j

l

j=1



for some unit vectors z′

j, j = 1, ..., l. By (1), we see that {x′j}lj=1 is linearly

indepen-dent since {xj}lj=1 is linearly independent. Hence we may assume that both {x′j}lj=1

andy′ j

s

j=l+1are orthogonal subsets by the Gram-Schmidt process. This shows that

G1 ≡  x′ j/ x′j  ⊕ 0 l j=1 and G2 ≡ 0 ⊕ y ′ j s

j=l+1 are orthogonal to TB and TC,

respectively. Since every vector v in G1 (resp., G2) is such that hAv, vi is in ∂W (B)

(resp., ∂W (C)), we obtain that k1(B) + k1(C) ≥ k(A) from k1(B) ≥ t1 + l and

k1(C) ≥ t2+ s − l. This completes the proof of the case r = 0.

(23)

Next, we prove for the case r = 1. Obviously, it is sufficient to consider s ≥ 1 since the case r = 1, s = 0 is the same as the case r = 0, s = 1. Assume that s = 1, that is, z1 = x1⊕ x2 ∈ R and z2 = x2⊕ y2 ∈ S. Then k1(B) ≥ t1+ 1 and k1(C) ≥ t2+ 1 since

(x1/kx1k) ⊕ 0 and 0 ⊕ (y2/ky2k) are orthogonal to TB and TC, respectively. Moreover,

hB (x1/ kx1k) , x1/ kx1ki is in the boundary of W (B) by the convex combination

hAz1, z1i = kx1k2  B x1 kx1k , x1 kx1k  + ky1k2  C y1 ky1k , y1 ky1k  ∈ (a, p) ,

and hC (y2/ ky2k) , y2/ ky2ki is in the boundary of W (C) by the same arguments.

Hence

k1(B) + k1(C) ≥ (t1+ 1) + (t2+ 1) = r + s + t1 + t2 = k(A).

Assume that s = 2. Then we have R = {z1 = x1⊕ y1 : x1 6= 0 and y1 6= 0} and

S = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}3j=2. If {x2, x3} is linearly independent, then

we may assume that it is an orthogonal set by the Gram-Schmidt process. By the con-vex combination mentioned above, we infer from the three unit vectors 0 ⊕ (y1/ ky1k),

(x2/ kx2k) ⊕ 0, and (x3/ kx3k) ⊕ 0 that k1(B) ≥ t1 + 2 and k1(C) ≥ t2+ 1. Hence

k1(B) + k1(C) ≥ (t1+ 2) + (t1+ 1) = r + s + t1 + t2 = k(A).

On the other hand, if {x2, x3} is linearly dependent, say, x2 = λx3 for some scalar λ,

then we define a new unit vector

z2′ = z2− λz3

kz2− λz3k = 0 ⊕

y2− λy3

ky2− λy3k ∈ span ({z2

, z3})

so that span ({z2, z3}) = span ({z2′}) ⊕ span ({z′3}) for some unit vector z3′ ≡ x′3⊕ y3′,

where z′

2 is orthogonal to z3′. Clearly, x′3 6= 0 for otherwise it leads to x2 = x3 = 0,

which contradicts the definition of S. From the three unit vectors 0 ⊕ (y1/ ky1k),

0 ⊕ ((y2− λy3) / ky2− λy3k), and (x′3/ kx′3k) ⊕ 0, we infer that k1(B) ≥ t1 + 1 and

k1(C) ≥ t2+ 2. Hence

(24)

Assume that s ≥ 3, that is, S = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}s+1j=2, and R =

{z1 = x1⊕ y1 : x1 6= 0 and y1 6= 0}. We consider the largest linearly independent

subset of {xj}s+1j=2, which we may assume to be {xj}j=2s+1, {x2} or {xj}lj=2, where

2 < l < s + 1. These three largest subsets are similar to those considered under r = 0, s ≥ 3. Indeed, we need only add the unit vector 0 ⊕ (y1/ ky1k) to every

sub-case of the case r = 0, s ≥ 3. Hence we have proved that the reversed inequality k1(B) + k1(C) ≥ k(A). This completes the proof of the case r = 1.

Let r = 2. With the help of the preceding discussions, we may assume that s ≥ 2. Assume that s = 2, that is, R = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}2j=1 and S =

{zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}4j=3. If {x3, x4} is linearly independent, then we

consider two cases as follows. First, we assume that {y1, y2} is linearly independent.

We may further assume that {x3, x4} and {y1, y2} are orthogonal subsets by the

Gram-Schmidt process. Obviously, the two subsets H1 ≡ {0 ⊕ (y1/ ky1k) , 0 ⊕ (y2/ ky2k)}

and H2 ≡ {(x3/ kx3k) ⊕ 0, (x4/ kx4k) ⊕ 0} are orthogonal to TC and TB, respectively.

Since every vector v in H1 (resp., H2) is such that hAv, vi is in the boundary of W (C)

(resp., W (B)), we infer, from k1(B) ≥ t1+2 and k1(C) ≥ t2+2, that k1(B)+k1(C) ≥

k(A). On the other hand, assume that {y1, y2} is linearly dependent, say, y1 = λy2

for some scalar λ. Then we define a new unit vector z′

1 = (z1− λz2)/kz1 − λz2k =

((x1−λx2)/kx1−λx2k)⊕0 so that span ({z1, z2}) = span ({z1′})⊕span ({z2′}) for some

unit vector z′

2 ≡ x′2⊕ y2′, where z1′ and z2′ are mutually orthogonal. Clearly, y2′ 6= 0 for

otherwise it leads to y1 = y2 = 0, which contradicts the definition of R. Moreover,

we may assume that {x3, x4} is an orthogonal subset by the Gram-Schmidt

pro-cess. Hence H3 ≡ {((x1 − λx2) / kx1 − λx2k) ⊕ 0, (x3/ kx3k) ⊕ 0, (x4/ kx4k) ⊕ 0}

and H4 ≡ {0 ⊕ (y′2/ ky2′k)} are orthogonal to TB and TC, respectively. Since every

vector v in H3 (resp., H4) is such that hAv, vi is in the boundary of W (B) (resp.,

W (C)), we infer, from k1(B) ≥ t1+ 3 and k1(C) ≥ t2+ 1, that k1(B) + k1(C) ≥ k(A).

On the other hand, if {x3, x4} is linearly dependent, then we need only consider

the case that {y1, y2} is linearly dependent. So, we may assume that y1 = λy2 and

(25)

x3 = µx4 for some scalars λ and µ. Define two new unit vectors z′1 = z1− λz2 kz1− λz2k = x1− λx2 kx1− λx2k ⊕ 0 and z ′ 3 = z3− µz4 kz3− µz4k = 0 ⊕ y3− µy4 ky3− µy4k .

Then span ({z1, z2}) = span ({z1′})⊕span ({z2′}) and span ({z3, z4}) = span ({z′3})⊕

span ({z′

4}) for some unit vectors z2′ = x′2⊕ y2′ and z′4 = x′4⊕ y4′, where z2′ (resp., z4′)

is orthogonal to z′

1 (resp., z′3). Clearly, y2′ and x′4 are nonzero by the same argument

as above. Hence H5 ≡ {((x1− λx2) / kx1− λx2k) ⊕ 0, (x′4/ kx′4k) ⊕ 0} and H6 ≡

{0 ⊕ (y′

2/ ky2′k) , 0 ⊕ ((y3− λy4) / ky3− λy4k)} are orthogonal to TB and TC,

respec-tively. Since every vector v in H5 (resp., H6) is such that hAv, vi is in the boundary

of W (B) (resp., W (C)), we infer, from k1(B) ≥ t1+2 and k1(C) ≥ t2+2, that k1(B)+

k1(C) ≥ k(A). Assume that s ≥ 3, that is, R = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}2j=1,

and S = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}s+2j=3. If {y1, y2} is linearly independent,

then we may assume that {y1, y2} is orthogonal by the Gram-Schmidt process. In

this case, we consider the largest linearly independent subset of {xj}s+2j=3, which may

be assumed to be {xj}s+2j=3, {x3} or {xj}lj=3 (3 < l < s + 2). Each of the three

cases can be handled by applying similar arguments as for the cases of r = 0, s ≥ 2. On the other hand, if {y1, y2} is linearly dependent, say, y1 = λy2 for some

scalar λ, then we define a new unit vector z′

1 = ((x1− λx2)/kx1− λx2k) ⊕ 0 so that

span ({z1, z2}) = span ({z1′}) ⊕ span ({z′2}) for some unit vector z′2 = x′2⊕ y′2, where z1′

and z′

2 are mutually orthogonal. Clearly, y2′ is nonzero by the same argument as for

the case of r = 0, s = 2. To complete the proof, it remains to consider the three cases mentioned above. By applying similar arguments again as for the cases of r = 0, s ≥ 2, we obtain the reversed inequality k1(B) + k1(C) ≥ k(A). This completes the

proof of the case r = 2.

Finally, assume that r ≥ 3. It suffices to consider s ≥ 3 since s ≤ 2 has been proven if we exchange the roles of s and r. Hence R = {zj = xj ⊕ yj : xj 6= 0 and yj 6= 0}rj=1

and S = {zj = xj⊕ yj : xj 6= 0 and yj 6= 0}r+sj=r+1. As mentioned previously, there are

(26)

{xj}r+sj=r+1). Without loss of generality, we may assume that this subset is {yj}rj=1, {y1}

or {yj}lj=11 , where 1 < l1 < r, and {xj}j=r+1r+s , {xr+1} or {xj}r+lj=r+12 , where 1 < l2 < s.

There are a total of nine cases to be considered. Since each case is similar to the one under r = 0, s ≥ 1, it follows that the reversed inequality k1(B) + k1(C) ≥ k(A)

holds. This completes the proof of the case r ≥ 3.  At the end of the section, we give a generalization of Theorem 2.2.2 under a slightly weaker condition on B and C. Let A be a matrix of the form B ⊕ C. Since W (A) is the convex hull of the union of W (B) and W (C), we consider two (disjoint) subsets of ∂W (A) as follows: one is ∂W (A) \ (∂W (B) ∪ ∂W (C)) ≡ Γ1, and the other

is ∂W (A) ∩ ∂W (B) ∩ ∂W (C) ≡ Γ2. Geometrically, Γ1 consists of the line segments

contained in ∂W (A) but not in ∂W (B) ∪ ∂W (C). On the other hand, since the common boundaries of the three numerical ranges consist of line segments and points which are not in any line segments, every point of the latter can be regarded as a degenerate line segment. Hence Γ2 consists of the (possibly degenerate) line segments

contained in the common boundaries of the three numerical ranges. If Γ ≡ Γ1∪ Γ2

consists of at most two (possibly degenerate) line segments, then we say that W (A) has property Λ. Evidently, the disjointness of W (B) and W (C) implies that property Λ holds since Γ1 consists of exactly two line segments and Γ2 is empty.

Applying similar arguments as in the proof of Theorem 2.2.2, property Λ is enough to establish the equality k(A) = k1(B)+k1(C). Hence we have the following theorem.

Theorem 2.2.6. Let A = B⊕C, where B and C are n-by-n and m-by-m matrices,

respectively. If W (A) has property Λ, then k(A) = k1(B) + k1(C) ≤ k(B) + k(C). In this case, k(A) = k(B) + k(C) if and only if k1(B) = k(B) and k1(C) = k(C). In particular, k(A) = m + n if and only if k1(B) = k(B) = n and k1(C) = k(C) = m.

(27)

2.3

Applications and discussions

The first application of our results in Section 2.2 is a generalization of Lemma 2.2.3. Indeed, we are able to determine the value of k(A) for A = B ⊕ C with normal C.

Proposition 2.3.1. Let A = B ⊕ C, where C is an m-by-m normal matrix.

Then k(A) = k1(B) + k1(C). In this case, k(A) = k(B) + k(C) if and only if

k1(B) = k(B) and k1(C) = k(C). In particular, if C = cIm for some scalar c, then

k(A) = k1(B) + k1(cIm).

Proof. Let the normal C be unitarily similar to ⊕m

j=1[cj]. By [19, Lemma 2.9], we

may assume that all the cj’s lie in ∂W (A). This shows that k1(C) = m immediately.

On the other hand, we also obtain k(A) = k1(B) + m by Lemma 2.2.3. Hence the

asserted equality k(A) = k1(B) + k1(C) has been proven. The remaining assertions

hold trivially by this equality. 

An easy corollary of Proposition 2.3.1 is to determine when k (A) equals the size of A for a matrix A = B ⊕ C with normal C.

Corollary 2.3.2. Let A = B ⊕ C, where B is an n-by-n matrix and C is an

m-by-m normal matrix. Then k(A) = n + m if and only if k1(B) = n and k1(C) = m. Assume, moreover, that dim Hη = 1 for all η ∈ ∂W (B). Then k(A) = n + m if and only if k1(B) = n ≤ 2 and k1(C) = m.

Proof. By Proposition 2.3.1, it is clear that k(A) equals the size of A if and only if

k1(B) and k1(C) equal the sizes of B and C, respectively. In this case, the assumption

on Hη implies that k1(B) = n ≤ 2 by [19, Proposition 2.10]. This completes the proof.



(28)

the end of Section 2.2, where Γ1 = ∂W (A) \ (∂W (B) ∪ ∂W (C)) and Γ2 = ∂W (A) ∩

∂W (B) ∩ ∂W (C). The next proposition gives a lower bound for k(A).

Proposition 2.3.3. Let A = B ⊕ C be an n-by-n (n ≥ 3) matrix. Then Γ is

empty if and only if the numerical range of one summand is contained in the interior of the numerical range of the other. In particular, if Γ is nonempty, then k(A) ≥ 3.

Proof. If Γ = Γ1 ∪ Γ2 is empty, then both Γ1 and Γ2 are empty. Since Γ1 is

empty, ∂W (A) is contained in ∂W (B) ∪ ∂W (C). This implies that W (B) ∩ W (C) is nonempty, and thus W (B) = W (C), W (B) ⊆ int W (C) or W (C) ⊆ int W (B). Moreover, Γ2 = φ implies that W (B) 6= W (C). With this, we conclude that either

W (B) ⊆ int W (C) or W (C) ⊆ int W (B). The converse is obvious. Hence we have proved the first assertion. Let Γ be nonempty, that is, either Γ1 or Γ2 is nonempty. If

Γ1 is nonempty, then there is a line segment on the boundary of W (A). This shows

that k(A) ≥ 3 by [19, Corollary 2.5]. On the other hand, if Γ2 is nonempty, then

there is a (possibly degenerate) line segment on the common boundaries of the three numerical ranges W (A), W (B) and W (C). Using [19, Corollary 2.5] again, we may assume that the line segment is degenerate, say, to {ξ}. This implies immediately that dimξH(A) ≥ 2. Thus k(A) ≥ 3 by [19, Proposition 2.4]. 

As an application, when A is reducible, the next corollary gives a necessary and sufficient condition for k(A) = 2.

Corollary 2.3.4. Let A = B ⊕ C be an n-by-n (n ≥ 3) matrix. Then k(A) = 2

if and only if either k(B) = 2 and W (C) ⊆ int W (B), or k(C) = 2 and W (B) ⊆

int W (C).

Proof. If k(A) = 2, then Proposition 2.3.3 shows that Γ is empty, and thus the

numerical range of one summand, say, B is contained in the interior of the numerical range of C. Hence k(C) = 2 by [19, Lemma 2.9]. The converse is obvious by [19,

(29)

Lemma 2.9] again.  The following proposition determines exactly when k(A) equals the size of A for an irreducible matrix A. It is also stated in [2, Theorem 7] while the proof there is different from ours.

Proposition 2.3.5. Let A be an n-by-n (n ≥ 3) irreducible matrix. Then k(A) = n if and only if ∂W (A) contains a line segment l and there are n points (not necessarily

distinct) in l ∪ (∂W (A) ∩ L), where L is the supporting line parallel to l such that their corresponding unit vectors form an orthonormal basis for Cn.

Proof. We need only prove the necessity. Assume that A is an n-by-n (n ≥ 3)

irreducible matrix with k(A) = n. If ∂W (A) contains no line segment, then dim Hξ=

dim Eξ,l≤ n/2 for all ξ ∈ ∂W (A) by [19, Proposition 2.2]. If n is odd, say, n = 2m+1,

then dim Hξ = dim Eξ,l ≤ m for all ξ ∈ ∂W (A). Since k(A) = n, it follows from [19,

Theorem 2.7] that A is reducible, which is absurd. If n is even, say, n = 2m, then m ≥ 2 by our assumption that n ≥ 3. Since k(A) = n and ∂W (A) contains no line segment, A is unitarily similar to a matrix of the form

 

ξIm eiθD

−eiθDηI m

 

by [19, Theorem 2.7], where dim Hξ = dim Hη = m. Let D = USV be the singular

value decomposition of D, where U and V are unitary and S = diag (s1, ..., sm) is a

diagonal matrix with sj ≥ 0, j = 1, ..., m. Then

  U∗ 0 0 V     ξIm eiθD

−eiθDηI m     U 0 0 V∗  =   ξIm eiθS

−eiθS ηI m

 

and the latter is unitarily similar to

m M j=1   ξ eiθs j −eiθs j η  .

(30)

This contradicts the irreducibility of A. Hence ∂W (A) must contain a line segment. We then apply [19, Theorem 2.7] again to complete the proof.  An easy corollary of Proposition 2.3.5 is the following upper bound for k(A). This was given in [19, Proposition 2.10]. Here we give a simpler proof.

Corollary 2.3.6. If A is an n-by-n (n ≥ 3) matrix with dim Hξ = 1 for all ξ ∈ ∂W (A), then k(A) ≤ n − 1.

Proof. Assume that k(A) = n. It suffices to consider that A is reducible; this

is because if otherwise, then Proposition 2.3.5 shows that ∂W (A) contains a line segment, which contradicts the assumption on Hξ. Let A = B ⊕ C. Then our

assumption on Hξ implies that Γ is empty. By Proposition 2.3.3, we obtain that the

numerical range of one summand is contained in the interior of the numerical range of the other summand. It follows from [19, Lemma 2.9] that the value of k(A) equals k(B) or k(C). Thus k(A) ≤ n − 1 as asserted.  We now combine Proposition 2.3.1, Corollary 2.3.2, Corollary 2.3.4, and Proposi-tion 2.3.5 to determine the value of k(A) for any 4-by-4 reducible matrix A. Corollary 2.3.4 shows exactly when the value of k(A) equals two. By Proposition 2.3.1, Corol-lary 2.3.2 and Proposition 2.3.5, we get a necessary and sufficient condition for the value of k(A) to be equal to four. In other words, the value of k(A) can be determined completely for any 4-by-4 reducible matrix A. To do this, we note that a 4-by-4 re-ducible matrix A can be written, after a unitary similarity, as (i) A = B ⊕ [c], where B is a 3-by-3 irreducible matrix and c is a complex number, (ii) A = B ⊕ [c], where B is a 3-by-3 reducible matrix and c is a complex number, or (iii) A = B ⊕ C, where B and C are 2-by-2 irreducible matrices. Proposition 2.3.7 below is to deal with case (i).

Recall that for a 3-by-3 irreducible matrix A, W (A) is of one of the following

(31)

shapes (cf. [9]): an elliptic disc, the convex hull of a heart-shaped region, in which case ∂W (A) contains a line segment, and an oval region.

Proposition 2.3.7. Let A = B ⊕ [c], where B is a 3-by-3 irreducible matrix and c is a complex number. Then k(A) = 4 if and only if c /∈ int W (B) and {a1, a2, b} ⊆

∂W (A), where W (B) is the convex hull of a heart-shaped region, in which case ∂W (B)

contains a line segment [a1, a2] contained in the supporting line L1 of W (B) and L2 is the supporting line of W (B) passing through b and parallel to L1.

Proof. By Corollary 2.3.2, we see that k(A) = 4 is equivalent to k1(B) = 3

and k1([c]) = 1. Since a necessary and sufficient condition for k1([c]) = 1 is that

c /∈ int W (B), it remains to show that k1(B) = 3 if and only if {a1, a2, b} ⊆ ∂W (A)

and W (B) satisfies the asserted properties. If k1(B) = 3, then k(B) = 3. Hence

it follows from Proposition 2.3.5 that ∂W (A) contains {a1, a2, b}, and W (B) is as

asserted. The converse is trivial.  For case (ii), let A = B ⊕ [c], where B is a 3-by-3 reducible matrix. After a unitary similarity, B can be written as C ⊕ [b], where C is a 2-by-2 matrix, so that k(A) = k1(C) + k1([b] ⊕ [c]) by Proposition 2.3.1. The following proposition gives a

necessary and sufficient condition for k(A) to be equal to four.

Proposition 2.3.8. Let A = C ⊕ [b] ⊕ [c], where C is a 2-by-2 matrix, and b and c are complex numbers. Then k(A) = 4 if and only if both b and c are in ∂W (A) and k1(C) = 2.

Proof. By Corollary 2.3.2, it is obvious that k (A) = 4 if and only if k1(C) = 2

and k1([b] ⊕ [c]) = 2. Moreover, it is also clear that k1([b] ⊕ [c]) = 2 is equivalent to

both of b and c being in ∂W (A). Hence the proof is complete.  To prove for case (iii), let A = B ⊕ C, where B and C are 2-by-2 irreducible

(32)

matrices. Since W (A) is the convex hull of the union of the two elliptic discs W (B) and W (C), either W (B) equals W (C), or Γ consists of at most four (possibly degen-erate) line segments. With this, we are now ready to give a necessary and sufficient condition for k(A) = 4.

Proposition 2.3.9. Let A = B ⊕ C, where B and C are 2-by-2 irreducible

matri-ces. Then k(A) = 4 if and only if Γ consists of at least three line segments (including the possibly degenerate ones), or Γ consists of exactly two (possibly degenerate) line segments such that k1(B) = k1(C) = 2.

Proof. If Γ consists of more than four (possibly degenerate) line segments, then

the two elliptic discs W (B) and W (C) are identical. Hence k(A) = 4 by direct computations. If Γ consists of four or three (possibly degenerate) line segments, then the endpoints of the major axes of the two elliptic discs W (B) and W (C) are in ∂W (A). Hence k(A) = 4. If Γ consists of exactly two (possibly degenerate) line segments such that k1(B) = k1(C) = 2, then k(A) = 4 by Theorem 2.2.6. Therefore

we have proved the sufficient condition for k(A) = 4. Next assume that k(A) = 4 and either Γ consists of exactly two (possibly degenerate) line segments such that the equalities k1(B) = k1(C) = 2 fail, or Γ consists of at most one (possibly degenerate)

line segment. Since property Λ holds in each case, we must have k1(B) = k1(C) = 2

by Theorem 2.2.6. This shows that we need only consider the latter. If Γ consists of exactly one (possibly degenerate) line segment, then Γ1 is empty and Γ2is a singleton.

Hence we may assume that W (B) is contained in W (C) and the intersection of W (B) and W (C) is Γ. This shows that k1(B) = 1 and k1(C) = 2, which is a contradiction.

If Γ is empty, then it follows from Proposition 2.3.3 that the numerical range of one summand, say, B is contained in the interior of the numerical range of the other summand C. By Corollary 2.3.4 and [5, Lemma 4.1], we see that k(A) = k(C) = 2, which is absurd. This completes the proof. 

(33)

As a final application of Theorem 2.2.6, it is obvious that the convex hull of the union of W (A) and W (A + aIn) has property Λ for any a 6= 0. Hence we obtain the

following proposition.

Proposition 2.3.10. Let A be an n-by-n matrix and a be a nonzero complex num-ber. Then k(A ⊕ (A + aIn)) = k1(A) + k1(A + aIn). In this case, k(A ⊕ (A + aIn)) =

2k(A) if and only if k1(A + aIn) = k1(A) = k(A).

We conclude this paper by stating the following open questions concerning this topic. Is it true that the equality k(A) = k1(B) + k1(C) holds for a matrix A of the

form B ⊕C even if property Λ fails? We note that although property Λ fails, the men-tioned formula may still be correct (cf. Proposition 2.3.1). Another natural example of the failure of property Λ is that both W (B) and W (C) have the same numerical range. Is it true that k (B ⊕ C) = k(B) + k(C) in this case? In particular, can we determine the value of k (A ⊕ A) (cf. Proposition 2.3.10)? The following proposition gives a partial answer for k (A ⊕ A) if we assume, in addition, that dim Hξ = 1 for

all ξ ∈ ∂W (A).

Proposition 2.3.11. If A is an n-by-n matrix with dim Hξ = 1 for all ξ ∈

∂W (A), then k m M j=1 A ! = m · k (A) .

Proof. Obviously, the inequality k ⊕m

j=1A ≥ m · k (A) holds. To prove the

re-versed inequality, we consider, for convenience, the case m = 2. Let ξ1 ∈ ∂W (A ⊕ A).

Then dim Hξ1(A ⊕ A) = 2 by our assumption on Hξ(A). Hence the subspace Hξ1(A ⊕ A)

is spanned by the two unit vectors x1⊕ 0 and 0 ⊕ x1, where ξ1 = hAx1, x1i. Let z1

be a unit vector in Hξ1(A ⊕ A). Then z1 = (α1x1⊕ α2x1) /

q

|α1|2+ |α2|2, where α1

(34)

is spanned by the two unit vectors x2 ⊕ 0 and 0 ⊕ x2, where ξ2 = hAx2, x2i.

More-over, if z2 is a unit vector in Hξ2(A ⊕ A), then z2 = (β1x2⊕ β2x2) /

q

|β1|2+ |β2|2,

where β1 and β2 are in C. Obviously, the orthogonality of z1 and z2 is equivalent to

α1β¯1 + α2β¯2 hx1, x2i = 0, that is, *  α1 α2  ,   β1 β2   + hx1, x2i = 0.

This shows that k(A ⊕ A) ≤ 2k(A) immediately by the definition of k(A). For general m, a similar argument as above yields that

*      α1 ... αm      ,      β1 ... βm      + hx1, x2i = 0

for some scalars α1, ..., αm and β1, ..., βm, where x1 and x2 are similarly defined. Since

the dimension of Cm is m, the number of vectors of the form [α

1, ..., αm]T which are

orthogonal to each other is at most m. We infer from this and the above equality that the reversed inequality k ⊕m

j=1A ≤ m · k (A) holds. Therefore we have the asserted

equality. 

At the end of this section, we apply Proposition 2.3.11 to the quadratic matrices. Recall that an n-by-n quadratic matrix A is unitarily similar to a matrix of the form

aIn1 ⊕ bIn2 ⊕   aIn3 D 0 bIn3  ,

where n1, n2, n3 ≥ 0, n1 + n2 + n3 = n, D > 0, and a, b ∈ σ (A) (cf. [18, Theorem

2.1]).

Corollary 2.3.12. If A is an n-by-n quadratic matrix of the above form and D is not missing, then k(A) = 2 · # ({λ ∈ σ (D) : λ = kDk}).

(35)

Proof. If D > 0, then D is unitarily similar to diag (d1, ..., dn3) , where d1 = · · · =

dp = kDk ≡ d > dp+1 ≥ · · · ≥ dn3 ≥ 0 (1 ≤ p ≤ n3). Hence A is unitarily similar to

a matrix of the form aIn1 ⊕ bIn2 ⊕

p j=1B ⊕nj=p+13 Bj, where n1+ n2+ 2n3 = n, B ≡   a d 0 b  , and Bj ≡   a dj 0 b  , j = p + 1, . . . , n3.

Since the set {a, b} and all of the numerical ranges W (Bj), j = p + 1, . . . , m, are

contained in the interior of W (B), it follows from [19, Lemma 2.9] that k(A) = k(⊕pj=1B). Since dim Hξ(B) = 1 for all ξ ∈ ∂W (B), we have k(A) = p · k(B) by

Proposition 2.3.11. Obviously, k(B) = 2 by [5, Lemma 4.1]. Thus k(A) = 2p as

asserted. 

We remark that in the preceding proof the equality k(⊕pj=1B) = 2p can also be

established directly. Indeed, the inequality k(⊕pj=1B) ≥ 2p holds trivially and we can

(36)

3

Gau-Wu numbers of nonnegative matrices

3.1

Introduction

In Section 3.2 below, we first consider a matrix A of the form         0 A1 0 0 . .. . .. Am−1 Am 0         (m ≥ 2),

where the diagonal zeros are zero square matrices. In this case, we obtain that k(A) has a lower bound m (Proposition 3.2.1) if A has a boundary vector x = ⊕m

j=1xk, that

is, hAx, xi ∈ ∂W (A), with all component vectors xj having the same norm 1/√m.

Next, we study a nonnegative matrix A of the above form with irreducible real part and Am = 0. Proposition 3.2.3 yields that k(A) ≤ m − 1. Moreover, with the help

of [19], we are able to give necessary and sufficient conditions for such a matrix A with the value of k(A) equal to m − 1 (Theorem 3.2.4). Finally, we also consider a nonnegative matrix A of the above form with irreducible real part. Example 3.2.6 shows that no analogous results hold for such an A. In Section 3.3, we consider more special nonnegative matrices, namely, the doubly stochastic matrices. It can be proven that k(A) equals 3 for any 3-by-3 doubly stochastic matrix (Proposition 3.3.2). Moreover, for a 4-by-4 doubly stochastic matrix A, we determine the value of k(A) completely and give the description of its numerical range W (A) (Propositions 3.3.4 and 3.3.5). For general n, we obtain the lower bound of k(A) for an n-by-n doubly stochastic matrix A (Theorems 3.3.6 and 3.3.7). In particular, for an n-by-n irreducible doubly stochastic matrix A, we obtain a necessary and sufficient condition for k(A) to be equal to this lower bound (Theorem 3.3.7).

We end this section by fixing some notations. For any finite matrix A, its trace,

(37)

determinant, and spectral radius are denoted by tr A, det A, and r(A), respectively. The number m of eigenvalues z of A with |z| = r(A) is called the index of imprimitivity of A.

3.2

Nonnegative block shift matrix

We start by reviewing a couple of basic facts on a block shift matrix. Recall that a block shift matrix A is one of the form

        01 A1 0 02 . .. . .. Am−1 Am 0m         (m ≥ 2),

where the diagonal zeros 0j (j = 1, ..., m) are zero square matrices. Let ϕ = 2π/m.

Then it is easy to see that the numerical range W (A) is an m-symmetric compact convex region since U∗AU = eA, where U is a unitary matrix of the form

        eiϕI 1 0 0 e2iϕI 2 . .. . .. 0 0 emiϕI m         ,

where the diagonal identity matrix Ij is of the same size as the corresponding 0j

(j = 1, ..., m). Let hAx, xi be a boundary point of W (A), where x = ⊕m

k=1xj is a unit

vector. We define x0φ = x and xjϕ = ⊕mk=1ei(k−1)jϕxk for j = 1, ..., m − 1. With these

notations, we can give a lower bound for k(A).

Proposition 3.2.1. Let A be a block shift of the above form with the corresponding notations as above. Then kxkk is equal to 1/m for all k = 1, ..., m if and only if the vectors x, 0 ≤ p ≤ m − 1, are orthonormal. In this case, we have k(A) ≥ m .

(38)

Proof. Assume that hxpϕ, xqϕi = 0 for 0 ≤ p 6= q ≤ m − 1. This is equivalent to

the equation

kx1k2+ ei(p−q)ϕkx2k2 + · · · + ei(m−1)(p−q)ϕkxmk2 = 0

for 0 ≤ p 6= q ≤ m − 1. That is, eiϕ, ..., ei(m−1)ϕ are the roots of the polynomial

kx1k2+ kx2k2t + · · · + kxmk2tm−1.

Hence each kxkk is equal to 1/√m for k = 1, ..., m by comparing the coefficients of

the above polynomial with those of kxmk2Qm−1j=1 (t −eijϕ). Conversely, if kxkk is equal

to 1/√m for all k = 1, ..., m, then it is a routine matter to check that xpϕ and xqϕ

are orthonormal for 0 ≤ p 6= q ≤ m − 1. Clearly, in this case, k(A) has a lower bound

m. 

Recall that the numerical radius ω(A) of a matrix A is the quantity max {|z| : z ∈ W (A)}. For a nonnegative matrix with irreducible real part, [16, Lemma 1] says that, for ω(A)eiθ in W (A), where θ is a real number with e6= 1, (a) if θ is an irrational

multiple of 2π, then A is permutationally similar to a matrix of the form

(1)         0 A1 0 0 . .. . .. Am−1 0 0         (m ≥ 2),

where the diagonal zeros are zero square matrices, and, in particular, W (A) is a circu-lar disc centered at the origin, and (b) if θ is a rational multiple of 2π, say, θ = 2πp/q, where p and q are relatively prime integers and q ≥ 2, then A is permutationally sim-ilar to (2)         0 A1 0 0 . .. . .. Aq−1 Aq 0         (m ≥ 2), 32

(39)

and, in particular, W (A) = e2πi/qW (A).

The following lemma is a generalization of [19, Lemma 3.6], which is useful for the proof of Proposition 3.2.3. Recall that a vector x with positive components, denoted by x ≻ 0, is called positive.

Lemma 3.2.2. Let A be an n-by-n (n ≥ 2) nonnegative matrix of the form (1)

with irreducible real part and m ≥ 2. Then the following hold:

(a) W (A) = {z ∈ C : |z| ≤ ω(A)}.

(b) There is a unique positive vector x = x1⊕ · · · ⊕ xm ∈ Cn such that hAx, xi =

ω(A).

(c) For any a = ω(A)e, θ ∈ [0, 2π), in ∂W (A), if x

θ = x1⊕eiθx2⊕· · ·⊕ei(m−1)θxm, then a = hAxθ, xθi and Ha is generated by xθ.

(d) Let aj = ω(A)eiθj (θj ∈ [0, 2π)), j = 1, 2, be two points in ∂W (A) with the corresponding unit vector xθj. Then xθ1 and xθ2 are orthogonal to each other if and

only if ei(θ1−θ2) is a zero of the polynomial kx

1k2+ kx2k2t + · · · + kxmk2tm−1.

Proof. Since U

θAUθ = eiθA for any θ, where Uθ = ⊕mk=1ei(k−1)θIk, that is, A is

unitarily similar to eiθA for any θ, (a) follows immediately. (b) is a consequence of

[11, Proposition 3.3]. To prove (c), note that

a = ω(A)eiθ = heAx, xi = hUθ∗AUθx, xi = hA(Uθx), (Uθx)i = hAxθ, xθi,

which shows that xθ is in Ha. That dim Ha = 1 is by [11, Corollary 3.10]. Thus Ha

is generated by xθ. (d) follows from the fact that hxθ1, xθ2i =

Pm

k=1ei(k−1)(θ1−θ2)x2k.

This completes the proof. 

Thus, for a nonnegative matrix A of the form (1) with irreducible real part, k(A) equals the maximum number of θ1, ..., θk in [0, 2π) for which ei(θj−θl) is a zero of

p(t) ≡ kx1k2 + kx2k2t + · · · + kxmk2tm−1 for all j and l, 1 ≤ j 6= l ≤ k. If m = 2,

參考文獻

相關文件

 If a DSS school charges a school fee exceeding 2/3 and up to 2 &amp; 1/3 of the DSS unit subsidy rate, then for every additional dollar charged over and above 2/3 of the DSS

Then, a visualization is proposed to explain how the convergent behaviors are influenced by two descent directions in merit function approach.. Based on the geometric properties

Then, we tested the influence of θ for the rate of convergence of Algorithm 4.1, by using this algorithm with α = 15 and four different θ to solve a test ex- ample generated as

Particularly, combining the numerical results of the two papers, we may obtain such a conclusion that the merit function method based on ϕ p has a better a global convergence and

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the

Proof. The proof is complete.. Similar to matrix monotone and matrix convex functions, the converse of Proposition 6.1 does not hold. 2.5], we know that a continuous function f

Specifically, in Section 3, we present a smoothing function of the generalized FB function, and studied some of its favorable properties, including the Jacobian consistency property;

Specifically, in Section 3, we present a smoothing function of the generalized FB function, and studied some of its favorable properties, including the Jacobian consis- tency