• 沒有找到結果。

1Introduction NonsingularityconditionsforFBsystemofnonlinearSDPs

N/A
N/A
Protected

Academic year: 2022

Share "1Introduction NonsingularityconditionsforFBsystemofnonlinearSDPs"

Copied!
29
0
0

加載中.... (立即查看全文)

全文

(1)

SIAM Journal on Optimization, vol. 21, pp. 1392-1417, 2011

Nonsingularity conditions for FB system of nonlinear SDPs

1

Shujun Bi, Shaohua Pan and Jein-Shan Chen§

February 14, 2011

(revised on July 29, 2011, August 21, 2011)

Abstract. For a locally optimal solution to the nonlinear semidefinite programming, under Robinson’s constraint qualification, we show that the nonsingularity of Clarke’s Jacobian of the Fischer-Burmeister (FB) nonsmooth system is equivalent to the strong regularity of the Karush-Kuhn-Tucker point. Consequently, from Sun’s paper (Mathe- matics of Operations Research, vol. 31, pp. 761-776, 2006), the semismooth Newton method applied to the FB system may attain the locally quadratic convergence under the strong second order sufficient condition and constraint nondegeneracy.

Key words: nonlinear semidefinite programming; the FB system; Clarke’s Jacobian;

nonsingularity; strong regularity.

AMS subject classifications. 90C22, 90C25, 90C31, 65K05

1 Introduction

LetX be a finite dimensional real vector space endowed with an inner product ⟨·, ·⟩ and its induced norm ∥ · ∥. Consider the nonlinear semidefinite programming (NLSDP)

min

x∈X f (x)

s.t. h(x) = 0, (1)

g(x)∈ Sn+,

1This work was supported by National Young Natural Science Foundation (No. 10901058) and Guangdong Natural Science Foundation (No. 9251802902000001).

Department of Mathematics, South China University of Technology, Guangzhou, China (beami- lan@163.com).

Department of Mathematics, South China University of Technology, Guangzhou, China (shh- pan@scut.edu.cn).

§Corresponding author. Member of Mathematics Division, National Center for Theoretical Sci- ences, Taipei Office. The author’s work is partially supported by National Science Council of Taiwan, Department of Mathematics, National Taiwan Normal University, Taipei, Taiwan 11677 (jschen@math.ntnu.edu.tw).

(2)

where f : X → IR, h : X → IRm and g : X → Sn are twice continuously differentiable functions,Sn is the linear space of all n× n real symmetric matrices, and Sn+ is the cone of all n× n positive semidefinite matrices. By introducing a slack variable X ∈ Sn+ for the conic constraint g(x)∈ Sn+, we can rewrite the NLSDP (1) as follows:

min

(x,X)∈X×Snf (x)

s.t. h(x) = 0, (2)

g(x)− X = 0, X∈ Sn+.

In this paper, we will concentrate on this equivalent formulation of the NLSDP (1).

The Karush-Kuhn-Tucker (KKT) condition for the NLSDP (2) takes the form Jx,XL(x, X, µ, S, Y ) = 0, h(x) = 0, g(x)− X = 0, −Y ∈ NSn+(X), (3) where the Lagrangian function L :X × Sn× IRm× Sn× Sn → IR is defined by

L(x, X, µ, S, Y ) := f (x) +⟨µ, h(x)⟩ + ⟨S, g(x) −X⟩ − ⟨X, Y ⟩,

Jx,XL(x, X, µ, S, Y ) is the derivative of L at (x, X, µ, S, Y ) with respect to (x, X), and NSn+(X) denotes the normal cone of Sn+ at X in the sense of convex analysis [16]:

NSn+(X) =

{ {Z ∈ Sn: ⟨Z, W − X⟩ ≤ 0 ∀W ∈ Sn} if X ∈ Sn+,

if X /∈ Sn+.

Recall that Φ :Sn× Sn→ Sn is a semidefinite cone (SDC) complementarity function if Φ(X, Y ) = 0 ⇐⇒ X ∈ Sn+, Y ∈ Sn+, ⟨X, Y ⟩ = 0 ⇐⇒ −Y ∈ NSn+(X).

Then, with an SDC complementarity function Φ, the KKT optimality conditions in (3) can be reformulated as the following nonsmooth system:

E(x, X, µ, S, Y ) :=



Jx,XL(x, X, µ, S, Y ) h(x)

g(x)− X Φ(X, Y )



 = 0. (4)

The most popular SDC complementarity functions include the matrix-valued natural residual (NR) function and Fischer-Burmeister (FB) function, which are defined as

ΦNR(X, Y ) := X− ΠSn+(X − Y ) ∀X, Y ∈ Sn and

ΦFB(X, Y ) := (X + Y )−√

X2+ Y2 ∀X, Y ∈ Sn, (5)

(3)

respectively, where ΠSn

+(·) denotes the projection operator onto Sn+. It turns out that ΦFB has almost all favorable properties of ΦNR (see [21]). Also, the squared norm of ΦFB induces a continuously differentiable merit function whose derivative is globally Lipschitz continuous [18, 24]. This greatly facilitates the globalization of the semismooth Newton method [14, 15] for solving the FB system of (2). The FB system and the NR system mean EFB(x, X, µ, S, Y ) = 0 and ENR(x, X, µ, S, Y ) = 0, respectively, with the mappings EFB and ENR defined as in E except that Φ is specified as ΦFB and ΦNR, respectively.

The strong regularity is one of the important concepts in sensitivity and perturbation analysis introduced by Robinson in his seminal paper [17]. For the NLSDP (1), Sun [22]

offered a characterization for the strong regularity via the study of the nonsingularity of Clarke’s Jacobian of the NR system under the strong second order sufficient condition and constraint nondegeneracy, and established its equivalence to other characterizations discussed in a wide range of literatures. Later, for the linear SDP, Chan and Sun [3]

gained more insightful characterizations for the strong regularity via the study of the nonsingularity of Clarke’s Jacobian of the NR system, too. Then, it is natural for us to ask: is it possible to give a characterization for the strong regularity of NLSDPs by studying the nonsingularity of Clarke’s Jacobian of the FB system? Note that up to now one even does not know whether the B-subdifferential of FB system is nonsingular or not without strict complementarity of locally optimal solutions.

In this work, for a locally optimal solution to the NLSDP (2), we prove that under Robinson’s constraint qualification, the nonsingularity of Clarke’s Jacobian of the FB system is equivalent to the strong regularity of the KKT point, which by [22, Theorem 4.1] is further equivalent to the strong second order sufficient condition and constraint nondegeneracy. This result is of interest since, on one hand, it relates the nonsingularity of Clarke’s Jacobian of the FB system to Robinson’s strong regularity condition and, on the other hand, it allows us to obtain the quadratic convergence of the semismooth Newton method [15, 14] for the FB system without strict complementarity assumption.

In addition, it also extends the result of [9, Corollary 3.7] for the variational inequality with the polyhedral cone constraints to the setting of semidefinite cones. It is worthwhile to point out that [22, Theorem 4.1] plays a key role in achieving this objective.

Throughout this paper, Jzf (z) and Jzz2f (z) denote the derivative and the second order derivative, respectively, of a twice differentiable function f with respect to z, and I denotes an identity operator. For any n × m real matrices A and B, ⟨A, B⟩ means their Frobenius inner product, and∥A∥ denotes the norm of A induced by the Frobenius inner product. For X ∈ Sn, we write X ≽ 0 (respectively, X ≻ 0) to mean X ∈ Sn+

(respectively, X ∈ Sn++). For a linear operatorA, we denote by A the adjoint ofA, and by ∥A∥2 the operator norm of A. For a linear operator A : Sn → Sn, we write A ≽ 0 (respectively, A ≻ 0) if ⟨W, A(W )⟩ ≥ 0 for any W ∈ Sn (respectively, ⟨W, A(W )⟩ > 0 for any nonzero W ∈ Sn). For any given sets of indices α and β, we designate by Aαβ

the submatrix of A whose row indices belong to α and column indices belong to β, and

(4)

use |α| to denote the number of elements in the set α.

2 Preliminary results

LetX and Y be two arbitrary finite dimensional real vector spaces each equipped with a scalar product⟨·, ·⟩ and its induced norm ∥·∥. Let O be an open set in X and Ξ : O → Y be a locally Lipschitz continuous function on the set O. By Rademacher’s theorem, Ξ is almost everywhere F(r´echet)-differentiable in O. We denote by DΞ the set of points inO where Ξ is F-differentiable. Then Clarke’s Jacobian of Ξ at x is well defined [6]:

∂Ξ(x) := conv{∂BΞ(x)},

where “conv” means the convex hull, and ∂BΞ(x) is the B-subdifferential of Ξ at x:

BΞ(x) :=

{

V : V = lim

k→∞JxΞ(xk), xk → x, xk∈ DΞ

} .

For the concepts of (strong) semismoothness, please refer to the literature [15, 14, 20].

The following matrix inequalities are used in the proof of Lemma 3.3; see Appendix.

Lemma 2.1 For any n× m real matrices A, B and any Z ∈ Sn+, it holds that

(A + B)TZ(A + B) ≼ 2(ATZA + BTZB), (6) (A− B)TZ(A− B) ≼ 2(ATZA + BTZB). (7) Proof. Fix any Z ∈ Sn+. Then, for any n× m real matrices A and B, we have that

0 ≼ (A − B)TZ(A− B) = (ATZA + BTZB)− (ATZB + BTZA), 0 ≼ (A + B)TZ(A + B) = (ATZA + BTZB) + (ATZB + BTZA).

The first equation means that (ATZB + BTZA)≼ (ATZA + BTZB), which along with the second equality yields (6). The second equation implies that −(ATZB + BTZA) (ATZA + BTZB), which along with the first equality yields (7). 2

Lemma 2.2 Let X, Y ∈ Sn with X2+ Y2 ≻0. Then for any n × m real matrices A, B, ATA + BTB− (ATX + BTY )(X2+ Y2)−1(XA + Y B)≽ 0.

Proof. Note that ATA + BTB − (ATX + BTY )(X2+ Y2)−1(XA + Y B) is the Schur complement of X2+ Y2 in the following block symmetric matrix

Σ =

[ X2 + Y2 XA + Y B (XA + Y B)T ATA + BTB

] .

(5)

We only need to prove Σ≽ 0 (see [10, Theorem 7.7.6]). For any ζ = (ζ1, ζ2)∈ IRn×IRm, ζTΣζ = ζ1T(X2+ Y21+ 2ζ1T(XA + Y B)ζ2+ ζ2T(ATA + BTB)ζ2

= ∥Xζ1+ Aζ22 +∥Y ζ1+ Bζ22 ≥ 0, which shows that Σ≽ 0. The proof is then complete. 2

For any given X∈ Sn, letLX:Sn→Snbe the Lyapunov operator associated with X:

LX(Y ) := XY + Y X ∀Y ∈ Sn.

We next study several properties of the Lyapunov operators associated with X, Y ∈ Sn and Z ∈ Sn+ with Z2 ≽ X2+ Y2. To this end, we need to establish two trace inequalities.

Lemma 2.3 Let X, Y ∈ Sn with X ≽ |Y |. Then, for any W ∈ Sn, it holds that Trace(W XW X)≥ Trace(W Y W Y ).

Proof. Fix any W ∈ Sn. By the trace property of symmetric matrices, we have that Trace(W XW X)− Trace(W Y W Y )

= Trace [W XW (X− Y )] + Trace [W (X − Y )W Y ]

= Trace [W (X− Y )W X] + Trace [W (X − Y )W Y ]

= Trace [W (X− Y )W (X + Y )] .

Since X ≽ |Y |, we have W (X − Y )W ≽ 0 and X + Y ≽ 0. From [10, Theorem 7.6.3], it then follows that Trace [W (X− Y )W (X + Y )] ≥ 0. The result is thus proved. 2 Lemma 2.4 For any given X, Y ∈ Sn and Z ∈ Sn+ satisfying Z ≽√

X2+ Y2, we have Trace(W ZW Z)≥ Trace(W |X|W |X|) + Trace(W |Y |W |Y |) ∀W ∈ Sn.

Proof. Fix any W ∈ Sn. Applying Lemma 2.3, we readily obtain that Trace(W ZW Z)≥ Trace(

W√

X2+ Y2W√

X2+ Y2 )

. (8)

In addition, from [1, Theorem IX.6.1], we know that φ(A, B) := Trace(W√ AW√

B) is a jointly concave function on Sn+× Sn+, which means that for any A1, A2, B1, B2 ∈ Sn+,

φ

(A1+ A2

2 ,B1+ B2 2

)

1

2[φ(A1, B1) + φ(A2, B2)] .

Using this inequality with A1 = B1 = X2 and A2 = B2 = Y2, we obtain that

(X2+ Y2

2 ,X2+ Y2 2

)

≥ Trace(W |X|W |X|) + Trace(W |Y |W |Y |).

(6)

This, together with the definition of φ and inequality (8), implies the result. 2 The following proposition, extending the result of [8, Proposition 3.4] associated with second-order cones to SDCs, is used to prove Proposition 2.2. Among others, Proposition 2.2 is the key to characterize the properties of Clarke’s Jacobian of ΦFB; see Section 4.

Proposition 2.1 For any given X, Y ∈ Sn and Z ∈ Sn+, the following implication holds:

Z2 ≽ X2+ Y2 =⇒ L2Z ≽ L2X +L2Y.

Proof. Since Z2 ≽ X2+ Y2 and Z ∈ Sn+, from [1, Proposition V.1.8] it follows that Z ≽√

X2 + Y2.

Now choose a matrix W ∈ Sn arbitrarily. Then, a simple computation yields that

⟨W, (L2Z− L2X − L2Y)W⟩ = 2[

Trace(W ZW Z) + Trace(W2Z2)− Trace(W XW X)

−Trace(W2X2)− Trace(W2Y2)− Trace(W Y W Y )]

= 2[

Trace(

W2(Z2− X2− Y2))

+ Trace(W ZW Z)

−Trace(W XW X) − Trace(W Y W Y )]

≥ 2 [Trace(W ZW Z) − Trace(W XW X) − Trace(W Y W Y )]

≥ 0,

where the first inequality is due to Z2 ≽ X2 + Y2, and the second one is using Z

√X2+ Y2 and Lemmas 2.4 and 2.3. Since W is arbitrary inSn, the result follows. 2

Proposition 2.2 For any given X, Y ∈ Sn and Z ∈ Sn++, define A: Sn× Sn → Sn by A(△U, △V ) := L−1Z LX(△U) + L−1Z LY(△V ) ∀△U, △V ∈ Sn.

If Z2 ≽ X2+ Y2, then the linear operator A satisfies ∥A∥2 ≤ 1, and consequently L−1Z LX(△U) + L−1Z LY(△V ) ≤ √∥△U∥2+∥△V ∥2 ∀△U, △V ∈ Sn. (9) Proof. Assume that Z2 ≽X2+ Y2. By the definition ofA and Proposition 2.1, we have

AA =L−1Z (L2X +L2Y)L−1Z ≼ L−1Z L2ZL−1Z =I.

This means that the largest eigenvalue of AA is less than 1, and consequently,

∥A∥2 =√

∥AA∥2 =√

λmax(AA) =

λmax(AA)≤ 1.

This completes the proof of the first part. By the definition of operator norm, we have

∥L−1Z LX(△U) + L−1Z LY(△V )∥ = ∥A(△U, △V )∥ ≤ ∥A∥2∥(△U, △V )∥.

(7)

Together with the first part, we prove that the inequality (9) holds. 2 Let α, β and γ be disjoint index sets with α∪ β ∪ γ = {1, 2, . . . , n}. Define

Γ(X, Y ) := (Xββ2 + Yββ2 + XβγXγβ + YβαYαβ)1/2 ∀X, Y ∈ Sn. (10) The following property of the function Γ will be used in the subsequent sections.

Proposition 2.3 Let X, Y ∈ Sn be such that Γ(X, Y )≻ 0. Then for any G, H ∈ Sn,

∥L−1Γ(X,Y )(XβγGγβ+ GβγXγβ)∥ ≤ 2

|β||γ| ∥Gγβ∥,

∥L−1Γ(X,Y )(YβαHαβ+ HβαYαβ)∥ ≤ 2

|β||α| ∥Hαβ∥.

Proof. Let Γ(X, Y ) = Qβdiag(λ1, . . . , λ|β|)QTβ be the spectral decomposition of Γ(X, Y ), where λi > 0 for each i. Let Qγ and Qα be arbitrary but fixed |γ| × |γ| and |α| × |α|

orthogonal matrix, respectively. Define eXβγ := QTβXβγQγ and eYβα := QTβYβαQα. Then, from the expression of Γ(X, Y ) and its spectral decomposition, it is easy to get that

λ2i

|γ|

k=1

Xeik2 +

|α|

l=1

Yeil2 for all i = 1, . . . ,|β|.

This means that for 1≤ k ≤ |γ|, 1 ≤ l ≤ |α|, 1 ≤ i ≤ |β| and 1 ≤ j ≤ |β|,

| eXik|

λi+ λj ≤ 1, | eXkj|

λi+ λj ≤ 1, |eYil|

λi+ λj ≤ 1, |eYlj|

λi+ λj ≤ 1. (11) For any G, H ∈ Sn, with eGβγ = QTβGβγQγ and eHβα = QTβHβαQα, we calculate that

QTβL−1Γ(X,Y )(XβγGγβ+ GβγXγβ)Qβ = [∑|γ|

k=1( eXikGekj+ eGikXekj) λi+ λj

]

1≤i,j≤|β|

,

QTβL−1Γ(X,Y )(YβαHαβ + HβαYαβ)Qβ =

[∑|α|

l=1( eYilHelj + eHilYelj) λi+ λj

]

1≤i,j≤|β|

.

Using the inequalities in (11) and noting that Frobenius norm is orthogonally invariant, from the last two equalities we obtain the desired result. 2

In the subsequent sections, we always use C :Sn× Sn → Sn to denote the function C(X, Y ) :=√

X2+ Y2 ∀X, Y ∈ Sn, (12) and for any given X, Y ∈ Sn assume that C(X, Y ) has the spectral decomposition

C(X, Y ) = P diag(λ1, . . . , λn)PT = P DPT, (13)

(8)

where P is an n× n orthogonal matrix, and D = diag(λ1, . . . , λn) with λi ≥ 0 for all i.

Define the index sets κ and β associated with the eigenvalues of C(X, Y ) by κ := {i : λi > 0} and β := {i : λi = 0} .

Then, by permuting the rows and columns of C(X, Y ) if necessary, we may assume that D =

[ Dκ 0 0 Dβ

]

=

[ Dκ 0 0 0

] .

3 Directional derivative and B-subdifferential

The function ΦFB is directionally differentiable everywhere inSn× Sn; see [21, Corollary 2.3]. But, to our best knowledge, the expression of its directional derivative is not given in the literature. Next we derive it and use it to show that the B-subdifferential of ΦFB at a general point coincides with that of its directional derivative function at the origin.2 Proposition 3.1 For any given X, Y ∈ Sn, let C(X, Y ) have the spectral decomposition as in (13). Then, the directional derivative Φ

FB((X, Y ); (G, H)) of ΦFB at (X, Y ) with the direction (G, H)∈ Sn× Sn has the following expression

(G + H)− P [L−1Dκ

(LXeκκ( eGκκ) +LYeκκ( eHκκ) )

D−1κ ( eXκκGeκβ+ eYκκHeκβ) ( eGβκXeκκ+ eHβκYeκκ)D−1κ Θ( eG, eH)

]

PT (14)

where eX := PTXP , eY := PTY P , eG := PTGP , eH := PTHP , and Θ is defined by Θ(U, V ) :=

[

Uββ2 + Vββ2 + UβκUκβ + VβκVκβ

(UβκXeκκ+ VβκYeκκ)Dκ−2( eXκκUκβ + eYκκVκβ) ]1/2

∀U, V ∈ Sn. (15) Proof. Fix any G, H ∈ Sn. Assume that (X, Y )̸= (0, 0). Then, for any t > 0, we have ΦFB(X + tG, Y + tH)− ΦFB(X, Y ) = t(G + H)− △(t) (16) with

△(t) ≡[

C2(X, Y ) + t(LX(G) +LY(H)) + t2(G2+ H2)]1/2

− C(X, Y ).

Let eX, eY , eG and eH be defined as in the proposition. It is easy to see that

△(t) := Pe T△(t)P = (D2+ fW )1/2− D, (17)

2When we are preparing this manuscript, we learn that these results are obtained by Zhang, Zhang and Pang (see [26]) via the singular value decomposition. To the contrast, we achieve them indepen- dently by the eigenvalue decomposition in order to obtain Proposition 3.2.

(9)

where

W = tf

(X eeG + eG eX + eY eH + eH eY )

+ t2

(Ge2+ eH2 )

.

Since eX2+ eY2 = D2 and Dβ = 0, we have eX = diag( eXκκ, 0) and eY = diag( eYκκ, 0). So, fW = t

[ LXeκκ( eGκκ) +LYeκκ( eHκκ) XeκκGeκβ + eYκκHeκβ GeβκXeκκ+ eHβκYeκκ 0

]

+

[ o(t) o(t)

o(t) t2

(Ge2ββ+ eHββ2 + eGβκGeκβ+ eHβκHeκβ ) ]

.

By equation (17) and [24, Lemma 6.2], we know that





△(t)e κκ =L−1Dκ(fWκκ) + o(∥fW∥),

△(t)e κβ = Dκ−1fWκβ + o(∥fW∥), fWββ = e△(t)Tκβ△(t)e κβ + e△(t)2ββ.

(18)

From the second equality of (18) and the expression of fWκβ, it follows that

△(t)e κβ = tD−1κ

(XeκκGeκβ + eYκκHeκβ )

+ o(t), (19)

and consequently,

△(t)e Tκβ△(t)e κβ = t2

(GeβκXeκκ+ eHβκYeκκ )

Dκ−2

(XeκκGeκβ + eYκκHeκβ )

+ o(t2).

This, together with the third equation of (18) and the expression of fWββ, implies that

△(t)e 2ββ = t2

(GeβκGeκβ + eHβκHeκβ + eG2ββ+ eHββ2 )

−t2(

GeβκXeκκ+ eHβκYeκκ )

Dκ−2

(XeκκGeκβ + eYκκHeκβ )

+ o(t2).

Since Dβ = 0, the expression of e△(t) in (17) implies that e△(t)ββ ≽ 0. Therefore,

limt↓0

△(t)e ββ

t = lim

t↓0

[ e△(t)2ββ]1/2

t = Θ( eG, eH).

In addition, from the first equation in (18) and the expression of fWκκ, we have

△(t)e κκ = tL−1Dκ(

LXeκκ( eGκκ) +LYeκκ( eHκκ) )

+ o(t).

Combining the last two equations with (19), we immediately obtain that

lim

t↓0

△(t)e t =

[ L−1Dκ

(LXeκκ( eGκκ) +LYeκκ( eHκκ) )

Dκ−1( eXκκGeκβ + eYκκHeκβ) ( eGβκXeκκ+ eHβκYeκκ)Dκ−1 Θ( eG, eH)

] .

(10)

This, along with (16), shows that Φ

FB((X, Y ); (G, H)) has the expression given by (14).

When (X, Y ) = (0, 0), by the positive homogeneity of ΦFB, we immediately have Φ

FB((X, Y ); (G, H)) = (G + H)−√

G2+ H2 = ΦFB(G, H).

Note that this is a special case of (14) with κ =∅. The result then follows. 2 Note that the function Θ in (15) is always well defined since, by Lemma 2.2,

UβκUκβ + VβκVκβ− (UβκXeκκ+ VβκYeκκ)D−2κ ( eXκκUκβ + eYκκVκβ)≽ 0

for all U, V ∈ Sn. As a consequence of Proposition 3.1, we readily obtain the following necessary and sufficient characterization for the differentiable points of the function ΦFB. Corollary 3.1 The function ΦFB is F-differentiable at (X, Y ) if and only if C(X, Y )≻ 0.

Furthermore, when C(X, Y )≻ 0, we have for any (G, H) ∈ Sn× Sn,

J ΦFB(X, Y )(G, H) = (G + H)− L−1C(X,Y )(LX(G) +LY(H)) . (20) Proof. The “if” part is direct by [1, Theorem V.3.3] or [5, Proposition 4.3]. We next prove the “only if” part by contradiction. Suppose that ΦFB is F-differentiable at (X, Y ), but C(X, Y ) ≻ 0 does not hold. Then |β| ̸= ∅. Since ΦFB is F-differentiable at (X, Y ), Φ

FB((X, Y ); (·, ·)) is a linear operator. But, letting (G1, H1), (G2, H2)∈ Sn× Sn be such that G1 = G2 = 0, H1 = diag(0, I|β|) and H2 =−H1, we obtain that

0 = ΦFB((X, Y ); (G1, H1) + (G2, H2))

= ΦFB((X, Y ); (G1, H1)) + ΦFB((X, Y ); (G2, H2))

= −2P

( 0 0 0 I|β|

) PT,

which is a contradiction. This contradiction shows that the “only if” part holds. The formula in (20) follows by [4, Lemma 2] or [11, Theorem 3.4]. 2

Next we derive the expression of the directional derivative of Θ at (U, V ) with the direction (G, H) ∈ Sn× Sn, which is used to characterize the F-differentiable points of Θ in Lemma 3.2 below. Define Ω1 :Sn× Sn→ IR|β|×|κ| and Ω2 :Sn× Sn → IR|β|×|κ| by

1(U, V ) := Uβκ− (UβκXeκκ+ VβκYeκκ)D−2κ Xeκκ ∀U, V ∈ Sn and

2(U, V ) := Vβκ− (UβκXeκκ+ VβκYeκκ)D−2κ Yeκκ ∀U, V ∈ Sn,

respectively. Noting that eXκκ2 + eYκκ2 = Dκ2, we can rewrite the function Θ in (15) as Θ(U, V ) =[

Uββ2 + Vββ2 + Ω1(U, V )Ω1(U, V )T+ Ω2(U, V )Ω2(U, V )T]1/2

∀U, V ∈ Sn. (21)

(11)

For any given U, V ∈ Sn, assume that Θ(U, V ) has the following spectral decomposition Θ(U, V ) = RΛRT = Rdiag(ϑ1, . . . , ϑ|β|)RT,

where Λ = diag(ϑ1, . . . , ϑ|β|) is the diagonal matrix of eigenvalues of Θ(U, V ) and R is a corresponding matrix of orthonormal eigenvectors. Define the index sets I and J associated with the eigenvalues of Θ(U, V ) by

I := {i: ϑi > 0} and J := {i: ϑi = 0} .

Then, by permuting the rows and columns of Θ(U, V ) if necessary, we may assume that Λ =

[ ΛI 0 0 ΛJ

]

=

[ ΛI 0 0 0

] .

From (21) and the spectral decomposition of Θ(U, V ), it is easy to obtain that

[RTUββ]J β = 0, [RTVββ]J β = 0, [RT1(U, V )]J κ = 0, [RT2(U, V )]J κ = 0. (22) Lemma 3.1 For any given (U, V )∈ Sn× Sn, assume that Θ(U, V ) has the spectral de- composition as above. Then, the directional derivative Θ((U, V ); (G, H)) of Θ at (U, V ) with the direction (G, H)∈ Sn× Sn has the following expression

R

[ L−1ΛI[fWII] Λ−1I WfIJ

WfIJTΛ−1I ( eΘJJ − fWIJTΛ−2I WfIJ)1/2 ]

RT, (23)

where eΘ := RTΘ2(G, H)R, and fW := RTW (G, H)R with W (G, H) given by W (G, H) := Ω1(U, V )Ω1(G, H)T + Ω1(G, H)Ω1(U, V )T +LUββ(Gββ)

+LVββ(Hββ) + Ω2(U, V )Ω2(G, H)T + Ω2(G, H)Ω2(U, V )T. Proof. Assume that Θ(U, V )̸= 0. For any t > 0, we calculate that

∆(t) := Θ(U + tG, V + tH)− Θ(U, V )

= [

Θ2(U, V ) + tW (G, H) + t2Θ2(G, H)]1/2

− Θ(U, V ).

From the spectral decomposition of Θ(U, V ), it then follows that

∆(t) := Re T∆(t)R = (

Λ2+ tfW + t2Θe )1/2

− Λ, (24)

where eΘ and fW are defined as in the lemma. From (24) and [24, Lemma 6.2], we have





∆(t)e II = tL−1ΛI[fWII] + o(t),

∆(t)e IJ = tΛ−1I fWIJ + o(t),

tfWJJ+ t2ΘeJJ = e∆(t)TIJ∆(t)e IJ + e∆(t)2JJ.

(25)

(12)

By equation (22) and the definition of fW , we have fWJJ = 0. Then, from the last two equalities of (25), it follows that

∆(t)e 2JJ = t2ΘeJJ− e∆(t)TIJ∆(t)e IJ = t2

(ΘeJJ − fWIJTΛ−2I fWIJ

)

+ o(t2).

Since ΛJ = 0, the expression of e∆(t) in (24) implies that e∆(t)JJ ≽ 0. Therefore,

limt↓0

∆(t)e JJ

t = lim

t↓0

∆(t)e 2JJ

t =

(ΘeJJ − fWIJTΛ−2I fWIJ )1/2

. This, together with the first two equalities of (25), yields that

Θ((U, V ); (G, H)) = lim

t↓0

R e∆(t)RT

t = R

[ L−1ΛI[fWII] Λ−1I WfIJ

WfIJTΛ−1I ( eΘJJ− fWIJTΛ−2I WfIJ)1/2 ]

RT.

If Θ(U, V ) = 0, then Uββ = 0, Vββ = 0, Ω1(U, V ) = 0 and Ω2(U, V ) = 0. By this, it is easy to compute that Θ((U, V ); (G, H)) = Θ(G, H). Note that Θ(G, H) is a special case of (23) with I =∅. The result then follows. 2

Remark 3.1 Lemma 3.1 shows that the function Θ defined by (15) is directionally dif- ferentiable everywhere in Sn× Sn. In fact, Θ is also globally Lipschitz continuous and strongly semismooth in Sn× Sn. Let Ψ(U, V ) := [Uββ Vββ1(U, V ) Ω2(U, V )] for U, V ∈ Sn, and Gmat(A) :=

AAT for A∈ IR|β|×2n. Comparing with (21), we have that Θ(U, V ) ≡ Gmat(Ψ(U, V )). By [21, Theorem 2.2], Gmat is globally Lipschitz continuous and strongly semismooth everywhere in IR|β|×2n. Since Ψ is a linear function, the compo- sition of Gmat and Ψ, i.e. the function Θ, is globally Lipschitz continuous, and strongly semismooth everywhere in Sn× Sn by [7, Theorem 19].

By the expression of the directional derivative of Θ, we may present the necessary and sufficient characterization for the differentiable points of Θ.

Lemma 3.2 The function Θ is F-differentiable at (U, V ) if and only if Θ(U, V ) ≻ 0.

Furthermore, when Θ(U, V )≻ 0, we have for any (G, H) ∈ Sn× Sn, J Θ(U, V )(G, H) = L−1Θ(U,V )[

(UβκGκβ + GβκUκβ) + (VβκHκβ + HβκVκβ)

(

GβκXeκκ+ HβκYeκκ )

D−2κ

(XeκκUκβ+ eYκκVκβ )

(

UβκXeκκ+ VβκYeκκ

) Dκ−2

(XeκκGκβ + eYκκHκβ

)

+LUββ(Gββ) +LVββ(Hββ) ]

.

(13)

Proof. We only need to prove the “only if” part. If Θ is F-differentiable at (U, V ), then Θ((U, V ); (G, H)) is a linear function of (G, H) which, by equation (23) implies that ( eΘJJ− fWIJTΛ−2I fWIJ)1/2 is a linear function of (G, H). We next argue that this holds true only if J =∅. Indeed, if J ̸= ∅, by taking G =

[ 0 0 0 Gββ

]

and H =

[ 0 0 0 Hββ

] with Gββ ≻ 0 and Hββ ≻ 0, we have Ω1(G, H) = 0 and Ω2(G, H) = 0 which, together with [RTUββ]J β = 0 and [RTVββ]J β = 0, implies that fWJ I = [RTW (G, H)R]J I = 0. Note that Θ2(G, H) = G2ββ+ Hββ2 . Then, ( eΘJJ − fWIJTΛ−2I fWIJ)1/2 =

[RT(G2ββ+ Hββ2 )R]JJ, which is clearly nonlinear. The Jacobian formula of Θ is direct by a simple computation. 2 Remark 3.2 Combining Proposition 3.1 with Lemma 3.2, we immediately obtain that Φ

FB((X, Y ); (·, ·)) is F-differentiable at (G, H) if and only if Θ(PTGP, PTHP )≻ 0.

By the definition of Θ and Lemma 3.2, we can prove the following result (see the Appendix for the proof) which corresponds to the property of ΦNR in [13, Lemma 11].

Lemma 3.3 For any given X, Y ∈ Sn, let ΨFB(·, ·) ≡ ΦFB((X, Y ); (·, ·)). Then,

BΦFB(X, Y ) = ∂BΨFB(0, 0).

Now Lemma 3.3 and Proposition 3.1 allow us to obtain the main result of this section.

Proposition 3.2 For any given X, Y ∈ Sn, let C(X, Y ) have the spectral decomposition as in (13). Then, a (U, V) ∈ ∂BΦFB(X, Y ) (respectively, ∂ΦFB(X, Y )) if and only if there exists a (G, H) ∈ ∂BΘ(0, 0) (respectively, ∂Θ(0, 0)) such that for any G, H ∈ Sn,

(I − U) (G) + (I − V) (H)

= P

L−1Dκ

(LXeκκ( eGκκ) +LYeκκ( eHκκ) )

Dκ−1

(XeκκGeκβ + eYκκHeκβ ) (GeβκXeκκ+ eHβκYeκκ

)

Dκ−1 G( eG) +H( eH)

 PT, (26)

where eX := PTXP , eY := PTY P , eG := PTGP , and eH := PTHP .

Proof. For any G, H ∈ Sn, let Ψ(G, H) := (PTGP, PTHP ). Define Ξ :Sn× Sn→ Sn by Ξ(S, T ) := P

[L−1Dκ(

LXeκκ(Sκκ)+LYeκκ(Tκκ))

Dκ−1( eXκκSκβ+ eYκκTκβ) (SβκXeκκ+ TβκYeκκ)Dκ−1 Θ(S, T )

] PT.

By Proposition 3.1, clearly, ΨFB(G, H) = (G + H)− Ξ(Ψ(G, H)) for any G, H ∈ Sn. Note that Ξ is globally Lipschitz continuous in Sn× Sn by the remarks after (21), and J Ψ(G, H) for any G, H ∈ Snis onto. Applying [3, Lemma 2.1] to the composite mapping Ξ◦ Ψ at (0, 0), we have that ∂B◦ Ψ)(0, 0) = ∂BΞ(Ψ(0, 0))J Ψ(0, 0) = ∂BΞ(0, 0)Ψ. So,

BΨFB(0, 0) = (I, I) − ∂BΞ(0, 0)Ψ.

This, together with Lemma 3.3 and the expression of Ξ, completes the proof. 2

參考文獻

相關文件

✓ Combining an optimal solution to the subproblem via greedy can arrive an optimal solution to the original problem.. Prove that there is always an optimal solution to the

6 《中論·觀因緣品》,《佛藏要籍選刊》第 9 冊,上海古籍出版社 1994 年版,第 1

To proceed, we construct a t-motive M S for this purpose, so that it has the GP property and its “periods”Ψ S (θ) from rigid analytic trivialization generate also the field K S ,

It is well known that the Fréchet derivative of a Fréchet differentiable function, the Clarke generalized Jacobian of a locally Lipschitz continuous function, the

In this paper we establish, by using the obtained second-order calculations and the recent results of [23], complete characterizations of full and tilt stability for locally

In this paper we establish, by using the obtained second-order calculations and the recent results of [25], complete characterizations of full and tilt stability for locally

The case where all the ρ s are equal to identity shows that this is not true in general (in this case the irreducible representations are lines, and we have an infinity of ways

Specifically, in Section 3, we present a smoothing function of the generalized FB function, and studied some of its favorable properties, including the Jacobian consistency property;