Smooth and nonsmooth analysis of vector-valued functions associated with circular cones

(1)

to appear in Nonlinear Analysis: Theory, Methods and Applications, 2013

Smooth and nonsmooth analysis of vector-valued functions associated with circular cones

Yu-Lin Chang

Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

E-mail: ylchang@math.ntnu.edu.tw

Ching-Yu Yang Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

E-mail: yangcy@math.ntnu.edu.tw

Jein-Shan Chen ¹ Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan E-mail: jschen@math.ntnu.edu.tw

January 8, 2013

Abstract. Let L_θ be the circular cone in IRⁿwhich includes second-order cone as a spe- cial case. For any function f from IR to IR, one can define a corresponding vector-valued function f^c(x) on IRⁿ by applying f to the spectral values of the spectral decomposition of x ∈ IRⁿ with respect to L_θ. We show that this vector-valued function inherits from f the properties of continuity, Lipschitz continuity, directional differentiability, Fr´echet differentiability, continuous differentiability, as well as semismoothness. These results will play crucial role in designing solution methods for optimization problem associated with circular cone.

Key words. Circular cone, vector-valued function, semismooth function, complementarity, spectral decomposition.

AMS subject classifications. 26A27, 26B05, 26B35, 49J52, 90C33, 65K05

1Corresponding author. Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Office. The author’s work is supported by National Science Council of Taiwan

(2)

1 Introduction

The circular cone [1, 2] is a pointed closed convex cone having hyperspherical sections orthogonal to its axis of revolution about which the cone is invariant to rotation. Let its half-aperture angle be θ with θ ∈ (0,^π₂). Then, the n-dimensional circular cone denoted by L_θ can be expressed as

L_θ := x = (x₁, x₂) ∈ IR × IRⁿ⁻¹| kxk cos θ ≤ x₁

(1) := x = (x₁, x₂) ∈ IR × IRⁿ⁻¹| kx₂k cot θ ≤ x₁ .

See Figure 1 as below.

(a) 0 < θ < 45^◦ (b) θ = 45^◦ (c) 45^◦< θ < 90^◦

Figure 1: The graphs of circular cones.

When θ = 45^◦, the circular cone reduces to the well-known second-order cone (SOC, also called Lorentz cone) given by

Kⁿ := x = (x₁, x₂) ∈ IR × IRⁿ⁻¹| kx₂k ≤ x₁ := (x₁, x₂) ∈ IR × IRⁿ⁻¹

kxk cos 45^◦ ≤ x₁ .

With respect to SOC, for any x = (x1, x2) ∈ IR × IRⁿ⁻¹, we can decompose x as

x = λ₁(x)u⁽¹⁾_x + λ₂(x)u⁽²⁾_x , (2) where λ₁(x), λ₂(x) and u⁽¹⁾x , u⁽²⁾x are the spectral values and the associated spectral vectors of x with respect to Kⁿ, given by

λ_i(x) = x₁+ (−1)ⁱkx₂k,

u⁽ⁱ⁾_x =







1 2

1, (−1)ⁱ x₂ kx₂k

, if x₂ 6= 0,

1 2

1, (−1)ⁱw

, if x₂ = 0,

for i = 1, 2 with w being any vector in IRⁿ⁻¹ satisfying kwk = 1. If x₂ 6= 0, the decomposition (2) is unique. With this spectral decomposition (2), for any function

(3)

f : IR → IR, the following vector-valued function associated with Kⁿ(n ≥ 1) is considered (see [3, 4]):

f^soc(x) = f (λ₁)u⁽¹⁾+ f (λ₂)u⁽²⁾ ∀x = (x₁, x₂) ∈ IR × IRⁿ⁻¹. (3) If f is defined only on a subset of IR, then f^soc is defined on the corresponding subset of IRⁿ. The definition (3) is unambiguous whether x₂ 6= 0 or x₂ = 0. The above definition (3) is analogous to one associated with the semidefinite cone Sⁿ, see [5, 6]. It was shown [4] that the properties of continuity, strict continuity, Lipschitz continuity, directional differentiability, differentiability, continuous differentiability, and semismoothness are each inherited by f^soc from f . These results are useful in the design and analysis of smoothing and nonsmooth methods for solving second-order cone programs (SOCP) and second-order cone complementarity problem (SOCCP) see [3, 4, 7, 8] and references therein.

Recently, there have been found circular cone constraints involved in real engineering problems. For example, in the formulation for optimal grasping manipulation for multi- fingered robots, the grasping force of i-th finger is subject to a contact friction constraint expressed as

(u_i1, u_i3)

≤ µu_i1 (4)

where µ is the friction coefficient, see Figure 2. Indeed, (4) is a circular cone constraint

Figure 2: The grasping force forms a circular cone where α = tan⁻¹µ < 45^◦. corresponding to u_i = (u_i1, u_i2, u_i3) ∈ L_θ with θ = tan⁻¹µ < 45^◦. Note that the circular cone L_θ is a non-self-dual (or non-symmetric cone) and its related study is rather limited. Nonetheless, motivated by the real world application regarding circular cone, the structures and properties about L_θ are investigated in [2]. In particular, the spectral factorization of z associated with circular cone is characterized in [2, Theorem 3.1]. For convenience, we restate it as below.

(4)

Theorem 1.1. [2, Theorem 3.1] For any z = (z₁, z₂) ∈ IR × IRⁿ⁻¹, one has

z = λ₁(z) · u⁽¹⁾_z + λ₂(z) · u⁽²⁾_z (5) where

λ₁(z) = z₁− kz₂k cot θ

λ2(z) = z1+ kz2k tan θ (6)

and











u⁽¹⁾z = 1 1 + cot²θ

1 0 0 cot θ

1

−w

=

sin²θ

−(sin θ cos θ)w

u⁽²⁾z = 1 1 + tan²θ

1 0

0 tan θ

1 w

=

cos²θ (sin θ cos θ)w

(7)

with w = z₂

kz₂k if z₂ 6= 0, and any vector in IRⁿ⁻¹ satisfying kwk = 1 if z₂ = 0.

Analogous to (3), with the spectral factorization (5), for any function f : IR → IR, we consider the following vector-valued function associated with L_θ (n ≥ 1):

f^c(z) = f (λ₁)u⁽¹⁾_z + f (λ₂)u⁽²⁾_z ∀z = (z₁, z₂) ∈ IR × IRⁿ⁻¹. (8) Can the properties of continuity, strict continuity, Lipschitz continuity, directional differentiability, differentiability, continuous differentiability, and semismoothness be each inherited by f^c from f ? These are what we want to explore in this paper.

At last, we say a few words about notations. In what follows, for any differentiable (in the Fr´echet sense) mapping F : IRⁿ → IR^m, we denote its Jacobian (not transposed) at x ∈ IRⁿ by ∇F (x) ∈ IR^m×n, i.e., (F (x + u) − F (x) − ∇F (x)u)/kuk → 0 as u → 0.

“ := ” means “define”. We write z = O(α) (respectively, z = o(α)), with α ∈ IR and z ∈ IRⁿ, to mean kzk/|α| is uniformly bounded (respectively, tends to zero) as α → 0.

2 Preliminaries

In this section, we review some basic concepts regarding vector-valued functions. These contain continuity, (local) Lipschitz continuity, directional differentiability, differentiability, continuous differentiability, as well as semismoothness.

Suppose F : IRⁿ → IR^m. Then, F is continuous at x ∈ IRⁿ if F (y) → F (x) as y → x;

and F is continuous if F is continuous at every x ∈ IRⁿ. We say F is strictly continuous (also called “locally Lipschitz continuous”) at x ∈ IRⁿ if there exist scalars κ > 0 and δ > 0 such that

kF (y) − F (z)k ≤ κky − zk ∀y, z ∈ IRⁿ with ky − xk ≤ δ, kz − xk ≤ δ;

(5)

and F is strictly continuous if F is strictly continuous at every x ∈ IRⁿ. We say F is directionally differentiable at x ∈ IRⁿ if

F⁰(x; h) := lim

t→0⁺

F (x + th) − F (x)

t exists ∀h ∈ IRⁿ;

and F is directionally differentiable if F is directionally differentiable at every x ∈ IRⁿ. F is differentiable (in the Fr´echet sense) at x ∈ IRⁿ if there exists a linear mapping

∇F (x) : IRⁿ → IR^m such that

F (x + h) − F (x) − ∇F (x)h = o(khk).

If F is differentiable at every x ∈ IRⁿ and ∇F is continuous, then F is continuously differentiable. We notice that, in the above expression about strict continuity of F , if δ can be taken to be ∞, then F is called Lipschitz continuous with Lipschitz constant κ.

It is well-known that if F is strictly continuous, then F is almost everywhere differentiable by Rademacher’s Theorem, see [9] and [10, Section 9J]. In this case, the generalized Jacobian ∂F (x) of F at x (in the Clarke sense) can be defined as the convex hull of the generalized Jacobian ∂_BF (x), where

∂_BF (x) :=

lim

x^j→x∇F (x^j)

F is differentiable at x^j ∈ IRⁿ

.

The notation ∂_B is adopted from [11]. In [10, Chapter 9], the case of m = 1 is considered and the notations “ ¯∇” and “ ¯∂” are used instead of, respectively, “∂_B” and “∂”. Assume F : IRⁿ → IR^m is strictly continuous, then F is said to be semismooth at x if F is directionally differentiable at x and, for any V ∈ ∂F (x + h), we have

F (x + h) − F (x) − V h = o(khk).

Moreover, F is called ρ-order semismooth at x (0 < ρ < ∞) if F is semismooth at x and, for any V ∈ ∂F (x + h), we have

F (x + h) − F (x) − V h = O(khk^1+ρ).

The following lemma, proven by Sun and Sun [5, Theorem 3.6] using the definition of generalized Jacobian, enables one to study the semismooth property of f^c by examining only those points x ∈ IRⁿwhere f^c is differentiable and thus work only with the Jacobian of f^c, rather than the generalized Jacobian. It is a very useful working lemma for verifying semismoothness property in section 4.

Lemma 2.1. Suppose F : IRⁿ→ IRⁿ is strictly continuous and directionally differentiable in a neighborhood of x ∈ IRⁿ. Then, for any 0 < ρ < ∞, the following two statements are equivalent:

(6)

(a) For any v ∈ ∂F (x + h) and h → 0,

F (x + h) − F (x) − vh = o(khk) (respectively, O(khk)^1+ρ).

(b) For any h → 0 such that F is differentiable at x + h,

F (x + h) − F (x) − ∇F (x + h)h = o(khk) (respectively, O(khk)^1+ρ).

We say F is semismooth (respectively, ρ-order semismooth) if F is semismooth (respectively, ρ-order semismooth) at every x ∈ IRⁿ. We say F is strongly semismooth if it is 1-order semismooth. Convex functions and piecewise continuously differentiable functions are examples of semismooth functions. The composition of two (respectively, ρ-order) semismooth functions is also a (respectively, ρ-order) semismooth function. The property of semismoothness, as introduced by Mifflin [12] for functionals and scalar- valued functions and further extended by Qi and Sun [13] for vector-valued functions, is of particular interest due to the key role it plays in the superlinear convergence analysis of certain generalized Newton methods [11, 13, 14, 15, 16]. For extensive discussions of semismooth functions, see [12, 13, 17].

3 Properties of Continuity and Differentiability

In this section, we focus on the properties of continuity and differentiability between f and f^c. We need some technical lemmas which come from the simple structure of circular cone and basic definitions before starting the proofs.

Lemma 3.1. Let λ₁ ≤ λ₂ be the spectral values of x ∈ IRⁿ and m₁ ≤ m₂ be the spectral values of y ∈ IRⁿ. Then, we have

|λ₁− m₁|²sin²θ + |λ₂− m₂|²cos²θ = kx − yk², (9) and hence, |λi− mi| ≤ c kx − yk, ∀i = 1, 2, where c = max{sec θ, csc θ}.

Proof. The proof follows from a direct computation. 2

Lemma 3.2. Let x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ and y = (y₁, y₂) ∈ IR × IRⁿ⁻¹. (a) If x2 6= 0, y2 6= 0, then we have

ku⁽ⁱ⁾− v⁽ⁱ⁾k ≤ 2 sin cos θ

kx₂k kx − yk, i = 1, 2, (10) where u⁽ⁱ⁾, v⁽ⁱ⁾ are the unique spectral vectors of x and y, respectively.

(7)

(b) If either x₂ = 0 or y₂ = 0, then we can choose u⁽ⁱ⁾, v⁽ⁱ⁾ such that the left hand side of inequality (10) is zero.

Proof. (a) From the spectral factorization (5), we know that u⁽¹⁾ = sin²θ

1 , (−1) cot θ x₂ kx₂k

, v⁽¹⁾ = sin²θ

1 , (−1) cot θ y₂ ky₂k

, where u⁽¹⁾, v⁽¹⁾ are unique. This gives u⁽¹⁾ − v⁽¹⁾ = sin²θ

0 , (−1) cot θ(_kx^x²

2k − _ky^y²

2k) . Then,

ku⁽¹⁾− v⁽¹⁾k = sin θ cos θ

x₂

kx₂k− y₂ ky₂k

= sin θ cos θ

x₂− y₂

kx₂k + (ky₂k − kx₂k)y₂ kx₂k · ky₂k

≤ sin θ cos θ

1

kx₂kkx2− y2k + 1

kx₂k|ky2k − kx2k|

≤ sin θ cos θ

1

kx₂kkx2− y2k + 1

kx₂kkx2 − y2k

≤ 2 sin θ cos θ

kx₂k kx − yk,

where the inequalities follow from the triangle inequality. Similar arguments apply for ku⁽²⁾− v⁽²⁾k.

(b) We can choose the same spectral vectors for x and y from the spectral factorization (5) since either x₂ = 0 or y₂ = 0. Then, it is obvious. 2

Lemma 3.3. For any w 6= 0 ∈ IRⁿ, we have ∇_w

w kwk

= 1

kwk

I − ww^T kwk²

. Proof. See [18, Lemma 3.3] or check it by direct computation. 2

Now, we are ready to present our first main result about continuity between f and f^c Theorem 3.1. For any f : IR → IR, f^c is continuous at x ∈ IRⁿ with spectral values λ₁, λ₂ if and only if f is continuous at λ₁, λ₂.

Proof. “⇐” Suppose f is continuous at λ₁, λ₂. For any fixed x = (x₁, x₂) ∈ IR×IRⁿ⁻¹and y → x, let the spectral factorizations of x, y be x = λ₁u⁽¹⁾+λ₂u⁽²⁾and y = m₁v⁽¹⁾+m₂v⁽²⁾, respectively. Then, we discuss two cases.

Case (i): If x₂ 6= 0, then we have f^c(y) − f^c(x)

= f (m₁)v⁽¹⁾− u⁽¹⁾ + [f (m₁) − f (λ₁)] u⁽¹⁾ (11) +f (m₂)v⁽²⁾− u⁽²⁾ + [f (m2) − f (λ₂)] u⁽²⁾.

(8)

Since f is continuous at λ₁, λ₂, and from Lemma 3.1, |m_i − λ_i| ≤ c ky − xk, we know f (m_i) −→ f (λ_i) as y → x. In addition, by Lemma 3.2, we have kv⁽ⁱ⁾−u⁽ⁱ⁾k −→ 0 as y → x. Thus, equation (11) yields f^c(y) −→ f^c(x) as y → x because both f (m_i) and ku⁽ⁱ⁾k are bounded. Hence, f^c is continuous at x ∈ IRⁿ.

Case (ii): If x₂ = 0, no matter y₂ is zero or not, we can arrange that x, y have the same spectral vectors. Thus, f^c(y) − f^c(x) = [f (m₁) − f (λ₁)] u⁽¹⁾+ [f (m₂) − f (λ₂)] u⁽²⁾. Then, f^c is continuous at x ∈ IRⁿ by similar arguments.

“⇒” The proof for this direction is straightforward or refer to similar arguments for [4, Prop. 2]. 2

Theorem 3.2. For any f : IR → IR, f^c is directionally differentiable at x ∈ IRⁿ with spectral values λ₁, λ₂ if and only if f is directionally differentiable at λ₁, λ₂.

Proof. “⇐” Suppose f is directionally differentiable at λ1, λ2. Fix any x = (x1, x2) ∈ IR × IRⁿ⁻¹, then we discuss two cases as below.

Case (i): If x₂ 6= 0, we have f^c(x) = f (λ₁)u⁽¹⁾+f (λ₂)u⁽²⁾where λ_i = x₁+(−1)ⁱ(tan θ)⁽⁻¹⁾ⁱkx₂k and u⁽ⁱ⁾ = (−1)ⁱsin θ cos θ

(tan θ)⁽⁻¹⁾ⁱ,_kx^x^T²

2k

for all i = 1, 2. From Lemma 3.3, we know that u⁽ⁱ⁾ is Fr´echet-differentiable with respect to x, with

∇_xu⁽ⁱ⁾= (−1)ⁱsin θ cos θ kx2k





0 0

0 I − x₂x^T₂ kx2k²



 ∀i = 1, 2. (12)

Also by the expression of λ_i, we know that λ_i is Fr´echet-differentiable with respect to x, with

∇_xλ_i =

1 , (−1)ⁱtan⁽⁻¹⁾ⁱθ x^T₂ kx₂k

∀i = 1, 2. (13)

In general, we cannot apply chain rule, when functions are only directionally differentiable. But, it works well for single-variable functions, that is, when single-variable functions are composed with a differentiable function. From the hypothesis, f is directionally differentiable at λ₁, then it is easy to compute

lim

t→0⁺

f (λ₁ + t × 1) − f (λ₁)

t = f⁰(λ₁; 1), lim

t→0⁺

f (λ1− t × 1) − f (λ1)

t = f⁰(λ₁; −1), lim

t→0⁺

f (λ₁+ o(t)) − f (λ₁)

t = 0.

Note that the spectral value function λ₁(x) = x₁−cot θkx₂k is differentiable when x₂ 6= 0, which yields

λ₁(x + th) = λ₁(x) + t∇_xλ₁h + o(t).

(9)

Let y := ∇_xλ₁h + ^o(t)_t . For the case of ∇_xλ₁h < 0, we know y < 0 as t is small. Thus, lim

t→0⁺

f (λ₁(x + th)) − f (λ₁(x)) t

= lim

t→0⁺

f (λ1(x) + ty) − f (λ1(x)) t

= lim

t→0⁺

f (λ₁(x) − (−ty)) − f (λ₁(x))

−ty (−y)

= lim

−ty→0⁺

f (λ₁(x) − (−ty)) − f (λ₁(x))

−ty lim

t→0⁺(−y)

= f⁰(λ₁(x); −1)(−∇_xλ₁h)

= f⁰(λ₁(x); ∇_xλ₁h).

Here the positively homogeneous property of directionally differentiable functions is used in the last equation. Similarly, for the other case of ∇_xλ₁h ≥ 0, we have

lim

t→0⁺

f (λ₁(x + th)) − f (λ₁(x))

t = f⁰(λ₁(x); ∇_xλ₁h).

In summary, the composite function f ◦ λ₁(·) is directionally differentiable at x. Now we can apply chain rule and product rule on f^c(x) = f (λ₁)u⁽¹⁾+ f (λ₂)u⁽²⁾. In other words,

(f^c)⁰(x; h)

= f (λ₁)∇_xu⁽¹⁾h + f⁰(λ₁; ∇_xλ₁h)u⁽¹⁾+ f (λ₂)∇_xu⁽²⁾h + f⁰(λ₂; ∇_xλ₂h)u⁽²⁾

= (A1, A2) ∈ IR × IRⁿ⁻¹, where

A₁ = f⁰

λ₁; h₁− cot θx^T₂h₂ kx₂k

sin²θ + f⁰

λ₂; h₁+ tan θx^T₂h₂ kx₂k

cos²θ (14) and

A2 =

f⁰

λ2; h1+ tan θx^T₂h₂ kx₂k

− f⁰

λ1; h1− cot θx^T₂h₂ kx₂k

sin θ cos θ x₂

kx₂k (15) +f (λ₂) − f (λ₁)

λ₂− λ₁

I − x₂x^T₂ kx₂k²

h₂, with h = (h₁, h₂) ∈ IR × IRⁿ⁻¹.

Now, applying equations (12) and (13) and using the fact that λ₂− λ₁ = sin θ cos θ^kx²^k in the A₂ term, we see that (f^c)⁰(x; h) can be rewritten in a more compact form as below:

(f^c)⁰(x; h) = f⁰

λ1; h1− cot θx^T₂h₂ kx₂k

u⁽¹⁾+ f⁰

λ2; h1+ tan θx^T₂h₂ kx₂k

u⁽²⁾ +f (λ₂) − f (λ₁)

λ₂− λ₁

I − x₂x^T₂ kx₂k²

h₂. (16)

(10)

Case (ii): If x₂ = 0, we compute the directional derivative (f^c)⁰(x; h) at x for any direction h by definition. Let h = (h₁, h₂) ∈ IR × IRⁿ⁻¹. We have two subcases.

First, consider the subcase of h₂ 6= 0. From the spectral factorization, we can choose u⁽¹⁾ =

sin²θ , − sin θ cos θ_kh^h²

2k

and u⁽²⁾ =

cos²θ , sin θ cos θ_kh^h²

2k

such that

f^c(x + th) = f (λ + 4λ1)u⁽¹⁾+ f (λ + 4λ2)u⁽²⁾ f^c(x) = f (λ)u⁽¹⁾+ f (λ)u⁽²⁾

where λ = x₁ and 4λ_i = t

h₁+ (−1)ⁱtan⁽⁻¹⁾ⁱθkh₂k

for all i = 1, 2. Thus, we obtain f^c(x + th) − f^c(x) = [f (λ + 4λ₁) − f (λ)] u⁽¹⁾+ [f (λ + 4λ₂) − f (λ)] u⁽²⁾. Using the following facts

lim

t→0⁺

f (λ + 4λ₁) − f (λ)

t = lim

t→0⁺

f (λ + t(h₁− cot θkh₂k)) − f (λ)

t = f⁰(λ; h₁− cot θkh₂k) lim

t→0⁺

f (λ + 4λ₂) − f (λ)

t = lim

t→0⁺

f (λ + t(h₁+ tan θkh₂k)) − f (λ)

t = f⁰(λ; h₁+ tan θkh₂k) yields

lim

t→0⁺

f^c(x + th) − f^c(x) t

= lim

t→0⁺

f (λ + 4λ₁) − f (λ)

t u⁽¹⁾+ lim

t→0⁺

f (λ + 4λ₂) − f (λ)

t u⁽²⁾

= f⁰(λ; h₁− cot θkh₂k)u⁽¹⁾+ f⁰(λ; h₁+ tan θkh₂k)u⁽²⁾ (17) which says (f^c)⁰(x; h) exists.

Secondly, for the subcase of h₂ = 0, the same arguments apply except h₂/kh₂k is replaced by any w ∈ IRⁿ⁻¹ with kwk = 1, i.e., choosing u⁽¹⁾ = sin²θ , − sin θ cos θw and u⁽²⁾ = (cos²θ , sin θ cos θw). Analogously, we obtain

lim

t→0⁺

f^c(x + th) − f^c(x)

t = f⁰(λ; h₁)u⁽¹⁾+ f⁰(λ; h₁)u⁽²⁾. (18) which implies (f^c)⁰(x; h) exists with form of (18). From all the above, it shows that f^c is directionally differentiable at x when x₂ = 0 and its directional derivative (f^c)⁰(x; h) is either in form of (17) or (18).

“⇒” Suppose f^c is directionally differentiable at x ∈ IRⁿ with spectral values λ₁, λ₂, we will prove that f is directionally differentiable at λ₁, λ₂. For λ₁ ∈ IR and any direction d₁ ∈ IR, let h := d₁u⁽¹⁾+ 0u⁽²⁾ where x = λ₁u⁽¹⁾+ λ₂u⁽²⁾. Then, x + th = (λ₁+ td₁)u⁽¹⁾+ λ₂u⁽²⁾ and

f^c(x + th) − f^c(x)

t = f (λ₁+ td₁) − f (λ₁)

t u⁽¹⁾.

(11)

Since f^c is directionally differentiable at x, the above equation implies

f⁰(λ₁; d₁) = lim

t→0⁺

f (λ₁+ td₁) − f (λ₁)

t exists.

This means f is directionally differentiable at λ₁. Similarly, f is also directionally differentiable at λ₂. 2

Theorem 3.3. For any f : IR → IR, f^c is differentiable at x = (x₁, x₂) ∈ IR × IRⁿ⁻¹ with spectral values λ₁, λ₂ if and only if f is differentiable at λ₁, λ₂. Moreover, for given h = (h1, h2) ∈ IR × IRⁿ⁻¹, we have

∇f^c(x)h =







b cx^T₂

kx₂k cx₂

kx₂k aI + (¯b − a)x₂x^T₂ kx₂k²







h₁ h₂

, when x₂ 6= 0,

where

a = f (λ₂) − f (λ₁) λ₂− λ₁ ,

b = f⁰(λ1) sin²θ + f⁰(λ2) cos²θ,

¯b = f⁰(λ₁) cos²θ + f⁰(λ₂) sin²θ, c = [f⁰(λ₂) − f⁰(λ₁)] sin θ cos θ.

When x₂ = 0, ∇f^c(x) = f⁰(λ)I with λ = x₁.

Proof. “⇐” The proof of this direction is identical to the proof shown as in Theorem 3.2, in which only “directionally differentiable” needs to be replaced by “differentiable”.

Since f is differentiable at λ₁ and λ₂, we have that f⁰(λ₁; ·) and f⁰(λ₂; ·) are linear, which means f⁰(λ_i; a + b) = f⁰(λ_i)a + f⁰(λ_i)b. This together with equations (14) and (15) yield

A₁ = f⁰

λ₁; h₁− cot θx^T₂h₂ kx2k

sin²θ + f⁰

λ₂; h₁+ tan θx^T₂h₂ kx2k

cos²θ

= f⁰(λ₁)h₁sin²θ − f⁰(λ₁) cot θx^T₂h₂

kx₂ksin²θ + f⁰(λ₂)h₁cos²θ + f⁰(λ₂) tan θx^T₂h₂ kx₂k cos²θ

= f⁰(λ1) sin²θ + f⁰(λ2) cos²θ h1+ [f⁰(λ2) − f⁰(λ1)] sin θ cos θ x^T₂ kx₂kh2

(12)

and A₂ =

f⁰

λ₂; h₁+ tan θx^T₂h₂ kx₂k

− f⁰

λ₁; h₁− cot θx^T₂h₂ kx₂k

sin θ cos θ x₂ kx₂k +f (λ₂) − f (λ₁)

λ₂− λ₁ (I − x₂x^T₂

kx₂k²)h₂ (19)

=

f⁰(λ₂)h₁− f⁰(λ₁)h₁+ f⁰(λ₂) tan θx^T₂h₂

kx₂k + f⁰(λ₁) cot θx^T₂h₂ kx₂k

sin θ cos θ x₂ kx₂k +f (λ₂) − f (λ₁)

λ2− λ1

I − x₂x^T₂ kx2k²

h₂

= [f⁰(λ₂) − f⁰(λ₁)] sin θ cos θ x₂ kx₂kh₁ +f⁰(λ₂) sin²θ + f⁰(λ₁) cos²θ x2x^T₂

kx₂k²h₂+f (λ2) − f (λ1) λ₂− λ₁

I − x2x^T₂ kx₂k²

h₂. Thus, for x₂ 6= 0, we have

∇f^c(x)h =







b cx^T₂

kx₂k cx2

kx₂k aI + (¯b − a)x2x^T₂ kx₂k²







h₁ h₂

(20)

with

a = f (λ₂) − f (λ₁) λ₂− λ₁ ,

b = f⁰(λ₁) sin²θ + f⁰(λ₂) cos²θ,

¯b = f⁰(λ₁) cos²θ + f⁰(λ₂) sin²θ, (21) c = [f⁰(λ₂) − f⁰(λ₁)] sin θ cos θ.

From equation (16), ∇f^c(x)h can also be recast in a more compact form:

∇f^c(x)h = f⁰(λ₁)

h₁− cot θx^T₂h₂ kx₂k

u⁽¹⁾+ f⁰(λ₂)

h₁+ tan θx^T₂h₂ kx₂k

u⁽²⁾ +f (λ2) − f (λ1)

λ₂− λ₁

I − x2x^T₂ kx₂k²

h₂. (22)

For case of x₂ = 0, with linearity of f⁰(λ; ·) and equations (17) and (18), we have

∇f^c(x) = f⁰(λ)I, (23)

where λ = λ₁ = λ₂ = x₁.

“⇒” Let f^c be Fréchet-differentiable at x ∈ IRⁿ with spectral eigenvalues λ₁, λ₂, we will show that f is Fréchet-differentiable at λ₁, λ₂. Suppose not, then f is not Fréchet- differentiable at λ_i for some i ∈ {1, 2}. Thus, either the right- and left-directional

(13)

derivatives of f at λ_i are unequal or one of them does not exist. In either case, this implies that there exist two sequences of non-zero scalars t^ν and τ^ν, ν = 1, 2, . . . , converging to zero such that the limits

ν→∞lim

f (λ_i+ t^ν) − f (λ_i)

t^ν , lim

ν→∞

f (λ_i+ τ^ν) − f (λ_i) τ^ν

either are unequal or one of them does not exist. Now for any x = λ₁u⁽¹⁾ + λ₂u⁽²⁾, let h := 1 · u⁽¹⁾ + 0 · u⁽²⁾ = u⁽¹⁾. Then, we know x + th = (λ1 + t)u⁽¹⁾ + λ2u⁽²⁾ and f^c(x + th) = f (λ₁+ t)u⁽¹⁾+ f (λ₂)u⁽²⁾, which give

ν→∞lim

f^c(x + t^νh) − f^c(x)

t^ν = lim

ν→∞

f (λ₁+ t^ν) − f (λ₁) t^ν u⁽¹⁾

ν→∞lim

f^c(x + τ^νh) − f^c(x)

τ^ν = lim

ν→∞

f (λ₁+ τ^ν) − f (λ₁) τ^ν u⁽¹⁾.

It follows that these two limits either are unequal or one of them does not exist. This implies that f^c is not Fr´echet-differentiable at x, which is a contradiction. 2

Theorem 3.4. For any f : IR → IR, f^c is continuously differentiable (smooth) at x ∈ IRⁿ with spectral values λ₁, λ₂ if and only if f is continuously differentiable (smooth) at λ₁, λ₂.

Proof. “⇐” Suppose f is continuously differentiable at x ∈ IRⁿ. From equation (20), it can been seen that ∇f^c is continuous at every x with x₂ 6= 0. It remains to show that ∇f^c is continuous at every x with x₂ = 0. Fix any x = (x₁, 0) ∈ IRⁿ, which says λ₁ = λ₂ = x₁. Let y^ν = (y₁^ν, y₂^ν) ∈ IR × IRⁿ⁻¹ be any sequence converging to x. For those y^ν₂ = 0, applying equation (23) gives ∇f^c(y^ν) = f⁰(λ(y^ν))I. Suppose y₂^ν 6= 0, from equation (21), we have

lim

y^ν→x,y₂^ν6=0a = lim

y^ν→x,y₂^ν6=0

f (λ₂(y^ν)) − f (λ₁(y^ν))

λ2(y^ν) − λ1(y^ν) = f⁰(x₁), lim

y^ν→x,y₂^ν6=0b = lim

y^ν→x,y₂^ν6=0 f⁰(λ₁(y^ν)) sin²θ + f⁰(λ₂(y^ν)) cos²θ = f⁰(x₁),

y^ν→x,ylim^ν₂6=0c y₂^ν

ky₂^νk = lim

y^ν→x,y₂^ν6=0sin θ cos θ [ f⁰(λ₂(y^ν)) − f⁰(λ₁(y^ν)) ] y^ν₂ ky^ν₂k = 0,

y^ν→x,ylim^ν₂6=0(¯b − a)y₂^νy₂^νT

ky₂^νk² = lim

y^ν→x,y₂^ν6=0

f⁰(λ₁(y^ν)) cos²θ + f⁰(λ₂(y^ν)) sin²θ

−f (λ₂(y^ν)) − f (λ₁(y^ν)) λ₂(y^ν) − λ₁(y^ν)

y₂^νy₂^νT ky₂^νk² = 0.

Using the facts that both _ky^y^ν²ν

2k and ^y_ky^ν²^yν²^νT

2k² are bounded by 1 and then taking the limit in (20) as y → x yield lim

y→x∇f^c(y) = f⁰(x₁)I = ∇f^c(x). This says ∇f^c is continuous at every x ∈ IRⁿ .

(14)

“⇒” The proof for this direction is similar to the one for [4, Prop. 5], so we omit it. 2

Next, we move to property of (locally) Lipschitz continuity. To this end, we need the following result, which is from [10, Theorem 9.67].

Lemma 3.4. [10, Theorem 9.67] Suppose f : IRⁿ → IR is strictly continuous. Then, there exist continuously differentiable functions f^ν : IRⁿ → IR, ν = 1, 2, · · · , converging uniformly to f on any compact set C in IRⁿ and satisfying

k∇f^ν(x)k ≤ sup

y∈C

Lipf (y) ∀x ∈ C, ν = 1, 2, 3, · · ·

where Lipf (x) := lim sup

y,z→x,y6=z

kf (y) − f (z)k ky − zk .

Theorem 3.5. For any f : IR → IR, the following results hold:

(a) f^c is strictly continuous at x ∈ IRⁿ with spectral values λ1, λ2 if and only if f is strictly continuous at λ₁, λ₂.

(b) f^c is Lipschitz continuous (with respect to k · k) with constant κ if and only if f is Lipschitz continuous with constant κ.

Proof. (a) “⇐” Fix any x ∈ IRⁿ with spectral values λ₁ and λ₂ given by (6). Suppose f is strictly continuous at λ₁ and λ₂. Then, there exist κ_i > 0 and δ_i > 0 for i = 1, 2 such that

|f (b) − f (a)| ≤ κ_i|b − a|, ∀ a, b ∈ [λ_i− δ_i, λ_i+ δ_i] i = 1, 2.

Let ¯δ := min{δ₁, δ₂} and C := [λ₁ − ¯δ₁, λ₁ + ¯δ] ∪ [λ₂ − ¯δ, λ₂ + ¯δ]. Define a real-valued function ¯f : IR → IR as

f (a) =¯











f (a) if a ∈ C,

(1 − t)f (λ₁+ ¯δ) if λ₁+ ¯δ < λ₂− ¯δ and, for some t ∈ (0, 1), +tf (λ₂− ¯δ) a = (1 − t)(λ₁+ ¯δ) + t(λ₂− ¯δ),

f (λ₁− ¯δ) if a < λ₁− ¯δ, f (λ₂+ ¯δ) if a > λ₂+ ¯δ.

From the above, we know that ¯f is Lipschitz continuous, which means there exists a scalar κ > 0 such that Lip ¯f (a) ≤ κ for all a ∈ IR. Since C is compact, by Lemma 3.4, there exist continuously differentiable functions f^ν : IR → IR, ν = 1, 2, · · · , converging uniformly to ¯f and satisfying

|(f^ν)⁰(a)| ≤ κ, ∀ a ∈ C, ∀ ν.

(15)

On the other hand, from Lemma 3.1, there exists a δ such that C contains all spectral values of w ∈ B(x, δ). Moreover, for any w ∈ B(x, δ) with spectral factorization w = µ₁u⁽¹⁾+ µ₂u⁽²⁾, by direct computation, we have

(f^ν)^c(w) − f^c(w)

2 = sin²θ|f^ν(µ₁) − f (µ₂)|²+ cos²θ|f^ν(µ₂) − f (µ₂)|².

This together with f^ν converging uniformly to f on C implies that (f^ν)^c converges uniformly to f^c on B(x, δ).

Next, we explain that k∇(f^ν)^c(w)k is uniformly bounded. Indeed, for w₂ = 0, from equation (23) we have k∇(f^ν)^c(w)k = |(f^ν)⁰(w₁)| ≤ κ. For general w₂ 6= 0, it is not hard to check k∇(f^ν)^c(w)k ≤ M for some uniform bound M ≥ κ on the set C by using equation (22).

Fix any y, z ∈ B(x, δ). Since (f^ν)^c converges uniformly to f^c, for any > 0 there exists an integer ν₀ such that for all ν ≥ ν₀ we have

k(f^ν)^c(w) − f^c(w)k ≤ ky − zk ∀w ∈ B(x, δ).

Note that f^ν is continuously differentiable, Theorem 3.4 implies (f^ν)^cis also continuously differentiable. Then, by the fact that k∇(f^ν)^c(w)k is uniform bounded by M and the Mean Value Theorem for continuously differentiable functions, we obtain

f^c(y) − f^c(z)

=

f^c(y) − (f^ν)^c(y) + (f^ν)^c(y) − (f^ν)^c(z) + (f^ν)^c(z) − f^c(z)

≤

f^c(y) − (f^ν)^c(y)

+ k(f^ν)^c(y) − (f^ν)^c(z)k +

(f^ν)^c(z) − f^c(z)

≤ 2ky − zk +

Z 1 0

∇(f^ν)^c(z + t(y − z))(y − z)dt

≤ (M + 2)ky − zk.

This shows that f^c is strictly continuous at x.

“⇒” Suppose that f^c is strictly continuous at x with eigenvalues λ₁ and λ₂ and spectral vectors u⁽¹⁾ and u⁽²⁾. This means there exist δ and M such that for y, z ∈ B(x, δ), we have

f^c(y) − f^c(z)

≤ M ky − zk.

For any i ∈ {1, 2} and any a, b ∈ [λ_i− δ, λ_i+ δ], denote

y := x + (a − λ_i)u⁽ⁱ⁾, z := x + (b − λ_i)u⁽ⁱ⁾.

Then, ky − xk = |a − λi|ku⁽ⁱ⁾k ≤ δ and kz − xk = |b − λi|ku⁽ⁱ⁾k ≤ δ. Thus,

|f (b) − f (a)| · u⁽ⁱ⁾

=

f^c(y) − f^c(z)

≤ M ky − zk.

which says that f is strictly continuous at λ₁ and λ₂ because ku⁽¹⁾k = sin θ and u⁽²⁾

= cos θ.

(b) This is immediate consequence of part (a). 2

(16)

4 Semismoothness Property

This section is devoted to presenting semismooth property between f and f^c. As men- tioned earlier, Lemma 2.1 will be employed frequently in our analysis.

Theorem 4.1. For any f : IR → IR, f^c is semismooth at x ∈ IRⁿ with spectral values λ₁, λ₂ if and only if f is semismooth at λ₁, λ₂.

Proof. “⇒” Suppose f^c is semismooth, then f^c is strictly continuous and directionally differentiable. By Theorem 3.2 and Theorem 3.5, f is strictly continuous and directionally differentiable. Now, for any α ∈ IR and any η ∈ IR such that f is differentiable at α + η, Theorem 3.2 yields that f^c is differentiable at x + h, where x := (α, 0) ∈ IR × IRⁿ⁻¹ and h := (η, 0) ∈ IR × IRⁿ⁻¹. Hence, we can choose the same spectral vectors for x + h = (α + η, 0) and x = (α, 0) such that

f^c(x + h) = f (α + η)u⁽¹⁾+ f (α + η)u⁽²⁾, f^c(x) = f (α)u⁽¹⁾+ f (α)u⁽²⁾.

Since f^c is semismooth, by Lemma 2.1, we know

f^c(x + h) − f^c(x) − ∇f^c(x + h)h = o(khk). (24) On the other hand, equation (23) yields ∇f^c(x + h)h = f⁰(α + η)Ih = (f⁰(α + η)η, 0) . Plugging this into equation (24) yields f (α + η) − f (α) − f⁰(α + η)η = o(|η|). Thus, by Lemma 2.1 again, it follows that f is semismooth at α. Since α is arbitrary, f is semismooth.

“⇐” Suppose f is semismooth, then f is strictly continuous and directionally differentiable. By Theorem 3.2 and Theorem 3.5, f^c is strictly continuous and directionally differentiable. For any x = (x1, x2) ∈ IR × IRⁿ and h = (h1, h2) ∈ IR × IRⁿ such that f^c is differentiable at x + h, we will verify that

f^c(x + h) − f^c(x) − ∇f^c(x + h)h = o(khk).

Case (i): If x₂ 6= 0, let λ_i be the spectral values of x and u⁽ⁱ⁾ be the associated spectral vectors. We denote x + h by z for convenience, i.e., z := x + h and let m_i be the spectral values of z with the associated spectral vectors v⁽ⁱ⁾. Hence, we have

f^c(x) = f (λ₁)u⁽¹⁾+ f (λ₂)u⁽²⁾, f^c(x + h) = f (m₁)v⁽¹⁾+ f (m₂)v⁽²⁾. Suppose now f^c is differentiable at z. From (20), we know

∇f^c(x + h) =







b cz₂^T

kz₂k cz₂

kz₂k aI + (¯b − a) z₂z₂^T kz₂k²





 ,

(17)

where

a = f (m₂) − f (m₁) m2− m1

,

b = f⁰(m₁) sin²θ + f⁰(m₂) cos²θ,

¯b = f⁰(m1) cos²θ + f⁰(m2) sin²θ, c = [f⁰(m₂) − f⁰(m₁)] sin θ cos θ.

With this, we can write out f^c(x + h) − f^c(x) − ∇f^c(x + h)h := (Ξ₁, Ξ₂) where Ξ₁ ∈ IR and Ξ₂ ∈ IRⁿ⁻¹. Since the expansion is very long, for simplicity, we denote Ξ₁ be the first component and Ξ₂ be the second component of the expansion. We will show that Ξ1 and Ξ2 are both o(khk). First, we compute the first component Ξ1:

Ξ₁ = sin²θ

f (m₁) − f (λ₁) − f⁰(m₁)(h₁− cot θz₂^Th₂ kz₂k)

+ cos²θ

f (m₂) − f (λ₂) − f⁰(m₂)(h₁+ tan θz₂^Th₂ kz₂k)

= sin²θ {f (m₁) − f (λ₁) − f⁰(m₁) (h₁− cot θ(kz₂k − kx₂k)) + o(khk)}

+ cos²θ {f (m₂) − f (λ₂) − f⁰(m₂) (h₁+ tan θ(kz₂k − kx₂k)) + o(khk)}

= o (h₁− (kz₂k − kx₂k)) + o(khk) + o (h₁+ (kz₂k − kx₂k)) + o(khk).

In the above expression of Ξ₁, the third equality holds since the following:

z₂^Th₂

kz₂k = z₂^T(z₂− x₂)

kz₂k = kz2k −kz₂kkx₂k kz₂k cos α

= kz2k − kx2k 1 + O(α²) = kz2k − kx2k 1 + O(khk²)

= kz₂k − kx₂k (1 + o(khk))

where α is the angle between x₂ and z₂and note that z₂−x₂ = h₂gives O(α²) = O(khk²).

In addition, the last equality in expression of Ξ₁ holds because f is semismooth and mi− λi = h1+ (−1)ⁱ(tan θ)⁽⁻¹⁾ⁱ(kz2k − kx2k).

On the other hand, due to

h₁+ (−1)ⁱ(tan θ)⁽⁻¹⁾ⁱ(kz₂k − kx₂k)

≤ |h₁| + M kz₂− x₂k ≤ M (|h₁| + kh₂k) where M = max{tan θ, cot θ} ≥ 1. Then, we observe that when khk → 0,

|h₁| + (−1)ⁱ(tan θ)⁽⁻¹⁾ⁱ(kz₂k − kx₂k) → 0

|h₁| + (−1)ⁱ(tan θ)⁽⁻¹⁾ⁱ(kz₂k − kx₂k) = O(khk).

Thus, we obtain o

h₁ + (−1)ⁱ(tan θ)⁽⁻¹⁾ⁱ(kz₂k − kx₂k)

= o(khk), which implies that the first component Ξ₁ is o(khk).