The same growth of FB-SOC merit function and NR-SOC merit function
1Shaohua Pan
School of Mathematical Sciences South China University of Technology
Guangzhou 510640, China E-mail: shhpan@scut.edu.cn
Jein-Shan Chen Department of Mathematics National Taiwan Normal University
Taipei, Taiwan 11677 E-mail: jschen@math.ntnu.edu.tw
Jing-Fan Li
Department of Mathematics National Taiwan Normal University
Taipei, Taiwan 11677 E-mail: 697400011@ntnu.edu.tw
September 6, 2009
Abstract. We establish the same growth of the Fischer-Burmeister (FB) merit function and the natural residual (NR) merit function associated with second-order cones. This extends an important result proved by Tseng in [12, Lemma 3.1] to the setting of second- order cones. Particularly, using such relation, we obtain the global error bound property of the FB merit function for the second-order cone complementarity problem (SOCCP), which plays a key role in analyzing the convergence rates of those algorithms based on FB SOC complementarity function for the second-order cone program and the SOCCP.
Key Words. Fischer-Burmeister function, Natural-Residual function, merit function, second-order cone.
1A much simpler proof for this result can be seen in [1].
1 Introduction
A well-known approach for solving the nonlinear complementarity problem (NCP) is to reformulate it as a global minimization over IRn via a certain merit function. For the approach to be effective, the choice of the merit function is crucial. A popular choice is the Fischer-Burmeister (FB) merit function:
ψFB(a, b) = 1
2|ϕFB(a, b)|2, where ϕFB : IR× IR → IR is the FB NCP-function defined by
ϕFB(a, b) :=√
a2 + b2− (a + b) ∀a, b ∈ IR.
It turns out that ϕFBand ψFB have many desirable properties; see [4, 5]. Another popular choice is the natural residual (NR) merit function:
ψNR(a, b) = 1
2|ϕNR(a, b)|2, where ϕNR : IR× IR → IR is the NR NCP-function defined as
ϕNR(a, b) := a− (a − b)+ = min{a, b} ∀a, b ∈ IR.
The NR merit function ψNR is not differentiable, which is its main drawback compared with ψFB. For the two functions, Tseng [12] proved the following important inequality:
(2−√
2)|ϕNR(a, b)| ≤ |ϕFB(a, b)| ≤ (2 +√
2)|ϕNR(a, b)| ∀a, b ∈ IR (1) which says ψFB and ψNR have the same order of growth behavior. It was this relation that accounts for the global error bound property of the FB merit function, which plays a key role in analyzing the convergence rates of those algorithms based on ϕFB; see [9, 11, 13].
The second-order cone complementarity problem (SOCCP), as an extension of the NCP, is to find a vector x∈ IRn such that
x∈ K, F (x) ∈ K, ⟨x, F (x)⟩ = 0, (2) where F : IRn → IRnis a continuous mapping,⟨·, ·⟩ denotes the Euclidean inner product, and K is the Cartesian product of second-order cones (SOCs). In other words,
K = Kn1 × Kn2 × · · · × Knm,
where n1, . . . , nm ≥ 1, n1+· · · + nm = n, and Kni is the SOC in IRni defined by Kni :={x = (x1, x2)∈ IR × IRni−1 | x1 ≥ ∥x2∥},
with∥·∥ denoting the Euclidean norm and K1 being the set of nonnegative real numbers IR+. Clearly, when m = n and n1 =· · · = nm = 1, the SOCCP (2) becomes the NCP.
The merit function approach for the NCP can be extended to the SOCCP case; see [2]. This approach aims to find a function ψ : IRn× IRn→ IR+ satisfying
ψ(x, y) = 0 ⇐⇒ x ∈ Kn, y ∈ Kn, ⟨x, y⟩ = 0,
and then the SOCCP can be reformulated as an unconstrained minimization problem:
xmin∈IRn Ψ(x) :=
∑m i=1
ψ(xi, Fi(x)),
where xi ∈ IRni and Fi : IRn→ IRni. We call such ψ an SOC merit function. Analogous to the NCP case, a popular choice for ψ is the FB SOC merit function
ψFB(x, y) := 1
2∥ϕFB(x, y)∥2, (3)
where ϕFB : IRn× IRn→ IRn is the vector-valued FB function defined by
ϕFB(x, y) := (x2+ y2)1/2− x − y (4) with x2 = x◦x denoting the Jordan product of x and itself, and x1/2 meaning the vector such that (x1/2)2 = x. Similarly, the NR merit function can be defined in the SOC case:
ψNR(x, y) := 1
2∥ϕNR(x, y)∥2 (5)
where ϕNR : IRn× IRn → IRn is the vector-valued natural residual function:
ϕNR(x, y) := x− (x − y)+ (6)
with (·)+ denoting the projection onto the coneKn. The function ϕNR was employed in [6, 7] to develop the smoothing Newton methods for the SOCCP.
It has been an open question whether ψFB and ψNR have the same order of growth behavior in the SOC case. In other words, can the inequality (1) be established for the SOC case? We answer this question affirmatively in this paper. Particularly, using this result, we obtain that the square root of the following FB merit function
ΨFB(x) :=
∑m i=1
ψFB(xi, Fi(x)) (7)
provides a global error bound for the solution of (2) under suitable conditions of F . This is a key to analyze the convergence rates of the algorithms based on ϕFB for the SOCCP.
Throughout this paper, we denote bd(Kn) by the boundary ofKn, and int(Kn) by the interior ofKn. The notation o(∥xk∥2) means a function satisfying lim
k→∞
o(∥xk∥2)
∥xk∥2 = 0, and O(∥xk∥2) denotes a function satisfying |O(∥xk∥2)| ≤ C∥xk∥2 for some constant C > 0.
2 Preliminaries
For any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1, their Jordan product [3] is defined as x◦ y := (⟨x, y⟩, x1y2+ y1x2),
which, unlike scalar or matrix multiplication, is not associative in general. The identity element under this product is e := (1, 0, . . . , 0)T ∈ IRn, i.e., e◦ x = x for any x ∈ IRn. We recall from [6] that each x = (x1, x2) ∈ IR × IRn−1 admits a spectral factorization, associated with Kn, of the form
x = λ1(x)u(1)x + λ2(x)u(2)x , (8) where λi(x) and u(i)x for i = 1, 2 are the spectral values and the associated spectral vectors of x, with respect to Kn, given by
λi(x) := x1+ (−1)i∥x2∥, u(i)x := 1 2
(
1, (−1)ix¯2), (9) with ¯x2 = ∥xx2
2∥ if x2 ̸= 0 and otherwise being any vector in IRn−1 satisfying ∥¯x2∥ = 1.
The following lemmas are used in the subsequent analysis; see [2] for their proofs.
Lemma 2.1 For any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1, if x2+ y2 ∈ bd(Kn), then x21 =∥x2∥2, y12 =∥y2∥2, x1y1 = xT2y2, x1y2 = y1x2.
Lemma 2.2 Let ϕFB and ϕNR be defined by (4) and (6), respectively. Then, ϕFB(x, y) = 0 ⇐⇒ ϕNR(x, y) = 0 ⇐⇒ x ∈ Kn, y∈ Kn, ⟨x, y⟩ = 0.
3 Main result
In this section, we concentrate on the main result of this paper which is stated as in Theorem 3.1. To prove the theorem, we need the following two crucial lemmas.
Lemma 3.1 Let ϕFB and ϕNR be defined by (4) and (6), respectively. Then, for any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1 with x2+ y2 ∈ bd(Kn), we have
(2−√
2)∥ϕNR(x, y)∥ ≤ ∥ϕFB(x, y)∥ ≤ (2 +√
2)∥ϕNR(x, y)∥.
Proof. Fix any x = (x1, x2), y = (y1, y2)∈ IR × IRn−1. If (x, y) = (0, 0), then the result is direct by Lemma 2.2. We next suppose (x, y) ̸= (0, 0). Let w = (w1, w2) := x2+ y2. From (x, y)̸= (0, 0) and x2 + y2 ∈ bd(Kn), it follows that
w1 =∥w2∥ = 2∥x1x2+ y1y2∥ ̸= 0, λ1(w) = 0 and λ2(w) = 4(x21 + y21),
where the last equality is using Lemma 2.1. Together with the formulas (8)–(9), we get
ϕFB(x, y) = w1/2− (x + y) =
√
x21+ y21 x1x2+ y1y2
√
x21+ y21
−
( x1 + y1 x2 + y2
)
.
In addition, using Lemma 2.1, it is not hard to calculate that
x1x2+ y1y2
√
x21+ y12 − (x2+ y2)
2
=
(√
x21+ y21− (x1+ y1)
)2
. From the last two equalities, we immediately obtain
∥ϕFB(x, y)∥2 = 2
(√
x21+ y12− (x1+ y1)
)2
, which together with the inequality (1) yields
2(2−√
2)2(min{x1, y1})2 ≤ ∥ϕFB(x, y)∥2 ≤ 2(2 +√
2)2(min{x1, y1})2. (10) Now we consider (x− y) ∈ Kn, (x− y) ∈ −Kn and (x− y) ̸∈ −Kn∪ Kn, respectively.
Case 1: (x− y) ∈ Kn. Under this case, ϕNR(x, y) = y, and applying Lemma 2.1,
∥ϕNR(x, y)∥2 = y12+∥y2∥2 = 2y12.
In addition, since (x− y) ∈ Kn implies x1 ≥ y1, the inequality (10) is equivalent to 2(2−√
2)2y21 ≤ ∥ϕFB(x, y)∥2 ≤ 2(2 +√ 2)2y21. Combining the last two equations, we readily obtain the desired result.
Case 2: (x− y) ∈ −Kn. Now ϕNR(x, y) = x and x1 ≤ y1. Along with Lemma 2.1,
∥ϕNR(x, y)∥2 =∥x∥2 = 2x21 = 2 min{x1, y1}2, and the desired result follows from (10) directly.
Case 3: (x− y) ̸∈ −Kn∪ Kn. In this case, we must have∥x2− y2∥ > |x1− y1|. On the other hand, by Lemma 2.1, ∥x2 − y2∥ = |x1 − y1|. This leads to a contradiction. This shows that this subcase actually does not occur. The proof is completed. 2
For convenience, in the rest of this paper, the notation S always represents the set S :={(x, y)∈ IRn× IRn | x ̸= 0, y ̸= 0, ∥ϕFB(x, y)∥ ̸= 0, x2+ y2 ∈ int(Kn)}. The following lemma shows that ϕFB and ϕNR have the same order of growth in S.
Lemma 3.2 Let ϕFB and ϕNR be defined by (4) and (6), respectively. Then, there exist positive constants ¯c1 and ¯c2 such that, for all (x, y)∈ S,
¯
c1∥ϕNR(x, y)∥ ≤ ∥ϕFB(x, y)∥ ≤ ¯c2∥ϕNR(x, y)∥.
Proof. We first argue the fact that, if the result holds when S is bounded, then it also holds when S is unbounded. Fix any (x, y)∈ S with S being unbounded. Let
S′ :={(x′, y′)∈ IRn× IRn | x′ ̸= 0, y′ ̸= 0, (x′)2+ (y′)2 ∈ int(Kn),∥x′∥2+∥y′∥2 ≤ 2}. Clearly, S′ is bounded, and therefore, for all (x′, y′)∈ S′,
¯
c1∥ϕNR(x′, y′)∥ ≤ ∥ϕFB(x′, y′)∥ ≤ ¯c2∥ϕNR(x′, y′)∥ .
If ∥x∥ ≥ ∥y∥, then (∥x∥x ,∥x∥y )∈ S′. From the last inequality, it follows that
¯ c1
ϕNR
( x
∥x∥, y
∥x∥
) ≤ ϕFB
( x
∥x∥, y
∥x∥
) ≤ ¯c2
ϕNR
( x
∥x∥, y
∥x∥
) .
Multiplying the two sides by ∥x∥ and using the expressions of ϕFB and ϕNR yield the result. If ∥x∥ ≤ ∥y∥, then (∥y∥x ,∥y∥y )∈ S′, and the result follows by similar arguments.
Next, we assume that the set S is bounded, and prove the result by two steps.
Step 1: Suppose such ¯c2does not exist. Then there must have a sequence{(xk, yk)} ⊆ S such that
∥ϕNR(xk, yk)∥ < 1
k∥ϕFB(xk, yk)∥. (11) Without loss of generality, assume that the sequences {xk} and {yk} converge to x∗ and y∗, respectively. Since {ϕFB(xk, yk)} is bounded and ∥ϕNR(x, y)∥ is continuous, taking the limit to the both sides of (11) yields ∥ϕNR(x∗, y∗)∥ = 0. By Lemma 2.2,
x∗ ∈ Kn, y∗ ∈ Kn, ⟨x∗, y∗⟩ = 0. (12) We proceed the arguments by two cases (x∗)2+(y∗)2 ∈int(Kn) and (x∗)2+(y∗)2 ∈bd(Kn).
Unless otherwise stated, the index k appearing in the sequel are all sufficiently large.
Case 1: (x∗)2 + (y∗)2 ∈ int(Kn). In this case, it suffices to consider the following two subcases: (1.1) x∗ ∈ int(Kn) or y∗ ∈ int(Kn); (1.2) x∗ ∈ bd(Kn) and y∗ ∈ bd(Kn).
Subcase 1.1: x∗ ∈ int(Kn) or y∗ ∈ int(Kn). Without loss of generality, we assume that y∗ ∈ int(Kn). Along with⟨x∗, y∗⟩ = 0 in (12), we have x∗ = 0, and (x∗−y∗)∈ −int(Kn).
This means that (xk− yk)∈ −int(Kn), and consequently ∥ϕNR(xk, yk)∥ = ∥xk∥2.
Let wk = (wk1, w2k) = (xk)2+ (yk)2, and λk1 and λk2 be the spectral values of wk. Then, λki = w1k+ (−1)i∥w2k∥ = ∥xk∥2 +∥yk∥2+ (−1)i∥2(xk1xk2 + y1ky2k)∥, i = 1, 2.
(a) Suppose that wk2 ̸= 0. By formulas (8)–(9), it follows that
(wk)1/2 =
√λk2+√
λk1
√ 2 λk2−√
λk1 2
w2k
∥w2k∥
=
√λk2+√
λk1 2 w2k
√λk2+√
λk1
. (13)
Together with ϕFB(xk, yk) = (wk)1/2− (xk+ yk), we have that 1
2∥ϕFB(xk, yk)∥2 = ∥xk∥2+∥yk∥2+ (xk)Tyk
− xk1+ y1k
√
λk2 +
√
λk1
(
∥xk∥2+∥yk∥2+
√
λk2
√
λk1
)
−2(xk1∥xk2∥2+ y1k∥y2k∥2) + 2(xk1 + y1k)(xk2)T(y2k)
√
λk2 +
√
λk1
= T1k+ T2k+ T3k, (14)
where
T1k := ∥xk∥2− (x√k1 + y1k)∥xk∥2 λk2 +
√
λk1 − √2xk1∥xk2∥2 λk2+
√
λk1 ,
T2k := ∥yk∥2− y1k
(
∥yk∥2 +
√
λk2
√
λk1
)
√
λk2 +
√
λk1 −√2yk1∥yk2∥2 λk2+
√
λk1 − 2xk1(xk2)Ty2k
√
λk2 +
√
λk1
, (15)
T3k := (xk)Tyk− xk1
(
∥yk∥2 +
√
λk2
√
λk1
)
√
λk2 +
√
λk1 − 2yk1(xk2)Tyk2
√
λk2+
√
λk1 .
Since xk→ x∗ = 0 and yk → y∗ ∈ int(Kn), it is not hard to verify that
klim→∞
T1k
∥xk∥2 = 1− y∗1
y∗1 +∥y∗2∥ + y∗1− ∥y∗2∥ = 1
2. (16)
From the expression of T2k, it follows that
(√
λk2 +
√
λk1
)
T2k+ 2xk1(xk2)Ty2k equals
(√
λk2− yk1 − ∥yk2∥) (∥y2k∥2+ y1k∥yk2∥)+
(√
λk1− y1k+∥y2k∥) (∥yk∥2− y1k
√
λk2
)
. In addition, an elementary calculation yields that
√
λk2− y1k− ∥yk2∥ = 4xk1y1k(xk2)Ty2k
(√
λk2+ y1k+∥y2k∥) (∥xk1xk2+ yk1y2k∥ + y1k∥yk2∥) +√ ∥xk∥2
λk2 + y1k+∥yk2∥ + o(∥xk∥2), (17)
√
λk1 − y1k+∥yk2∥ = − 4xk1y1k(xk2)Ty2k
(√
λk1 + y1k− ∥y2k∥) (∥xk1xk2 + yk1yk2∥ + y1k∥y2k∥) +√ ∥xk∥2
λk1 + y1k− ∥yk2∥ + o(∥xk∥2).
The last two sides imply that
(√
λk2 +
√
λk1
)
T2k is equal to
∥xk√∥2(∥yk2∥2+ y1k∥y2k∥)
λk2+ yk1 +∥y2k∥ +∥xk∥2(∥yk∥2− y1k
√
λk2)
√
λk1+ y1k− ∥y2k∥ + o(∥xk∥2) + 4xk1y1k(xk2)Tyk2(∥y2k∥2+ yk1∥yk2∥)
(√
λk2 + y1k+∥yk2∥) (∥xk1xk2+ y1ky2k∥ + y1k∥yk2∥)
− 4xk1y1k(xk2)Ty2k
(
∥yk∥2− y1k
√
λk2
) (√
λk1+ yk1 − ∥y2k∥) (∥xk1xk2 + y1ky2k∥ + y1k∥y2k∥)− 2xk1(xk2)Ty2k.
Let T21k and T22k denote the sums of the first three terms and the last three terms, respectively, on the right hand side of last equation. By the expression of T21k,
klim→∞
T21k
∥xk∥2 = ∥y2∗∥2+ y1∗∥y∗2∥
2(y1∗+∥y2∗∥) +∥y∗∥2− y1∗(y1∗+∥y2∗∥) 2(y1∗− ∥y2∗∥) = 0.
Whereas from the expression of T22k, it follows that T22k
∥xk∥2 = 4xk1y1k(xk2)Ty2k
∥xk∥2
∥yk2∥2+ y1k∥y2k∥
(√
λk2+ y1k+∥y2k∥) (∥xk1xk2+ yk1y2k∥ + y1k∥yk2∥) − 1 2yk1
− ∥yk∥2− y1k
√
λk2
(√
λk1 + y1k− ∥y2k∥) (∥xk1xk2+ yk1yk2∥ + y1k∥y2k∥)
.
Since the sequence
{4xk1y1k(xk2)Tyk2
∥xk∥2
}
is bounded, and the limit of the term in the bracket as k→ ∞ equals 0, we have limk→∞T22k/∥xk∥2 = 0. Thus, we get
k→∞lim T2k
∥xk∥2 = lim
k→∞
T21k + T22k
∥xk∥2(√λk2+
√
λk1
) = 0. (18)
We next take a look at T3k. By the expression of T3k,
(√
λk2+
√
λk1
)
T3k equals (xk)Tyk
(√
λk2 +
√
λk1
)
− xk1
(
∥yk∥2+
√
λk2
√
λk1
)
− 2yk1(xk2)Tyk2
=
(√
λk2 − y1k−∥y2k∥) [(xk2)Ty2k+ xk1∥y2k∥]+
(√
λk1 −y1k+∥y2k∥) [(xk)Tyk− xk1
√
λk2
]
which, together with (17), implies that
(√
λk2 +
√
λk1
)
T3k = o(∥xk∥2). Therefore,
klim→∞
T3k
∥xk∥2 = 0.
Together with equations (14), (16) and (18), we immediately obtain
klim→∞
∥ϕFB(xk, yk)∥2
∥ϕNR(xk, yk)∥2 = lim
k→∞
2(T1k+ T2k+ T3k)
∥xk∥2 = 1, which contradicts the inequality (11).
(b) Suppose that w2k= 0. Under this subcase, since λk1 = λk2, we have that 1
2∥ϕFB(xk, yk)∥2 =∥xk∥2+∥yk∥2+ (xk)Tyk−(xk1 + y1k)(∥xk∥2+∥yk∥2)
√∥xk∥2+∥yk∥2 .
Since y1k > 0 by y∗ ∈ int(Kn), from w2k= 0 we get y2k =−(xk1/y1k)xk2, which means that
∥yk2∥2 = o(∥xk∥2) and (xk)Tyk= xk1y1k− (xk1/y1k)∥xk2∥2 = xk1y1k+ o(∥xk∥2).
Substituting the two equalities into the above 12∥ϕFB(xk, yk)∥2 yields that 1
2∥ϕFB(xk, yk)∥2 =∥xk∥2− √(xk1 + y1k)∥xk∥2
∥xk∥2+∥yk∥2 + (y1k)2− (xk1+ yk1)(y1k)2
√∥xk∥2+∥yk∥2 + xk1yk1 + o(∥xk∥2).
In addition, using ∥yk2∥2 = o(∥xk∥2) and an elementary calculation, we have that (y1k)2− (xk1 + y1k)(y1k)2
√∥xk∥2+∥yk∥2 + xk1y1k= (y1k)2− (yk1)3
√∥xk∥2+∥yk∥2 + xk1y1k− xk1(y1k)2
√∥xk∥2+∥yk∥2
= √ (y1k)2∥xk∥2
∥xk∥2+∥yk∥2(√∥xk∥2+∥yk∥2+ yk1)
+ o(∥xk∥2).
From the last two equations, xk → x∗ = 0 and yk→ y∗, it follows that
k→∞lim
∥ϕFB(xk, yk)∥2
∥ϕNR(xk, yk)∥2 = lim
k→∞
∥ϕFB(xk, yk)∥2
∥xk∥2 = 2− 2y∗1
∥y∗∥ + 2(y1∗)2
∥y∗∥(∥y∗∥ + y∗1) = 1, which gives a contradiction to the inequality (11).
Subcase 1.2: x∗ ∈ bd(Kn) and y∗ ∈ bd(Kn). Since (x∗)2 + (y∗)2 ∈ int(Kn), we must have x∗1 > 0 and y1∗ > 0, which implies xk1 > 0 and yk1 > 0. Also, xk− yk ̸∈ −Kn∪ Kn. If not, ϕNR(xk, yk) = xk or yk, which implies that ∥ϕNR(x∗, y∗)∥ ̸= 0. In addition, noting that 0 =⟨x∗, y∗⟩ = ∥x∗2∥∥y∗2∥ + (x∗2)Ty2∗ = limk→∞[(xk2)Tyk2 +∥xk2∥∥y2k∥], we have
(xk2)Ty2k =−∥xk2∥∥y2k∥ + αk with lim
k→∞αk = 0. (19)