# Interior proximal methods for SOCP

In document SOC Functions and Their Applications (Page 136-160)

Since P

k=1µkk < ∞, using Lemma 3.7 with vk := D(ζk, ζ) ≥ 0 and βk := µkk ≥ 0 yields that the sequence {D(ζk, ζ)} converges. Thus, by Proposition 3.10(e), the sequence {ζk} is bounded and consequently has an accumulation point. Without any loss of generality, let bζ ∈ F be an accumulation point of {ζk}. Then, there exists a subsequence {ζkj} → bζ for some kj → ∞. Since f is lower semi-continuous, we obtain f (bζ) = lim infkj→∞f (ζkj). On the other hand, f (ζkj) → f by part (a). The two sides imply that f (bζ) = f. Therefore, bζ is a solution of the CSOCP. The proof is thus complete. 

where {λk} is a sequence of positive parameters, and H : IRn × IRn → (−∞, ∞] is a proximal distance with respect to int(K) (see Def. 3.1) which plays the same role as the Euclidean distance kx − yk2 in the classical proximal algorithms (see, e.g., [105, 132]), but possesses certain more desirable properties to force the iterates to stay in K ∩ V, thus eliminating the constraints automatically. As will be shown, such proximal distances can be produced with an appropriate closed proper univariate function.

In the rest of this section, we focus on the case where K = Kn, and all the analysis can be carried over to the case where K has the direct product structure. Unless otherwise stated, we make the following minimal assumption for the CSOCP (3.64):

(A1) domf ∩ (V ∩ int(Kn)) 6= ∅ and f := inf{f (x) | x ∈ V ∩ Kn} > −∞.

Definition 3.2. An extended-valued function H : IRn × IRn → (−∞, ∞] is called a proximal distance with respect to int(Kn) if it satisfies the following properties:

(P1) domH(·, ·) = C1× C2 with int(Kn) × int(Kn) ⊂ C1× C2 ⊆ Kn× Kn.

(P2) For each given y ∈ int(Kn), H(·, y) is continuous and strictly convex on C1, and it is continuously differentiable on int(Kn) with dom∇1H(·, y) = int(Kn).

(P3) H(x, y) ≥ 0 for all x, y ∈ IRn, and H(y, y) = 0 for all y ∈ int(Kn).

(P4) For each fixed y ∈ C2, the sets {x ∈ C1 : H(x, y) ≤ γ} are bounded for all γ ∈ IR.

Definition 3.2 has a little difference from Definition 2.1 of  for a proximal distance w.r.t. int(Kn), since here H(·, y) is required to be strictly convex over C1 for any fixed y ∈ int(Kn). We denote D(int(Kn)) by the family of functions H satisfying Definition 3.2. With a given H ∈ D(int(Kn)), we have the following basic iterative algorithm for (3.64).

Interior Proximal Algorithm (IPA). Given H ∈ D(int(Kn)) and x0 ∈ V ∩ int(Kn).

For k = 1, 2, . . . , with λk > 0 and εk ≥ 0, generate a sequence {xk} ⊂ V ∩ int(Kn) with gk ∈ ∂εkf (xk) via the following iterative scheme:

xk := argminλkf (x) + H(x, xk−1) | x ∈ V

(3.66) such that

λkgk+ ∇1H(xk, xk−1) = ATuk for some uk ∈ IRm. (3.67) The following proposition implies that the IPA is well-defined, and moreover, from its proof we see that the iterative formula (3.66) is equivalent to the iterative scheme (3.65).

When εk > 0 for any k ∈ N (the set of natural numbers), the IPA can be viewed as an approximate interior proximal method, and it becomes exact if εk= 0 for all k ∈ N.

Proposition 3.14. For any given H ∈ D(int(Kn)) and y ∈ int(Kn), consider the problem f(y, τ ) = inf {τ f (x) + H(x, y) | x ∈ V} with τ > 0. (3.68) Then, for each ε ≥ 0, there exist x(y, τ ) ∈ V ∩ int(Kn) and g ∈ ∂εf (x(y, τ )) such that

τ g + ∇1H(x(y, τ ), y) = ATu (3.69) for some u ∈ IRm. Moreover, for such x(y, τ ), we have

τ f (x(y, τ )) + H(x(y, τ ), y) ≤ f(y, τ ) + ε. (3.70) Proof. Set F (x, τ ) := τ f (x)+H(x, y)+δV∩Kn(x), where δV∩Kn(x) is the indicator function defined on the set V ∩ Kn. Since domH(·, y) = C1 ⊂ Kn, it is clear that

f(y, τ ) = inf {F (x, τ ) | x ∈ IRn} . (3.71) Since f > −∞, it is easy to verify that for any γ ∈ IR the following relation holds

{x ∈ IRn| F (x, τ ) ≤ γ} ⊂ {x ∈ V ∩ Kn| H(x, y) ≤ γ − τ f}

⊂ {x ∈ C1| H(x, y) ≤ γ − τ f} ,

which together with (P4) implies that F (·, τ ) has bounded level sets. In addition, by (P1)-(P3), F (·, τ ) is a closed proper and strictly convex function. Hence, the problem (3.71) has a unique solution, to say x(y, τ ). From the optimality conditions of (3.71), we get

0 ∈ ∂F (x(y, τ )) = τ ∂f (x(y, τ )) + ∇1H(x(y, τ ), y) + ∂δV∩Kn(x(y, τ ))

where the equality is due to [131, Theorem 23.8] and domf ∩ (V ∩ int(Kn)) 6= ∅. Notice that dom ∇1H(·, y) = int(Kn) and dom ∂δV∩Kn(·) = V ∩ Kn. Therefore, the last equation implies x(y, τ ) ∈ V ∩ int(Kn), and there exists g ∈ ∂f (x(y, τ )) such that

−τ g − ∇1H(x(y, τ ), y) ∈ ∂δV∩Kn(x(y, τ )).

On the other hand, by the definition of δV∩Kn(·), it is not hard to derive that

∂δV∩Kn(x) = Im(AT), ∀x ∈ V ∩ int(Kn).

The last two equations imply that (3.69) holds for ε = 0. When ε > 0, (3.69) also holds for such x(y, τ ) and g since ∂f (x(y, τ )) ⊂ ∂εf (x(y, τ )). Finally, since for each y ∈ int(Kn) the function H(·, y) is strictly convex, and since g ∈ ∂εf (x(y, τ )), we have

τ f (x) + H(x, y) ≥ τ f (x(y, τ )) + H(x(y, τ ), y)

+hτ g + ∇1H(x(y, τ ), y), x − x(y, τ )i − ε

= τ f (x(y, τ )) + H(x(y, τ ), y) + hATu, x − x(y, τ )i − ε

= τ f (x(y, τ )) + H(x(y, τ ), y) − ε for all x ∈ V,

where the first equality is from (3.69) and the last one is by x, x(y, τ ) ∈ V. Thus, f(y, τ ) = inf{τ f (x) + H(x, y) | x ∈ V} ≥ τ f (x(y, τ )) + H(x(y, τ ), y) − ε. 

In the following, we focus on the convergence behaviors of the IPA with H from several subclasses of D(int(Kn)), which also satisfy one of the following properties.

(P5) For any x, y ∈ int(Kn) and z ∈ C1, H(z, y) − H(z, x) ≥ h∇1H(x, y), z − xi;

(P5’) For any x, y ∈ int(Kn) and z ∈ C2, H(y, z) − H(x, z) ≥ h∇1H(x, y), z − xi.

(P6) For each x ∈ C1, the level sets {y ∈ C2| H(x, y) ≤ γ} are bounded for all γ ∈ IR.

Specifically, we denote F1(int(Kn)) and F2(int(Kn)) by the family of functions H ∈ D(int(Kn)) satisfying (P5) and (P5’), respectively. If C1 = Kn, we denote F1(Kn) by the family of functions H ∈ D(int(Kn)) satisfying (P5) and (P6). If C2 = Kn, we write F2(int(Kn)) as F (Kn). It is easy to see that the class of proximal distance F (int(Kn)) (respectively, F (Kn)) in  subsumes the (H, H) with H ∈ F1(int(Kn)) (respectively, F1(Kn)), but it does not include any (H, H) with H ∈ F2(int(Kn)) (respectively, F2(Kn)).

Proposition 3.15. Let {xk} be the sequence generated by the IPA with H ∈ F1(int(Kn)) or H ∈ F2(int(Kn)). Set σν =Pν

k=1λk. Then, the following results hold.

(a) f (xν) − f (x) ≤ σ−1ν H(x, x0) + σ−1ν Pν

k=1σkεk for any x ∈ V ∩ C1 if H ∈ F1(int(Kn));

f (xν)−f (x) ≤ σν−1H(x0, x)+σ−1ν Pν

k=1σkεk for any x ∈ V ∩C2 if H ∈ F2(int(Kn)).

(b) If σν → +∞ and εk → 0, then lim infν→∞f (xν) = f. (c) The sequence {f (xk)} converges to f whenever P

k=1εk < ∞.

(d) If X 6= ∅, then {xk} is bounded with all limit points in X under (d1) or (d2) below:

(d1) X is bounded and P

k=1εk < ∞;

(d2) P

k=1λkεk< ∞ and H ∈ F1(Kn) (or H ∈ F2(Kn)).

Proof. The proofs are similar to those of [10, Theorem 4.1]. For completeness, we here take H ∈ F2(int(Kn)) for example to prove the results.

(a) Since gk ∈ ∂εkf (xk), from the definition of the subdifferential, it follows that f (x) ≥ f (xk) + hgk, x − xki − εk, ∀x ∈ IRn.

This together with equation (3.67) implies that

λk(f (xk) − f (x)) ≤ h∇1H(xk, xk−1), x − xki + λkεk, ∀x ∈ V ∩ C2. Using (P5’) with x = xk, y = xk−1 and z = x ∈ V ∩ C2, it then follows that

λk(f (xk) − f (x)) ≤ H(xk−1, x) − H(xk, x) + λkεk, ∀x ∈ V ∩ C2. (3.72)

Summing over k = 1, 2, . . . , ν in this inequality yields that

−σνf (x) +

ν

X

k=1

λkf (xk) ≤ H(x0, x) − H(xν, x) +

ν

X

k=1

λkεk. (3.73)

On the other hand, setting x = xk−1 in (3.72), we obtain

f (xk) − f (xk−1) ≤ λ−1k H(xk−1, xk−1) − H(xk, xk−1) + εk≤ εk. (3.74) Multiplying the inequality by σk−1 (with σ0 ≡ 0) and summing over k = 1, . . . , ν, we get

ν

X

k=1

σk−1f (xk) −

ν

X

k=1

σk−1f (xk−1) ≤

ν

X

k=1

σk−1εk.

Noting that σk = λk+ σk−1 with σ0 ≡ 0, the above inequality can reduce to σνf (xν) −

ν

X

k=1

λkf (xk) ≤

ν

X

k=1

σk−1εk. (3.75)

Adding the inequalities (3.73) and (3.75) and recalling that σk = λk+ σk−1, it follows that

f (xν) − f (x) ≤ σν−1H(x0, x) − H(xν, x) + σ−1ν

ν

X

k=1

σkεk, ∀x ∈ V ∩ C2,

which immediately implies the desired result due to the nonnegativity of H(xν, x).

(b) If σν → +∞ and εk → 0, then applying Lemma 2.2(ii) of  with ak = εk and bν := σ−1ν Pν

k=1λkεk yields σν−1Pν

k=1λkεk→ 0. From part(a), it then follows that lim inf

ν→∞ f (xν) ≤ inf {f (x) | x ∈ V ∩ int(Kn)} . This together with f (xν) ≥ inf {f (x) | x ∈ V ∩ Kn} implies that

lim inf

ν→∞ f (xν) = inf {f (x) | x ∈ V ∩ int(Kn)} = f.

(c) From (3.74), 0 ≤ f (xk) − f ≤ f (xk−1) − f+ εk. Using Lemma 2.1 of  with γk≡ 0 and vk= f (xk) − f, we have that {f (xk)} converges to f whenever P

k=1εk < ∞.

(d) If the condition (d1) holds, then the sets {x ∈ V ∩ Kn| f (x) ≤ γ} are bounded for all γ ∈ IR, since f is closed proper convex and X = {x ∈ V ∩ Kn| f (x) ≤ f}. Note that (3.74) implies {xk} ⊆ {x ∈ V ∩ Kn| f (x) ≤ f (x0) +Pk

j=1εj}. Along withP

k=1εk < ∞, clearly, {xk} is bounded. Since {f (xk)} converges to f and f is l.s.c., passing to the limit and recalling that {xk} ⊂ V ∩ Kn yields that each accumulation point of {xk} is a solution of (3.64).

Suppose that the condition (d2) holds. If H ∈ F2(Kn), then inequality (3.72) holds for each x ∈ V ∩ Kn, and particularly for x ∈ X. Consequently,

H(xk, x) ≤ H(xk−1, x) + λkεk ∀x ∈ X. (3.76) Summing over k = 1, 2, . . . , ν for the last inequality, we obtain

H(xν, x) ≤ H(x0, x) +

ν

X

k=1

λkεk.

This, by (P4) and P

k=1λkεk < ∞, implies that {xk} is bounded, and hence has an accumulation point. Without loss of generality, let ˆx ∈ Kn be an accumulation point of {xk}. Then there exists a subsequence {xkj} such that xkj → ˆx as j → ∞. From the lower semicontinuity of f and part(c), we get f (ˆx) ≤ limj→+∞f (xkj) = f, which means that ˆx is a solution of (3.64). If H ∈ F1(Kn), then the last inequality becomes

H(x, xν) ≤ H(x, x0) +

ν

X

k=1

λkεk.

By (P6) and P

k=1λkεk < ∞, we also have that {xk} is bounded, and hence has an accumulation point. Using the same arguments as above, we get the desired result.  An immediate byproduct of the above analysis yields the following global rate of convergence estimate for the IPA with H ∈ F1(Kn) or H ∈ F2(Kn).

Proposition 3.16. Let {xk} be the sequence given by the IPA with H ∈ F1(Kn) or F2(Kn). If X 6= ∅ andP

k=1εk< ∞, then f (xν) − f = O(σν−1).

Proof. The result is direct by setting x = x for some x ∈ X in the inequalities of Proposition 3.15(a), and noting that 0 < σσk

ν ≤ 1 for all k = 1, 2, · · · , ν. 

To establish the global convergence of {xk} to an optimal solution of (3.64), we need to make further assumptions on X or the proximal distances in F1(Kn) and F2(Kn).

We denote bF1(Kn) by the family of functions H ∈ F1(Kn) satisfying (P7)-(P8) below, Fb2(Kn) by the family of functions H ∈ F2(Kn) satisfying (P7’)–(P8’) below, and ¯F (Kn) by the family of functions H ∈ F2(Kn) satisfying (P7’)-(P9’) below:

(P7) For any {yk} ⊆ int(Kn) converging to y ∈ Kn, we have H(y, yk) → 0;

(P8) For any bounded sequence {yk} ⊆ int(Kn) and any y ∈ Kn with H(y, yk) → 0, there holds that λi(yk) → λi(y) for i = 1, 2;

(P7’) For any {yk} ⊆ int(Kn) converging to y ∈ Kn, we have H(yk, y) → 0;

(P8’) For any bounded sequence {yk} ⊆ int(Kn) and any y ∈ Kn with H(yk, y) → 0, there holds that λi(yk) → λi(y) for i = 1, 2;

(P9’) For any bounded sequence {yk} ⊆ int(Kn) and any y ∈ Kn with H(yk, y) → 0, there holds that yk → y.

It is easy to see that all previous subclasses of D(int(Kn)) have the following relations:

Fb1(Kn) ⊆ F1(Kn) ⊆ F1(int(Kn)), F¯2(Kn) ⊆ bF2(Kn) ⊆ F2(Kn) ⊆ F2(int(Kn)).

Proposition 3.17. Let {xk} be generated by the IPA with H ∈ F1(int(Kn)) or F2(int(Kn)).

Suppose that X is nonempty, P

k=1λkεk < ∞ and P

k=1εk < ∞.

(a) If X is a single point set, then {xk} converges to an optimal solution of (3.64).

(b) If X at least includes two elements and for any x = (x1, x2), ¯x = (¯x1, ¯x2) ∈ X

with x 6= ¯x, it holds that x1 6= ¯x1 or kx2k 6= k¯x2k, then {xk} converges to an optimal solution of (3.64) whenever H ∈ bF1(Kn) (or H ∈ bF2(Kn)).

(c) If H ∈ ¯F2(Kn), then {xk} converges to an optimal solution of (3.64).

Proof. Part (a) is direct by Proposition 3.15(d1). We next consider part (b). Assume that H ∈ bF2(Kn). Since P

k=1λkεk < ∞, from (3.76) and Lemma 2.1 of , it follows that the sequence {H(xk, x)} is convergent for any x ∈ X. Let ¯x be the limit of a subsequence {xkl}. By Proposition 3.15(d2), ¯x ∈ X. Consequently, {H(xk, ¯x)} is convergent. By (P7’), H(xkl, ¯x) → 0, and so H(xk, ¯x) → 0. Along with (P8’), λi(xk) → λi(¯x) for i = 1, 2, i.e.,

xk1 − kxk2k → ¯x1− k¯x2k and xk1 + kxk2k → ¯x1+ k¯x2k as k → ∞.

This implies that xk1 → ¯x1 and kxk2k → k¯x2k. Together with the given assumption for X, we have that xk→ ¯x. Suppose that H ∈ bF1(Kn). The inequality (3.76) becomes

H(x, xk) ≤ H(x, xk−1) + λkεk, ∀x ∈ X,

and using (P7)-(P8) and the same arguments as above then yields the result. Part(c) is direct by the arguments above and the property (P9’). 

When all points in the nonempty X lie on the boundary of Kn, we must have x1 6= ¯x1 or kx2k 6= k¯x2k for any x = (x1, x2), ¯x = (¯x1, ¯x2) ∈ X with x 6= ¯x, and the assump-tion for X in (b) is automatically satisfied. Since the solutions of (3.64) are generally on the boundary of Kn, the assumption for X in Proposition 3.17(b) is much weaker than the one in Proposition 3.17(a).

Up to now, we have studied two types of convergence results for the IPA by the class in which the proximal distance H lies. Proposition 3.15 and Proposition 3.16 show that the largest, and less demanding, classes F1(int(Kn)) and F2(int(Kn)) provide reasonable convergence properties for the IPA under minimal assumptions on the problem’s data.

This coincides with interior proximal methods for convex programming over nonnegative orthant cones; see . The smallest subclass ¯F2(Kn) of F2(int(Kn)) guarantees that {xk} converges to an optimal solution provided that X is nonempty. The smaller class Fb2(Kn) may guarantee the global convergence of the sequence {xk} to an optimal solution under an additional assumption except the nonempty of X. Moreover, we will illustrate that there are indeed examples for the class ¯F2(Kn). For the smallest subclass bF1(Kn) of F1(int(Kn)), the analysis shows that it seems hard to find an example, although it guarantees the convergence of {xk} to an optimal solution by Proposition 3.17(b).

Next, we provide three kinds of ways to construct a proximal distance w.r.t. int(Kn) and analyze their own advantages and disadvantages. All of these ways exploit a l.s.c.

(lower semi-continuous) proper univariate function to produce such a proximal distance.

In addition, with such a proximal distance and the Euclidean distance, we obtain the regularized ones.

The first way produces the proximal distances for the class F1(int(Kn)). This way is based on the compound of a univariate function φ and the determinant function det(·), where φ : IR → (−∞, ∞] is a l.s.c. proper function satisfying the following conditions:

(B1) domφ ⊆ [0, ∞), int(domφ) = (0, ∞), and φ is continuous on its domain;

(B2) for any t1, t2 ∈ domφ, there holds that

φ(tr1t1−r2 ) ≤ rφ(t1) + (1 − r)φ(t2), ∀r ∈ [0, 1]; (3.77) (B3) φ is continuously differentiable on int(domφ) with dom(φ0) = (0, ∞);

(B4) φ0(t) < 0 for all t ∈ (0, ∞), limt→0+φ(t) = ∞, and limt→∞t−1φ(t2) ≥ 0.

With such a univariate φ, we define the function H : IRn× IRn→ (−∞, ∞] as in (3.15):

H(x, y) := φ(det(x)) − φ(det(y)) − h∇φ(det(y)), x − yi, ∀x, y ∈ int(Kn).

∞, otherwise. (3.78)

By the conditions (B1)-(B4), we may prove that H has the following properties.

Proposition 3.18. Let H be defined as in (3.78) with φ satisfying (B1)-(B4). Then, the following hold.

(a) For any fixed y ∈ int(Kn), H(·, y) is strictly convex over int(Kn).

(b) For any fixed y ∈ int(Kn), H(·, y) is continuously differentiable on int(Kn) with

1H(x, y) = 2φ0(det(x))

 x1

−x2



− 2φ0(det(y))

 y1

−y2



(3.79) for all x ∈ int(Kn), where x = (x1, x2), y = (y1, y2) ∈ IR × IRn−1.

(c) H(x, y) ≥ 0 for all x, y ∈ IRn, and H(y, y) = 0 for all y ∈ int(Kn).

(d) For any y ∈ int(Kn), the sets {x ∈ int(Kn) | H(x, y) ≤ γ} are bounded for all γ ∈ IR.

(e) For any x, y ∈ int(Kn) and z ∈ int(Kn), the following three point identity holds H(z, y) = H(z, x) + H(x, y) + h∇1H(x, y), z − xi.

Proof. (a) It suffices to prove φ(det(x)) is strictly convex on int(Kn). By Proposition 1.8(a), there has

det(αx + (1 − α)z) > (det(x))α(det(z))1−α, ∀α ∈ (0, 1),

for all x, z ∈ int(Kn) and x 6= z. Since φ0(t) < 0 for all t ∈ (0, +∞), we have that φ is decreasing on (0, +∞). This, together with the condition (B2), yields that

φ [det(αx + (1 − α)z)] < φ(det(x))α(det(z))1−α

≤ αφ[det(x)] + (1 − α)φ[det(z)], ∀α ∈ (0, 1)

for any x, z ∈ int(Kn) and x 6= z. This means that φ(det(x)) is strictly convex on int(Kn).

(b) Since det(x) is continuously differentiable on IRn and φ is continuously differentiable on (0, ∞), we have that φ(det(x)) is continuously differentiable on int(Kn). This means that for any fixed y ∈ int(Kn), H(·, y) is continuously differentiable on int(Kn). By a simple computation, we immediately obtain the formula in (3.79).

(c) Since φ(det(x)) is strictly convex and continuously differentiable on int(Kn), we have φ(det(x)) > φ(det(y)) − h∇φ(det(y)), x − yi,

for any x, y ∈ int(Kn) with x 6= y. This implies that H(y, y) = 0 for all y ∈ int(Kn). In addition, from the inequality and the continuity of φ on its domain, it follows that

φ(det(x)) ≥ φ(det(y)) − h∇φ(det(y)), x − yi

for any x, y ∈ int(Kn). By the definition of H, we have H(x, y) ≥ 0 for all x, y ∈ IRn. (d) Let {xk} ⊆ int(Kn) be a sequence with kxkk → ∞. For any fixed y = (y1, y2) ∈ int(Kn), we next prove that the sequence {H(xk, y)} is unbounded by three cases, and then the desired result follows. For convenience, we write xk = (xk1, xk2) for each k.

Case 1: the sequence {det(xk)} has a zero limit point. Without loss of generality, we assume that det(xk) → 0 as k → ∞. Together with limt→0+φ(t) = ∞, it readily follows that limk→∞φ(det(xk)) → ∞. In addition, for each k we have that

h∇φ(det(y)), xki = 2φ0(det(y))(xk1y1 − (xk2)Ty2)

≤ 2φ0(det(y))y1(xk1 − kxk2k) ≤ 0, (3.80)

where the inequality is true by using φ0(t) < 0 for all t > 0, the Cauchy-Schwartz Inequality, and y ∈ int(Kn). Now from (3.78), it then follows that limk→∞H(xk, y) = ∞.

Case 2: the sequence {det(xk)} is unbounded. Noting that det(xk) > 0 for each k, we must have det(xk) → +∞ as k → ∞. Since φ is decreasing on its domain, we have that

φ(det(xk)) kxkk =

√2φ(λ1(xk2(xk))

p(λ1(xk))2+ (λ2(xk))2 ≥ φ[(λ2(xk))2] λ2(xk) .

Note that λ2(xk) → ∞ in this case, and from the last equation and (B4) it follows that lim

k→∞

φ(det(xk))

kxkk ≥ lim

k→∞

φ[(λ2(xk))2] λ2(xk) ≥ 0.

In addition, since {kxxkkk} is bounded, we without loss of generality assume that xk

kxkk → ˆx = (ˆx1, ˆx2) ∈ IR × IRn−1. Then, ˆx ∈ Kn, kˆxk = 1, and ˆx1 > 0 (if not, ˆx = 0), and hence

k→∞lim

∇φ(det(y)), xk kxkk

= h∇φ(det(y)), ˆxi

= 2φ0(det(y))(ˆx1y1− ˆxT2y2)

≤ 2φ0(det(y))ˆx1(y1− ky2k)

< 0.

The two sides show that limk→∞ H(xkxkkk,y) > 0, and consequently limk→∞H(xk, y) = ∞.

Case 3: the sequence {det(xk)} has some limit point ω with 0 < ω < ∞. Without loss of generality, we assume that det(xk) → ω as k → ∞. Since {xk} is unbounded and {xk} ⊂ int(Kn), we must have xk1 → ∞. In addition, by (3.80) and φ0(t) < 0 for t > 0,

−h∇φ(det(y)), xki ≥ −2φ0(det(y))(xk1y1− kxk2kky2k) ≥ −2φ0(det(y))xk1(y1− ky2k).

This along with y ∈ int(Kn) implies that −h∇φ(det(y)), xki → +∞ as k → ∞. Noting that φ(det(xk)) is bounded, from (3.78) it follows that limk→∞H(xk, y) → ∞.

(e) For any x, y ∈ int(Kn) and z ∈ int(Kn), from the definition of H it follows that H(z, y) − H(z, x) − H(x, y) = h∇φ(det(x)) − ∇φ(det(y)), z − xi

= h∇1H(x, y), z − xi,

where the last equality is by part (b). The proof is thus complete. 

Proposition 3.18 shows that the function H defined by (3.15) with φ satisfying (B1)–

(B4) is a proximal distance w.r.t. int(Kn) and dom H = int(Kn) × int(Kn). Also, H ∈ F1(int(Kn)). The conditions (B1) and (B3)-(B4) are easy to check, whereas by Lemma 2.2 of  we have the following important characterizations for the condition (B2).

Lemma 3.8. A function φ : (0, ∞) → IR satisfies (B2) if and only if one of the following conditions holds:

(a) the function φ(exp(·)) is convex on IR;

(b) φ(t1t2) ≤ 1

2 φ(t21) + φ(t22) for any t1, t2 > 0;

(c) φ0(t) + tφ00(t) ≥ 0 if φ is twice differentiable.

Example 3.8. Let φ : (0, ∞) → IR be φ(t) = − ln t, if t > 0.

∞, otherwise.

Solution. It is easy to verify that φ satisfies (B1)-(B4). By formula (3.78), the induced proximal distance is

H(x, y) :=

− lndet(x)

det(y) +2xTJny

det(y) − 2, ∀x, y ∈ int(Kn),

∞, otherwise,

where Jn is a diagonal matrix with the first entry being 1 and the rest (n − 1) entries being −1. This is exactly the proximal distance given by . Since H ∈ F1(int(Kn)), we have the results of Proposition 3.15(a)-(d1) if the proximal distance is used for the IPA. 

Example 3.9. Take φ(t) = t1−q/(q −1) (q > 1) if t > 0, and otherwise φ(t) = ∞.

Solution. It is not hard to check that φ satisfies (B1)-(B4). In light of (3.78), we compute that

H(x, y) :=

(det(x))1−q− (det(y))1−q

q − 1 + 2xTJny

(det(y))q − (det(y))1−q, ∀x, y ∈ int(Kn),

∞, otherwise,

where Jnis the diagonal matrix same as Example 3.8. Since H ∈ F (int(Kn)), when using the proximal distance for the IPA, the results of Proposition 3.15(a)-(d1) hold. 

We should emphasize that using the first way can not produce the proximal distances of the class F1(Kn), and so bF1(Kn), since the condition limt→0+φ(t) = ∞ is necessary to guarantee that H has the property (P4), but it implies that the domain of H(·, y) for any y ∈ int(Kn) can not be continuously extended to Kn. Thus, when choosing such proximal distances for the IPA, we can not apply Proposition 3.15(d2) and Proposition 3.17.

The other two ways are both based on the compound of the trace function tr(·) and a vector-valued function induced by a univariate φ via (1.9). For convenience, in the

sequel, for any l.s.c. proper function φ : IR → (−∞, ∞], we write d : IR × IR → (−∞, ∞]

as

d(s, t) := φ(s) − φ(t) − φ0(t)(s − t), if s ∈ domφ, t ∈ dom(φ0).

∞, otherwise. (3.81)

The second way also produces the proximal distances for the class F1(int(Kn)), which requires φ : IR → (−∞, ∞] to be a l.s.c. proper function satisfying the conditions:

(C1) domφ ⊆ [0, +∞) and int(domφ) = (0, ∞);

(C2) φ is continuous and strictly convex on its domain;

(C3) φ is continuously differentiable on int(domφ) with dom(φ0) = (0, ∞);

(C4) for any fixed t > 0, the sets {s ∈ domφ | d(s, t) ≤ γ} are bounded with all γ ∈ IR;

for any fixed s ∈ domφ, the sets {t > 0 | d(s, t) ≤ γ} are bounded with all γ ∈ IR.

Let φsoc be the vector-valued function induced by φ via (1.9) and write dom(φsoc) = C1. Clearly, C1 ⊆ Kn and intC1= int(Kn). Define the function H : IRn× IRn → (−∞, ∞] by H(x, y) := tr(φsoc(x)) − tr(φsoc(y)) − h∇tr(φsoc(y)), x − yi, ∀x ∈ C1, y ∈ int(Kn).

∞, otherwise. (3.82)

Using (1.6), Proposition 1.3, Lemma 3.3, the conditions (C1)-(C4), and similar arguments to [116, Proposition 3.1 and Proposition 3.2] (also see Section 3.1), it is not difficult to argue that H has the following favorable properties.

Proposition 3.19. Let H be defined by (3.82) with φ satisfying (C1)-(C4). Then, the following hold.

(a) For any fixed y ∈ int(Kn), H(·, y) is continuous and strictly convex on C1. (b) For any fixed y ∈ int(Kn), H(·, y) is continuously differentiable on int(Kn) with

1H(x, y) = ∇tr(φsoc(x)) − ∇tr(φsoc(y)) = 2 [(φ0)soc(x) − (φ0)soc(y)] . (c) H(x, y) ≥ 0 for all x, y ∈ IRn, and H(y, y) = 0 for any y ∈ int(Kn).

(d) H(x, y) ≥P2

i=1d(λi(x), λi(y)) ≥ 0 for any x ∈ C1 and y ∈ int(Kn).

(e) For any fixed y ∈ int(Kn), the sets {x ∈ C1| H(x, y) ≤ γ} are bounded for all γ ∈ IR;

for any fixed x ∈ C1, the sets {y ∈ int(Kn) | H(x, y) ≤ γ} are bounded for all γ ∈ IR.

(f ) For any x, y ∈ int(Kn) and z ∈ C1, the following three point identity holds:

H(z, y) = H(z, x) + H(x, y) + h∇1H(x, y), z − xi.

Proposition 3.19 shows that the function H defined by (3.82) with φ satisfying (C1)-(C4) is a proximal distance w.r.t. int(Kn) with dom H = C1× int(Kn), and furthermore, such proximal distances belong to the class F1(int(Kn)). In particular, when domφ = [0, ∞), they also belong to the class F1(Kn). We next present some specific examples.

Example 3.10. Let φ(t) = t ln t − t if t ≥ 0, and otherwise φ(t) = ∞, where we stipulate 0 ln 0 = 0.

Solution. It is easy to verify that φ satisfies (C1)-(C4) with domφ = [0, ∞). By formulas (1.9) and (3.82), we compute that H has the following expression:

H(x, y) = tr(x ◦ ln x − x ◦ ln y + y − x), ∀x ∈ Kn, y ∈ int(Kn).

∞, otherwise.



Example 3.11. Let φ(t) = tp − tq if t ≥ 0, and otherwise φ(t) = ∞, where p ≥ 1 and 0 < q < 1.

Solution. We can show that φ satisfies the conditions (C1)-(C4) with dom(φ) = [0, ∞).

When p = 1 and q = 1/2, from formulas (1.9) and (3.82), we derive that

H(x, y) =



 tr

"

y12 − x12 + (tr(y12)e − y12) ◦ (x − y) 2pdet(y)

#

, ∀x ∈ Kn, y ∈ int(Kn).

∞, otherwise.



Example 3.12. Let φ(t) = −tq if t ≥ 0, and otherwise φ(t) = ∞, where 0 < q < 1.

Solution. We can show that φ satisfies the conditions (C1)-(C4) with domφ = [0, ∞).

Now

H(x, y) =  (1 − q)tr(yq) − tr(xq) + tr(qyq−1◦ x), ∀x ∈ Kn, y ∈ int(Kn).

∞, otherwise.



Example 3.13. Let φ(t) = − ln t + t − 1 if t > 0, and otherwise φ(t) = ∞.

Solution. It is easy to check that φ satisfies (C1)-(C4) with domφ = (0, ∞). The induced proximal distance is

H(x, y) = tr(ln y) − tr(ln x) + 2hy−1, xi − 2, ∀x, y ∈ int(Kn).

∞, otherwise.

By a simple computation, we obtain that the proximal distance is same as the one given by Example 3.8, and the one induced by φ(t) = − ln t (t > 0) via formula (3.82). 

Clearly, the proximal distances in Examples 3.10–3.12 belong to the class F1(Kn).

Also, by Proposition 3.20 below, the proximal distances in Examples 3.10–3.11 also satisfy (P8) since the corresponding φ also satisfies the following condition (C5):

(C5) For any bounded sequence {ak} ⊆ int(domφ) and a ∈ domφ such that lim

k→∞d(a, ak)

= 0, there holds that a = limk→∞ak, where d is defined as in (3.81).

Proposition 3.20. Let H be defined as in (3.82) with φ satisfying (C1)-(C5) and domφ = [0, ∞). Then, for any bounded sequence {yk} ⊆ int(Kn) and y ∈ Kn such that H(y, yk) → 0, we have λi(yk) → λi(y) for i = 1, 2.

Proof. From Proposition 3.19(d) and the nonnegativity of d, for each k we have H(y, yk) ≥ d(λi(y), λi(yk)) ≥ 0, i = 1, 2.

This, together with the given assumption H(y, yk) → 0, implies that d(λi(y), λi(yk)) → 0, i = 1, 2.

Notice that {λi(yk)} ⊂ int(domφ) and λi(y) ∈ Kn for i = 1, 2 by Property 1.1(c). From the condition (C5), we immediately obtain λi(yk) → λi(y) for i = 1, 2. 

Nevertheless, we should point out that the proximal distance H given by (3.82) with φ satisfying (C1)-(C4) and domφ = [0, ∞) generally does not have the property (P7), even if φ satisfies the condition (C6) below. This fact will be illustrated by Example 3.14.

(C6) For any {ak} ⊆ (0, ∞) converging to a ∈ [0, ∞), limk→∞d(a, ak) → 0.

Example 3.14. Let H be the proximal distance induced by the entropy function φ in Example 3.10.

Solution. It is easy to verify that φ satisfies the conditions (C1)-(C6). Here we shall present a sequence {yk} ⊂ int(K3) which converges to y ∈ K3, but H(y, yk) → ∞. Let

yk =

p2(1 + e−k3)

1 + k−1− e−k3

1 − k−1+ e−k3

∈ int(K3) and y =

√2 1 1

∈ K3.

By the expression of H(y, yk), i.e., H(y, yk) = tr(y◦ ln y) − tr(y◦ ln yk) + tr(yk− y), it suffices to prove that limk→∞−tr(y ◦ ln yk) = ∞ since limk→∞tr(yk− y) = 0 and tr(y◦ ln y) = λ2(y) ln(λ2(y)) < ∞. By the definition of ln yk, we have

tr(y◦ ln yk) = ln(λ1(yk)) y1− (y2)T2k + ln(λ2(yk)) y1+ (y2)T2k

(3.83)

for y = (y1, y2), yk= (y1k, y2k) ∈ IR × IR2 with ¯y2k = yk2/ky2kk. By computing, ln(λ1(yk)) = ln√

2 − ln 1 +p

1 + e−k3

− k3, y1− (y2)T2k = 1

kyk2k

−k−1+ e−k3 1 +√

1 + k−1− e−k3 + k−1− e−k3 1 +√

1 − k−1+ e−k3

! .

The last two equalities imply that limk→∞ln(λ1(yk)) y1− (y2)T2k = −∞. In addition, by noting that yk2 6= 0 for each k, we compute that

lim

k→∞ln(λ2(yk)) y1− (y2)T2k = ln(λ2(yk))



y1+ (y2)T y2 ky2k



= λ2(y) ln(λ2(y)).

From the last two equations, we immediately have limk→∞−tr(y◦ ln yk) = ∞.  Thus, when the proximal distance in the IPA is chosen as the one given by (3.82) with φ satisfying (C1)-(C6) and domφ = [0, ∞), Proposition 3.17(b) may not apply, i.e.

the global convergence to an optimal solution may not be guaranteed. This is different from interior proximal methods for convex programming over nonnegative orthant cones by noting that φ is now a univariate Bregman function. Similarly, it seems hard to find examples for the class F+(Kn) in  so that Theorem 2.2 therein can apply for since it also requires (P7).

The third way will produce the proximal distances for the class F2(int(Kn)), which needs a l.s.c. proper function φ : IR → (−∞, ∞] satisfying the following conditions:

(D1) φ is strictly convex and continuous on domφ, and φ is continuously differentiable on a subset of domφ, where dom(φ0) ⊆ domφ ⊆ [0, ∞) and int(domφ0) = (0, ∞);

(D2) φ is twice continuously differentiable on int(domφ) and limt→0+φ00(t) = ∞;

(D3) φ0(t)t − φ(t) is convex on dom(φ0), and φ0 is strictly concave on dom(φ0);

(D4) φ0 is SOC-concave on dom(φ0).

With such a univariate φ, we define the proximal distance H : IRn× IRn→ (−∞, ∞] by H(x, y) := tr(φsoc(y)) − tr(φsoc(x)) − h∇tr(φsoc(x)), y − xi, ∀x ∈ C1, y ∈ C2,

∞, otherwise. (3.84)

where C1 and C2 are the domain of φsoc and (φ0)soc, respectively. By the relation between dom(φ) and dom(φ0), obviously, C2 ⊆ C1 ⊆ Kn and intC1 = intC2 = int(Kn).

Lemma 3.9. Let φ : IR → (−∞, ∞] be a l.s.c. proper function satisfying (D1)-(D4).

Then, the following hold.

(a) tr [(φ0)soc(x) ◦ x − φsoc(x)] is convex in C1 and continuously differentiable on intC1.

(b) For any fixed y ∈ IRn, h(φ0)soc(x), yi is continuously differentiable on intC1, and moreover, it is strictly concave over C1 whenever y ∈ int(Kn).

Proof. (a) Let ψ(t) := φ0(t)t−φ(t). Then, by (D2) and (D3), ψ(t) is convex on domφ0 and continuously differentiable on int(domφ0) = (0, +∞). Since tr [(φ0)soc(x) ◦ x − φsoc(x)] = tr[ψsoc(x)], using Lemma 3.3(b) and (c) immediately yields part(a).

(b) From (D2) and Lemma 3.3(a), (φ0)soc(·) is continuously differentiable on int C1. This implies that hy, (φ0)soc(x)i for any fixed y is continuously differentiable on intC1. We next show that it is also strictly concave in C1 whenever y ∈ int(Kn). Note that tr[(φ0)soc(·)]

is strictly concave on C1 since φ0 is strictly concave on dom(φ0). Consequently, tr[(φ0)soc(βx + (1 − β)z)] > βtr[(φ0)soc(x)] + (1 − β)tr[(φ0)soc(z)], ∀0 < β < 1 for any x, z ∈ C1 and x 6= z. This implies that

0)soc(βx + (1 − β)z) − β(φ0)soc(x) − (1 − β)(φ0)soc(z) 6= 0.

In addition, since φ0 is SOC-concave on dom(φ0), it follows that

0)soc[βx + (1 − β)z] − β(φ0)soc(x) − (1 − β)(φ0)soc(z) Kn 0.

Thus, for any fixed y ∈ int(Kn), the last two equations imply that

hy, (φ0)soc[βx + (1 − β)z] − β(φ0)soc(x) − (1 − β)(φ0)soc(z)i > 0.

This shows that hy, (φ0)soc(x)i for any fixed y ∈ int(Kn) is strictly convex on C1.  Using the conditions (D1)-(D4) and Lemma 3.9, and following the same arguments as [116, Propositions 4.1 and 4.2] (also see Section 3.2, Propositions 3.8-3.9), we may prove the following proposition.

Proposition 3.21. Let H be defined as in (3.84) with φ satisfying (D1)-(D4). Then, the following hold.

(a) H(x, y) ≥ 0 for any x, y ∈ IRn, and H(y, y) = 0 for any y ∈ int(Kn).

(b) For any fixed y ∈ C2, H(·, y) is continuous in C1, and it is strictly convex on C1 whenever y ∈ int(Kn).

(c) For any fixed y ∈ C2, H(·, y) is continuously differentiable on int(Kn) with

1H(x, y) = 2∇(φ0)soc(x)(x − y).

Moreover, dom∇1H(·, y) = int(Kn) whenever y ∈ int(Kn).

(d) H(x, y) ≥P2

i=1d(λi(y), λi(x)) ≥ 0 for any x ∈ C1 and y ∈ C2.

(e) For any fixed y ∈ C2, the sets {x ∈ C1| H(x, y) ≤ γ} are bounded for all γ ∈ IR.

(f ) For all x, y ∈ int(Kn) and z ∈ C2, H(x, z) − H(y, z) ≥ 2h∇1H(y, x), z − yi.

Proposition 3.21 demonstrates that the function H defined by (3.84) with φ satisfying (D1)-(D4) is a proximal distance w.r.t. the cone int(Kn) and possesses the property (P5’), and therefore belongs to the class F2(int(Kn)). If, in addition, domφ = [0, ∞), then H belongs to the class F2(Kn). The conditions (D1)–(D3) are easy to check, and for the condition (D4), we can employ the characterizations in [41, 44] to verify whether φ0 is SOC-concave or not. Some examples are presented as follows.

Example 3.15. Let φ(t) = t ln t − t + 1 if t ≥ 0, and otherwise φ(t) = ∞.

Solution. It is easy to verify that φ satisfies (D1)–(D3) with domφ = [0, ∞) and dom(φ0) = (0, ∞). By Example 2.12(c), φ0 is SOC-concave on (0, ∞). Using formu-las (1.9) and (3.84), we have

H(x, y) =  tr(y ◦ ln y − y ◦ ln x + x − y), ∀x ∈ int(Kn), y ∈ Kn.

∞, otherwise.



Example 3.16. Let φ(t) = tq+1

q +1 if t ≥ 0, and otherwise φ(t) = ∞, where 0 < q < 1.

Solution. It is easy to show that φ satisfies (D1)-(D3) with domφ = [0, ∞) and dom(φ0) = [0, ∞). By Example 2.12, φ0 is also SOC-concave on [0, ∞). By (1.9) and (3.84), we com-pute that

H(x, y) =

 1

q+1tr(yq+1) + q+1q tr(xq+1) − tr(xq◦ y), ∀ x ∈ int(Kn), y ∈ Kn.

∞, otherwise.



Example 3.17. Let φ(t) = (1 + t) ln(1 + t) + tq+1

q +1 if t ≥ 0, and otherwise φ(t) = ∞, where 0 < q < 1.

Solution. We can verify that φ satisfies (D1)-(D3) with domφ = dom(φ0) = [0, ∞).

From Example 2.12, φ0 is also SOC-concave on [0, ∞). Using (1.9) and (3.84), it is not hard to compute that for any x, y ∈ Kn,

H(x, y) = tr [(e + y) ◦ (ln(e + y) − ln(e + x))] − tr(y − x) + 1

q + 1tr(yq+1) + q

q + 1tr(xq+1) − tr(xq◦ y).



Note that the proximal distances in Example 3.16 and Example 3.17 belong to the class F2(Kn). By Proposition 3.22 below, the ones in Example 3.16 and Example 3.17 also belong to the class bF2(Kn).

Proposition 3.22. Let H be defined as in (3.84) with φ satisfying (D1)-(D4). Suppose that domφ = dom(φ0) = [0, ∞). Then, H possesses the properties (P7’) and (P8’).

Proof. By the given assumption, C1 = C2 = Kn. From Proposition 3.21(b), the function H(·, y) is continuous on Kn. Consequently, limk→∞H(yk, y) = H(y, y) = 0.

From Proposition 3.21(d), H(yk, y) ≥ d(λi(y), λi(yk)) ≥ 0 for i = 1, 2. This together with the assumption H(yk, y) → 0 implies d(λi(y), λi(yk)) → 0 for i = 1, 2. From this, we necessarily have λi(yk) → λi(y) for i = 1, 2. Suppose not, then the bounded sequence {λi(yk)} must have another limit point νi ≥ 0 such that νi 6= λi(y). Without loss of generality, we assume that limk∈K,k→∞λi(yk) = νi. Then, we have

d(νi, λi(y)) = lim

k→∞d(νi, λi(yk)) = lim

k∈K,k→∞d(νi, λi(yk)) = d(νi, νi) = 0,

where the first equality is due to the continuity of d(s, ·) for any fixed s ∈ [0, ∞), and the second one is by the convergence of {d(νi, λi(yk))} implied by the first equality. This contradicts the fact that d(νi, λi(y)) > 0 since νi 6= λi(y). 

As illustrated by the following example, the proximal distance generated by (3.84) with φ satisfying (D1)-(D4) generally does not belong to the class ¯F2(Kn).

Example 3.18. Let H be the proximal distance as in Example 3.15.

Solution. Let

yk =

√2 (−1)kk+1k (−1)kk+1k

 for each k and y =

√2 1 1

.

It is not hard to check that the sequence {yk} ⊆ int(K3) satisfies H(yk, y) → 0. Clearly, the sequence yk 9 y as k → ∞, but λ1(yk) → λ1(y) = 0 and λ2(yk) → λ2(y) = 2√

2.

Finally, let H1 be a proximal distance produced via one of the ways above, and define Hα(x, y) := H1(x, y) + α

2kx − yk2, (3.85)

where α > 0 is a fixed parameter. Then, by Propositions 3.18, 3.19 and 3.21 and the identity

kz − xk2 = kz − yk2+ ky − xk2 + 2hz − y, y − xi, ∀x, y, z ∈ IRn,

it is easily shown that Hα is also a proximal distance w.r.t. int(Kn). Particularly, when H1 is given by (3.84) with φ satisfying (D1)-(D4) and domφ = dom(φ0) = [0, ∞) (for example the distances in Examples 3.16 and and Example 3.17), the regularized proximal distance Hα satisfies (P7’) and (P9’), and hence Hα ∈ ¯F2(Kn). With such a regularized proximal distance, the sequence generated by the IPA converges to an optimal solution of (3.64) if X 6= ∅. 

To sum up, we may construct a proximal distance w.r.t. the cone int(Kn) via three ways with an appropriate univariate function. The first way in (3.78) can only produce a proximal distance belonging to F1(int(Kn)), the second way in (3.82) produces a proximal distance of F1(Kn) if domφ = [0, ∞), whereas the third way in (3.84) produces a proximal distance of the class bF2(Kn) if domφ = dom(φ0) = [0, ∞). Particularly, the regularized proximal distances Hα in (3.85) with H1 given by (3.84) with domφ = dom(φ0) = [0, ∞) belong to the smallest class ¯F2(Kn). With such regularized proximal distances, we have the convergence result of Proposition 3.17(c) for the general convex SOCP with X 6= ∅.

For the linear SOCP, we will obtain some improved convergence results for the IPA by exploring the relations between the sequence generated by the IPA and the central path associated to the corresponding proximal distances.

Given a l.s.c. proper strictly convex function Φ with dom(Φ) ⊆ Kn and int(domΦ) = int(Kn), the central path of (3.64) associated to Φ is the set {x(τ ) | τ > 0} defined by

x(τ ) := argminn

τ f (x) + Φ(x) | x ∈ V ∩ Kno

for τ > 0. (3.86) In what follows, we will focus on the central path of (3.64) w.r.t. a distance-like function H ∈ D(int(Kn)). From Proposition 3.14, we immediately have the following result.

Proposition 3.23. For any given H ∈ D(int(Kn)) and ¯x ∈ int(Kn), the central path {x(τ ) | τ > 0} associated to H(·, ¯x) is well defined and is in V ∩ int(Kn). For each τ > 0, there exists gτ ∈ ∂f (x(τ )) such that τ gτ+∇1H(x(τ ), ¯x) = ATy(τ ) for some y(τ ) ∈ IRm.

We next study the favorable properties of the central path associated to H ∈ D(int(Kn)).

Proposition 3.24. For any given H ∈ D(int(Kn)) and ¯x ∈ int(Kn), let {x(τ ) | τ > 0}

be the central path associated to H(·, ¯x). Then, the following results hold.

(a) The function H(x(τ ), ¯x) is nondecreasing in τ .

(b) The set {x(τ ) | ˆτ ≤ τ ≤ ˜τ } is bounded for any given 0 < ˆτ < ˜τ . (c) x(τ ) is continuous at any τ > 0.

(d) The set {x(τ ) | τ ≥ ¯τ } is bounded for any ¯τ > 0 if X 6= ∅ and domH(·, ¯x) = Kn.

(e) All cluster points of {x(τ ) | τ > 0} are solutions of (3.64) if X 6= ∅.

Proof. The proofs are similar to those of Propositions 3–5 of .

(a) Take τ1, τ2 > 0 and let xi = x(τi) for i = 1, 2. Then, from Proposition 3.23, we know x1, x2 ∈ V ∩ int(Kn) and there exist g1 ∈ ∂f (x1) and g2 ∈ ∂f (x2) such that

1H(x1, ¯x) = −τ1g1+ ATy1 and ∇1H(x2, ¯x) = −τ2g2+ ATy2 (3.87) for some y1, y2 ∈ IRm. This together with the convexity of H(·, ¯x) yields that

τ1−1 H(x1, ¯x) − H(x2, ¯x)

≤ τ1−1h∇1H(x1, ¯x), x1− x2i = hg1, x2− x1i, τ2−1 H(x2, ¯x) − H(x1, ¯x)

≤ τ2−1h∇1H(x2, ¯x), x2− x1i = hg2, x1− x2i. (3.88) Adding the two inequalities and using the convexity of f , we obtain

τ1−1− τ2−1

H(x1, ¯x) − H(x2, ¯x) ≤ hg1− g2, x2− x1i ≤ 0.

Thus, H(x1, ¯x) ≤ H(x2, ¯x) whenever τ1 ≤ τ2. Particularly, from the last two equations, 0 ≤ τ1−1H(x1, ¯x) − H(x2, ¯x)

≤ τ1−1h∇1H(x1, ¯x), x1− x2i (3.89)

≤ hg2, x2− x1i

≤ τ2−1H(x1, ¯x) − H(x2, ¯x) , ∀τ1 ≥ τ2 > 0.

(b) By part(a), H(x(τ ), ¯x) ≤ H(x(˜τ ), ¯x) for any τ ≤ ˜τ , which implies that {x(τ ) : τ ≤ ˜τ } ⊆ L1 = {x ∈ int(Kn) | H(x, ¯x) ≤ H(x(˜τ ), ¯x)} .

Noting that {x(τ ) : ˆτ ≤ τ ≤ ˜τ } ⊆ {x(τ ) : τ ≤ ˜τ } ⊆ L1, the desired result follows by (P4).

(c) Fix ¯τ > 0. To prove that x(τ ) is continuous at ¯τ , it suffices to prove that limk→∞x(τk)

= x(¯τ ) for any sequence {τk} such that limk→∞τk = ¯τ . Given such a sequence {τk}, and take ˆτ , ˜τ such that ˆτ > ¯τ > ˜τ . Then, {x(τ ) : ˆτ ≤ τ ≤ ˜τ } is bounded by part (b), and τk∈ (ˆτ , ˜τ ) for sufficiently large k. Consequently, the sequence {x(τk)} is bounded. Let ¯y be a cluster point of {x(τk)}, and without loss of generality assume that limk→∞x(τk) = ¯y.

Let K1 := {k : τk ≤ ¯τ } and take k ∈ K1. Then, from (3.89) with τ1 = ¯τ and τ2 = τk, 0 ≤ ¯τ−1[H(x(¯τ ), ¯x) − H(x(τk), ¯x)]

≤ ¯τ−1h∇1H(x(¯τ ), ¯x), x(¯τ ) − x(τk)i

≤ τk−1[H(x(¯τ ), ¯x) − H(x(τk), ¯x)] .

If K1 is infinite, taking the limit k → ∞ with k ∈ K1 in the last inequality and using the continuity of H(·, ¯x) on int(Kn) yields that

H(x(¯τ ), ¯x) − H(¯y, ¯x) = h∇1H(x(¯τ ), ¯x), x(¯τ ) − ¯yi.

This together with the strict convexity of H(·, ¯x) implies x(¯τ ) = ¯y. If K1 is finite, then K2 := {k : τk≥ ¯τ } must be infinite. Using the same arguments, we also have x(¯τ ) = ¯y.

(d) By (P3) and Proposition 3.23, there exists gτ ∈ ∂f (x(τ )) such that for any z ∈ V ∩Kn, H(x(τ ), ¯x) − H(z, ¯x) ≤ τ−1h∇1H(x(τ ), ¯x), x(τ ) − zi = hgτ, z − x(τ )i. (3.90) In particular, taking z = x ∈ X in the last equality and using the fact

0 ≥ f (x) − f (x(τ )) ≥ hgτ, x− x(τ )i,

we have H(x(τ ), ¯x) − H(x, ¯x) ≤ 0. Hence, {x(τ ) | τ > ¯τ } ⊂ {x ∈ int(Kn) | H(x, ¯x) ≤ H(x, ¯x)}. By (P4), the latter is bounded, and the desired result then follows.

(e) Let ˆx be a cluster point of {x(τ )} and {τk} be a sequence such that limk→∞τk = ∞ and limk→∞x(τk) = ˆx. Write xk := x(τk) and take x ∈ X and z ∈ V ∩ int(Kn). Then, for any ε > 0, we have x(ε) := (1 − ε)x+ εz ∈ V ∩ int(Kn). From the property (P3),

h∇1H(x(ε), ¯x) − ∇1H(xk, ¯x), xk− x(ε)i ≤ 0.

On the other hand, taking z = x(ε) in (3.90), we readily have τk−1h∇1H(xk, ¯x), xk− x(ε)i = hgk, x(ε) − xki with gk ∈ ∂f (xk). Combining the last two equations, we obtain

τk−1h∇1H(x(ε), ¯x), xk− x(ε)i ≤ hgk, x(ε) − xki.

Since the subdifferential set ∂f (xk) for each k is compact and gk ∈ ∂f (xk), the sequence {gk} is bounded. Taking the limit in the last inequality yields 0 ≤ hˆg, x(ε) − ˆxi, where ˆg is a limit point of {gk}, and by [131, Theorem 24.4], ˆg ∈ ∂f (ˆx). Taking the limit ε → 0 in the inequality, we get 0 ≤ hˆg, x− ˆxi. This implies that f (ˆx) ≤ f (x) since x ∈ X

and ˆg ∈ ∂f (ˆx). Consequently, ˆx is a solution of the CSOCP (3.64). 

Particularly, from the following proposition, we also have that the central path is convergent if H ∈ D(int(Kn)) satisfies domH(·, ¯x) = Kn, where ¯x ∈ int(Kn) is a given point. Notice that H(·, ¯x) is continuous on domH(·, ¯x) by (P2), and hence the assumption for H is equivalent to saying that H(·, ¯x) is continuous at the boundary of the cone Kn. Proposition 3.25. For any given ¯x ∈ int(Kn) and H ∈ D(int(Kn)) with domH(·, ¯x) = Kn, let {x(τ ) : τ > 0} be the central path associated to H(·, ¯x). If X is nonempty, then limτ →∞x(τ ) exists and is the unique solution of min{H(x, ¯x) | x ∈ X}.

Proof. Let ˆx be a cluster point of {x(τ )} and {τk} be such that limk→∞τk = ∞ and limk→∞x(τk) = ˆx. Then, for any x ∈ X, using (3.89) with x1 = x(τk) and x2 = x, we obtain

[H(x(τk), ¯x) − H(x, ¯x)] ≤ τkhgk, x − x(τk)i ≤ τk[f (x) − f (x(τk))] ≤ 0,

where the second inequality is since gk ∈ ∂f (x(τk)), and the last one is due to x ∈ X. Taking the limit k → ∞ in the last inequality and using the continuity of H(·, ¯x), we have H(ˆx, ¯x) ≤ H(x, ¯x) for all x ∈ X. Since ˆx ∈ X by Proposition 3.24(e), this shows that any cluster point of {x(τ ) | τ > 0} is a solution of min{H(x, ¯x) | x ∈ X}. By the uniqueness of the solution of min{H(x, ¯x) | x ∈ X}, we have limτ →∞x(τ ) = x. 

For the linear SOCP, we may establish the relations between the sequence generated by the IPA and the central path associated to the corresponding distance-like functions.

Proposition 3.26. For the linear SOCP, let {xk} be the sequence generated by the IPA with H ∈ D(int(Kn)), x0 ∈ V ∩ int(Kn) and εk ≡ 0, and {x(τ ) | τ > 0} be the central path associated to H(·, x0). Then, xk = x(τk) for k = 1, 2, . . . under either of the conditions:

(a) H is constructed via (3.78) or (3.82), and {τk} is given by τk = Pk

j=0λj for k = 1, 2, . . .;

(b) H is constructed via (3.84), the mapping ∇(φ0)soc(·) defined on int(Kn) maps any vector IRn into ImAT, and the sequence {τk} is given by τk= λk for k = 1, 2, · · · . Moreover, for any positive increasing sequence {τk}, there exists a positive sequence {λk} with P

k=1λk = ∞ such that the proximal sequence {xk} satisfies xk= x(τk).

Proof. (a) Suppose that H is constructed via (3.78). From (3.67) and Proposition 3.18(b), we have

λjc + ∇φ(det(xj)) − ∇φ(det(xj−1)) = ATuj for j = 0, 1, 2, . . . . (3.91) Summing the equality from j = 0 to k and taking τk =Pk

j=0λj, yk=Pk

j=0uj, we get τkc + ∇φ(det(xk)) − ∇φ(det(x0)) = ATyk.

This means that xk satisfies the optimal conditions of the problem

minτkf (x) + H(x, x0) | x ∈ V ∩ int(Kn) , (3.92) and so xk = x(τk). Now let {x(τ ) : τ > 0} be the central path. Take a positive increasing sequence {τk} and let xk ≡ x(τk). Then from Proposition 3.23 and Proposition 3.18(b), it follows that

τkc + ∇φ(det(xk)) − ∇φ(det(x0)) = ATyk for some yk ∈ IRm. Setting λk= τk− τk−1 and uk= yk− yk−1, from the last equality it follows that

λkc + ∇φ(det(xk)) − ∇φ(det(xk−1)) = ATuk.

This shows that {xk} is the sequence generated by the IPA with εk ≡ 0. If H is given by (3.82), using Proposition 3.19(b) and the same arguments, we also have the result holds.

(b) Under this case, by Proposition 3.21(c), the above (3.91) becomes λjc + ∇(φ0)soc(xj) · (xj − xj−1) = ATuj for j = 0, 1, 2, . . . .

Since φ00(t) > 0 for all t ∈ (0, ∞) by (D1) and (D2), from [63, Proposition 5.2] it follows that ∇(φ0)soc(x) is positive definite on int(Kn). Thus, the last equality is equivalent to

∇(φ0)soc(xj)−1

λjc + (xj − xj−1) =∇(φ0)soc(xj)−1

ATuj for j = 0, 1, 2, . . . . (3.93) Summing the equality (3.93) from j = 0 to k and making suitable arrangement, we get λkc + ∇(φ0)soc(xk)(xk− x0) = ATuk+ ∇(φ0)soc(xk)

k−1

X

j=0

∇(φ0)soc(xj)−1

(ATuj− λjc), which, using the given assumptions and setting τk = λk, reduces to

τkc + ∇(φ0)soc(xk)(xk− x0) = ATk for some ¯yk ∈ IRm.

This means that xk is the unique solution of (3.92), and hence xk = x(τk) for any k. Let {x(τ ) : τ > 0} be the central path. Take a positive increasing sequence {τk} and define the sequence xk = x(τk). Then, from Proposition 3.23 and Proposition 3.21(c),

τkc + ∇(φ0)soc(xk)(xk− x0) = ATyk for some yk ∈ IRm, which, by the positive definiteness of ∇(φ0)soc(·) on int(Kn), implies that

[∇(φ0)soc(xk)]−1kc − ATyk) + [∇(φ0)soc(xk−1)]−1k−1c − ATyk−1) + (xk− xk−1) = 0.

Consequently,

τkc + ∇(φ0)soc(xk)(xk− xk−1) = ∇(φ0)soc(xk)[∇(φ0)soc(xk−1)]−1(ATyk−1− τk−1c).

Using the given assumptions and setting λk = τk, we have

λkc + ∇(φ0)soc(xk)(xk− xk−1) = ATuk for some uk ∈ IRm.

for some uk ∈ IRm. This implies that {xk} is the sequence generated by the IPA and the sequence {λk} satisfiesP

k=1λk = ∞ since {τk} is a positive increasing sequence.  From Proposition 3.25 and Proposition 3.26, we readily have the following improved convergence results of the sequence generated by the IPA for the linear SOCP.

Proposition 3.27. For the linear SOCP, let {xk} be the sequence generated by the IPA with H ∈ D(int(Kn)), x0 ∈ V ∩ int(Kn) and εk≡ 0. If one of the conditions is satisfied:

(a) H is constructed via (3.82) with domH(·, x0) = Kn and P

k=0λk = ∞;

(b) H is constructed via (3.84) with domH(·, x0) = Kn, the mapping ∇(φ0)soc(·) defined on int(Kn) maps any vector in IRn into ImAT, and limk→∞λk= ∞;

and X 6= ∅, then {xk} converges to the unique solution of min{H(x, x0) | x ∈ X}.

## SOC means and SOC inequalities

In this chapter, we present some other types of applications of the aforementioned SOC-functions, SOC-convexity, and SOC-monotonicity. These include so-called SOC means, SOC weighted means, and a few SOC trace versions of Young, H¨older, Minkowski in-equalities, and Powers-Størmer’s inequality. We believe that these results will be helpful in convergence analysis of optimizations involved with SOC. Many materials of this chap-ter are extracted from [36, 77, 78], the readers can look into them for more details.

In document SOC Functions and Their Applications (Page 136-160)