4 Continuation approach for the CMFWP

(1)

to appear in Journal of Global Optimization, 2011

A continuation approach for the capacitated multi-facility Weber problem based on nonlinear SOCP reformulation

Jein-Shan Chen ¹ Department of Mathematics National Taiwan Normal University

Taipei 11677, Taiwan

Shaohua Pan²

School of Mathematical Sciences South China University of Technology

Guangzhou 510640, China

Chun-Hsu Ko ³

Department of Electrical Engineering I-Shou University

Kaohsiung 840, Taiwan

September 4, 2009

(revised on July 21, 2010, December 2, 2010)

Abstract. We propose a primal-dual continuation approach for the capacitated multi- facility Weber problem (CMFWP) based on its nonlinear second-order cone program (SOCP) reformulation. The main idea of the approach is to reformulate the CMFWP as a nonlinear SOCP with a nonconvex objective function, and then introduce a logarithmic barrier term and a quadratic proximal term into the objective to construct a sequence of convexiﬁed subproblems. By this, this class of nondiﬀerentiable and nonconvex optimization problems is converted into the solution of a sequence of nonlinear convex SOCPs. In this paper, we employ the semismooth Newton method proposed in [17] to solve the KKT system of the resulting convex SOCPs. Preliminary numerical results are reported for eighteen test

1Member of Mathematics Division, National Center for Theoretical Sciences, Taipei Oﬃce. The author’s work is partially supported by National Science Council of Taiwan. E-mail: [email protected], FAX: 886-2-29332342.

2The author’s work is supported by the Fundamental Research Funds for the Central Universities (SCUT), Guangdong Natural Science Foundation (No. 9251802902000001) and National Young Natural Science Foundation (No. 10901058). E-mail: [email protected]

3E-mail: [email protected]

(2)

instances, which indicate that the continuation approach is promising to ﬁnd a satisfying suboptimal solution, even a global optimal solution for some test problems.

Key words: Capacitated multi-facility Weber problem, nondiﬀerentiable, nonconvex, second-order cone program, semismooth Newton method.

1 Introduction

There are many diﬀerent kinds of facility location problems for which various methods have been proposed; see [10, 15, 21, 20, 22, 23, 26, 32, 35, 36] and references therein. The web-site maintained by EWGLA (Euro Working Group on Location Analysis) also serves as a good resource to look for related literature including survey papers, books, and journals. In this paper, we consider the newer but more diﬃcult capacitated multi-facility Weber problem (CMFWP) which plays an important role in the operation and management science.

The CMFWP, also called the capacitated Euclidean distance location-allocation problem in other contexts, is concerned with locating a set of facilities and allocating their capacity to satisfy the demand of a set of customers with known locations so that the total transportation cost is minimized. Supply centers such as plants and warehouses may constitute the facilities, while retailers and dealers may be considered as customers. The mathematical model of CMFWP can be stated as follows:

min

∑m i=1

∑n j=1

cijwij∥xi− aj∥

s.t.

∑n j=1

w_ij = s_i, i = 1, 2, . . . , m (1)

∑m i=1

w_ij = d_j, j = 1, 2, . . . , n

w_ij ≥ 0, i = 1, 2, . . . , m; j = 1, 2, . . . , n x_i ∈ IR², i = 1, 2, . . . , m.

In the formulation of CMFWP, m is the number of facilities to be located, n is the number of customers, si is the capacity of facility i, and djis the demand of customer j. Throughout this paper, we without loss of generality assume that the total capacity of all facilities equals the total demand of all customers, i.e.,

∑m i=1

s_i =

∑n j=1

d_j. (2)

In addition, the allocations w_ij are unknown variables denoting the amount to be shipped from facility i to customer j with the unit shipment cost per unit distance being cij. If all

(3)

w_ij are known, the CMFWP reduces to the traditional convex multi-facility location problem for which many eﬃcient algorithms (see [10, 13, 22, 33, 34, 26]) have been proposed, whereas if all x_i are ﬁxed, it reduces to the ordinary transportation problem. For the sake of notation, in the sequel, we denote by x_i = (x_i1, x_i2) the unknown coordinates of facility i, and by a_j = (a_j1, a_j2) the given coordinates of customer j.

We see that the objective function of (1) is nondifferentiable at any points where x_i = a_j for some i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , n}, which precludes a direct application of effective gradient-based methods for finding its solution. Also, it is nonconvex since w_ij are unknown decision variables. Therefore, the CMFWP belongs to a class of nondifferentiable nonconvex optimization problems subject to (m + n) linear constraints and mn nonnega- tivity constraints. In fact, Sherali and Nordai [30] have shown that this class of problems is NP-hard, even if all demand points a_j are located on a straight line.

For this class of problems, Cooper in his seminal work [7] first proposed an exact solution method by explicit enumeration of all extreme points of the transportation polytope, defined by the first three groups of constraints of (1). Selim [29] later presented a biconvex cutting plane procedure in his unpublished dissertation. The exact solution method, simi- lar to Cooper’s complete enumeration, can effectively deal with very small instances only.

Recently, Sherali and Nordai [28] developed a branch-and-bound algorithm which is based on a partitioning of all the allocation space and ﬁnitely converges to a global optimum within a speciﬁed percentage tolerance. Apart from these exact methods, some heuristic methods have also been proposed by the reformulation linearization technique [27] or an approximating mixed integer linear programming formulation [1]. We observe that most of these methods are combinatorial primal ones which are designed by exploiting the structure of the problem itself and do not provide any information about the dual solution.

In this paper, we propose a continuous primal-dual approach by converting the CMFWP into the solution of a sequence of nonlinear convex second-order cone programs (SOCPs).

Specifically, we first reformulate the CMFWP as a nonlinear SOCP with a nonconvex cost function, and then introduce a logarithmic barrier term and a quadratic proximal term into the objective to circumvent the nonconvex difficulty. Among others, the strict convexity of logarithmic function and the strong convexity of quadratic proximal term are fully used to convexify the objective function of the resulting SOCP. Such a technique is not new which is often used in the literature; see [4] for example. The SOCP reformulation has recently attracted much attention for engineering and operations research problems. However, to our best of knowledge, they are all formed into linear SOCPs for which some softwares using interior point methods [18, 31] can be applied. In contrast, the nonlinear SOCP reformulation has little been used since all the aforementioned softwares are only able to solve linear SOCPs. This paper is concerned with the application of the nonlinear SOCP reformulation in the CMFWP, and its main purpose is to propose an alternative continuous approximate

(4)

method to handle the class of diﬃcult problems, instead of introducing a highly specialized method in a competition for the best solution and the fasted computation time.

This paper is organized as follows. In Section 2, we review the general convex SOCP and the semismooth Newton method [17] for solving it. Section 3 presents a detailed process of reformulating (1) as a nonlinear SOCP. Section 4 proposes a primal-dual continuation approach for the CMFWP by solving approximately a sequence of convexiﬁed SOCPs. In Section 5, we report the preliminary numerical results for some test problems from [3, 28]

with the continuation approach, and compare the ﬁnal objective values with those yielded by the global optimization methods [28]. Finally, we conclude this paper.

Throughout this paper, ∥ · ∥ denotes the Euclidean norm, IRⁿ denotes the space of n- dimensional real column vectors, and IRⁿ¹× · · · × IRⁿ^m is identified with IRⁿ¹⁺^···+n^m. Thus, (x₁, . . . , x_m)∈ IRⁿ¹×· · ·×IRⁿ^m is viewed as a column vector in IRⁿ¹⁺^···+n^m. The notations I and 0 denote an identity matrix and a zero matrix of suitable dimension, respectively, and diag(x₁, . . . , x_n) means a diagonal matrix with x₁, . . . , x_n as the diagonal elements. Given a finite number of square matrices Q₁, . . . , Q_n, we denote the block diagonal matrix with these matrices as block diagonal by diag(Q₁, . . . , Q_n). For a differentiable function f , we denote by∇f(x) and ∇²xxf (x) the gradient and the Hessian matrix of f at x, respectively.

For a diﬀerentiable mapping G : IRⁿ → IR^m, we denote by G^′(x)∈ IR^m^×n the Jacobian of G at x. LetO be an open set in IRⁿ. If G :O → IRⁿis a locally Lipschitz continuous, then

∂_BG(x) :=^{H ∈ IR^m^×n| ∃{x^k} ⊆ DG : x^k → x, G^′(x^k)→ H^}

is nonempty and called the B-subdiﬀerential of G at x ∈ O, where DG denotes the set of points at which G is diﬀerentiable. We assume that the reader is familiar with the concepts of (strongly) semismooth functions, and refer to [24, 25] for details.

2 Preliminaries

The convex SOCP is to minimize a convex function over the intersection of an aﬃne linear manifold with the Cartesian product of second-order cones, which can be described as

minimize g(x)

subject to Ax = b, x∈ K, (3)

where g : IRⁿ → IR is a twice continuously diﬀerentiable convex function, A is an m × n matrix with full row rank, b ∈ IR^m and K is the Cartesian product of second-order cones (SOCs), also called Lorentz cones. In other words,

K = Kⁿ¹ × Kⁿ² × · · · × Kⁿ^q, (4)

(5)

where q, n₁, . . . , n_q ≥ 1, n1+· · · + nq = n, and Kⁿⁱ denotes the SOC in IRⁿⁱ deﬁned by Kⁿⁱ :=

{

x_i = (x_i1, x_i2, . . . , x_in_i)∈ IRⁿⁱ | ^√x²_i2+ . . . + x²_in

i ≤ xi1

}

(5) with K¹ denoting the nonnegative real number set IR₊. A special case of (4) corresponds to the nonnegative orthant cone IRⁿ₊, i.e., q = n and n1 =· · · = nq = 1. When g is linear, clearly, (3) becomes a linear SOCP which has been investigated in many previous works and the interested reader is referred to the survey papers [2, 19] and the books [5, 6] for many important applications and theoretical properties.

The treatment of nonlinear convex SOCPs is much more recent and mainly focuses on the research of eﬀective solution methods. Notice that K is a closed convex cone which is self-dual in the sense that K equals its dual cone K^∗ := {y ∈ IRⁿ| ⟨y, x⟩ ≥ 0 ∀x ∈ K}.

Thus, it is easy to write the Karush-Kuhn-Tucker (KKT) optimality conditions of (3) as







∇g(x) − A^Ty− λ = 0 Ax− b = 0

⟨x, λ⟩ = 0, x ∈ K, λ ∈ K.

(6)

These conditions are also suﬃcient for optimality since g is convex. Based on the KKT system (6), there have been several methods proposed for solving (3), which include the smoothing Newton methods [8, 12, 16], the merit function method [9], and the semismooth Newton method [17]. As mentioned in the introduction, this paper is concerned with the application of the nonlinear SOCP in the CMFWP.

Since the resulting convex SOCPs in Sec. 4 will be solved with the semismooth Newton method in [17], we next review it. Let P_K : IRⁿ → IRⁿ denote the Euclidean projection operator onto the cone K, i.e., PK(z) := argmin_y_∈K{∥z − y∥} for any z ∈ IRⁿ. Then, from [12, Prop. 4.1], we have that

x− P_K(x− λ) = 0 ⇐⇒ x ∈ K, λ ∈ K, ⟨x, λ⟩ = 0. (7) Consequently, the solution of the convex SOCP (3) is equivalent to ﬁnding the zeros of

Φ(ω) := Φ(x, y, λ) :=





∇g(x) − A^Ty− λ Ax− b x− PK(x− λ)



. (8)

Since P_K is strongly semismooth by [16, Prop. 4.5], using [11, Theorem 19] then yields that the operator Φ is at least semismooth, and furthermore, it is strongly semismooth if

∇²xxg(x) is locally Lipschitz continuous at any x∈ IRⁿ. The semismooth Newton method in [17] ﬁnds a zero of Φ by applying the nonsmooth Newton method [24, 25] to the semismooth system Φ(ω) = 0. In other words, it generates the iterate sequence {ω^k = (x^k, y^k, λ^k)} by

ω^k+1 := ω^k− Wk⁻¹Φ(ω^k), (9)

(6)

where W_k is an arbitrary element from the B-subdiﬀerential ∂_BΦ(ω^k) and has the form of

W_k=





∇²_xxg(x^k) −A^T −I

A 0 0

I− V^k 0 V^k





for a suitable block diagonal matrix V^k= diag(V₁^k, . . . , V_q^k) with V_i^k ∈ ∂BP_Kni(x^k_i − λ^ki).

The following two technical lemmas respectively provide the formula to compute the value of P_K at any point and the representation of each element in V ∈ ∂BP_Kⁿ(z).

Lemma 2.1 [17, Lemma 2.2] For any given z = (z₁, z₂)∈ IR × IRⁿ⁻¹, it holds that P_Kⁿ(z) = max{0, µ1(z)}u⁽¹⁾z + max{0, µ2(z)}u⁽²⁾z ,

where µ₁(z), µ₂(z) and u⁽¹⁾_z , u⁽²⁾_z are the spectral values and the spectral vectors of z, respec- tively, given by

µ₁(z) = z₁− ∥z2∥, µ₂(z) = z₁+∥z2∥;

u⁽¹⁾_z = 1 2

(

1,−¯z2

)

, u⁽²⁾_z = 1 2

(

1, ¯z₂⁾ with ¯z₂ = _∥z^z²

2∥ if z₂ ̸= 0 and otherwise ¯z2 being any vector in IRⁿ⁻¹ satisfying ∥¯z2∥ = 1.

Lemma 2.2 [17, Lemma 2.6] Given a general point z = (z1, z2)∈ IR×IRⁿ⁻¹, each element V ∈ ∂BP_Kⁿ(z) has the following representation:

(a) If z₁ ̸= ±∥z2∥, then P_Kⁿ(z) is continuously diﬀerentiable with

V = P_K^′n(z) =











0 if z₁ <−∥z2∥ I if z1 >∥z2∥ 1

2

( 1 z¯₂^T

¯ z₂ H

)

if − ∥z2∥ < z1 <∥z2∥, where

¯

z₂ := z₂

∥z2∥, H :=

(

1 + z₁

∥z2∥

)

I− z₁

∥z2∥z¯₂z¯₂^T. (b) If z₂ ̸= 0 and z1 =∥z2∥, then

V ∈

{

I, 1 2

( 1 z¯₂^T

¯ z₂ H

)}

, where ¯z₂ := z₂

∥z2∥ and H := 2I− ¯z2z¯₂^T.

(7)

(c) If z₂ ̸= 0 and z1 =−∥z2∥, then V ∈

{

0, 1 2

( 1 z¯^T₂

¯ z2 H

)}

, where ¯z₂ := z₂

∥z2∥ and H := ¯z₂z¯₂^T. (d) If z = 0, then either V = 0 or V = I or V belongs to the set

{1 2

( 1 z¯₂^T

¯ z₂ H

) H = (w₀+ 1)I− w0z¯₂z¯₂^T for some |w0| ≤ 1 and ∥¯z2∥ = 1

}

.

3 Nonlinear SOCP reformulation

In this section, we will present the detailed process of reformulating the capacitated multi- facility Weber problem (1) as a nonlinear SOCP with the form of (3). First, by introducing mn new variables tij for i = 1, 2, . . . , m, j = 1, 2, . . . , n, we can transform (1) into

min

∑m i=1

∑n j=1

c_ijw_ijt_ij

s.t. ∥xi− aj∥ ≤ tij, i = 1, 2, . . . , m; j = 1, 2, . . . , n

∑n j=1

w_ij = s_i, i = 1, 2, . . . , m (10)

∑m i=1

w_ij = d_j, j = 1, 2, . . . , n

w_ij ≥ 0, i = 1, 2, . . . , m; j = 1, 2, . . . , n x_i ∈ IR², i = 1, 2, . . . , m.

We write out all the constraints of ∥xi− aj∥ ≤ tij, i = 1, 2, . . . , m; j = 1, 2, . . . , n as below:

√

(x_i1− a11)²+ (x_i2− a12)² ≤ ti1, i = 1, 2, . . . , m,

√

(xi1− a21)²+ (xi2− a22)² ≤ ti2, i = 1, 2, . . . , m, (11) ... ... ... ...

√

(x_i1− an1)²+ (x_i2− an2)² ≤ tin, i = 1, 2, . . . , m.

Let _{

u_ij := x_i1− aj1, i = 1, 2, . . . , m, j = 1, 2, . . . , n;

vij := xi2− aj2, i = 1, 2, . . . , m, j = 1, 2, . . . , n. (12) Then, the constraints in (11) turn into

(u_i1)²+ (v_i1)² ≤ (ti1)², t_i1≥ 0, i = 1, 2, . . . , m,

(u_i2)²+ (v_i2)² ≤ (ti2)², t_i2≥ 0, i = 1, 2, . . . , m, (13)

... ... ... ... ...

(uin)²+ (vin)² ≤ (tin)², tin ≥ 0, i = 1, 2, . . . , m.

(8)

Let

ˆ x_ij :=





tij

u_ij v_ij



∈ IR³, i = 1, 2, . . . , m, j = 1, 2, . . . , n. (14)

From (13) and the deﬁnition ofK³, we have ˆx_ij ∈ K³ for i = 1, 2, . . . , m and j = 1, 2, . . . , n.

It should be pointed out that we have created additional constraints through the above reformulation procedure. In particular, from (12), it follows that

u₁₁− u1k = a_k1− a11, k = 2, 3, . . . , n, ... ...

u_m1− umk = a_k1− a11, k = 2, 3, . . . , n (15) and

v₁₁− v1k = a_k2− a12, k = 2, 3, . . . , n, ... ...

v_m1− vmk = a_k2− a12, k = 2, 3, . . . , n. (16) We will see that (15)–(16) can be recast as a linear system. Note that (15) and (16) are equivalent to

[0 1 0]





t₁₁ u₁₁ v₁₁



+ [0 −1 0]





t_1k u_1k v_1k



= a_k1− a11, k = 2, 3, . . . , n,

... ... ...

[0 1 0]





t_m1 u_m1 vm1



+ [0 −1 0]





t_mk u_mk vmk



= a_k1− a11, k = 2, 3, . . . , n

and

[0 0 1]





t₁₁ u11

v₁₁



+ [0 0 −1]





t_1k u1k

v_1k



= a_k2− a12, k = 2, 3, . . . , n

... ... ...

[0 0 1]





t_m1 u_m1 v_m1



+ [0 0 −1]





t_mk u_mk v_mk



= a_k2− a12, k = 2, 3, . . . , n.

We can simplify (15)–(16) by introducing the following notations. More speciﬁcally, let

A_u :=







0 1 0 0 −1 0

... . ..

0 1 0 0 −1 0





∈ IR^(n−1)×3n, (17)

(9)

A_l :=







0 0 1 0 0 −1

... . ..

0 0 1 0 0 −1





∈ IR⁽ⁿ^−1)×3n, (18)

and denote

b_u :=







a21− a11

a₃₁− a11

... an1− a11





∈ IRⁿ⁻¹, b_l :=







a22− a12

a₃₂− a12

... an2− a12





∈ IRⁿ⁻¹.

Then, equations (15)–(16) can be recast as the following system of linear constraints







Au

A_u

. ..

Au

−− − − − − − − − − − A_l

A_l

. ..

A_l













[ˆx₁₁] ... [ˆx1n]

−−

[ˆx₂₁] ... [ˆx_2n]

−−...

−−

[ˆx_m1] ... [ˆx_mn]







=







bu

b_u ... bu

−−

b_l b_l ... b_l







, (19)

where the dimensions in the linear system are 2m(n− 1) × 3mn, 3mn × 1, 2m(n − 1) × 1 for the matrix, column of variables, and column of constants, respectively.

We next look into the constraints on demand and capacity, namely the two groups of

(10)

equality constraints in (10). We notice that they can be recast as a linear system as below:







[1 1 1 · · · 1]

. ..

[1 1 1 · · · 1]

− − − − − − − − − − − − − − − − − − − − [1 0 0· · · 0] [1 0 0 · · · 0] · · · [1 0 0· · · 0]

[0 1 0· · · 0] [0 1 0 · · · 0] · · · [0 1 0· · · 0]

... ... ... ...

[0 0 0· · · 1] [0 0 0 · · · 1] · · · [0 0 0· · · 1]













w₁₁ ... w_1n

−−

w₂₁ ... w2n

−−...

−−

w_m1 ... wmn







=







s₁ s₂ ... s_m

−−

d₁ d₂ ... d_n







. (20)

Again, we point it out that the dimensions of the system (20) are (m + n)×(mn), (mn)×1, (m + n)× 1 for the matrix, column of variables and column of constants, respectively.

Moreover, the coefficient matrix has not full row rank due to the assumption (2), but its any m + n− 1 rows are all linear independent. For convenience, in the rest of this paper, A_w denotes the matrix composed of the first m + n−1 rows in the coefficient matrix of (20).

In summary, we reformulate the CMFWP (1) as the following nonlinear SOCP:

minimize

∑m i=1

∑n j=1

c_ijw_ijt_ij subject to Ax = b

x∈ (K³)^mn× (K¹)^mn

(21)

where (K³)^mn× (K¹)^mn denotes the Cartesian product of mn K³ and mnK¹, and

x := (ˆx₁₁, . . . , ˆx_1n, . . . , ˆx_m1, . . . ˆx_mn, w₁₁, . . . , w_1n, . . . , w_m1, . . . , w_mn) , (22)

A :=







A_u |

Au |

. .. |

A_u |

− − − − − − − − − − − − | 0

A_l |

. .. |

A_l |

− − − − − − − − − − − − | − − −

0 | A_w







, (23)

(11)

b := (b_u, b_u, . . . , b_u, b_l, b_l, . . . , b_l, s₁, . . . , s_m, d₁, . . . , d_n₋₁) . (24) Notice that, by the expression (22) of x, the objective function of (21) can be rewritten as a quadratic function x^TQx, where Q = [Q_kl]_4mn_×4mn is a nonsymmetric matrix with

Qkl=

{ c_ij if k = 3(i− 1)n + 3(j − 1) + 1, l = 3mn + (i − 1)n + j;

0 otherwise (25)

for k, l = 1, 2, . . . , 4mn and i = 1, 2, . . . , m; j = 1, 2, . . . , n. Hence, (21) is equivalent to minimize x^TQx

subject to Ax = b

x∈ (K³)^mn× (K¹)^mn

(26)

where Q is a 4mn× 4mn matrix given by (25), A is a (2mn − m + n − 1) × 4mn matrix with full row rank (2mn− m + n − 1), and the dimension of b is (2mn − m + n − 1) × 1.

Since x^TQx = x^TQ^Tx for any x∈ IR^4mn, the SOCP (21) is further equivalent to minimize f (x) := 1

2x^T(Q + Q^T)x subject to Ax = b

x∈ (K³)^mn× (K¹)^mn

(27)

The SOCP (27) has an advantage over (26) that its Hessian matrix is symmetric, although its objective function is still nonconvex. In the next section, we develop a primal-dual algorithm for solving the CMFWP based on the nonlinear SOCP reformulation (27).

We should point out that the Hessian matrix of the function f , i.e., ¯Q = (Q + Q^T) and the constraint coeﬃcient matrix A of (27) are both sparse. For example, when m = n = 2, Q = [ ¯¯ Q_kl]₁₆_×16 has only eight nonzero entries: Q¯_1,13 = ¯Q_13,1 = c₁₁, ¯Q_4,14 = ¯Q_14,4 = c₁₂, ¯Q_7,15 = ¯Q_15,7 = c₂₁, ¯Q_10,16 = ¯Q_16,10 = c₂₂, whereas A = [A_ij]₇_×16 has sixteen nonzero entries: A12 = 1, A15 =−1, A28 = 1, A2,11 =−1, A33 = 1, A36 =−1, A49 = 1, A4,12 =

−1, A5,13 = A_5,14 = 1, A_5,15 = A_5,16 = 1, A_6,13 = A_6,15 = 1, A_7,14 = A_7,16 = 1. In addition, we observe that each row of ¯Q has at most a nonzero entry c_i₁_,j₁ with c_i₁_,j₁ ∈ {c11, . . . , c_1n, . . . , c_m1, . . . , c_mn}, and all diagonal entries are zero.

4 Continuation approach for the CMFWP

This section designs a primal-dual approximate algorithm for the nonconvex SOCP (27).

The main idea is transforming (27) into the solution of a sequence of convexiﬁed SOCP subproblems.

(12)

First, note that the variables w_ij are nonnegative and restricted by the linear constraints

∑_n

j=1w_ij = s_i and ^∑^m_i=1w_ij = d_j, and hence the nonconvex SOCP (27) is equivalent to min f (x)

s.t. Ax = b

0≤ wij ≤ min{si, dj}, i = 1, . . . , m; j = 1, . . . , n x∈ (K³)^mn× (K¹)^mn.

(28)

Since the logarithmic barrier function −^∑^m_i=1^∑ⁿ_j=1[ln(w_ij) + ln(min{si, d_j} − wij)] is well deﬁned when 0 < w_ij < min{si, d_j} for all i = 1, 2, . . . , m and j = 1, 2, . . . , n, and moreover, its limit is +∞ if some wij tends to 0 or min{si, dj}, we can dispense with the bound constraints on w_ij in (28) and obtain the following transformed problem:

min f (x)− τ^∑^m

i=1

∑n j=1

[

ln(w_ij) + ln(min{si, d_j} − wij)^] s.t. Ax = b

x∈ (K³)^mn× (K¹)^mn

(29)

where τ > 0 is a barrier parameter that ﬁnally tends to 0. The above operation seems to purposely make the original problem (27) more complex. However, we will see that the introduction of the logarithmic barrier term gives a help in locating the feasible solution of (27), as well as plays a certain role in convexifying f (x).

Now, we introduce a quadratic proximal term 1

2∥x − z∥² with z ∈ IR^4mn being a given vector into the objective of (29) to further convexify f (x). Deﬁne

g(x, z, τ, ε) := f (x)− τ^∑^m

i=1

∑n j=1

[

ln(w_ij) + ln(min{si, d_j} − wij)^]+ε

2∥x − z∥² (30) where ε > 0 is a proximal parameter that will become larger until over some threshold.

Proposition 4.1 Given a vector z ∈ IR^4mn, let g be deﬁned as in (30) for any τ, ε > 0.

Then, there exists ε₀ > 0 such that g is strictly convex for any ε > ε₀ and τ > 0.

Proof. For given z∈ IR^4mn and any τ, ε > 0, we compute the Hessian matrix of g as

∇²xxg(x, z, τ, ε) = Q + Q^T + diag(ε, . . . , ε

| {z }

3mn

, P11+ε, . . . , P1n+ε, . . . , Pm1+ε, . . . , Pmn+ε)

with P_ij for i = 1, 2, . . . , m and j = 1, 2, . . . , n given by P_ij = τ

(w_ij)² + τ

(min{si, d_j} − wij)² > 0.

(13)

Since all diagonal entries of Q + Q^T are zero and each row of Q + Q^T has at most a nonzero entry c_i₁_,j₁ with c_i₁_,j₁ ∈ {c11, . . . , c_1n, c₂₁, . . . , c_2n, . . . , c_m1, . . . , c_mn}, we have that

[∇²xxg(x, z, τ, ε)^]

kk >

4mn∑

l=1,l̸=k

[∇²xxg(x, z, τ, ε)^]

kl

for all k = 1, 2, . . . , 4mn whenever ε > ε₀ with ε₀ := max

1≤i≤m,1≤j≤n{cij}. (31)

In other words, the matrix ∇²xxg(x, z, τ, ε) is strictly diagonally dominant for any ε > ε₀ and τ > 0. From Corollary 7.2.3 of [14], it then follows that ∇²_xxg(x, z, τ, ε) is positive deﬁnite, and consequently g(x, z, τ, ε) is strictly convex, for any ε > ε₀ and τ > 0. 2

Proposition 4.1 states that for a given z ∈ IR^4mn, the function g(x, z, τ, ε) is strictly convex for any ε > ε₀ and τ > 0. In fact, it is also strongly convex when ε > ε₀. In view of this, our continuation approach seeks for an approximate optimal solution of the CMFWP by solving a sequence of the following subproblems

min g(x, ˆx^k, τ_k, ε_k) s.t. Ax = b

x∈ (K³)^mn× (K¹)^mn

(32)

with a decreasing sequence{τk} and a increasing sequence {εk}, where ˆx^k is a vector given by the solution of the last subproblem. For a ﬁxed k, since the subproblem (32) is a convex SOCP whenever ε_k > ε₀, its solution is easy, which from Section 2 is equivalent to solving







∇xg(x, ˆx^k, τ_k, ε_k)− A^Ty− λ = 0 Ax− b = 0

x− PK(x− λ) = 0

(33)

with K = (K³)^mn× (K¹)^mn. Deﬁne the mapping Φ_k : IR^10mn^−m+n−1→ IR^10mn^−m+n−1 by

Φ_k(ω) = Φ_k(x, y, λ) :=





∇xg(x, ˆx^k, τ_k, ε_k)− A^Ty− λ Ax− b

x− PK(x− λ)



. (34)

Then, solving the nonsmooth system (33) is equivalent to ﬁnding the zero of the operator Φ_k(ω). We will attain this goal by using the semismooth Newton method in [17].

Next we describe the iteration steps of the continuation approach in which two ﬁxed constants c₁ ∈ (0, 1) and c2 > 1 are used to reduce and increase the dynamic parameters τ and ε. Let Ψ_k(w) := 1

2∥Φk(ω)∥² denote the natural merit function of system Φ_k(ω) = 0.

Algorithm 4.1 (Continuation Approach )

(14)

(S.0) Given the constants c₁ ∈ (0, 1), c2 > 1 and ˆτ , ˆε > 0. Select τ₀, ε₀ > 0 and a starting point ¯ω⁰ = (¯x⁰, ¯y⁰, ¯λ⁰) with the last mn elements ¯w⁰_ij of ¯x⁰ satisfying 0 < ¯w⁰_ij <

min{si, d_j}. Let ˆx⁰ := ¯x⁰, and set k := 0.

(S.1) Compute (by Algorithm 4.2 below) an approximate optimal solution ω^k = (x^k, y^k, λ^k) of (33) with the starting point ¯ω^k = (¯x^k, ¯y^k, ¯λ^k).

(S.2) If τ_k < ˆτ and ε_k > ˆε, then stop, and otherwise go to (S.3).

(S.3) Modify the parameters τ_k and ε_k by τ_k+1 := c₁τ_k and ε_k+1 := c₂ε_k, and let

¯

x^k+1:= x^k, y¯^k+1 := y^k, λ¯^k+1 := z^k, xˆ^k+1:= x^k. (S.4) Set k := k + 1, and go to (S.1).

Note that the continuation approach is a primal-dual one and the last mn components of the ﬁnal iterate y^k provide an approximate dual solution of the CMFWP, which usually has a certain economic meaning associated with this class of transportation problems. In addition, the main computation work of Algorithm 4.1 is to seek an approximate solution of (33). Such a solution can be easily obtained when ε_k is large enough since, on the one hand, the mapping∇xg(x, ˆx^k, τ_k, ε_k) is strongly monotone in this case, which together with Lemma 3.1 of [16] implies that the function Ψ_k(ω) has bounded level sets; and on the other hand, from Section 2, the operator Φ_k is at least semismooth which guarantees in theory a fast algorithm with a superlinear (or quadratic) convergence rate, to seek the solution of (33). Of course, we should emphasize that, when computing the approximate solution of (33), we must restrict the maximum steplength of the iterates (see Algorithm 4.2 below).

We next describe the speciﬁc iteration steps of the semismooth Newton method [17]

when applying it for solving the semismooth system (33). For a given k, from Section 2 it follows that the main iteration step of the method is as follows:

ω^l+1 := ω^l− Wl⁻¹Φ_k(ω^l), W_l ∈ ∂BΦ_k(ω^l) for l = 0, 1, 2, . . . ,

where ∂BΦk(ω^l) denotes the B-subdiﬀerential of the semismooth mapping Φk at the point ω^l and any element W_l of ∂_BΦ_k(ω^l) has the expression

W_l :=





∇²_xxg(x^l, ˆx^k, ε_k, τ_k) −A^T I

A 0 0

I− V^l 0 V^l



 (35)

for a suitable block diagonal matrix V^l = diag(V₁^l, . . . , V_4mn^l ) with V_i^l ∈ ∂BP_K³(x^l− λ^l) for i = 1, 2, . . . , 3mn and V_i^l ∈ ∂BP_K¹(x^l− λ^l) for i = 3mn + 1, . . . , 4mn.

Algorithm 4.2 (Solving the subproblem (33) for Step (S.1) of Algorithm 4.1)