Partition-optimization with Schur convex sum objective functions

(1)

PARTITION-OPTIMIZATION WITH SCHUR CONVEX SUM OBJECTIVE FUNCTIONS∗

FRANK K. HWANG† AND URIEL G. ROTHBLUM‡

Vol. 18, No. 3, pp. 512–524

Abstract. We study optimization problems over partitions of the ﬁnite set N ={1, . . . , n}, where each element i in the partitioned set N is associated with a real number θi _{and the} ob-jective associated with a partition π = (π1, . . . , πp) has the form F (π) = f (θπ), where θπ = (_i∈π

1θ

i_{, . . . ,} i∈πpθ

i_). _{When F is to be either maximized or minimized, we obtain} condi-tions that allow for simple construccondi-tions of particondi-tions that are uniformly optimal for all Schur convex functions f .

Key words. partitions, optimization, Schur-convexity AMS subject classiﬁcations. 90CZ7, 26B25, 90C25 DOI. 10.1137/S0895480198347167

1. Introduction. We consider partitions of the ﬁnite set N ={1, . . . , n} into nonempty parts. When a corresponding partition π has p parts, we refer to it as a

p-partition and denote it by π = (π1, . . . , πp); also, we refer to the vector (|π1|, . . . , |πp|) as the shape of the partition π.

Throughout, we assume that each element i in the partitioned set N is associated with a real number θi and, by possibly permuting the elements of N , we may assume that θ1 ≤ θ2 ≤ · · · ≤ θn. A partition is called consecutive if (after the possible permutation of N ) the elements in each part are consecutive integers.

We consider optimization problems (maximization and minimization) over fami-lies of partitions where the objective value F (π) associated with a partition π is given through a real-valued function f that is deﬁned on Rp and F (π) = f _i_∈π1θi, . . . ,

i_∈ππθ

i_{; such partitioning problems are called sum partitioning problems. Of} partic-ular interest are constrained shape, bounded-shape, and single-shape problems, where the underlying sets of partitions are deﬁned, respectively, by restrictions, bounds, and speciﬁcation on the shape of partitions. For many applications of partitioning problems see, for example, [1, 2, 3, 4].

An important tool for studying optimization problems is the identification of properties that are satisfied by optimal solutions. In particular, determining the existence of optimal solutions with a particular property allows one to restrict the search for an optimal solution to a smaller class of feasible solutions, namely, those that satisfy the property. For partitioning problems, consecutiveness is a particularly valuable property, as the number of p-partitions with prescribed shape is exponential in n, while the number of consecutive p-partitions is p!. Conditions on the function f that suffice for the optimality of consecutive partitions have been studied extensively in the literature. Hwang and Rothblum [3] introduced a class of functions called

asymmetric Schur convex functions, unifying classical (quasi) convexity and Schur

∗_{Received by the editors November 12, 1998; accepted for publication (in revised form) December} 1, 2003; published electronically February 25, 2005.

http://www.siam.org/journals/sidma/18-3/34716.html

†_{Department of Applied Mathematics, Chiaotung University, Hsinchu, 30045 Taiwan, Republic} of China (fhwang@math.nctu.edu.tw).

‡_{Faculty of Industrial Engineering and Management, Technion—Israel Institute of Technology,} Haifa 32000, Israel (rothblum@ie.technion.ac.il). The research of this author was supported by a grant from the Israel Science Foundation and by the E. and J. Bishop Research Fund at Technion.

512

(2)

convexity; asymmetric Schur convexity was shown in Gao, Hwang, Li, and Rothblum [1] to be suﬃcient for optimality of consecutive partitions, generalizing many earlier results.

The goal of the current paper is to study bounded-shape partitioning problems where the function f is Schur convex and the objective is to either maximize F or to minimize it. We identify conditions that allow for explicit solution of such problems without the need to scan through all consecutive partitions. Under these conditions, optimality turns out to be invariant of the particular (Schur convex) function f . It follows that, depending on whether the objective function is to be maximized or min-imized, the vector associated with an invariant optimal partition must majorize or be majorized by the vectors associated with all other feasible partitions (see section 2 for formal definitions). For bounded-shape maximization problems, we explicitly con-struct an invariant consecutive optimal partition when the ranking of the coordinates of the lower bounds on the part-sizes is consistent with that of the upper bounds and, in addition, the θi_{’s have the uniform sign; further, we demonstrate that if either of} these two conditions is dropped, an invariant optimal partition need not exist. For bounded-shape minimization problems, we explicitly construct an invariant solution when all the θi_{’s are 1, that is, when the vector associated with a partition is the} shape of the partition; further, we show via an example that this restriction cannot be relaxed. Our proof for minimization problems first identifies a vector which is majorized by all vectors that satisfy prescribed lower and upper bounds and have a prescribed coordinate-sum. We then show that when the bounds and the prescribed coordinate-sum are integers, the majorized vector can be rounded up/down to an in-teger vector that is majorized by all corresponding inin-teger vectors. Results of Veinott [7] concern the construction of majorized vectors in a more general context of network flows, and his proofs depend on yet unpublished results in [8]. The proofs we derive herein are self-contained and simpler.

2. Preliminaries. Throughout, we let n be a positive integer and N≡ {1, . . . , n}. A partition (of N ) is an ordered collection of sets π = (π1, . . . , πp), where π1, . . . , πp are disjoint nonempty subsets of N whose union is N . In this case we refer to p as the size of π and to the sets π1, . . . , πp as the parts of π. Also, if the number of elements in the parts of the partition π = (π1, . . . , πp) are n1, . . . , np, respectively, we refer to (n1, . . . , np) as the shape of π; of course, in this case

p

j=1nj =|N| = n. We sometimes refer to p-partitions or to (n1, . . . , np)-partitions as partitions of size p or of shape (n1, . . . , np), respectively. A partition is called consecutive if its parts consist of consecutive integers, that is, if there is an enumeration of its parts, say, πj₁, . . . , πjp,

such that for t = 1, . . . , p and corresponding positive integers nj₁, . . . , njp, πjt =

t₋₁ s=1njs+ 1, . . . , t s=1njs .

We assume that each element i in the given partitioned set N is associated with a real number θi _{and, without loss of generality,}

θ1≤ θ2≤ · · · ≤ θn.

(2.1)

We denote by θ the vector (θ1_{, . . . , θ}n₎_{∈ R}n_{. Also, for a subset S}_{⊆ {1, . . . , n} we} deﬁne the S-summation scalar θSby θS ≡

i∈Sθ

i_{. For a p-partition π = (π}

1, . . . , πp) we deﬁne the π-summation-vector θπ by θπ ≡ (θπ1, . . . , θπp)∈ R

p_.

Throughout this paper we let p be a ﬁxed positive integer. Given a real-valued function F over a set Π of p-partitions, we consider the problem of maximizing F over

(3)

Π. The problem is called sum-partitioning if there is a function f :Rp_{→ R such that}

F (π) = f (θπ) for each p-partition π. (2.2)

We refer to single-shape, bounded-shape and constrained-shape problems as partition-ing problems with Π as the set of partitions with a prescribed shape, with a shape that satisfies the prescribed lower and upper bound and with a shape in a prescribed set, respectively. For constrained-shape problems the set of partitions is defined through a set Γ of positive integer p-vectors with the coordinate-sum n. For bounded-shape problems, Γ is defined by two positive integer p-vectors L and U satisfying L ≤ U and p_j=1Lj ≤ |N| ≤

p

j=1Uj; we then write Γ(L,U ) for Γ and Π(L,U ) for the cor-responding set of partitions. Finally, for single-shape problems, Γ is deﬁned by a single positive integer p-vector (n1, . . . , np) satisfying

p

j=1nj =|N|; we then write Γ(n1,... ,np)_{for Γ and Π}(n1,... ,np)_{for the corresponding set of partitions.}

For a vector x∈ Rn _{and k = 1, . . . , n, let x}

[k] be the kth largest coordinate of x. We say that a vector a∈ Rp _{majorizes a vector b}_{∈ R}p_{, written a}_{b, if}

k i=1 a[i]≥ k i=1

b[i] for all k = 1, . . . , p (2.3) and p i=1 a[i]= p i=1 b[i]; (2.4)

we note that (2.3) and (2.4) are, respectively, equivalent to max |I|=k i∈I ai≥ max |I|=k i∈I bi for all k = 1, . . . , p (2.3) and p i=1 ai = p i=1 bi. (2.4)

We say that a strictly majorizes b if a majorizes b but does not majorize a.

A real-valued function f on a subset B of Rp_{is called Schur convex if f (a)}_{≥ f(b)} for all a, b∈ B satisfying a b, that is, if f is order-preserving with respect to the partial order majorization. The function f is called strictly Schur convex if it is Schur convex and f (a) > f (b) for all a, b ∈ B for which a strictly majorizes b. For example, a real-valued function f on Rp _{with f (x) =}p

j=1g(xj), where g is a (strictly) convex real-valued function on R, is known to be (strictly) Schur convex (see [6]); such functions are called separable (strictly) Schur convex. We say that f is (strictly) Schur concave if -f is (strictly) Schur convex.

We say that a p-vector z is a majorizing vector in a ﬁnite set Λ⊆ Rpif z∈ Λ and

z majorizes every vector in Λ; we say that z is a minorizing vector in Λ if z∈ Λ and z

is majorized by every vector in Λ. Since majorization is a partial order that does not provide comparisons for all pairs of vectors, majorizing and minorizing vectors need not exist.

For j = 1, . . . , p− 1, let f(j) _{be the real-valued function on R}p _{with f}(j)_{(x) =} max_{{I⊆{1,... ,p}:|I|=j}}_u_∈Ixu for each x ∈ Rp (these functions are convex as the

(4)

maximum of linear functions). The characterization of majorization through (2.3)– (2.4) shows that a ﬁnite set Λ⊆ Rp _{contains a majorizing/minorizing vector if and} only if the functions f(1)_{, . . . , f}(p−1) _{are simultaneously maximized/minimized over} Λ and, in addition, all vectors in Λ have a common coordinate-sum.

3. Maximization problems with f Schur convex. In this section we focus

on maximization problems where the function f is Schur convex.

Let Π be a set of partitions. We say that a partition π∗ is shape-majorizing in Π if π∗ ∈ Π and the shape of π∗ majorizes the shape of every other partition in Π; when Π is deﬁned as the set of partitions with its shape in a prescribed set Γ, π∗ is shape-majorizing if and only if its shape is a majorizing vector in Γ. The next result shows that if Γ has a majorizing vector, a shape-majorizing partition exists.

Proposition 3.1. Suppose Γ is a set of positive integer p-vectors with

coordinate-sum n and Π is the set of partitions with its shape in Γ. If (n1, . . . , np) is a majorizing

vector in Γ, then there exists a consecutive shape-majorizing partition in Π.

Proof. The conclusion of the lemma follows from the existence of consecutive

par-titions with any prescribed shape (in fact, the consecutive parpar-titions with prescribed shape are in one-to-one correspondence with the permutations over{1, . . . , p}).

We say that θ is sign-uniform if it is either nonpositive or nonnegative. The next result shows that this condition together with the assumptions of Proposition 3.1 facilitate a uniform solution for sum-partitioning problems under all Schur convex functions f . This is accomplished by ﬁrst determining a majorizing shape and then assigning the elements to parts greedily (where greedily has diﬀerent meanings for the case where θ≤ 0 and for the case where θ ≥ 0).

Theorem 3.2. Suppose f is Schur convex, Γ is a set of positive integer p-vectors

with the coordinate-sum n, (n1, . . . , np) is a majorizing vector in Γ with n1≤ · · · ≤ np,

and Π is the (constrained-shape) set of partitions with its shape in Γ.

(i) If θ≤ 0, then the (consecutive) p-partition π− with π_j− =n−j_u=1nu+ 1, . . . , n−j_u=1−1nu

for j = 1, . . . , p is in Π and maximizes F (.) over Π.

(ii) If θ ≥ 0, then the (consecutive) p-partition π+ _{with π}+ j =

j₋₁ u=1nu + 1, . . . ,j_u=1nu

for j = 1, . . . , p is in Π and maximizes F (.) over Π.

Further, if f is strictly Schur convex, the inequalities of (2.1) hold strictly, and the θi_{’s are nonzero, then π}− _{and π}+ _{are, respectively, the only optimal partitions.}

Proof. We ﬁrst consider the case where θ ≥ 0. Since the shape of π+ _is (n1, . . . , np) ∈ Γ, then π+ is shape-majorizing in Π. Also, from n1 ≤ · · · ≤ np we have that|π+₁| ≤ · · · ≤ |π+

p|. These properties of π+ ensure that for each π∈ Π,

j∈ {1, . . . , p} and enumeration u1, . . . , up of the elements 1, . . . , p, j s=1 πus ≤ max {I⊆{1,... ,p}:|I|=j} u_∈I πu ≤ max {I⊆{1,... ,p}:|I|=j} u_∈I π+_u = p u=p−j+1 π+_u = p u=p−j+1 nu. (3.1)

We conclude from (3.1), (2.1), the nonnegativity of the θi’s, and the deﬁnition of π+ that j s=1 (θπ)u_s = i_∈π_u1_{∪···∪π}_uj θi≤ n i=n1+···+np−j+1 θi= p u=p_−j+1 (θπ+)u, (3.2)

(5)

with equality holding when j = p. Since π+ _{is in Π, it also satisﬁes (3.2). Applying} (3.2) to π+ _{and to π, we conclude that}

max {I⊆{1,... ,p}:|I|=j} u∈I (θπ)u≤ n i=n1+···+np−j+1 θi= max {I⊆{1,... ,p}:|I|=j} u∈I (θπ+)_u (3.3)

with equality holding when j = p. Thus, θπ+ majorizes θ_π and, therefore, the Schur convexity of f implies that F (π+_{) = f (θ}

π+)≥ f(θπ) = F (π).

Next, assume that θ≤ 0. Since the shape of π− is (n1, . . . , np)∈ Γ, π− is also shape-majorizing in Π. Also, from n1 ≤ · · · ≤ np we have that |π−1| ≤ · · · ≤ |π−p|. These properties of π− ensure that for each π ∈ Π, j ∈ {1, . . . , p} and enumeration

u1, . . . , upof the elements 1, . . . , p, p s=j+1 πus ≤ max {I⊆{1,... ,p}:|I|=p−j} u_∈I πu ≤ max {I⊆{1,... ,p}:|I|=p−j} u_∈I π_u− = p u=j+1 π−_u = p u=j+1 nu, (3.4) and, therefore, j s=1 πus = n− p s=j+1 πus ≥n− p u=j+1 π−u = j u=1 nu. (3.5)

From (2.1), (3.5), the nonpositivity of the θi_{’s, and the deﬁnition of π}−_{, we see that} j s=1 (θπ)u_s = i∈π_u1∪···∪π_uj θi≤ n i=n−(n1+···+nj)+1 θi= j u=1 (θπ−)u, (3.6)

with equality holding when j = p. Since π− is in Π(n1,... ,np)_{, it also satisﬁes (3.6).}

Applying (3.6) to π− and to π, we conclude that

max {I⊆{1,... ,p}:|I|=j} u∈I (θπ)u≤ n i=n_−(n1+_···+nj)+1 θi= max {I⊆{1,... ,p}:|I|=j} u∈I (θπ−)u (3.7)

with equality holding when j = p. Thus, θπ− majorizes θπ and, therefore, the Schur convexity of f implies that F (π−) = f (θπ−)≥ f(θπ) = F (π).

Finally, if the inequalities of (2.1) hold strictly and the θi’s are nonzero, then for each π= π+, (3.4) implies that (3.5) holds as a strict inequality for at least one

j; thus, θπ+ strictly majorizes θ_π. Consequently, if f is strictly Schur convex, we have that F (π+_{) = f (θ}

π+) > f (θπ) = F (π). A similar argument shows that if the inequalities of (2.1) hold strictly, the θi_{’s are nonzero, and f is strictly Schur convex,} then F (π−) = f (θπ−) > f (θπ) = F (π).

Solution of constrained-shape partitioning problems withf Schur

con-vex, sign-uniform θ, and given majorizing shape. Let Γ be a set of positive integer p-vectors with coordinate-sum n and let (n1, . . . , np) be a majorizing vector in Γ with n1 ≤ · · · ≤ np. Also, assume the θ1, . . . , θn are given and satisfy (2.1). Of course, if either the θi_{’s and/or the n}

u’s are not ranked a priori, one can sort them

(6)

and renumber indices in time O[n(lg n)] and/or O[p(lg p)], respectively. Once the indices are renumbered, Theorem 3.2 provides an explicit solution of the partitioning problem when either θ≥ 0 or θ ≤ 0; only the partial sums of the nj’s are needed, and these can be determined with p additions and the associated vector can be determined with, at most, n additions.

Next we explain how the “expensive” sorting of the θi’s can be reduced. Suppose a sorting of n1, . . . , npis executed if needed (requiring time O[p(lg p)] comparisons), and an index-enumeration j1, . . . , jp satisfying nj1 ≤ nj2 ≤ · · · ≤ njp becomes available.

It is then not necessary to fully sort θ1, . . . , θn in order to determine the optimal partition; all that is needed is to determine the set of nj1-smallest coordinates of

θ, the next nj2-smallest coordinates, and so on. This block-sorting can be executed with O(pn) comparisons [5], yielding an improved complexity bound of O(pn). If the data is given with (2.1) in force, Theorem 3.2 provides an explicit solution of the partitioning problem requiring only the sorting of n1, . . . , np; so, in this case the problem is solvable in time O[p(lg p)].

Theorem 3.2 yields an explicit solution to partitioning problems when a majorizing shape within the set of allowable shapes Γ is available. Such a shape is trivially available when Γ contains a single shape, e.g., when eitherp_j=1Lj = n or

p

j=1Uj =

n. Next we obtain a suﬃcient condition for the existence of a majorizing shape in

nondegenerate bounded-shape problems; further, under this condition the majorizing shape is easily computable.

Lemma 3.3. _{Let L and U be positive integer p-vectors satisfying L} ≤ U and p

j=1Lj< n < p

j=1Uj. Then there exists an index j∈ {1, . . . , p} with j u=1Lu+ p u=j+1Uu= p u=1Uu− j

u=1(Uu−Lu)≤ n; further, if j∗is the ﬁrst such index and

µ∗≡ n−j_u=1∗−1Lu− p u=j∗+1Uu, then (n∗1, . . . , n∗p)≡ (L1, . . . , Lj∗−1, µ∗, Uj∗+1, . . . , Up)∈ Γ(L,U )_{, and} k u=1 n∗u= max _k u=1 Lu, n− p u=k+1 Uu for k = 1, . . . , p. (3.8) Moreover, if L1≤ L2≤ · · · ≤ Lp (3.9) and U1≤ U2≤ · · · ≤ Up, (3.10)

then n∗₁≤ · · · ≤ n∗_p and (n∗₁, . . . , n∗_p) majorizes every vector in Γ(L,U )_.

Proof. The existence of an index j ∈ {1, . . . , p} withj_u=1Lu+ p

u=j+1Uu = p

u=1Uu− j

u=1(Uu− Lu)≤ n is immediate from the fact that p u=1Uu > n and p u=1Uu− p u=1(Uu−Lu) = p

u=1Lu< n. With j∗as the first such index and with the definition of µ∗ and (n∗₁, . . . , n∗_p) as in the statement of the lemma, we clearly have that Lj∗ ≤ µ∗ < Uj∗ and (n∗1, . . . , n∗p)∈ Γ(L,U ). Also, from the definition of j∗ and n∗_j’s we have that

k u=1 n∗_u= ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩ k u=1 Lu> n− p u=k+1 Uu if k < j∗, n− p u=k+1 u∗_u≥ k u=1 Lu if k≥ j∗.

(7)

When either k < j∗ or k≥ j∗, we have that (3.8) holds.

Next, assume that (3.9) and (3.10) hold. To verify that the coordinates of (n∗₁, . . . , n∗_p) are nondecreasing, observe that if t < j∗, we have n∗_t = Lt ≤ Lt+1 ≤

n∗_t+1, and if t ≥ j∗, we have n∗_t ≤ Ut ≤ Ut+1 = n∗t+1. Next, let I be a sub-set of {1, . . . , p} and let (n1, . . . , np) be a vector in Γ(L,U ). The complement of I within{1, . . . , p} will be denoted Ic. Since_u_∈Inu≤

u_∈IUu and n− u_∈Inu= u_∈Icnu≥

u_∈IcLu, we have that

u∈I nu≤ min n− u∈Ic Lu, u∈I Uu ≤ min ⎧ ⎨ ⎩n− p_−|I| u=1 Lu, p u=p−|I|+1 Uu ⎫ ⎬ ⎭, (3.11)

where (3.9)–(3.10) are used for the second inequality in (3.11). Also, for each j = 1, . . . , p− 1, we get from (3.8) (with k = p − j) that

p u=p−j+1 n∗_u= n− p−j u=1 n∗_u= n− max ⎧ ⎨ ⎩ p−j u=1 Lu, n− p u=p−j+1 Uu ⎫ ⎬ ⎭ (3.12) = min ⎧ ⎨ ⎩n− p−j u=1 Lu, p u=p−j+1 Uu ⎫ ⎬ ⎭. Since (n∗₁, . . . , n∗_p) ∈ Γ(L,U )_{, (3.11) applies to (n}∗

1, . . . , n∗p). It follows from (3.11) applied to (n1, . . . , np) and to (n∗1, . . . , n∗p) and from (3.12) that, for j = 1, . . . , p− 1,

max {I⊆{1,... ,p}:|I|=j} u∈I nu≤ min ⎧ ⎨ ⎩n− p−j u=1 Lu, p u=p−j+1 Uu ⎫ ⎬ ⎭= p u=p−j+1 n∗_u = max {I⊆{1,... ,p}:|I|=j} u∈I n∗_u,

verifying that (n∗₁, . . . , n∗_p) majorizes (n1, . . . , np).

Next we state two immediate conclusions from Theorem 3.2 and Lemma 3.3. Theorem 3.4. Suppose f is Schur convex and L and U are positive integer

p-vectors satisfying L≤ U,p_j=1Lj < n < p

j=1Uj, (3.9), and (3.10). Let (n∗1, . . . , n∗p)

be as in Lemma 3.3.

(i) If θ≤ 0, then the (consecutive) p-partition π− with π_j− =n−j_u=1n∗_u+ 1, . . . , n−j_u=1−1n∗_ufor j = 1, . . . , p is in Π(L,U ) _{and maximizes F (.) over Π}(L,U )_.

(ii) If θ ≥ 0, then the (consecutive) p-partition π+ _{with π}+ j =

j₋₁ u=1n∗u + 1, . . . ,j_u=1n∗_ufor j = 1, . . . , p is in Π(L,U ) _{and maximizes F (.) over Π}(L,U )_.

Further, if f is strictly Schur convex, the inequalities of (2.1) hold strictly, and the θi_{’s are nonzero, then π}− _{and π}+ _{are, respectively, the only optimal partitions.}

Under the assumptions of Theorem 3.4, the solution method discussed follow-ing Theorem 3.2 applies; further, Lemma 3.3 shows that the computation of the majorizing-shape vector (n∗₁, . . . , n∗_p) is available with O(p) arithmetic operations.

We say that two vectors, L and U , in Rp _{are consistent if there exists a} per-mutation ({u1}, . . . , {up}) such that the vectors

Lu1, . . . , Lup

and Uu1, . . . , Uup

satisfy (3.9)–(3.10). Corollary 3.4 implies that when f is Schur convex, L and U are consistent positive integer p-vectors satisfying L≤ U andp_j=1Lj < n <

p j=1Uj,

(8)

and θ is sign-uniform, there exists a majorizing vector in Γ(L,U ) _{and a (consecutive,} shape-majorizing) partition in Π(L,U ) _{which is optimal uniformly under all Schur} convex functions f . Further, such a partition is easily computable by ﬁrst (jointly) sorting the Lu’s and Uu’s and then selecting either of the two partitions constructed in Theorem 3.2.

Two important cases for which the assumptions of Lemma 3.3 and Theorem 3.4 apply are as follows:

(i) single-shape problem, where the coordinates of a single prescribed shape, say, (n1, . . . , np), can be ranked and permuted to satisfy the monotonicity assumption (3.9)–(3.10) with L = U = (n1, . . . , np), and

(ii) uniform bounded shape problem, where Lu’s and Uu’s are, respectively, in-dependent of u.

The next two examples demonstrate, respectively, that neither the consistency of

L and U nor the sign-uniformity of θ can be removed from Corollary 3.5.

Example I.Suppose p = 3, n = 9, L₁ = 1, L₂ = L₃= 2, U₁= 5, U₂= U₃= 4,

and θi _{= 1 for i = 1, . . . , 9.} _{With Π} _{≡ Π}(L,U )_{, max}

π∈Πmaxu(θπ)u = 5, and

the maximum is realized by exactly the partitions with shape (5, 2, 2). However,

maxπ∈Πmaxu,v[(θπ)u+ (θπ)v] = 8, and the maximum is realized by exactly the

parti-tions with shape (1, 4, 4). Thus, there is no shape-majorizing partition in Π(L,U )_{. It}

is easily noted that Γ(L,U ) _{does not have a vector which majorizes all other vectors in}

the set.

To see that no partition is optimal uniformly under all (separable) Schur convex functions f , let f1 and f2 be the (separable, strictly Schur convex) functions with

f1(x) = 3

u=1|xu|3 and f2(x) = 3

u=1|xu− 4|3. The shapes in Γ(L,U ) are (5, 2, 2), (4, 3, 2), (4, 2, 3), (3, 4, 2), (3, 3, 3), (3, 2, 4), (2, 4, 3), (2, 3, 4), and (1, 4, 4); the

val-ues of these vectors under (f1, f2) are, respectively, (141, 17), (99, 9), (99, 9), (99, 9), (81, 3), (99, 9), (99, 9), (99, 9), and (129, 27). So, the optimal partitions with the

ob-jective deﬁned by f1 and f2 are, respectively, those with shape (5, 2, 2) and those with

shape (1, 4, 4).

Example II._{Suppose p = 3, n = 6, n}_j_{= j for j = 1, 2, 3, θ}i₌−1 for i = 1, 2, 3,

and θi _{= 1 for i = 4, 5, 6. With Π} _{≡ Π}(1,2,3)_{, max}

π∈Πmaxu(θπ)u = 3, and the

maximum is realized by the partitions with π3={4, 5, 6} and only by those. However, maxπ∈Πmaxu,v[(θπ)u+ (θπ)v] = 3, and the maximum is realized by the partition

with π3 = {1, 2, 3} and only by them. Thus, there is no partition π in Π with θπ

majorizing each of the vectors associated with a partition π in Π. To see that no partition is optimal uniformly under all Schur convex functions f , let f1 and f2be the

(separable, strictly Schur convex) functions with f1(x) = 3

u=1|xu+ 3| 3 _{and f}

2(x) = 3

u=1|xu− 3|3; the optimal partitions with f1 and f2 are, respectively, precisely the

partitions π with π3={4, 5, 6} and those with π3={1, 2, 3}.

4. Minimization problems with f Schur convex. In this section we focus

on minimization problems where the function f is Schur convex. The main result of this section can be derived from more general results of Veinott [6, Theorem 2, p. 554] which depend on (yet unpublished) results of [8]; the proofs provided herein are self-contained and more elementary.

Let Π be a set of partitions. We say that a partition π∗is shape-minorizing in Π if π∗∈ Π and the shape of π∗is majorized by the shape of every other partition in Π; when Π is deﬁned as the set of partitions with its shape in a prescribed set Γ, π∗ is shape-minorizing if and only if its shape is a minorizing vector in Γ. The next result shows that if Γ has a minorizing vector, a shape-minorizing partition exists.

(9)

Proposition 4.1. Suppose Γ is a set of positive integer p-vectors with a

coordinate-sum n and Π is the set of partitions with its shape in Γ. If (n1, . . . , np) is a minorizing

vector in Γ, then there exists a consecutive shape-minorizing partition in Π.

Proof. As for Proposition 3.1, the conclusion follows from the existence of

con-secutive partitions with any prescribed shape.

The next result is in the spirit of Theorem 3.2 with minimization replacing maximization—it provides conditions for the existence of a uniform solution to con-strained-shape partitioning problems under the assumptions of Proposition 4.1. But here, more restrictive conditions than sign-uniformity of θ are required.

Theorem 4.2. _{Suppose that θ}i_{= 1 for each i (that is, the objective function is a}

function of the shape of a partition). Then any shape-minorizing partition is optimal (minimizing) uniformly under all Schur convex functions f .

Proof. The assumptions of the theorem imply that for each partition π, θπ is the shape of π, and the conclusion of the theorem follows from the deﬁnition of Schur convexity.

The next example demonstrates that sign-uniformity of θ is not suﬃcient for the set of vectors associated with partitions having a prescribed shape to contain a minorizing vector, nor is it suﬃcient for the existence of a uniformly minimizing partition under all Schur convex functions. So, in general, the conclusions of Theorem 3.2 do not generalize when minorization replaces majorization. It is noted that the example concerns a single-shape problem.

Example III._{Let n = 11, p = 3, n}₁_{= 2, n}₂_{= 4, n}₃_{= 5, θ}i_{= 1 for i = 1, 2, 3, 4,}

θi = 2 for i = 5, 6, 7, 8, and θi= 6 for i = 9, 10, 11. Let X be the set of positive integer 3-vectors with coordinate-sum 30. All vectors associated with feasible partitions are

in X. Now, x1≡ (10, 10, 10) is majorized by all vectors in X and x2 ≡ (11, 10, 9) is majorized by all vectors in X except for x1_{. But neither x}1 _{nor x}2 _{is realizable by a}

feasible partition because neither 9 nor 10 nor 11 is the sum of two elements among {1, 2, 6}. Next we observe that x3 _{= (11, 11, 8) and x}4 _{= (12, 9, 9) are majorized by}

all vectors in X\x1_{, x}2_{, x}3_{, x}4_{, but neither majorizes the other. Representing parts}

of partitions by the multiset of the θi_{’s, we observe that (11, 11, 8) is realizable by}

the partition π3_{= (}_{{5, 9}, {1, 6, 7, 10}, {2, 3, 4, 8, 11}) and (12, 9, 9) is realizable by the}

partition π4_{= (}_{{10, 11}, {1, 2, 3, 9}, {4, 5, 6, 7, 8}).}

For t > 0, let ft: R3→ R be given by ft(x) = 3

j=1|xj−10−t|3for each x∈ R3.

These functions are separable and strictly Schur convex; further, for all suﬃciently small positive t, ft(x3) > ft(x4), and the reverse inequality holds for all suﬃciently

large negative t. Since every vector in X\{x1_{, x}2_{, x}3_{, x}4_{} majorizes either x}3 _{or x}4_,

the Schur convexity of the ft’s implies that π4 is optimal for all suﬃciently small

positive t, and π3 _{is optimal for all suﬃciently large negative t.}

We next show that every set of bounded shapes contains a minorizing shape, without the restriction concerning the consistency of the lower bound and the upper bound. Of course, Example III demonstrates that shape-minorization does not yield uniform optimality as does shape-majorization with sign-uniform θ. Our ﬁrst step considers noninteger vectors.

Theorem 4.3. _{Let L and U be p-vectors satisfying L}≤ U andp

j=1Lj < n < p

j=1Uj, respectively. For every real β > 0 deﬁne x(β) as the p-vector with

x(β)j≡ ⎧ ⎪ ⎨ ⎪ ⎩ Lj if β≤ Lj, β if Lj < β < Uj, Uj if β≥ Uj. (4.1)

(10)

Then x(.) is nondecreasing and continuous, and {x(β) ∈ Rp _: p

j=1x(β)j = n}

contains a single vector, say, x∗, which is majorized by every vector in{x ∈ Rp_{: L}_≤

x≤ U andp_j=1xj= n}.

Proof. The fact that x(.) is nondecreasing and continuous is immediate from

(4.1). Further, sincep_j=1x(β)j= p

j=1Lj< n for β≤ minjLj, and p

j=1x(β)j = p

j=1Uj> n for β ≥ maxjUj, continuity arguments assure that p

j=1x(β)j = n for some minjLj < β < maxjUj. Since x(.) is nondecreasing,

p

j=1x(β)j= p

j=1x(β)j if and only if x(β) = x(β). So, {x(β) ∈ Rp _: p

j=1x(β)j = n} contains a single element, say, x∗. We note that{β ∈ R : x(β) = x∗} is a nonempty closed interval which is nondegenerate when{j = 1, . . . , p : Lj < x∗j < Uj} = ∅}.

Let N₋ ≡ {j = 1, . . . , p : x∗_j = Lj}, N0 ≡ {j = 1, . . . , p : Lj < x∗j < Uj},

N+ = {j = 1, . . . , p : x∗j = Uj > Lj}, v− ≡ |N−|, v0 ≡ |N0|, and v+ ≡ |N+|. Of course, v₋+ v0+ v+= p. Select β∗such that x(β∗) = x∗(β∗ is unique when N0= ∅). We then have that x∗j = Lj ≥ β∗ for j ∈ N−, xj∗= β∗ for j∈ N0, and x∗j = Uj≤ β∗ for j ∈ N+. It follows that by possibly permuting indices, we can assume that x∗’s coordinates are nonincreasing, all elements in N₋ precede all elements in N0, and all elements in N0 precede all elements in N+; in particular, N− = {1, . . . , v−},

N0={v−+ 1, . . . , v−+ v0}, and N+={v−+ v0+ 1, . . . , p}.

Let X ≡ {x ∈ Rp : L ≤ x ≤ U and p_j=1xj = n}. Also, for k = 1, . . . , p, let

Wk _{≡ {w ∈ R}p _{: 0} _{≤ w ≤ 1 and} p

j=1wj = k} (with 1 representing the vector (1, . . . , 1)T _{in R}p_{), and let hk}_{: X} _{→ R with h}

k(x) for x in X being the sum of the k largest coordinates of x. We observe that the functions hk have representations

hk(x) = k u=1 x[u]= max [I]=k u∈I xu= max w_∈Wk k u=1 wuxu= max w_∈Wkw T_x. (4.2)

The claim that x∗ ∈ X is majorized by all vectors x in X means that x∗ minimizes each hk over X. We consider three ranges for k.

1≤ k ≤ v₋: In this case for each x∈ X,

hk(x∗) = k u=1 x∗_[u]= k u=1 x∗_u= k u=1 Lu≤ k u=1 xu≤ k u=1 x[u]= hk(x). (4.3)

p− v+≤ k ≤ p: In this case for each x ∈ X,

hk(x∗) = k u=1 x∗_[u]= k u=1 x∗_u= n− p u=k+1 x∗_u= n− p u=k+1 Uu (4.4) ≤ p u=1 xu− p u=k+1 xu≤ k u=1 xu≤ k u=1 x[u]= hk(x).

v₋k < p− v+: We will construct a vector w∗ in Wk that satisﬁes

wTx∗≤ (w∗)Tx∗≤ (w∗)Tx for each x∈ X and w ∈ Wk.

(4.5)

It will then follow from (4.2) that for every x ∈ X, hk(x∗) = maxw_∈Wk(w)Tx∗ =

(w∗)T_x∗ _{≤ (w}∗₎T_x _{≤ h}

k(x). (In fact, a variant of the classic minmax theorem of game theory ensures that the existence of such a vector w∗ is necessary and suﬃcient

(11)

for x∗ to minimize hk over X.) Speciﬁcally, let ω≡ (k − v₋)/v0, and let w∗ be the p-vector with w∗_u≡ ⎧ ⎪ ⎨ ⎪ ⎩ 1 for u = 1, . . . , v₋, ω for u = v₋+ 1, . . . , v₋+ v0, 0 for u = v₋+ v0+ 1, . . . , p. (4.6)

Since v₋< k < p− v+= v−+ v0, we have that v0= p− v−− v+> 0 and 0 < ω < 1; in particular, w∗∈ Wk_.

For z ∈ Rp _{and j = 0, 1, . . . , p, let ¯}_z j =

j

u=1zu; in particular, ¯xp = n and ¯

wp= k for each x∈ X and w ∈ Wk. Further,

w∗Tx = p u=1 w∗uxu= p u=1 wu∗(¯xu− ¯xu−1) = p₋₁ u=1

(wu∗− wu+1∗ )¯xu+ wp∗n for each x∈ X (4.7) and wTx∗= p u=1 wux∗u= p u=1 ( ¯wu− ¯wu−1)x∗u= p−1 u=1 ¯

wu(x∗u− x∗u+1) + kx∗u for each w∈ W. (4.8)

Applying (4.7) to x∗ and to arbitrary x∈ X, we observe that (w∗)Tx∗− (w∗)Tx = p₋₁ u=1 (w∗u− w∗u+1)(¯x∗u− ¯xu) (4.9) = (1− ω)(¯x∗_v₋− ¯xv₋) + ω(¯x∗v−+v0− ¯xv−+v0)

(the cases where v₋ = 0 and/or v+ = 0 require special attention). From (4.3) with

k = v₋, we have that ¯x∗_v

− ≤ ¯xv−, and from (4.4) with k = v−+ v0= p− v+, we have that ¯x∗_v₋_+v₀ ≤ ¯xv₋+v0; since 0≤ ω ≤ 1, we conclude from (4.9) that (w∗)

T_x∗_{≤ w}∗T_x, establishing the right-hand side inequalities of (4.5). Next, by applying (4.8) to w∗ and to arbitrary w∈ Wk, we observe that

w∗Tx∗− wTx∗= p−1 u=1 ( ¯w_u∗− ¯wu)(x∗u− x∗u+1) (4.10) = v− u=1 (u− ¯wu)(x∗u− x∗u+1) + v−+v−−1 u=v₋+1 (k− ¯wu)(β∗− β∗) + p u=v₋+v0 (k− ¯wu)(x∗_u− x∗_u+1)

(here again, the cases where v₋ = 0 and/or v+ = 0 require special attention). Since ¯

wu≤ u and ¯wu≤ k for each w ∈ Wk and u = 1, . . . , p and since x∗1≥ x∗2≥ · · · ≥ x∗p, we conclude from (4.10) that (w∗)T_x∗ _{≥ w}T_x∗ _{for every w} _{∈ W}k_{, completing the} proof of (4.5).

In the next result, we use the notation _∞ for the 1_∞ norm in Rp _{deﬁned for}

x∈ Rp _by_x

∞= maxu∈{1,... ,p}xu.

(12)

Theorem 4.4. Let L and U be positive integer p-vectors satisfying L≤ U and p

j=1Lj < n < p

j=1Uj, and let x∗ be as in Theorem 4.3. Then there exists an

integer p-vector z∗ with z∗− x∗ _∞< 1, and each such vector is majorized by every integer vector inx∈ Rp_{: L}_{≤ x ≤ U and}p

j=1xj = n

.

Proof. The conclusion of this theorem is trivial when x∗ is integral, so assume that this is not the case. Let N₋, N0, N+, v−, v0, and v+be as in the proof of Theorem 4.3, and as in that proof assume that x∗’s coordinates are nonincreasing, all elements in N₋ precede all elements in N0, and all elements in N0precede all elements in N+; in particular, N₋={1, . . . , v₋}, N0={v−+ 1, . . . , v−+ v0}, and N+={v−+ v0+ 1, . . . , p}. The assertion that x∗ is not integral means that N0 = ∅ and the unique

β∗ with x(β∗) = x∗ is not integral.

Let X ≡ {x ∈ Rp _{: L} _{≤ x ≤ U and} p

j=1xj = n}, let β∗ be the largest integer less than β∗, and let β∗ ≡ β∗ + 1. The integrality of L and U ensures that Lu ≤ β∗ < β∗ < β∗ ≤ Uu for u ∈ N0. Further, we observe that v0β∗ =

n−_u_∈N −Lu−

u∈N+Uuis an integer and v0β∗ < v0β∗< v0β∗ , implying that

µ≡ v0β∗− v0β∗ is an integer satisfying 1 ≤ µ < v0 and µβ∗ + (v0− µ)β∗ =

v0β∗ + µ(β∗ − β∗) = v0β∗ + µ = v0β∗. It follows that the p-vector z∗ with z∗u for u = 1, . . . , p given by z∗_u≡ ⎧ ⎪ ⎨ ⎪ ⎩ x∗_u if u∈ N₋∪ N0, β∗ _{if u = v} −+ 1, . . . , v−+ µ, β∗ _{if u = v}₋_{+ µ + 1, . . . , v}₋_{+ v}₀ (4.11)

is integral, is in X, and satisﬁes z∗− x∗ _∞< 1. We will show that z∗ is majorized by any integer vector z in X by showing that hk(z)≥ hk(z∗) for k = 1, . . . , p, where

hk(.) is the function assigning to each p-vector the sum of its k largest coordinates (see the proof of Theorem 4.3).

Let z be an integer vector in X. For u ∈ N₋, Lu ≥ β∗, and the integrality of

Lu implies that Lu ≥ β∗ . Similarly, for u ∈ N+, Uu ≤ β∗, and the integrality of Uu implies that Uu ≤ β∗. Consequently, z∗’s coordinates are nonincreasing and, therefore, hk(z∗) =k_j=1z_[j]∗ =k_j=1z_j∗ for k = 1, . . . , p. From Theorem 4.3,

hk(z)≥ hk(x∗) = hk(z∗) for 1≤ k ≤ v₋and for v0+v+≤ k ≤ p. Further, as Theorem 4.3 ensures that hv₋+1(z)≥ hv₋+1(x∗) = hv₋(x∗) + β∗, the integrality of hv₋+1(z) and hv₋(x∗) implies that hv₋+1(z)≥ hv₋(x∗) +β∗ = hv₋+1(z∗). To prepare for an inductive argument, assume that hk(z) ≥ hk(z∗) and hk+1(z) < hk+1(z∗) for some

v₋+ 1≤ k < v0+ v+− 1. Then hk(z∗) + zk+1∗ = hk+1(z∗) > hk+1(z) = hk(z) + z[k+1], implying that z[k+1] < hk(z∗) + zk+1∗ − hk(z)≤ zk+1∗ ≤ β∗ . Since z[k+1] and β∗ are integral, we conclude that z[k+1]≤ β∗ − 1 = β∗ and, therefore, z[j]≤ β∗ for

j = k + 2, . . . , v₋+ vo(recall that the coordinates of z∗ are nonincreasing). It follows that hv−+v0(z) = hk+1(z) + v₋+v0 u=k+2 z[u]< hk+1(z∗) + (v−+ v0− k − 1)β∗ = k+1 u=1 z∗_u+ (v₋+ v0− k − 1)β∗ ≤ v₋+v0 u=1 z∗_u= hv−+v0(x ∗_).

This inequality contradicts the conclusion of Theorem 4.3, asserting that x∗ is ma-jorized by z, and thereby completes an inductive proof that hk(z) ≥ hk(z∗) for

k∈ {v₋+ 1, . . . , v0+ v+}.

(13)

We ﬁnally observe that an integer vector z is in X and satisﬁes z − x∗ _∞ < 1

if and only if zu = x∗_u for u∈ N₋∪ N+ (as each such x∗u is integral), it has exactly

µ of the v0 coordinates zu indexed by u∈ N0 equal β∗ , and it has the remaining

v0 − µ coordinates indexed by u ∈ N0 equal β∗. It follows that for each such

z, a coordinate permutation of z∗ exists, implying that hk(z) = hk(z∗) for each

k = 1, . . . , p; in particular, such z, like z∗, is majorized by all integer vectors in

X.

REFERENCES

[1] B. Gao, F. K. Hwang, W. W-C. Li, and U. G. Rothblum, Partition-polytopes over 1-dimensional points, Math. Program., 85 (1999), pp. 335–362.

[2] F. K. Hwang, S. Onn, and U. G. Rothblum, A polynomial time algorithm for shaped partition problems, SIAM J. Optim., 10 (1999), pp. 70–81.

[3] F. K. Hwang, and U. G. Rothblum, Directional-quasi-convexity, asymmetric Schur-convexity and optimality of consecutive partitions, Math. Oper. Res., 21 (1996), pp. 540–554. [4] F. K. Hwang, and U. G. Rothblum, Partitions: Optimality and Clustering, World Scientiﬁc,

to appear.

[5] D. Knuth, The Art of Computer Programming, 2nd ed., Addison-Wesley, Reading, MA, 1981. [6] A. W. Marshall, and I. Olkin, Inequalities, Theory of Majorization and Its Applications,

Academic Press, New York, 1979.

[7] A. F. Veinott, Jr., Least d-majorized network ﬂows with inventory and statistical applica-tions, Management Sci., 17 (1971), pp. 547–567.

[8] A. F. Veinott, Jr., On d-majorization and d-Schur convexity, to appear.