• 沒有找到結果。

Multiprocessor energy-efficient scheduling with task migration considerations

N/A
N/A
Protected

Academic year: 2021

Share "Multiprocessor energy-efficient scheduling with task migration considerations"

Copied!
8
0
0

加載中.... (立即查看全文)

全文

(1)

Multiprocessor Energy-Efficient Scheduling

with Task Migration Considerations

Jian-Jia Chen, Heng-Ruey Hsu, Kai-Hsiang Chuang,

Chia-Lin Yang, Ai-Chun Pang, and Tei-Wei Kuo

Department of Computer Science and Information

Engineering National Taiwan University, Taipei, Taiwan 106, ROC.

E-Mails:

{r90079, b89108, b89109, yangc, acpang, ktw}@csie.ntu.edu.tw

Abstract

This paper targets energy-efficient scheduling of tasks over multiple processors, where tasks share a com-mon deadline. Distinct from many research results on heuristics-based energy-efficient scheduling, we pro-pose approximation algorithms with different approxi-mation bounds for processors with/without constraints on the maximum processor speed, where no task mi-gration is allowed. When there is no constraint on processor speeds, we propose an approximation algo-rithm for two-processor scheduling to provide trade-offs among the specified error, the running time, the approx-imation ratio, and the memory space complexity. An approximation algorithm with a 1.13-approximation ra-tio for M-processor systems is also derived (M > 2). When there is an upper bound on processor speeds, an artificial-bound approach is taken to minimize the en-ergy consumption with a 1.13-approximation ratio. An optimal scheduling algorithm is then proposed in the min-imization of the energy consumption when task migra-tion is allowed.

Keywords: Energy-Efficient Scheduling, Real-Time Task Scheduling, Power Management, Real-Real-Time Systems, Multiprocessor Scheduling.

1. Introduction

While an energy-efficient design has become a focus on various systems, voltage-scaling CPU’s and power-aware subsystems are now adopted in many modern computer systems. The design of CPU circuitry is usu-ally done such that a higher supply voltage results in

∗ Support in parts by research grants from ROC National Sci-ence Council (091, NSC-92-2213-E-002-092, and NSC-92-2220-E-002-013).

a higher execution speed (or higher frequency). An ex-ample energy consumption function [1, 14], as follows, shows the energy consumption of a processor as a func-tion of the processor speed:

P (s) = CefVdd2s, (1) where s = k(Vdd−Vt)2

Vdd , and P, s, Cef, Vt, Vdd, and k denote the energy consumption, the processor speed, the effective switch capacitance, the threshold voltage, the supply voltage, and a hardware-design-specific con-stant, respectively (Vdd≥ Vt≥ 0, k > 0, and Cef > 0). The energy consumption function of a processor is usu-ally a convex function of the processor speed, and each specific function is highly dependent on the design of the corresponding processor.1

Energy-efficient scheduling has been an active re-search topic in the past decade. In particular, Yao, et al. [15] proposed an off-line scheduling algorithm and an on-line competitive algorithm to minimize the en-ergy consumption of task executions in a uniprocessor environment, where the processor under considerations has an infinite number of continuous processor speeds. In [10], Ishihara and Yasuura showed that an optimal schedule in the minimization of energy consumption with only two processor speeds when the processor has only a finite number of discrete processor speeds, and all tasks are ready at time 0 and have a common dead-line. Note that the results could only be applied to pro-cessors with an energy consumption function equal to Formula (1). While an energy consumption function could be any convex function, Chen, et al. [3] showed that the result in [10] remains.

Although many excellent results were proposed for uniprocessor energy-efficient scheduling, lit-tle work has been done for multiprocessor

envi-1 f(x) is a convex function if f(αx + (1 − α)y) ≤ αf(x) + (1 − α)f(y) for any α ∈ (0, 1) and any x, y [5].

(2)

ronments. In recent years, energy-efficient design has been outlined as a critical issue by the indus-try in business operations, e.g., [2], where various configurations of server farms are adopted. Unfor-tunately, multiprocessor energy-efficient scheduling is often NP-hard under various application con-straints. Gruian [7] proposed a simulated anneal-ing (SA) approach in multiprocessor energy-efficient scheduling with the considerations of precedence con-straints and a predictable execution time for each task. In [8], a power-aware scheduling algorithm based on a list heuristics with a dynamic priority assign-ment was proposed to determine the amount of time allocated to each task. Zhang, et al. [16] proposed a heuristic algorithm in which each task was first as-signed to a proper processor, and the processor speed in executing each task was then chosen without vio-lating the precedence and timing constraints. Mishra, et al. [12] explored scheduling issues on the com-munication delay of tasks. Zhu, et al. [17] explored on-line scheduling for a set of independent/dependent frame-based tasks, where all tasks in a frame-based task set are ready at time 0 and share a common dead-line. Given an off-line schedule with worst-case task execution times, on-line strategies were pro-posed to reclaim the slacks resulted from the early completion times of tasks observed in the run time. Al-though some work has been done on multiproces-sor energy-efficient scheduling, many previous results are mainly on heuristics-based energy-efficient schedul-ing.

Distinct from the past work, the objective of this pa-per is to propose approximation algorithms with differ-ent approximation bounds for processors with/without constraints on the maximum processor speed, where task migration is considered. We first show that there does not exist any polynomitime approximation al-gorithm with an approximation bound (1 +) in the minimization of energy consumption for multiproces-sor scheduling over procesmultiproces-sors with an upper bound on the processor speed, where could be any positive real. When there is no constraint on processor speeds, we propose an approximation algorithm for two-processor scheduling to provide trade-offs among the specified error, the running time, the approximation ratio, and the memory space complexity. An approximation algo-rithm with a 1.13-approximation ratio for M-processor systems is also derived (M > 2). When there is an up-per bound on processor speeds, an artificial-bound ap-proach is taken to minimize the energy consumption with a 1.13-approximation ratio.2An optimal

schedul-2 Such a constraint violation study was first introduced in [11].

ing algorithm is then proposed in the minimization of the energy consumption when task migration is al-lowed.

The rest of this paper is organized as fol-lows: Section 2 formally defines the multiproces-sor energy-efficient scheduling problems and show their hardness. Section 3 presents approximation al-gorithms for multiprocessor energy-efficient schedul-ing for the two-processor case and general cases, where no task migration is allowed. In Section 4, an op-timal polynomial-time scheduling algorithm is pro-posed in the minimization of the energy consumption when task migration is allowed. Section 5 is the con-clusion.

2. Problem

Definitions

and

NP-Hardness

2.1. Problem Definitions

This paper is interested in multiprocessor energy-efficient scheduling with/without constraints on the maximum processor speed and with task migration considerations. We assume a homogeneous multipro-cessor environment, where each of the M identical processors has the same energy consumption function P (s) of a given processor speed s. In this paper, P (s) is assumed being a convex and increasing function. An example energy consumption function in [1, 14] is P (s) = Cefs((ks + 2Vt) 2Vt+ks+ q 4Vts k +k2s2 2 − Vt2), which is a reformulation of Formula (1). LetUmaxdenote the maximum available processor speed for processors un-der consiun-derations such that tasks could be executed at any processor speed in [0, Umax]. When Vt = 0, P (s) = αs3, where α = Cef

k2 . It is reasonable to

con-sider only cases for a given energy consumption func-tionP (s) where P (s1)> P (s2) fors1> s2. Let the en-ergy consumed for a processor in the execution of tasks at the processor speeds for t time units be P (s)t. We assume that the number of CPU cycles executed in a time interval is linearly proportional to the proces-sor speed. We denote the amount of required CPU cy-cles for a task running at a speeds for t time units is the multiplication ofs and t.

Task migration might or might not be allowed in the exploring of energy-efficient scheduling in this pa-per. When task migration is allowed, migration cost is assumed being negligible. No task could execute simul-taneously on more than one processors. For the rest of Section 2, we first formally define the multiproces-sor scheduling problems with the minimization of

(3)

en-ergy consumption with/without task migration in this paper. We then show the NP-hardness of the problems.

Definition 1 Multiprocessor Scheduling with the

Min-imization of Energy Consumption with Task Migration (MMEM):

Consider a setT of independent tasks over M identi-cal processors with an energy consumption functionP (s), where all tasks inT are ready at time 0 and share a com-mon deadlineD. Each task τi ∈ T is associated with a computation requirement equal toci CPU-cycles. The problem is to minimize the energy consumption in the scheduling of tasks in T without missing the common deadlineD, where task migration is allowed.

A variation of the MMEM problem without task migration could be defined similarly as follows:

Definition 2 Multiprocessor Scheduling with the

Min-imization of Energy Consumption without Task Migra-tion (MME):

The input and output of theMME problem are as the same as their counterparts of theMMEMproblem, where no task migration is allowed.

A schedule of a task set is a mapping of the execu-tions of the tasks in the set to processors in the system with an assignment of processor speeds for the corre-sponding time intervals of the tasks. A schedule is feasi-ble if all processor speeds assigned for its time intervals are valid, no task misses the deadlineD, and the given task migration constraint is satisfied. The energy con-sumption of a scheduleSC is denoted as Φ(SC) (Please see the first paragraph of this section for the definition of the energy consumption). A schedule is optimal if it is feasible, and its energy consumption is equal to the minimum energy consumption of all feasible sched-ules. If there does not exist any feasible schedule for an input instance, then the minimum energy consump-tion is denoted as∞.

2.2. Hardness of the

MME Problem

We shall show the NP-hardness of theMME prob-lem in this section and then propose an optimal algo-rithm for theMMEM problem in a later section:

Lemma 1 (Chen, Kuo, and Yang [3]) There ex-ists an optimal schedule for any task setT executing on a single processor at the single processor speed

P

τi∈Tci

D ,

where the processor under considerations has an in-finite number of continuous processor speeds, and all tasks inT are ready at time 0 and have a common dead-lineD.

Although multiprocessor scheduling is NP-hard [4] when no task migration is allowed, this does not imply

the NP-hardness of the MME problem directly . For example, whenP () is a linear function(P (s) ∝ s), any feasible schedule is an optimal solution.

Theorem 1 TheMME problem is NP-hard when M ≥

2.

Proof: The NP-hardness is proved by a reduction from the 3-PARTITION problem [4], where P () is a strict convex and increasing function, follows from Lemma 1.3

A polynomial-time (1 +)-approximation algorithm must have a polynomial-time complexity of the input size and derive a solution with a bound (1 +) on the given objective function [13]. That is, whenE(OP T ) represents the value of an optimal solution for the ob-jective function, any solution derived from a (1 + )-approximation algorithm should have a value of the objective function no more than (1 +)E(OP T ) (for minimization problems).

Theorem 2 There does not exist a polynomial-time

(1 +)−approximation algorithm for the MME problem ( > 0) when Umax= ∞, unless P = NP .

Proof: This theorem can be proved by contradic-tion: Suppose that there exists a polynomial-time (1 +)−approximation algorithm for the MME prob-lem, called ALG. Given an instance of the PARTI-TION problem [4] (which is NP-complete), the prob-lem is to find a subset A in a given set A such that 

ai∈Aw(ai) = 

ai∈A−Aw(ai), where each ele-mentai inA is associated with a size w(ai)∈ Z+. Let Umax be an arbitrarily positive real and Umax = ∞. The instance of the PARTITION problem could be re-duced to an instance of theMME problem such that a unique task τi is created for each element ai ∈ A, and the required CPU cycles for τi is w(ai)· Umax. All tasks are ready at time 0, and the common dead-line is set as D =

P

ai∈Aw(ai)

2 . Let the number M of

processors be 2. By applying the approximation al-gorithm ALG to the resulting instance of the MME problem, the energy consumption of the derived sched-ule would be bounded by the multiplication of (1 +) and the energy consumption of an optimal sched-ule if there exists would be any feasible schedsched-ule. How-ever, any feasible schedule could not execute tasks at a speed overUmax. In other words, if there exists a fea-sible schedule, thenALG must already identify a sub-setAofA such thata

i∈Aw(ai) = 

ai∈A−Aw(ai). If there does not exist a feasible schedule, then ALG would report the failure by returning∞. Since ALG is

3 f() is strict convex if f(αx + (1 − α)y) < αf(x) + (1 − α)f(y) for anyα ∈ (0, 1) and any x, y. For example, P (s) ∝ s3when s ≥ 0.

(4)

a polynomial-time algorithm, such a conclusion contra-dicts with the NP-Completeness of the PARTITION problem (unless P=NP).

Theorem 2 implies that there does not ex-ist any polynomial-time approximation algorithm for theMME problem when Umax = ∞ unless P = NP , since could be any positive real.

3. Multiprocessor

Scheduling

with-out Task Migration

In this section, we present approximation algorithms for the MME problem. We first consider the case in whichUmax =∞ for two and an arbitrary number of processors, respectively. We then show the proposed al-gorithms can be proved to bound the maximum pro-cessor speed by constant factors, when Umax = ∞. Note that, since all tasks are ready at time 0 and share a common deadline, the tasks assigned on a processor can be executed in any order. That is, the execution or-der for the tasks assigned on a processor does not affect the feasibility and the energy consumption for any fea-sible schedule. The following formula resulted from the convexity of the energy consumption function is used in this section: P (c1 D)D + P ( c 2 D)D ≥ P ( c 3 D)D + P ( c 4 D)D, (2) whenc1+c2=c3+c4 and 0≤ c1< c3< c4< c2. Based on Formula (2) and Lemma 1, it is clear that the executing of a taskτi on the processori from time 0 toD at the speed ci

D results in an optimal schedule, when |T | ≤ M. In the following of this section, only non-trivial cases,|T | > M, are considered.

3.1. Multiprocessor Scheduling over Two

Identical Processors When

U

max

=

We shall show how to obtain a fully polynomial time approximation scheme (FPTAS) for two identi-cal processors by a reduction to theMaximum Subset Sum problem [4].4 Given a set A of positive numbers

a1, a2, · · · , a|A|and an arbitrary numberW , the Max-imum Subset Sum problem [4] (which is NP-hard) is to find a subsetA ofA such thata

i∈Aai≤ W and 

ai∈Aai is maximized.

Lemma 2 (Ibarra and Kim [9]) The Maximum

Subset Sum problem admits a fully polynomial-time

4 An algorithmA for a minimization problem is an FPTAS if A is executed in polynomial time in the size of the input and1, and the approximation ratio of algorithmA is 1 +  [13], where 0<  is a user input parameter. Note that the approximation ratio is1−1 for a maximization problem, where 0<  < 1.

1

(1−δ)-approximation algorithm subset () for any

0< δ < 1, where the time complexity is O(|A|(3δ)2) and the space complexity isO(|A| + (3δ)3).

Due to Lemma 1 and non-migration of tasks, there must exist an optimal schedule for theMME problem with two processors which assigns two subsets T1and T2(T1∪ T2=T ) of tasks on the processors 1 and 2 at

the speeds P τi∈T 1ci D and P τi∈T 2ci D , respectively. With-out loss of generality, let τ

i∈T1ci 

τi∈T2ci. Be-cause of the convexity of the energy consumption func-tions in Formula (2), achieving the optimal schedule for the MME problem is to generate a subset T1 of T such thatτi∈T1ci 12



τi∈Tci and 

τi∈T1ci is maximized. We develop Algorithmbasic for the MME problem with two processors by applying the subset routine in Lemma 2 with a properδ. The input param-eter in Algorithm basic is a specified amount of error tolerant to users, which is a necessary requirement for an FPTAS. It is obvious that the correctness of Algo-rithmbasic is guaranteed. In the following theorems, we show that setting δ = 

2 leads Algorithmbasic

to be a fully polynomial time (1 +)-approximation al-gorithm for the MME problem when the energy con-sumption function satisfies Formula (1).

Algorithm 1 :basic

Input: (T, D, );

Output: A feasible schedule SC with minimal energy

con-sumption; 1: letW = P τi∈Tci 2 ; 2: C =subset`c1, c2, · · · , c|T |, W, δ´withδ = p /2; let T1be the corresponding task set ofC.

3: output the scheduleSC which executes all tasks in T1at the speed

P

τi∈T 1ci

D on the processor 1 and all tasks in T − T1at the speedPτi∈T −T 1ci

D on the processor 2;

Lemma 3 f(x, γ) = (γx)x33+(1−x)+(1−γx)3 3 ≤ 2γ2− 4γ + 3 for

any fixedγ, where 1 ≥ γ > 0 and 12 ≥ x > 0.

Proof: It is solved when ∂f (x,γ)∂x = 0.

Theorem 3 Algorithmbasic is a (1+)-approximation

algorithm for theMME problem for any 0 <  < 2 when P (s) ∝ s3, i.e.,V

t= 0 in Formula(1).

Proof: Let OP T denote a subset of T , where 

τi∈OP Tci ≤ W and 

τi∈OP Tci is maximized. For the simplicity of representation, we use C(X) to de-note τ

i∈Xci for any subset X of tasks. Let SCopt be the schedule which executes the tasks in OP T at the speed C(OP T )D on the processor 1 and the tasks

(5)

in T − OP T at the speed C(T −OP T )

D on the proces-sor 2. SCopt is an optimal solution for the MME problem and

Φ(SCopt) = (P (C(OP T ) D ) +P (

C(T ) − C(OP T )

D ))D.

Without loss of generality, let C(OP T ) = C(T ) · x andC(T − OP T ) = C(T ) · (1 − x). Since C(OP T ) ≤ C(T − OP T ), we have 0 < x ≤ 1

2. We knowC(T1) =

γ · C(OP T ), where 1 ≥ γ ≥ 1 − δ due to the approxi-mation ratio of Algorithmsubset. The ratio of the en-ergy consumption ofSC to that of SCoptis defined as a functionf():

f(x, γ) = Φ(SC) Φ(SCopt) =

(γx)3+ (1− γx)3

x3+ (1− x)3 ≤ 2γ2−4γ+3,

where the inequality comes from Lemma 3. Note that both γ and x are unknown during the calculation. f(x, 1 −

2)≤ 1 +  by solving 1 +  = 2γ2− 4γ + 3.

Since 2γ2− 4γ + 3 is a decreasing function of γ for any 0< γ ≤ 1, f(x, γ) ≤ 1 +  if δ =

2.

Therefore, by setting δ = 

2, we conclude that

Algorithmbasic is a (1 + )-approximation algorithm for the MME problem. The time complexity of Algo-rithmbasic is O(|T |18

), and the space complexity is O(|T | + (18

)1.5).

We can also prove that Algorithmbasic is an FP-TAS even whenVt= 0 in Formula (1) in the following theorem.

Theorem 4 Algorithmbasic is a (1+)-approximation

algorithm for the MME problem for any 0 <  < 2 whenP (s) = Cef(2ks32 +2Vts2 k +sVt2+s 2 k  Vts k + s 2 4k2 + Vts  Vts 4k +s 2 k2), i.e.,Vt= 0 in Formula (1).

3.2. Multiprocessor Scheduling over an

Ar-bitrary Number of Processors When

U

max

=

In this section, we present a scheduling algorithm with a 1.13-approximation ratio for the MME prob-lem when the maximum available processor speed is infinite. Our proposed algorithm shown in Algorithm 2 (Algorithmltf) adopts the Largest-Task-First strat-egy. That is, tasks are considered in a non-increasing order of their computation requirements.

Letpmdenote the load on the processorm. The load of a processor is defined as the total amount of the com-putation requirements of the tasks assigned to that pro-cessor. LetTmdenote the set of the tasks assigned to the processorm. Note that the task set T is a sorted set in a non-increasing order of the computation require-ment of each task, i.e.,ci≥ cj ifi < j. Algorithm ltf

assigns a task to the processor with the smallest load by the task order inT . To achieve the minimal energy consumption, based on Lemma 1, each task on the pro-cessorm should be executed at the speed

P

τi∈Tmci

D .

The time complexity of Algorithmltf is O(|T | log |T |), which is dominated by the sorting of the tasks. Since each task is assigned to one processor without miss-ing the common deadline, the correctness of Algorithm ltf is guaranteed. For the simplicity of representation, the schedule derived from Algorithmltf is denoted as SCT,LT F.

Algorithm 2 :ltf

Input: (T, D, M);

Output: A feasible schedule SCT,LT Fwith minimal energy consumption;

1: sort all tasks in a non-increasing order of the computa-tion requirement of each task;

2: setp1, p2, · · · , pMto 0, andT1, T2, · · · , TMtoφ;

3: for i = 1 to |T | do

4: find the smallestpm; (break ties arbitrarily)

5: Tm← Tm∪ {τi} and pm← pm+ci;

6: return the scheduleSCT,LT F which executes all of the tasks inTm(1≤ m ≤ M) at the speedpm

D on the proces-sorm;

Lemma 4 Algorithm ltf is an optimal algorithm if

|T | ≤ 2M and ci+M≥ 12cM−i+1for all 1≤ i ≤ |T | − M.

Proof: It can be proved by transforming any feasible

solution into SCT,LT F without increasing the energy consumption.

The next step is to derive the lower bound of the MME problem by relaxing the problem constraint. Let k be the largest index satisfying M ≤ k ≤ 2M and ci+M 12cM−i+1 for all 1≤ i ≤ k − M. T represents the set of the firstk tasks of T . Note that, if |T| < 2M andT −T = φ, we know c|T|+1< 12c2M−|T|. We relax the constraint of theMME problem so that any task inT − T could be executed on more than one proces-sor simultaneously. Below, we describe the scheduling method for the relaxedMME problem. We assign the tasks inT according to Algorithmltf. Let pmdenote the load of the processorm after performing the task assignment. There exists a positive valuePminthat sat-isfies the following equation:

M  m=1 (Pmin− pm)δm=  τi∈T −T ci, (3) whereδmis 1 ifPmin> pmand 0 otherwise. Since task migration and simultaneous execution of a task on mul-tiple processors are allowed for the tasks inT − T, we can distribute the computation of these tasks among

(6)

τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8 τ9 τ10 τ11 τ12 p1 p2 p3 p4 p5 p6 p7 p8 Pmin T − T

Figure 1. The task assignment of SCT,LT F for

M = 8 and |T| = 12. The computation

require-ments of the tasks inT − Tare distributedx over the processors 3, 4, and 7 (the patterned regions).

the processors. IfPmin > pm, (Pmin− pm) CPU cycles ofT −Tare distributed on the processorm. Each pro-cessorm then performs computation at the speed pm

D if pm > Pmin and PminD otherwise. LetSCT,LT F denote the resulting schedule. Figure 1 illustrates the loads of processors inSCT,LT F for the relaxedMME prob-lem. Due to the optimality provided in Lemma 4, it is clear that SCT,LT F consumes no more energy than SCT,opt, where SCT,opt is an optimal schedule for the MME problem. Lemma 5 Φ(SCT,LT F) Φ(SCT,opt) Φ(SCT,LT F) Φ(SC T,LT F) ≤ R , where R∗=max{Pli∈LP (liD) M·P ( S

M D) } for any positive integer M, pos-itive realsS and D, and any set L of M positive reals that satisfyl

i∈Lli=S and maxli∈Lli≤32minli∈Lli.

Proof: Since Φ(SCT,LT F )≤ Φ(SCT,opt), the first in-equality is proved. Let o1, o2, · · · , oM denote the load on each processor for the task assignment generated by Algorithm ltf for T and p1, p2, · · · , pM for T . maxp, minp, and mino are the values with max{pi}, min{pi}, and min{oi}, respectively. It is clear that maxp ≥ Pmin, minp ≤ Pmin, and Φ(SCT,LT F ) MD · P (Pτi∈Tci

MD ). If maxp = Pmin, then minp = Pmin and Algorithm ltf generates an optimal solu-tion. Therefore, we only consider the condition where maxp > Pmin ≥ minp andT − T= φ. We now prove the second inequality. We first consider the case where om≤ Pminfor each processorm. Let m∗be the proces-sor with the largest load inSCT,LT F, i.e.,pm =maxp, and τk be the last task added into Tm in SCT,LT F. Once a processor m satisfies pm ≥ Pmin, Algorithm ltf will not assign any more task to the processor m. Therefore, we have τi∈Tm∗−{τk}ci < Pmin. It is clear that minp τ

i∈Tm∗−{τk}ci; otherwise, Algo-rithmltf will not assign task τk to the processorm∗. Therefore,maxp− minp ≤ ck. Because of om≤ Pmin

for each processorm, we have ck ≤ c|T|+1. We know mino ≤ minp since Algorithm ltf adds the tasks in T − T to the processor with the minimal load. Due to the definition of T, c|T|+1 12mino.5 Combining all inequality relations mentioned above, we have

maxp− minp≤ ck≤ c|T|+1 1

2mino≤ 1 2minp. Therefore,maxp 32minpwhich proves the second in-equality.

We now consider the case where someom > Pmin. Let these processors beJ, where |J| ≥ 1. For each pro-cessorm in J, SCT,LT F andSCT,LT F assign the same tasks on the processorm, e.g., the processors 1, 2, 5, 6, and 8 in Figure 1. By assuming Φ(SCT,LT F)

Φ(SC T,LT F) > R , we conclude that Φ(SCT,LT F)−Pj∈JP (oj/D)D Φ(SC T,LT F)− P j∈JP (oj/D)D > R . This contradicts the case for M = M − |J| and T = T − ∪j∈JTj. Therefore, the approximation ratio for Al-gorithmltf is R∗.

We now derive the value ofR∗when the energy con-sumption function satisfies Formula (1).

Theorem 5 Algorithmltf is a 1.13-approximation

al-gorithm for theMME problem when P (s) ∝ s3, i.e., Vt= 0 in Formula(1).

Proof: We prove this theorem by showing that R∗ ≤ 1.13. By the definition of R in Lemma 5, there must exist at least one real number x for a set L, where 2x ≤ li ≤ 3x for all li ∈ L. It is clear that 2xM ≤ S ≤ 3xM. For each element li ∈ L, we know P (li

D) 3x−lx iP (2xD)+li−2xx P (3xD).6Therefore, we have 

li∈LP (

li

D)≤ kP (3xD)+ (M −k)P (2xD) for a realk sat-isfyingk = S−2Mx

x (rephrasing of 3x·k+2x·(M −k) = S). R∗is obtained by finding the properx which max-imizes the function f(x) = kP (3x

D) + (M − k)P (2xD). Without loss of generality, letP (s) = αs3, whereα is a constant. By solvingf(x) = 0 and showing f(x) < 0 for all x satisfying S

2M ≥ x ≥ 3MS , f(x) is maximized

when x = 45M19S and the maximum value of f(x) is α193S3/M2

3·452D3 ≈ 1.13α S 3

M2D3. Therefore,R∗≤ 1.13.

Corollary 1 Algorithmltf is a 1.13-approximation

al-gorithm for theMME problem when P (s) = Cef(2ks32 +

2Vts2 k +sVt2+s 2 k  Vts k + s 2 4k2 +Vts  Vts 4k + s 2 k2).

5 There are two cases: 1. Ifmino =c2M−|T|, thenc|T|+1 <

1

2c2M−|T| = 12mino; 2. Ifmino = ci+cjfor somei, j > 2M − |T|, then relations c|T|+1 ≤ ciandc|T|+1 ≤ cj re-sult inc|T|+112mino.

6 The inequality comes from thatP (αx + (1 − α)y) ≤ αP (x) + (1− α)P (y) for any α ∈ (0, 1) and any x, y. The coefficients of P (2x

D) andP (3xD) are obtained by solvinga, b in the following

(7)

3.3. Multiprocessor

Scheduling

When

U

max

= ∞

By adopting the constraint-violation approach [11], we propose an artificial-bound approach by first setting an artificial upper bound on the processor speed and then derive feasible schedules in the minimization of energy consumption. We show that Algorithmsbasic and ltf bound the maximum processor speed by the factors of2 and (433M1 ), respectively. For the sim-plicity of representation, we assume that tasks in T are sorted in a non-increasing order of their computa-tion requirements. We first prove that Algorithm ltf could derive a schedule with a 1.13-approximation ra-tio without violating the maximum processor speed for certain input instances (Please see Theorem 6):

Theorem 6 Algorithmltf is a 1.13-approximation

al-gorithm if the given input instance satisfies

P

τi∈Tci

M +

cM+1≤ UmaxD and c1≤ UmaxD.

Proof: This theorem is proved by contradiction. We

assume that there exists a processor m in SCT,LT F, where τ

i∈Tmci > UmaxD. Let τk be the last task added into Tm in Algorithm ltf. Two cases are con-sidered. If k ≤ M, we know c1 ≥ ck > UmaxD. This contradicts the assumption. If k > M, we have 

τi∈Tm−{τk}ci > UmaxD − ck ≥ UmaxD − cM+1. In Algorithmltf, once pm

P

τi∈Tci

M , no more tasks can be assigned on the processorm. Therefore,

P

τi∈Tci

M > 

τi∈Tm−{τk}ci. Based on the above inequalities, we have

P

τi∈Tci

M +cM+1 > UmaxD. This contradicts our assumption.

We now show that Algorithmsbasic and ltf bound the maximum processor speed by the factors of2and (433M1 ), respectively.

Theorem 7 Given an input instance with a feasible

schedule for the MME problem, no schedule derived from Algorithmltf uses any processor speed larger than (433M1 )Umax.

Proof: Let O be a feasible schedule for the input instance. Without loss of generality, O partitions T into M disjoint subsets of tasks. We assume that m is the processor with the largest load in O. Since O is a feasible schedule, the load on the processor m, says pm, must be no more than UmaxD. Let n be the processor with the largest load in SCT,LT F. By rephrasing the processing time into computation re-quirement in the Makespan problem, we know that pn ≤ (433M1 )pm≤ (433M1 )UmaxD since the Longest-Processing-Time-First algorithm was proved to be a

(43 3M1 )-approximation algorithm for the Makespan problem in [6].7 We complete the proof.

Theorem 8 Given an input instance with a feasible

schedule for theMME problem over two processors, no schedule derived from Algorithmbasic uses any proces-sor speed larger than (1 +2)Umax.

Proof: It comes from the setting of δ in Algorithm

basic.

4. Task Migration: An Optimal

Algo-rithm

Algorithm 3 :ltf-m

Input: (T, D, M);

Output: An optimal schedule with minimum energy

con-sumption;

1: sortT in a non-increasing order of the computation re-quirement of each task; letC ←Pτ

i∈Tci;

2: if MDC > Umaxor∃τi∈ T such thatci

D > Umaxthen

3: return non-existence of any feasible schedule;

4: leti ← 1;

5: while i ≤ |T | do

6: if ci>MC then

7: scheduleτito be executed at the speedci

Don the pro-cessorM from time 0 to D;

8: C ← C − ci,i ← i + 1, and M ← M − 1; 9: else 10: break; 11: letS ←MDC andt ← 0; 12: while i ≤ |T | do 13: if t +ci S > D then

14: scheduleτito be executed at the speedS on the pro-cessorM − 1 from time 0 to t +ci

S − D and on the the processorM from time t to D; M ← M − 1;

15: else

16: scheduleτito be executed on the processorM at the speedS from time t to t +ci

S;

17: i ← i + 1 and t ← (t +ci

S) modD;

18: return the schedule of all tasks;

In this section, an efficient optimal algorithm is pro-posed for the MMEM problem, where task migration is allowed. If|T | ≤ M, based on Formula (2) shown in Section 3, it is clear that the executing of each taskτi on the processori from time 0 to D at the speed ci

D re-sults in an optimal schedule. Our proposed Algorithm ltf-m (Algorithm 3) adopts the Largest-Task-First strategy again, and the time complexityO(|T | log |T |) comes from the sorting ofT in line 1. We can prove the following two lemmas.

7 The Makespan problem is as follows: Given processing time for n tasks, find an assignment of the tasks to M identical proces-sors so that the completion time for these tasks is minimized.

(8)

Lemma 6 Ifc1 > SD and |T | ≥ M, then there exists

an optimal schedule which executes onlyτ1on a processor at the speedc1

D from 0 toD, where S =

P

τi∈Tci

MD .

Lemma 7 Ifc1 ≤ SD and |T | ≥ M, then there exists

an optimal schedule which executes each task inT on at most two processors at the speedS, where S =

P

τi∈Tci

MD .

Theorem 9 Any schedule derived from Algorithm

ltf-m is an optimal schedule.

Proof: If c1 > SD where S =

P

τi∈Tci

MD , then Algo-rithmltf-m executes τ1onM at the speed c1

D, and the remaining tasksT − {τ1} and M − 1 processors form a subproblem of theMMEM problem; otherwise, Algo-rithmltf-m executes each task in T over at most two processors at the speedS. Based on Lemmas 6 and 7, we conclude this proof by repeating the above proce-dure in solving the MMEM subproblems.

5. Conclusion

This paper targets energy-efficient scheduling of tasks over multiple processors, where tasks share a common deadline. Distinct from the past work, this paper proposes approximation algorithms with differ-ent approximation bounds for processors with/without constraints on the maximum processor speed. We show the non-existence of polynomial-time approximation algorithms in the minimization of energy consumption for multiprocessor scheduling over processors with an upper bound on the processor speed, unlessP = NP . When there is no constraint on processor speeds, we propose an approximation algorithm for two-processor scheduling to provide trade-offs among the specified error, the running time, the approximation ratio, and the memory space complexity. An approximation algo-rithm with a 1.13-approximation ratio for M-processor systems is also derived (M > 2). When there is an up-per bound on processor speeds, an artificial-bound ap-proach is taken to minimize the energy consumption with a 1.13-approximation ratio. Furthermore, an opti-mal polynomial-time scheduling algorithm is proposed for the minimization of the energy consumption when task migration is allowed.

For future research, we shall explore multiprocessor energy-efficient scheduling for task sets with arbitrary deadlines and arrival times.

References

[1] A. Chandrakasan, S. Sheng, and R. Broderson. Lower-Power CMOS digital design. IEEE Journal of of Solid-State Circuit, 27(4):473–484, 1992.

[2] J. S. Chase, D. C. Anderson, P. N. Thakar, A. Vahdat, and R. P. Doyle. Managing energy and server resources in hosting centres. In Symposium on Operating Systems Principles, pages 103–116. ACM Press, 2001.

[3] J.-J. Chen, T.-W. Kuo, and C.-L. Yang. Profit-driven uniprocessor scheduling with timing and energy con-straints. In ACM Symposium on Applied Computing, pages 834–840. ACM Press, 2004.

[4] M. R. Garey and D. S. Johnson. Computers and in-tractability: A guide to the theory of NP-completeness. W.H. Freeman and Co, 1979.

[5] G. Golub and J. Ortega. Scientific Computing and Dif-ferential Equations. Academic Press, 1992.

[6] R. Graham. Bounds on multiprocessing timing anoma-lies. SIAM Journal on Applied Mathematics, 17:263– 269, 1969.

[7] F. Gruian. System-level design methods for low-energy architectures containing variable voltage processors. In Power-Aware Computing Systems, pages 1–12, 2000. [8] F. Gruian and K. Kuchcinski. Lenes: Task scheduling

for low energy systems using variable supply voltage pro-cessors. In Proceedings of Asia South Pacific Design Au-tomation Conference, pages 449–455, 2001.

[9] O. H. Ibarra and C. E. Kim. Fast approximation algo-rithms for the knapsack and sum of subsets problems. Journal of the ACM, 22(4):463–468, 1975.

[10] T. Ishihara and H. Yasuura. Voltage scheduling prob-lems for dynamically variable voltage processors. In Pro-ceedings of the International Symposium on Low Power Electroncs and Design, pages 197–202, 1998.

[11] J.-H. Lin and J. S. Vitter. -approximations with min-imum packing constraint violation. In Symposium on Theory of Computing, pages 771–782. ACM Press, 1992. [12] R. Mishra, N. Rastogi, D. Zhu, D. Mosse, and R. Mel-hem. Energy aware scheduling for distributed real-time systems. In International Parallel and Distributed Pro-cessing Symposium, page 21, 2003.

[13] V. V. Vazirani. Approximation Algorithms. Springer, 2001.

[14] M. Weiser, B. Welch, A. Demers, and S. Shenker. Scheduling for reduced CPU energy. In Proceedings of Symposium on Operating Systems Design and Imple-mentation, pages 13–23, 1994.

[15] F. Yao, A. Demers, and S. Shankar. A scheduling model for reduced CPU energy. In Proceedings of the 36th An-nual Symposium on Foundations of Computer Science, pages 374–382. IEEE, 1995.

[16] Y. Zhang, X. Hu, and D. Z. Chen. Task scheduling and voltage selection for energy minimization. In Annual ACM IEEE Design Automation Conference, pages 183– 188, 2002.

[17] D. Zhu, R. Melhem, and B. Childers. Scheduling with dynamic voltage/speed adjustment using slack reclama-tion in multi-processor real-time systems. In Proceed-ings of IEEE 22th Real-Time System Symposium, pages 84–94, 2001.

數據

Figure 1. The task assignment of SC T,LT F  for M = 8 and |T  | = 12. The computation  require-ments of the tasks in T − T  are distributedx over the processors 3 , 4, and 7 (the patterned regions).

參考文獻

相關文件

Based on the forecast of the global total energy supply and the global energy production per capita, the world is probably approaching an energy depletion stage.. Due to the lack

Reading Task 6: Genre Structure and Language Features. • Now let’s look at how language features (e.g. sentence patterns) are connected to the structure

In an Ising spin glass with a large number of spins the number of lowest-energy configurations (ground states) grows exponentially with increasing number of spins.. It is in

In this paper, we develop a novel volumetric stretch energy minimization algorithm for volume-preserving parameterizations of simply connected 3-manifolds with a single boundary

If P6=NP, then for any constant ρ ≥ 1, there is no polynomial-time approximation algorithm with approximation ratio ρ for the general traveling-salesman problem...

Although many excellent resource synchronization protocols have been pro- posed, most of them are either for hard real-time task scheduling with the maxi- mum priority inversion

 Propose eQoS, which serves as a gene ral framework for reasoning about th e energy efficiency trade-off in int eractive mobile Web applications.  Demonstrate a working prototype and

 The class of languages decided by polynomial- time algorithms是the class of languages. accepted by