Multiprocessor Energy-Efficient Scheduling for Real-Time Tasks
with Different Power Characteristics
∗Jian-Jia Chen and Tei-Wei Kuo
Department of Computer Science and Information Engineering
Graduate Institute of Networking and Multimedia
National Taiwan University, Taipei, Taiwan, ROC
Email:
{r90079, ktw}@csie.ntu.edu.tw
Abstract
In the past decades, a number of research results have been reported for energy-efficient scheduling over uniproces-sor and multiprocesuniproces-sor environments. Different from many of the past results on the assumption for task power character-istics, we consider real-time scheduling of tasks with different power characteristics. The objective is to minimize the energy consumption of task executions under the given deadline con-straint. When tasks have a common deadline and are ready at time 0, we propose an optimal real-time task scheduling algo-rithm for multiprocessor environments with the allowance of task migration. When no task migration is allowed, a 1.412-approximation algorithm for task scheduling is proposed for different settings of power characteristics. The performance of the approximation algorithm was evaluated by an extensive set of experiments, where excellent results were reported.
1
Introduction
With the advanced technology in VLSI circuit designs, many modern processors could now operate at various sup-ply voltages, where different supsup-ply voltages lead to differ-ent processing speeds. Many computer systems, especially embedded systems, adopt not only voltage-scaling processors but also various energy-efficient strategies in managing their subsystems intelligently. Beside the energy-efficiency de-signs for battery-powered systems, how to reduce energy con-sumption for multi-processor systems, such as server farms, also receives a lot of attention in the past decade. As pointed out in [1], multiprocessor implementations of real-time sys-tems could be more energy-efficient than uniprocessor imple-mentations, due to the convex power consumption functions. Energy-efficient scheduling is to derive a schedule for real-time tasks with minimization on the energy consumption such ∗Support in parts by research grants from ROC National Science Council
NSC-93-2752-E-002-008-PAE and NSC-93-2218-E-002-140.
that the timing constraints can be met. The considerations of timing constraints in task scheduling significantly compli-cate the problems in energy-efficient scheduling. Uniproces-sor energy-efficient scheduling problems have been widely explored, e.g., [2, 3, 10–12, 19]. The energy-efficient real-time task scheduling problems over multiprocessors are of-tenN P-hard. When all of the power consumption functions of tasks are the same, Chen, et al. [5, 18] proposed approx-imation algorithms to schedule frame-based tasks over mul-tiprocessors with and without independent voltage scalings, where all the tasks share a common deadline and arrive at the same time. In [7, 8, 20], energy-efficient scheduling al-gorithms based on list heuristics were proposed to schedule real-time tasks with precedence constraints. Mishra, et al. [13] explored energy-efficient scheduling issues with the con-siderations of task communication delay.
This work is motivated by the energy-efficient scheduling of tasks in reality, where tasks often have different power characteristics [2, 11, 12]. The power consumption function of a task running at the processor speeds is of the form hsα, whereh and α are task and hardware dependent power char-acteristics, respectively (0< h and 2 ≤ α ≤ 3). The param-eter setting ofh in hsαmight depend on the software imple-mentation and the execution path of each task, whereas the value ofα might depend on the hardware design of the pro-cessors under considerations. The objective of this paper is to minimize the energy consumption of task executions un-der the given deadline constraint. We are interested in the scheduling of frame-based task sets, in which all the tasks are ready at time 0 and share a common deadline. When task migration is allowed, we propose an optimal real-time task scheduling algorithm with a time complexityO(|T| log |T|) for multiprocessor environments, whereT is the set of real-time tasks under considerations. When task migration is not allowed, a polynomial-time approximation algorithm is pro-posed. The approximation ratio of the algorithm is shown be-ing 1.412. The performance of the approximation algorithm was evaluated by an extensive set of experiments, where ex-cellent results were reported.
The rest of this paper is organized as follows: In Section 2, we define the system models and formulate the problem. Section 3 presents an algorithm for multiprocessor energy-efficient scheduling, when task migration is allowed. Our ap-proximation algorithm to cope with systems which do not al-low task migration is then presented in Section 4. Simulation results are shown in Section 5. Section 6 concludes this paper.
2
Formal Models and Problem Definitions
We are interested in energy-efficient scheduling of real-time tasks that are ready at real-time 0 and share a common dead-lineD over multiple homogeneous processors. Each task τiis characterized by its worst-case execution CPU cyclesci and power consumption functionPi():
Pi(s) = CiVdd2s, (1) wheres = β(Vdd−Vt)2
Vdd , ands, Ci, Vt, Vdd, andβ denote the
processor speed, the effective switch capacitance, the thresh-old voltage (the minimum voltage that can be supplied to the processor for correct functionality), the supply voltage, and a hardware-design-specific constant, respectively (Vdd≥ Vt ≥ 0, β > 0, and Ci > 0) [4, 17]. The value of the ef-fective switch capacitance is highly related to the software implementations and the execution path of each task (which could be usually derived by profiling). Each power consump-tion funcconsump-tionPi(s) can be phrased as hi· sα, whereα is a hardware-dependent factor, andhi is a parameter related to the corresponding task execution [3, 9, 12, 15, 19]. For exam-ple, whenVtis 0,hiisCi/β2, andα is 3. hiis a positive real number, andα is usually a real number between between 2 and 3 [12, 15]. It is clear that the power consumption function is a strictly convex and increasing function of the processor speed when the processor speeds are non-negative numbers. In this paper, we assume thatPi(s) is second-order differen-tiable.
Suppose that each processor could operate at a speed in [0, ∞], and the speed of each processor could be adjusted in-dependently from each another. We assume that the number of CPU cycles executed in a time interval is linearly propor-tional to the processor speed, and that the energy consumed for a processor in the execution of a task at the processor speeds for t time units is the multiplication of its correspond-ing power consumption at the speeds and t. Let the amount of CPU cycles completed for a task running at a speeds for
t time units be the multiplication of s and t. Assume that the
time and energy overheads required on speed/voltage switch-ing be negligible. SincePi(s)/s is also a strictly convex and increasing function, an optimal schedule must execute each task τi ∈ T entirely at a selected speed si [2, 6]. Specif-ically, executing task τi at the speed s consumes Pi(s)csi amount of energy. The energy consumption function Ei() ofτi is defined as a function of the execution timeti ofτi: Ei(ti) = Pi(ctii)ti = hi· cαi/tα−1i . Note that the energy
consumption function of the execution time ofτiis a strictly convex and decreasing function.
Problem Definition We consider energy-efficient schedul-ing with and without task migration in this work, where mi-gration cost is assumed being negligible. No task is allowed to execute simultaneously on more than one processor. A
sched-ule of a task setT is a mapping of the executions of the tasks
inT to processors in the system with an assignment of pro-cessor speeds for the corresponding execution intervals of the tasks. A schedule is feasible if no task misses its deadline
D, and no task is executed simultaneously on more than one
processor. The energy consumption of a scheduleS is de-noted as Φ(S) which is the sum of the energy consumption of task executions inS. A schedule is optimal if it is feasible, and its energy consumption is equal to the minimum energy consumption of all feasible schedules. Two energy-efficient scheduling problems are defined, as follows:
Definition 1 Multiprocessor Energy-Efficient Scheduling with Task Migration (MEESM)
Consider a set T of independent tasks over M identical processors, where all tasks inT are ready at time 0 and share a common deadlineD. Each task τi ∈ T is associated with a computation requirement equal tociCPU-cycles and a power consumption functionPi() of a given processor speed. The objective is to derive a schedule forT such that all of the tasks inT complete before D, the total energy consumption is min-imized, where task migration among processors is allowed.
A variation of the MEESM problem without task migra-tion could be defined similarly as follows:
Definition 2 Multiprocessor Energy-Efficient Scheduling without Task Migration (MEES)
The input, output, and objective of this problem are as the same as their counterparts of the MEESM problem, where no task migration among processors is allowed.
If the number of tasks inT is no more than the number of processors in the system, the executing of each taskτion a different processor from time 0 toD at the speed ci
D is an optimal schedule. For the rest of this paper, we only focus our discussions on the other case in which the number of tasks in
T is more than the number of processors in the system. Since
the MEES problem isN P-hard even when all of the power consumption functions of the tasks are the same [5], we pro-pose an efficient algorithm which finds an approximated solu-tion with a worst-case guarantee on the energy consumpsolu-tion by adopting approximation algorithms [16].
3
An Optimal Algorithm When Task
Migra-tion is Allowed
In this section, we present an optimal algorithm for the MEESM problem. SincePi(s)/s is a convex and increasing
function of the processor speed for every task τi in a given task setT, there exists an optimal schedule that executes each taskτi∈ T entirely at some speed si[2, 6]. In the following discussions, we only consider schedules in which the entire duration of a task executes at the same speed.
LetV = (t1, t2, . . . , t|T|) be an assignment of execution times of tasks inT, where tiis a positive real number for ev-ery taskτi∈ T. The energy consumption of an assignment V of task execution times ofT is defined asτ
i∈TEi(ti).V is
said feasible forT on M processors if the sum of execution times of all of the tasks inT is no greater than M · D (i.e.,
τi∈Tti≤ M ·D), and tiis no greater thanD for every task
τiinT. Given a feasible schedule SV ofT, it is clear that a feasible assignmentV of task execution times can be derived by setting the execution time of taskτi inV as that of τi in SV. The energy consumption ofV is equal to that of SV. In the following lemma, we show that we can efficiently derive a feasible schedule SV with the same energy consumption as that of a given feasible assignmentV of execution times of tasks in T. In other words, a feasible assignment V of execution times of tasks inT with the minimum energy con-sumption leads to an optimal schedule ofT for the MEESM problem.
Lemma 1 Given a feasible assignmentV of execution times
of tasks inT, a feasible schedule SV can be derived inO(|T|) such that the energy consumption ofSV is equal to that ofV .
Proof. We prove this lemma by constructing a feasible scheduleSV according toV = (t1, t2, . . . , t|T|): • Case 1: If Pi−1j=1tj D = Pi j=1tj D , then execute τi on the Pi−1 j=1tj D -th processor from time (i−1j=1tj modD) to (
i j=1tj modD). • Case 2: If Pi−1j=1tj D = Pi j=1tj D , then execute τi on the Pi−1 j=1tj D -th processor from time (i−1j=1tj modD) to D and on the
Pi j=1tj
D -th processor from time 0 to (ij=1tj modD).
Since 0 < ti ≤ D for all τi inT andτ
i∈Tti = MD,
the resulting scheduleSV above is a feasible schedule for the MEESM problem. Besides, it is clear that the energy con-sumption of SV is equal to the energy consumption of V , since the execution time of taskτi inT is ti both inV and SV. The time complexity isO(|T|) by taking the summation in an incremental manner.
Moreover, the strict convexity of the energy consumption functions of tasks would not allow any processor being idle between time 0 and timeD in an optimal schedule. In other words, an optimal schedule will always have some task exe-cuting between time 0 and timeD on any of the M proces-sors.
Lemma 2 When|T| > M, there exists an optimal schedule
which executes some task at any time instant between time 0 and timeD on each of the M processors.
Proof. We prove this lemma by contradiction. LetSV be an optimal schedule, in whichSV does not execute some task at some time instant between 0 andD on at least one of the M processors. LetV = (t1, t2, . . . , t|T|) be the assignment of execution times ofT for schedule SV. Therefore, we know thatτ
i∈Tti < M · D. Since |T| > M, there must be a
taskτjwhose associatedtjis less thanD in V . Stretching the execution time ofτjas min{D, MD−
τi∈T\{τj}ti} results
in a feasible assignmentVof execution times of tasks inT. BecauseEj() is a strictly convex and decreasing function of the execution time ofτj, the energy consumption ofVis less than that ofV . By Lemma 1, there exists a feasible schedule whose energy consumption is less thanSV, which contradicts the optimality ofSV.
Taking both Lemmas 1 and 2 into considerations at the same time, the MEESM problem can be formulated as a con-vex programming problem, as follows:
minimize τ i∈TEi(ti) subject to τ i∈Tti=M · D and 0< ti≤ D ∀τi∈ T. (2)
For the rest of this section, we will show that the optimal as-signment of execution times of tasks inT described by Equa-tion (2) can be determined inO(|T| log |T|) time by applying the Karush-Kuhn-Tucker optimality condition [14,§14] and a binary search strategy. After the optimal assignment of exe-cution times of tasks inT is determined, Lemma 1 is applied to derive an optimal schedule for the MEESM problem.
In the following, we first obtain an optimal solution by ig-noring the conditionti> 0. After that, we show that ti> 0 is satisfied for every taskτi inT for the solution. To apply the Karush-Kuhn-Tucker optimality condition for concave gramming, Equation (2) is reformulated as a concave pro-gramming problem, as follows:
maximize τ i∈T ¯ Ei(ti) subject to τ i∈Tti=M · D and ti≤ D ∀τi∈ T, (3)
where ¯Ei(ti) is defined as −Ei(ti). The Karush-Kuhn-Tucker optimality condition for Equation (3) is to find a vec-tor (λ1, λ2, . . . , λ|T|), a vector (t∗1, t∗2, . . . , t∗|T|), and a con-stantλ such that
¯ E i(t∗i)− λi=λ, t∗i ≤ D, (t∗i − D)λi= 0, λi≥ 0, ∀τi ∈ T, and τi∈Tt ∗ i =MD, (4)
where ¯Ei() is the derivative of ¯Ei(). Since
τi∈T
¯
Ei(ti) is a concave function, and{tj} is a quasiconvex set for every τj ∈ T, setting (t1, t2, . . . , t|T|) as (t∗1, t∗2, . . . , t∗|T|) is an
optimal solution for the concave programming in Equation (3) [14,§14]. We will show that such a vector (t∗1, t∗2, . . . , t∗|T|) could be determined by the Lagrange multiplier technique. Let be an index, where 0 ≤ < M. If the execution time of
τiis set asD for i = 1, 2, . . . , , the concave programming in Equation (3) could be rephrased as
maximize |T|i=+1E¯i(ti)
subject to |T|i=+1ti= (M − ) · D, (5) by further ignoring the inequalities ti ≤ D for i = + 1, . . . , |T|.
Equation (5) could be solved by applying the Lagrange multiplier technique. Since ¯Ei(ti) = (α − 1)hicα
it−αi , given an index, the conditions ¯Ei(ti) = ¯Ej(tj) for all < i, j ≤
|T| for the Lagrange multiplier technique lead to tαi tα j = (α−1)hicαi (α−1)hjcαj , ∀ < i, j ≤ |T| |T| j=+1tj = (M − ) · D.
Therefore, the optimal solution for Equation (5) is to assign (t+1, t+2, . . . , t|T|) as (t∗+1, t∗+2, . . . , t∗|T|), where |T| X j=+1 t∗+1ccj +1( hj h+1) 1/α= (M − )D (6a) t∗j= t∗+1ccj +1( hj h+1) 1/α, ∀ + 1 < j ≤ |T|, (6b)
and the Lagrange multiplierλ∗is ¯E+1 (t∗+1). As a result, the time complexity to derive the optimal solution of Equation (5) for an index is O(|T| − ).
It is clear that every t∗j in (t∗+1, t∗+2, . . . , t∗|T|) derived from Equation (6) is greater than 0. Therefore, if each t∗j in (t∗+1, t∗+2, . . . , t∗|T|) is no greater thanD when = 0, then assigning ti as t∗i forτi inT is an assignment of exe-cution times with the minimum energy consumption. There-fore, we only have to consider the other case. For the rest of this section, letT be sorted by a non-decreasing order of
E
i(D), where Ei() is the derivative of the energy consump-tion funcconsump-tionEi(). The following lemma helps to construct an assignment of execution times of tasks inT with the min-imum energy consumption.
Lemma 3 Suppose that every t∗j in (t∗∗+1, t∗∗+2, . . . , t∗|T|)
derived from Equation (6) is less thanD for an index ∗, and
¯
E
∗(D) is no less than ¯E∗+1(t∗∗+1), where 1 ≤ ∗ < M. The assignment oftiasD, for i = 1, 2, . . . , ∗, andtjast∗j, forj = ∗+ 1, ∗+ 2, . . . , |T|, would derive an assignment of
task execution times with the minimum energy consumption.
Proof. We prove this lemma by showing that all of the
condi-tions in Equation (4) hold. It is clear that such an assignment satisfiesτ
i∈Tt
∗
i =MD and 0 < t∗i ≤ D for all τiinT. Letλ be ¯E∗+1(t∗∗+1). Forj = ∗+ 1, ∗+ 2, . . . , |T|, let λj be 0. Fori = 1, 2, . . . , ∗, letλibe ¯Ei(D) − ¯E∗+1(t∗∗+1).
Algorithm 1 : BIN
Input: (T, D, M);
Output: An optimal schedule for the MEESM problem;
1: if|T| ≤ M then
2: return the schedule by executing each task τiinT at the speed
ci
D on the i-th processor from time0 to D;
3: sortT in a non-decreasing order of Ei(D);
4: left ← 0 and right ← M;
5: while lef t < right− 1 do
6: ˆ← (left + right)/2;
7: apply Equation (6) by setting as ˆ;
8: if λ > ¯Eˆ(D) then
9: right ← (left + right)/2;
10: else
11: left ← (left + right)/2;
12: ∗← left;
13: V ← (t1, t2, . . . , t|T|) by setting tias D for i= 1, 2, . . . , ∗
and tj as t∗j derived from Equation (6) for j = ∗+ 1, ∗+ 2, . . . , |T|;
14: return the schedule by applying Lemma 1 on V ;
As a result, the equality ¯Ei(t∗i)− λi = λ holds for ev-ery taskτi inT. Since T is sorted in a non-decreasing or-der ofEi(D), and ¯Ei() is defined as−Ei(), we know that
¯
E
i(D) ≥ ¯Ej(D) when i < j. Because of the condition ¯ E ∗(D) ≥ ¯E∗+1(t∗∗+1), we have ¯ E i(D) ≥ ¯E∗(D) ≥ ¯E∗+1(t∗∗+1),
for any i = 1, 2, . . . , ∗. As a result, λi = E¯i(D) − ¯
E
∗+1(t∗∗+1) ≥ 0 for i = 1, 2, . . . , ∗. It is clear that all
of the conditions in Equation (4) hold.
By Lemma 3, an assignment of execution times with the minimum energy consumption for the MEESM problem can be derived inO(M|T| + |T| log |T|) by setting from 0 to
M −1 sequentially. Moreover, the following lemma helps the
reducing of the time complexity toO(|T| log |T|) by a binary search on the setting of.
Lemma 4 Suppose that every t∗j in (t∗∗+1, t∗∗+2, . . . , t∗|T|)
derived from Equation (6) is less thanD for an index ∗, and
¯
E
∗(D) is no less than ¯E∗+1(t∗∗+1), where 1 ≤ ∗ < M. If∗ < ˆ < M, the Lagrange multiplier for Equation (5) by setting as ˆ is strictly greater than ¯Eˆ(D). If ˆ ≤ ∗, the Lagrange multiplier for Equation (5) by setting as ˆ is no greater than ¯Eˆ(D).
Proof. For notational brevity, let ¯tˆ+1 and ¯tˆbe the values oft∗ˆ+1andt∗ˆderived from Equation (6) by setting as ∗, respectively. We consider the case when∗ < ˆ < M. As shown in Lemma 3, we know ¯Eˆ+1(¯tˆ+1) = ¯Eˆ(¯tˆ). When is set as ˆ, one can verify that t∗ˆ+1derived from Equation (6) is strictly less than ¯tˆ+1. Since ¯Ej() is a decreasing function of the execution time for any taskτjinT, we know
λ∗ = ¯E
whereλ∗is the Lagrange multiplier for Equation (5) by set-ting as ˆ, and the last inequality comes from the condition ¯
tˆ< D. The other case can also be proved in a similar
man-ner.
Our proposed algorithm denoted as Algorithm BIN(shown in Algorithm 1) adopts the binary search strategy. After all, we conclude this section by showing the following theorem.
Theorem 1 Algorithm BIN can derive an optimal schedule for the MEESM problem inO(|T| log |T|).
Proof. It follows directly from Lemmas 1, 2, and 4.
4
An Approximation Algorithm When Task
Migration is not Allowed
In this section, we present an approximation algorithm for the MEES problem. Since the flexibility of task migration relaxes the constraint on the dis-allowance of task migration for the MEES problem, the energy consumption of the opti-mal schedule for the MEESM problem is no more than that of the optimal schedule for the MEES problem for the same task setT on M processors. Our proposed approximation al-gorithm first estimates a lower bound on the minimum energy consumption for the MEES problem by applying Algorithm BIN(presented in Section 3). Then, a feasible schedule of the MEES problem is derived by referring to the optimal sched-ule of the MEESM problem.
For the rest of this paper, lett∗i denote the estimated
exe-cution time of taskτiinT, which is defined as the execution time of τi in the optimal solution derived from Algorithm BIN when task migration is allowed. The estimated execu-tion times of tasks in T are then used to assign tasks onto theseM processors. Let e∗i be the estimated energy
consump-tion of taskτiwhen the execution time ofτiis the estimated execution time of τi, i.e., e∗i = Ei(t∗i). Letpmdenote the
load on them-th processor. The load of a processor is
de-fined as the total amount of estimated execution time of the tasks assigned onto this processor. For notational brevity, let
Tmdenote the set of the tasks assigned onto them-th proces-sor. Our proposed algorithm shown in Algorithm 2 (denoted as AlgorithmLEET) adopts the
Largest-Estimated-Execution-Time-First strategy. That is, tasks are considered in a
non-increasing order of their estimated execution time.
For notational brevity, for the rest of this paper, letT be a sorted set in a non-increasing order of the estimated execu-tion time, i.e., t∗i ≥ t∗j ifi < j. AlgorithmLEETconsiders the tasks in the sorted order fromτ1toτ|T|. Once taskτiis considered,τiis assigned onto them-th processor whose cur-rent load is the smallest. (For the simplicity on presentation, we break ties by choosing the smallest indexm. Actually, the analysis in the following still holds by breaking ties arbitrar-ily.) After the assignment of the tasks onto theM processors is done, we have to assign the execution times of these tasks
to meet the timing constraint. For every taskτiassigned onto them-th processor, the execution time of τi is set ast∗ipDm.
After all, it is clear that the total execution time of the tasks assigned onto each processor is exactly equal to D. Since the processor speed is in [0, ∞], executing the tasks assigned onto each processor one after one is a feasible schedule of the MEES problem. The time complexity of AlgorithmLEETis
O(|T| log |T|), which is dominated by applying Algorithm
BIN, the sorting of the tasks, and the procedure to find the minimumpm.
Algorithm 2 :LEET
Input: (T, D, M);
Output: A feasible schedule for the MEES problem;
1: let(t∗1, t∗2, . . . , t∗|T|) be the assignment of execution times of T
by applying Algorithm BIN(T, D, M);
2: sort all tasks inT in a non-increasing order of their estimated execution times;
3: set p1, p2, · · · , pM as 0, and T1, T2, · · · , TM as φ;
4: for i= 1 to |T| do
5: find the smallest pm; (break ties by choosing the smallest index m)
6: Tm← Tm∪ {τi} and pm← pm+ t∗i;
7: for m= 1 to |M| do
8: for each task τi∈ Tmdo
9: ti← t∗i×pD
m;
10: return the schedule SLEETwhich executes all of the tasks τiin
Tm(1 ≤ m ≤ M) at the speed ci/tion the m-th processor one
after one;
In the following, we shall show the optimality of Algo-rithm LEET, i.e., that on the approximation ratio. For the simplicity of representation, the schedule derived from Al-gorithmLEETis denoted asSLEETfor task setT. For nota-tional brevity, letT be a subset ofT, where T consists of those tasks whose estimated execution times are strictly less thanD. That is, T = {τi | t∗i < D, ∀τi∈ T}. Moreover, let ˆT be the difference set of T from T, i.e., ˆT = T \ T. Note that the analysis in the rest of this section only focuses on the case that|T| > M since AlgorithmLEETguarantees to derive an optimal schedule for the other case. Furthermore, this also implies thatTis not empty for the rest discussions. We will show that the approximation ratio of AlgorithmLEET is(α−1)αα(2α−1α−2)(2αα−1−1)α. Since the value ofα is at most 3, and the
approximation ratio is an increasing function on the value of
α, the approximation ratio of AlgorithmLEETis 1.412. Before we proceed to prove the approximation ratio of Algorithm LEET, we first introduce some properties of the estimated energy consumptions and the estimated execution times of the tasks under considerations. In the following lemma, we show that the ratio of the estimated energy con-sumption to the estimated execution time of taskτi is equal to that of taskτjif bothτiandτjare elements ofT.
Lemma 5 For any two tasksτi, τj ∈ T, the ratio of the esti-mated energy consumption to the estiesti-mated execution time of
τiis equal to that ofτj, i.e.,e ∗ i t∗i = e∗j t∗j.
Proof. It follows directly from the property of the Lagrange
multiplier. By the definition of the task setT,t∗i andt∗j are both less thanD. By Lemma 3 and Algorithm BIN, we have
−E
i(t∗i) =−Ej(t∗j). Therefore, we conclude this lemma by showing that −(α − 1)hi c α i (t∗ i)α =−(α − 1)hj cαj (t∗ j)α ⇒ t∗i t∗j = (t∗ j)α−1hicαi (t∗ i)α−1hjcαj =e∗i e∗j.
If two processors are only assigned with some tasks inT after all of the tasks are assigned in AlgorithmLEET, we show that the ratio of the loads between these two processors is at most 2.
Lemma 6 Suppose that them∗-th and the ˆm-th processors are assigned with some tasks inTsuch thatpm∗is the maxi-mum andpmˆ is the minimum after all of the tasks are assigned
onto processors in AlgorithmLEET, thenpm∗ is at most twice ofpmˆ.
Proof. Since tasks are assigned onto the processor with the smallest load, and |T| > M, it is clear that both pm∗
and pmˆ are greater than 0. For each task τi in ˆT, Algo-rithm LEET assigns only τi onto a processor. Therefore,
τi∈Tt
∗
i = (M − | ˆT|)D. Since |T| > (M − | ˆT|), we knowpm∗ ≥ D and pmˆ ≤ D by the pigeon hole principle.
Because of the conditiont∗i < D for every task τiinT,Tm∗
consists of at least two tasks. Let the last task inserted into
Tm∗ beτr. Whenτris considered in the first loop in
Algo-rithmLEET(i.e., the for loop from Steps 4 to 6), there must be at least one task assigned onto the ˆm-th processor already, sinceτris assigned onto the processor whose current load is the minimum. Let τq be the first task assigned onto the ˆ m-th processor. Because Algorim-thmLEETassigns the tasks in a non-increasing order of the estimated execution times, we havet∗r ≤ t∗q ≤ pmˆ. Furthermore, sinceτris assigned onto the processor whose current load is the minimum, we know
pm∗ − t∗r ≤ pmˆ. By considering the above inequalities, we knowpm∗ ≤ 2pmˆ.
Besides, we need the following lemma to prove the ap-proximation ratio of AlgorithmLEET.
Lemma 7 Supposef(x) = k · (2x)α+ ( ˆM − k)xα for a positive number ˆM and a non-negative number k, where 0 ≤ k ≤ ˆM and 2k · x + ( ˆM − k) · x = ˆM, then
f(x) ≤ (α − 1)α−1(2α− 1)α αα(2α− 2)α−1 M.ˆ
Proof. Since 2k·x+( ˆM −k)·x = ˆM, we know k = M − ˆˆ xMx. Therefore,
f(x) = ˆM(xα−1(2α− 1) + xα(2− 2α)),
and the derivative off(x) is
f(x) = ˆM((α − 1)xα−2(2α− 1) + αxα−1(2− 2α)). f(x) is maximized at x∗ when f(x∗) = 0. By solving f(x∗) = 0, we havex∗ = (α−1)(2α−1)
α(2α−2) . As a result, we
conclude that
f(x) ≤ f(x∗) = (α − 1)α−1(2α− 1)α αα(2α− 2)α−1 M.ˆ
We conclude this section by showing that AlgorithmLEET is a(α−1)αα(2α−1α−2)(2αα−1−1)α-approximation algorithm for the MEES
problem.
Theorem 2 The approximation ratio of Algorithm LEETis
(α−1)α−1(2α−1)α
αα(2α−2)α−1 .
Proof. Let O∗ be the energy consumption for an optimal schedule forT of the MEES problem. Since the sum of the estimated energy consumption of all of the tasks inT is a lower bound onO∗, we know thatO∗ ≥ τ
i∈Te
∗ i. Letτr be some task inT. By Lemma 5, we have
O∗≥ τi∈T e∗ i =e∗r/t∗r(M − | ˆT|)D + τi∈ ˆT e∗ i. For each taskτiin ˆT, AlgorithmLEETassigns onlyτionto a processor. Since we break ties by choosing the smallest index
m in AlgorithmLEET, thei-th processor is assigned with only taskτi, where 0 < i ≤ | ˆT|. For the m-th processor, where m > | ˆT|, the energy consumption to execute task τiinTmis equal toe∗i(pm
D)α−1inSLEET, and the sum of the estimated energy consumption of the tasks inTmis equal to e
∗ r
t∗rpmby
applying Lemma 5. Therefore, we have
Φ(SLEET) = τi∈ ˆT e∗ i + M m=| ˆT|+1 e∗ r t∗ r pm(pDm)α−1 = τi∈ ˆT e∗ i + M m=| ˆT|+1 e∗ r t∗ r (pm D)αD.
The approximation ratioA of AlgorithmLEETcan be phrased as A = Φ(SLEET) O∗ ≤ M m=| ˆT|+1(pDm)α M − | ˆT| , (7)
where the inequality comes from the fact a+b1a+b2 ≤ b1b2 when
b1≥ b2> 0 and a ≥ 0. It remains to show that
M m=| ˆT|+1( pm D)α M − | ˆT| ≤ (α − 1)α−1(2α− 1)α αα(2α− 2)α−1 . (8) Suppose that them∗-th and the ˆm-th processors are assigned with some tasks inT such that pm∗ is the maximum, and
pmˆ is the minimum. By Lemma 6, we have 2pDmˆ ≥ pm∗D ≥
pm
D ≥
pmˆ
D, for all| ˆT| < m ≤ M. Besides, by the convexity of (pm
D)α of pm
D (i.e., the second order derivative of ( pm
D)α of pm
D is non-negative when pm
D ≥ 0) and the fact pmˆ ≥ 2pmˆ − pm≥ 0 for all | ˆT| < m ≤ M, we have
(pm D)α≤ 2pmˆ − pm pmˆ (pmˆ D )α+ (1− 2pmˆ − pm pmˆ )(2pmˆ D )α, since 2pmˆ−pm pmˆ ( pmˆ D) + (1− 2pmˆpm−pˆ m)( 2pmˆ D ) is equal to pm D. Therefore, M X m=| ˆT|+1 (pD )m α ≤ XM m=| ˆT|+1 2pmˆ−pm pmˆ ( pmˆ D )α + (1 −2pmˆ−pm pmˆ )( 2pmˆ D )α = k · ( 2pmˆ D )α+ (M − | ˆT| − k)( pmˆ D )α, where 2kpmˆ D + (M − | ˆT| − k)pDmˆ = (M − | ˆT|), i.e., k = M
m=| ˆT|+1(1−2pmˆp−pmˆ m). By applying Lemma 7 with the setting of ˆM as M − | ˆT| and x as pmˆ
D, we reach the con-clusion.
Corollary 1 The approximation ratio of AlgorithmLEETis
1.412.
Proof. Sinceα is no greater than 3, and (α−1)αα(2α−1α−2)(2αα−1−1)α
is an increasing function of α, the approximation ratio is bounded whenα = 3.
5
Performance Evaluation
In this section, we provide performance evaluation on the energy consumption of AlgorithmLEET. Another algorithm, denoted as AlgorithmRAND, which is very similar to Algo-rithmLEET, was simulated for comparison. The only differ-ence between AlgorithmRAND and AlgorithmLEETis that tasks are not sorted before the assignment procedure in Algo-rithmRAND.
Workload Parameters and Performance Metrics The common deadlineD of the tasks in a task set was set as 100 units of time in the simulations. For each task τi inT, τi was characterized by two different parameters: the number of execution CPU cyclesci and the coefficienthiof the power consumption function of τi. ci was generated uniformly in the range (0, D]. hiwas uniformly distributed in the range of 2 and 10. The exponent of the power consumption functions of the processor speeds was set as 3, i.e., Pi(s) = his3. We simulated two cases for different numbers of processors with different numbers of tasks. For the first case, we evaluated the algorithms for the effects on the ratio of the number of tasks to the number of processors. For a given ratioη of the number of tasks to the number of processors, the number of processors
M was an integral random variable between 10 and 30, and
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
Maximum relative energy consumption ratio
Ratio of number of tasks to number of processors LEET RAND
(a) Maximum ratio whenα = 3
1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.4 1.45 1.5 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
Average relative energy consumption ratio
Ratio of number of tasks to number of processors LEET RAND
(b) Average ratio whenα = 3
Figure 1. (a) and (b): maximum and average rel-ative energy consumption ratios, respectively.
2 4 6 8 10 12 14 16 18 20 20 25 30 35 40 45 50 55 60 1 1.2 1.4 1.6 1.8 2
Number of processors
Number of tasks
Maximum relative energy consumption ratio
LEET RAND
(a) Maximum ratio whenα = 3
2 4 6 8 10 12 14 16 18 20 20 25 30 35 40 45 50 55 60 1 1.2 1.4 1.6 Number of proces sors Number of tasks
Average relative energy consumption ratio
LEET RAND
(b) Average ratio whenα = 3
Figure 2. (a) and (b): maximum and average rel-ative energy consumption ratios, respectively.
the number of tasks was set as the floor of the multiplication ofη and M, i.e., η · M. For the other case, the number of processors ranged from 2 to 20, and the task-set size ranged from 21 to 60. Experimental results were conducted with 512 independent experiments for each parameter configuration.
The relative energy consumption ratio was adopted as the performance metric in our experiments. The relative energy consumption ratio for an input instance was defined as the energy consumption of the schedule derived by the algorithm to that of an optimal schedule with the allowance of task mi-gration. As shown in Section 3, the energy consumption of an optimal schedule with the allowance of task migration can be derived in an efficient manner. Since the problem isN P-hard, the performance metric relative energy consumption
ra-tioaimed at the providing of an approximated index. When the results were for the average relative energy consumption ratio, their results were averaged. When they were for the maximum relative energy consumption ratio, the maximum value was returned.
Experimental Results For the evaluation of the effects on the ratio of the number of tasks to the number of processors, Figures 1(a) and 1(b) present the maximum and average rela-tive energy consumption ratios for the simulated algorithms. The performance of AlgorithmLEETwas very close to that
of the optimal solutions. The maximum and average rela-tive energy consumption ratios for AlgorithmLEETwere less than 1.11 and 1.01, respectively. Furthermore, the maximum and average relative energy consumption ratios for Algorithm RANDwere less than 1.82 and 1.46, respectively. When the ratio of the number of tasks to the number of processors was small, both of AlgorithmLEETand AlgorithmRAND might assign a task along with improper tasks on a processor. Such an assignment might result in a significant increase on the en-ergy consumption of these tasks when the enen-ergy consump-tion for the other tasks were almost as the same as that in the optimal schedule. Such an observation explained why the maximum relative energy consumption ratios in Figure 1(a) decreased when the ratio of the number of tasks to the number of processors increased. However, when the ratio of the num-ber of tasks to the numnum-ber of processors was small, in most cases, most processors were assigned with only one task, and the assignment was almost as the same as that of an optimal schedule. Therefore, the average energy consumption ratio was relatively small when the ratio of the number of tasks to the number of processors was less than 1.5.
When the number of processors ranged from 2 to 20, and the task set size ranged from 21 to 60, Figure 2(a) (/Fig-ure 2(b)) presents the maximum (/average) relative energy consumption ratios for the simulated algorithms. The per-formance of AlgorithmLEETwas again very close to the op-timal solution. The maximum and average relative energy consumption ratios for AlgorithmLEETwere less than 1.084 and 1.01 respectively, where the maximum and average rela-tive energy consumption ratios for AlgorithmRANDwere less than 1.941 and 1.485, respectively. The trends of the simu-lation results were similar to those in Figures 1(a) and 1(b). The results indicated that AlgorithmLEETcould derive effec-tive schedules for the MEES problem.
6
Conclusion
This paper targets energy-efficient scheduling problems over homogeneous processors for real-time tasks with a com-mon deadline. Different from the past work, we consider different parameter settings for the power consumption func-tion Pi(s) of the processor speed s for each task τi, i.e.,
Pi(s) = hisα, where the value ofhidepends upon the power characteristics of task τi and α is a hardware-specific con-stant (2 ≤ α ≤ 3). We propose an optimal algorithm, when task migration is permitted, and an approximation algorithm, when task migration is not allowed. We show that the ap-proximation ratio of the proposed apap-proximation algorithm is
(α−1)α−1(2α−1)α
αα(2α−2)α−1 . Since the value ofα is at most 3, the
ap-proximation ratio is at most 1.412. The proposed algorithm is evaluated by a series of simulation experiments, compared to a lower bound by allowing task migration. The performance of AlgorithmLEETis very close to that of the optimal solu-tions. The results also indicate that Algorithm LEETcould
derive effective schedules for the MEES problem.
For future research, we will explore energy-efficient scheduling over multiple processors for periodic real-time tasks or tasks with arbitrary deadlines and arrival times.
References
[1] J. H. Anderson and S. K. Baruah. Energy-efficient synthesis of pe-riodic task systems upon identical multiprocessor platforms. In Pro-ceedings of the 24th International Conference on Distributed Comput-ing Systems, pages 428–435, 2004.
[2] H. Aydin, R. Melhem, D. Moss´e, and P. Mej´ıa-Alvarez. Determining optimal processor speeds for periodic real-time tasks with different power characteristics. In Proceedings of the IEEE EuroMicro Confer-ence on Real-Time Systems, page 225, 2001.
[3] N. Bansal, T. Kimbrel, and K. Pruhs. Dynamic speed scaling to man-age energy and temperature. In Proceedings of the 2004 Symposium on Foundations of Computer Science, pages 520–529, 2004. [4] A. Chandrakasan, S. Sheng, and R. Broderson. Lower-power CMOS
digital design. IEEE Journal of Solid-State Circuit, 27(4):473–484, 1992.
[5] J.-J. Chen, H.-R. Hsu, K.-H. Chuang, C.-L. Yang, and A.-C. P. T.-W. Kuo. Multiprocessor energy-efficient scheduling with task migra-tion consideramigra-tions. In EuroMicro Conference on Real-Time Systems (ECRTS’04), pages 101–108, 2004.
[6] J.-J. Chen, T.-W. Kuo, and C.-L. Yang. Profit-driven uniprocessor scheduling with energy and timing constraints. In ACM Symposium on Applied Computing, pages 834–840. ACM Press, 2004.
[7] F. Gruian. System-level design methods for low-energy architectures containing variable voltage processors. In Power-Aware Computing Systems, pages 1–12, 2000.
[8] F. Gruian and K. Kuchcinski. Lenes: Task scheduling for low energy systems using variable supply voltage processors. In Proceedings of Asia South Pacific Design Automation Conference, pages 449–455, 2001.
[9] S. Irani, S. Shukla, and R. Gupta. Algorithms for power savings. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 37–46. Society for Industrial and Applied Mathematics, 2003.
[10] T. Ishihara and H. Yasuura. Voltage scheduling problems for dynami-cally variable voltage processors. In Proceedings of the International Symposium on Low Power Electroncs and Design, pages 197–202, 1998.
[11] W.-C. Kwon and T. Kim. Optimal voltage allocation techniques for dynamically variable voltage processors. In Proceedings of the 40th Design Automation Conference, pages 125–130, 2003.
[12] P. Mej´ıa-Alvarez, E. Levner, and D. Moss´e. Adaptive scheduling server for power-aware real-time tasks. ACM Transactions on Em-bedded Computing Systems, 3(2):284–306, 2004.
[13] R. Mishra, N. Rastogi, D. Zhu, D. Moss´e, and R. Melhem. Energy aware scheduling for distributed real-time systems. In International Parallel and Distributed Processing Symposium, page 21, 2003. [14] R. L. Rardin. Optimization in Operations Research. Prentice Hall,
1998.
[15] Y. Shin and K. Choi. Power conscious fixed priority scheduling for hard real-time systems. In Proceedings of the 36th ACM/IEEE Con-ference on Design Automation ConCon-ference, pages 134–139, 1999. [16] V. V. Vazirani. Approximation Algorithms. Springer, 2001.
[17] M. Weiser, B. Welch, A. Demers, and S. Shenker. Scheduling for reduced CPU energy. In Proceedings of Symposium on Operating Systems Design and Implementation, pages 13–23, 1994.
[18] C.-Y. Yang, J.-J. Chen, and T.-W. Kuo. An approximation algorithm for energy-efficient scheduling on a chip multiprocessor. In Proceed-ings of the 8th Conference of Design, Automation, and Test in Europe (DATE), pages 468–473, 2005.
[19] F. Yao, A. Demers, and S. Shenker. A scheduling model for reduced CPU energy. In Proceedings of the 36th Annual Symposium on Foun-dations of Computer Science, pages 374–382. IEEE, 1995.
[20] Y. Zhang, X. Hu, and D. Z. Chen. Task scheduling and voltage selec-tion for energy minimizaselec-tion. In Annual ACM IEEE Design Automa-tion Conference, pages 183–188, 2002.