Chapter 2 Preliminaries
2.3 Low Power Work-Demand Analysis (lpWDA)
The lpWDA [22], which was originally designed for real-time systems with periodic tasks only, is an efficient on-line slack estimation heuristic for the RM scheduling. The slack estimation procedure uses the short-term work-demand analysis. The goal of the lpWDA is to extend the available slack time of the scheduled task by delaying the schedule of lower-priority tasks in near future as late as possible.
The slack time, slacki(t) of a periodic task Ti at time t can be computed as Di – t – loadi(t), where Di is the deadline of Ti and loadi(t) is the amount of work required to be processed in [t, Di]. loadi(t) consists of three types of work: (1) wirem(t): the remaining WCET of Ti at time t for Ti itself, (2) Hi(t): the work from the higher-priority tasks, and (3) Li(t): the work from the lower-priority tasks. wirem(t) is the known value at each scheduling point, but Hi(t) and Li(t) should be computed from a complex analysis. In the lpWDA, it computes approximate estimates of Hi(t) and Li(t), H~ (t)
i ≥ , for a safe estimation on available slack times. Therefore, slacki(t) can be computed by estimating an approximate value of loadi(t), ~load(t)
) the amount of work to be done between Ti being scheduled for execution and being preempted, Hipast of Ti is the work required by uncompleted higher-priority tasks before t, and CalcSlackTime() is a procedure of the lpWDA used to calculate the slack time slacki(t) of Ti [22].
For instance, there are two periodic tasks in Table 1 and their periods are 6 and 8, respectively. The WCET of tasks are 1 and 2 time units, respectively. Fig. 2 (a) shows the non-DVS scheme, and Fig. 2 (b) shows the load estimation of each task at t = 0. While the first instance of T1, T1,1, is scheduled for execution, slack1(0) has to be calculated. The analysis scope of the lpWDA is [0, 8] according to the latest upcoming deadline, and H~ (0)1 = 0 and H~ (0)2 = 2 can be derived. Now w1rem(0) and H~ (0)1 are known values, but (0)L~1
has to be determined by T2 which has the earliest upcoming deadline among tasks whose priorities are lower than that of T1. According to the lpWDA, (0)L~1
Therefore, the available execution time A1(0) for T1 can be estimated as
A1(0) = max(0, slack1(0)) + w1rem(0) = 5 And the clock speed can be adjusted to
Sclk = (w1rem(0)/ A1(0))×Smax = 1/5×Smax
lpWDA Algorithm :
IF system start then FOR each task Tx
IF Tx is completed or preempted THEN Call UpdataLoadInfo()
END IF
IF Tx is scheduled for execution THEN
Call CalcSlackTime() to get slack time slackx(t) Set the clock frequency and voltage accordingly.
END IF END IF
Function CalcSlackTime()
Identify the task Ty that has the earliest upcoming deadline among tasks whose priorities are not higher than that of Tx
Ly(t) = CalcLowerPriorityWork(Ty) loady(t) = wremy (t) + Hy(t) + Ly(t) slackx(t) = max (0, udy – t – loady(t) ) return (slackx(t))
Fig. 1. lpWDA algorithm [22]
Function CalcLowerPriorityWork()
IF Ty is identical to Tn THEN return 0 END IF
Identify the task Tz that has the earliest upcoming deadline among tasks whose priorities are lower than that of Ty
Lz(t) = CalcLowerPriorityWork(Tz) loadz(t) = wremz (t) + Hz(t) + Lz(t)
Fig. 1. lpWDA algorithm [22] (cont.)
Table 1. Example task set
(a)
(b)
Fig. 2. Short-term work-demand analysis example: (a) shows the non-DVS scheme before using lpWDA. (b) shows the load estimation of each task at t = 0 using lpWDA. (The gray-boxes mean the amount of work from the higher or same priority tasks, and the black-boxes mean the amount of work from the lower priority tasks.)
Task WCET(ms) Period(ms)
T1 1 6
T2 2 8
Chapter 3
Related Work
3.1 Existing DVS Algorithms for Mixed Workload Real-Time Systems
Recently, several researchers proposed DVS algorithms for mixed workload real-time systems. Under the EDF (Earliest Deadline First) (or EDF*) scheduling policy, most of these algorithms integrate the bandwidth preserving servers and priority-based slack stealing strategy.
Doh et al. [23] proposed an approach which leads to proper allocation of energy budgets for hard periodic and soft aperiodic real-time tasks. Given an energy budget, it computes a proper voltage setting for attaining an improved performance for aperiodic tasks while meeting the deadline requirements of periodic tasks. It used Total Bandwidth Servers (TBS) [12], which is a kind of bandwidth preserving servers, and only focused on the off-line static scheduling problem.
Aydin et al. proposed three separate on-line schemes with mixed workload under a power consumption constraint. It also used TBS and Dynamic Reclaiming Algorithm (DRA) [20]
under the EDF* scheduling policy. In the Basic Reclaiming Scheme (BRS) [24] the earliness of aperiodic tasks is only used for reclaiming the coming aperiodic tasks, and the earliness of periodic tasks is only used for reclaiming the coming periodic tasks. The Mutual Reclaiming Scheme (MRS) [24] was developed from BRS. The main difference between MRS and BRS is that in the MRS both periodic and aperiodic tasks can mutually reclaim their unused computation times. The Bandwidth Sharing Scheme (BSS) [24] is to solve the problem of the actual aperiodic workload that is relatively lower than the predict aperiodic workload. In the
BSS when the TBS is idle, the algorithm will create a ghost job J to produce more earliness to aggressively reduce the clock speed. But it will increase the response time of aperiodic tasks if an actual aperiodic task arrives right after creating the ghost job.
In [26], Shin et al. merged the TBS and two DVS algorithms, lppsEDF [9] and DRA, respectively, under the EDF* scheduling policy. They also proposed an enhanced approach called Workload-based Slack Estimation (WSE) [8], which integrates Constant Bandwidth Servers (CBS) [16] and DRA. The WSE is almost the same as the MRS (as indicated in [8]).
All of the above approaches focused on the EDF (or EDF*) scheduling policy. Few of existing DVS for mixed workload real-time systems were proposed under the RM scheduling policy. The RM scheduling policy has been adopted in most real-time schedulers of practical interest due to its low overhead and predictability [25].
Under the RM scheduling policy, most of these algorithms integrate the bandwidth preserving servers and stretching-to-NTA strategy. The Stretching-To-NRT (SNRT) [26]
scheme is the first proposed DVS algorithm for mixed workload real-time systems under the RM scheduling policy by Shin et al. The SNRT is a modified stretching-to-NTA algorithm with the Deferrable Servers (DS) and Sporadic Servers (SS). Because the arrival time of an aperiodic task is unknown, the stretching rules are only applied when the budget of bandwidth preserving servers is exhausted. But it is inefficient when the workload of aperiodic tasks is small and the budget is larger than 0. They also proposed Bandwidth-Based Slack-Stealing (BSS) [8] scheme which considers the bandwidth of a
scheduling server and identifies the maximum available time (MAT) for a periodic task.
Even when the execution budget is larger than 0, the MAT can be calculated before the arrival time of next periodic task. Therefore, the clock speed of the periodic task can be slow down based on the MAT.
One problem of existing approaches is that they only calculate the MAT from the schedule point to NTA. To have long MAT, in this thesis, we propose an on-line DVS
algorithm named Work-demand-based Slack-Stealing (WSS) scheme for mixed workload real-time systems under the RM scheduling policy. The WSS integrates the bandwidth preserving servers and lpWDA [22] to compute the MAT by delaying the schedule of lower-priority periodic tasks. As a result, the MAT of high periodic tasks can be extended and the overall energy consumption can be reduced. Another problem of the existing approaches is that a periodic task with the highest priority may still run slowly even if there are some aperiodic tasks, without the execution budget, waiting in the ready queue. The WSS use the concept of slack stealing, which was originally used in mixed workload real time systems without DVS, to service aperiodic tasks when there are some aperiodic tasks, without the execution budget, waiting in the ready queue.
3.2 Comparison of Existing Inter-Task DVS Algorithms for Mixed Workload Real-time Systems
Table 2 shows the qualitative comparison of several existing inter-task DVS algorithms for mixed workload real-time systems along with the proposed WSS. The scheduling policy indicates an DVS algorithm uses RM or EDF scheduling policy, which are the two most popular real-time schedulers. EDF* is almost the same as EDF. The difference is that, in EDF*, amongthe tasks whose deadlines are the same, the task with the earliest arrival time has the highest priority. Among the tasks whose deadlines and arrival times are the same, the task with the lowest index has the highest priority. The scaling decision describes that the decision of the clock speed is calculated on-line or off-line. The on-line DVS strategy indicates that the DVS algorithms for mixed workload real-time systems belong to which on-line DVS strategies that were described in Chapter 2. The “bandwidth-preserving servers” indicates it combined which bandwidth-preserving servers to schedule aperiodic tasks. The metric of energy consumption indicates the CPU energy consumption of each DVS algorithm. The metric of response time represents the time interval between the arrival and completion of an
aperiodic task. We will compare the proposed WSS with the non-DVS scheme, SNRT, BSS quantitatively in chapter 5.
Table 2. Qualitative comparison of existing on-line DVS algorithms for mixed workload real-time systems.
Chapter 4
Proposed WSS Algorithm
4.1 System Model, Assumptions and Notations
The target processor can change its supply voltage (V) and clock speed (Sclk) (or frequency) continuously within its operational ranges, [Vmin,Vmax] and [Smin,Smax]. There are two components of mixed workload real-time systems: a set of T = { T1…Tn } of n periodic tasks with hard deadlines, and a set of J aperiodic tasks arriving randomly with soft deadlines.
Ji is the ith aperiodic task. Ti has higher priority than Tj if i < j. A periodic task Ti can be specified as Ti(Pi, Wi), where Pi and Wi are the period and WCET of Ti. Based on related work [8][12][14][15][16][24][26], the arrival time, period and WCET of periodic tasks are known in advance, but those of aperiodic tasks are made available only when they arrive. The relative deadline (Di) of each periodic task instance is assumed equal to its period. All tasks are assumed to be independent.
4.2 Basic Idea
The basic idea of the proposed WSS (Work-demand-based Slack-Stealing) is to compute the slack time under the existence of the bandwidth preserving servers by using short-term work-demand analysis. Moreover, the WSS uses two modes of operation to reduce the response time of aperiodic tasks. It uses the concept of slack stealing to service aperiodic tasks. If there are some aperiodic tasks waiting in the ready queue, and there is no execution budget and the slack time is large than 0, the WSS will service the aperiodic tasks first and will not cause any deadline miss of periodic tasks.
4.3 Two Modes of the WSS
There are two modes of operation in the WSS: power saving mode and non-power saving mode. When the execution budget is exhausted and there are still some aperiodic tasks in the
ready queue, the mode is set to the non-power saving mode. When no aperiodic task is in the ready queue or the execution budget is replenished, the mode is switched back to the power saving mode.
4.4 The Stretching Rules in Power Saving Mode
In the real-time systems with mixed workload, the slack time can still be computed by assuming that the bandwidth preserving server is an additional periodic task TBPS. The WCET of TBPS is Qs and the period is Ts. The priority of TBPS is according to Ts. However, TBPS isn’t like other periodic tasks. TBPS will not be executed even if its execution time is not zero when there is no aperiodic task in the ready queue. The algorithm of the lpWDA with the DS/SS has to be modified as follows in order to be applicable to mixed workload real time systems:
z When a periodic tasks Ti finishes, Hi(t) has to be recalculated in the lpWDA. If the bandwidth preserving server (BPS), which has the remaining execution budget qs, has a higher-priority than Ti, the calculation of Hi(t) must include qs when Ti
finishes.
z When an aperiodic task is completed, it is just seen as a preemption of TBPS.
z At the deadline DBPS of TBPS, Hi(t) of each lower priority task Ti (for i > BPS) has to be recomputed as Hi(t) = Hi(t) - qs. This implies that TBPS is completed at DBPS. z When an aperiodic task is scheduled for execution, no computation of slack time is
needed.
Therefore, in the power saving mode, we calculate the clock speed by following two stretching rules:
z Stretching rule for a periodic task T: The clock speed of T can be calculated as
follows:
z Stretching rule for an aperiodic task: If there is no periodic task in the ready queue, execute the aperiodic task at the clock speed of
) 0
where the next periodic task arrival time (NTA), the next replenishment time (R) of the bandwidth preserving server, and the deadline (DBPS) of TBPS are known in advance and t is the start time of the aperiodic task.
If there is any periodic task in the ready queue, the clock speed of aperiodic task is S0.
In this way, Hi(t) of periodic task Ti is still an overestimated value for a safe estimation on available slack time, because the consumed execution budget of bandwidth preserving server will not exceed Qs in a period of Ts. Fig. 3 is the modified UpdateLoadInfo procedure of Fig. 1 for the WSS, where budget∆ is the amount of consumed execution budget.
We give an example to illustrate the operation of WSS. Assume there are two periodic tasks Ta(6,1) and Tb(8,2). Fig. 4 (a) shows two aperiodic tasks are serviced by a DS(5,1) without DVS. At t = 1, an aperiodic task J1 arrives. Because the execution budget is large than zero and the DS has the highest priority, J1 can be serviced immediately and the execution budget of DS is consumed at a rate of the clock speed per unit time. At t = 5, the execution budget is replenished. If any aperiodic task arrives between t = 2 and t = 5, the aperiodic task will be serviced by the background priority which was described in Chapter 2. Fig. 4 (b) and (c) show how to use the short-term work-demand analysis with a DS. When a periodic task is scheduled for execution, the calculation of slack time has to consider the execution budget of the DS. Because Ts of the DS is 5, TBPS has the highest priority. Therefore, we let TBPS = T1, Ta
= T2, and Tb = T3. At t = 0 , H0(0) = 0, H1(0) = 2, H2(0) = 4, and then slack2(0) = 2 can be derived by the CalcSlackTime() of Fig. 1. And the clock speed of T2 is 1 / (2 + 1) × S0 = 1/3
× S0. If T2,1 completes at t =1, H2(1) and H3(1) will be recalculated. Thus, H2(1) = 2 + 1 = 3 (1 from qs) and H3(1) = 3 according to Fig. 3. At this time, an aperiodic task J1 arrives, and it will complete at t = 2 as shown in Fig. 4(d). Then H2(2) and H3(2) are 2 and 2 according to Fig. 3.
Function UpdateLoadInfo()
IF (PERIODIC TASK COMPLETION) THEN udx = udx + Px
ELSE IF(PERIODIC TASK PREEMPTION) THEN
rem
wx =wremx – wdone
LOOP each task Ti, i = x + 1 until i = n Hi(t) = Hi(t) – wdone
END LOOP
ELSE IF(APERIODIC TASK PREEMPTION OR COMPLETION) THEN
rem
Fig. 3. The UpdataLoadInfo procedure in the WSS.
(a)
(b)
(c)
(d)
Fig. 4. An example of work-demand-based slack-stealing scheme: (a) shows that tasks are scheduled with a DS without DVS. (b) shows the analysis scope with a DS. (c) shows the load estimation of each task at t = 0. (The gray-boxes mean the amount of work from higher or same priority tasks, and the black-boxes mean the amount of work from lower priority tasks.) (d) shows an aperiodic task completed at t = 2.
4.5 The Serviced Rules in Non-Power Saving Mode
If an aperiodic task arrives and no other aperiodic task is in the ready queue in the non-power saving mode, the WSS computes the slack time of a periodic task which has the highest priority in the ready queue and services the aperiodic task according to the slack time.
For instance, if Jy arrives at time t without the execution budget and no other aperiodic task in the ready queue, slacki(t) of the periodic task Ti, which has the highest priority in the ready queue has to be calculated. If slacki(t) is greater than zero, it means the schedulability of Ti
will not be affected even if the execution time of Ti is deferred by the amount of slacki(t).
Hence, we set [t, min(t + slacki(t), NRT)] as an time interval allowed to execute aperiodic tasks. If no aperiodic task is in the ready queue, slacki(t) can be reclaimed for Ti and the mode can be switched back to the power saving mode. In other words, aperiodic tasks can be executed first if slacki(t) is larger than zero in the non-power saving mode, and in this way the response time can be reduced when the actual workload of aperiodic tasks is large than the execution budget. On the other hand, if slacki(t) is smaller than or equal to zero, Ti will be executed using clock speed S0. If Ti completes early, Jy has an opportunity to be serviced according to the slack time of the new scheduled periodic task.
If Jy arrives at time t without execution budget but another aperiodic task is already in the ready queue, then task Jy will be placed in a queue of pending tasks according to the FIFO (first in, first out) scheduling policy. In addition, if another higher priority task Tj (j < i) arrives during a time interval allowed to execute aperiodic tasks, this time interval has to be recalculated.
Moreover, if an aperiodic task is serviced in the non-power saving mode, the Hi(t)’s of the other lower priority tasks ( BPS < i ) need not be modified. The period of servicing aperiodic tasks looks like an idle period of the schedule in the non-power saving mode.
Chapter 5
Simulation Results and Discussion
5.1 Simulation Model
Aperiodic tasks were generated by the exponential distribution with interarrival time (1/λ) and service time (1/µ). We used a fixed value µ and varied λ to control the workload (ρ
= λ/µ) of aperiodic tasks under a fixed utilization Up of periodic tasks [8]. There are three periodic tasks in Table 3. The period of each task is 6, 8, and 14, respectively and the WCET of each task is 0.5, 1.0, and 1.283, respectively. The utilization Up of periodic tasks is 0.3 ((0.5 / 6) + (1.0 / 8) + (1.283 / 14)) [8].
The actual execution time of each periodic task instance was generated by a normal distribution function in the range of [BCET, WCET], where BCET is the best-case execution time. The mean and the standard deviation were set to (WCET+BCET)/2 and (WCET-BCET)/6, respectively [24]. In the experiments, the voltage scaling overhead is assumed negligible both in the time delay and power consumption [8]. The total amount of Us and Up must be smaller than Ulub, which is the least upper bound of schedulable utilization. Ulub is 1 with EDF scheduling and n(21/n-1) for n tasks with RM scheduling.
In order to experimentally evaluate the performance of the proposed algorithms, WSS, we implemented the following existing schemes for performance evaluation:
(1) PD scheme [8]: Aperiodic tasks were assumed to be serviced by the SS. It was also assumed that if the system is idle, it enters into the power-down mode (PD). The power consumption in the PD mode is assumed to be zero [8].
(2) Stretching-to-NRT (SNRT) scheme [25]: It was described in Chapter 3.
(3)Bandwidth-Based Slack-Stealing (BSS) scheme [8]: It was described in Chapter 3.
For all experiments, all tasks were assigned an initial clock speed S0 = (Up + Us) Sm / Ulub, where Sm is the maximum clock speed [8]. In the following, we evaluated the performance of each scheme in terms of energy consumption and response time. The energy consumption and response time are normalized to those of PD [8].
Table 3. Periodic task set description.
Task Set (millisecond)
Task Period WCET
T1 6 0.5
T2 8 1.0
T3 14 1.283
5.2 Effects of Different Workloads of Aperiodic Tasks on Performance
BCET is assumed to be 10% of WCET, and ρ is ranging from 0.05 to 0.25 (λ = 0.05 ~ 0.25 and µ = 1.0) [8]. The server utilization Us is set from 10% ~ 35%, where Us is controlled by changing the value of Ts with a fixed Qs value [8]. The WSS is compared with the others three schemes under different workloads (server utilization) of aperiodic tasks.
Fig. 5 shows the effects of different workloads (server utilization) on (a) normalized energy consumption, (b) normalized response time, and (c) normalized energy * response time of all schemes under different server utilization. From the simulation results, we have the
following observations:
z As the server utilization (Us) increases, the energy consumption of all schemes increases because the initial clock speed S0 increases.
z When the actual workload of aperiodic tasks is close to the server utilization, the energy consumption of the WSS is close to that of the BSS. The reason is that the slack time of the WSS will be used to service aperiodic tasks when the execution budget is exhausted.
z When the workload of aperiodic tasks is close to server utilization, the response time of all schemes increases.
z The WSS reduces the energy consumption, in average, by 58%, 22%, and 12%
compared with the PD, SNRT, and BSS, respectively.
z The WSS reduces the response time, in average, by 38% and 31% compared with the SNRT and BSS, respectively. The PD has the smallest response time.
z The WSS reduces the response time, in average, by 38% and 31% compared with the SNRT and BSS, respectively. The PD has the smallest response time.