華夏機構典藏 HWHIR : Item 987654321/939

(1)

行政院國家科學委員會專題研究計畫成果報告

鬆弛時間分析法應用於嵌入式即時系統之研究

研究成果報告(精簡版)

計畫類別：個別型

計畫編號： NSC 99-2221-E-146-011-

執行期間： 99 年 08 月 01 日至 100 年 07 月 31 日

執行單位：華夏技術學院資訊管理系

計畫主持人：陳大仁

共同主持人：陳祐祥、謝衛民

計畫參與人員：碩士班研究生-兼任助理人員：劉佑玫

大專生-兼任助理人員：林秉毅

大專生-兼任助理人員：施淳仁

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫可公開查詢

中華民國 100 年 10 月 24 日

(2)

行政院國家科學委員會補助專題研究計畫

■ 成果報告

□期中進度報告

鬆弛時間分析法應用於嵌入式即時系統之研究

計畫類別：■個別型計畫

□整合型計畫

計畫編號：NSC 99－2221－ E－146－011

－

執行期間：99 年 08 月 01 日至 100 年 07 月 31 日

執行機構及系所：華夏技術學院資訊管理系

計畫主持人：陳大仁

共同主持人：謝樹明、陳祐祥、謝衛民

計畫參與人員：劉佑玫、林秉毅、施淳仁

成果報告類型(依經費核定清單規定繳交)：■精簡報告

□完整報告

本計畫除繳交成果報告外，另須繳交以下出國心得報告：

□赴國外出差或研習心得報告

□赴大陸地區出差或研習心得報告

■出席國際學術會議心得報告

□國際合作研究計畫國外研究報告

處理方式：

除列管計畫及下列情形者外，得立即公開查詢

□涉及專利或其他智慧財產權，□一年□二年後可公開查詢

中

華

民

國一百年十月二十日

(3)

行政院國家科學委員會補助專題研究計畫成果報告

計畫名稱：具有轉換感知與線上動態電壓調整之即時系統排程法之研究

計畫編號：NSC 99－2221－ E－146－011－

執行期間：99 年 08 月 01 日至 100 年 07 月 31 日

主持人: 陳大仁華夏技術學院資訊管理系

計畫參與人員：謝樹明

、陳祐祥、謝衛民、劉佑玫、林秉毅、施淳仁

華夏技術學院資訊管理系

I、

中文摘要

由於電腦系統大量的電源消耗除了會增加使

用成本，還會增加其運作時的溫度，進而提高系

統故障的機率與降低系統的可靠度，而攜帶型系

統還會因為電池電量的限制而縮短其運作時間。

因此，如何降低處理器的電源消耗已經成為當今

電腦系統發展的一個重要研究課題。本計畫對於

固定優先權工作的硬式即時系統，提出新的鬆弛

時間計算方式，並設計一套排程法降低週期性工

作的 CPU 功耗。鬆弛時間計算的方式採用獨特的

新概念，稱為 low-power fluid slack analysis

(lpFSA)，根據這個概念開發的排程法可以取得更

多鬆弛時間，並利用動態電壓調整 (dynamic

voltage scaling，DVS)技術，降低目前工作的執行

電壓。不同於一般鬆弛時間回收法，lpFSA 具以

下特性: 1.能與現有排程法(lpWDA，lpLDAT，…)

合作，進一步改善能源效率。2.與現有方法互相

獨立，只需小幅調整與設定即可讓排程運作。3.

時間複雜度與實際執行的額外負擔低。根據實驗

結果，新方法能比原始 lpWDA 與 lpLDAT 節省最

多 15%-21%的電源消耗。

關鍵詞:

動態電壓調整，鬆弛時間分析，電源效

率，即時系統排程。

II、Abstract

The power consumption of computer systems

can not only increase the cost but also the operating

temperature, which leads to the increased chance of

system failures and the decreased system reliability.

In addition, it will decrease the operation time for

limited power. As a consequence, how to decrease

the power consumption of processors becomes an

important research topic in the modern computer

development. In the first part of this project, we

developed a scheduling algorithm to reduce the

energy consumption of hard real-time tasks with

fixed priorities according to rate-monotonic(RM)

policy.

Sets

of

independent

tasks

running

periodically on a processor with dynamic voltage

scaling (DVS) are considered as well. The proposed

online approach can cooperate with many slack-time

analysis methods based on low-power work demand

analysis

(lpWDA)

without

increasing

the

computational complexity of DVS algorithms. The

proposed approach introduces a novel technique

called low-power fluid slack analysis (lpFSA) that

extends the analysis interval produced by its

cooperative

methods

(i.e.

the

lpWDA-based

methods) and computes the available slack in the

extended interval. The lpFSA regards the additional

slack as fluid and computes its length, such that it

can be moved to the current job. Therefore, the

proposed

approach

provides

the

cooperative

methods with additional slack. Experimental results

show that the proposed approach combined with

lpWDA-based algorithms achieves more energy

reductions than do the initial algorithms alone.

Keywords:

DVS, slack time analysis,

energy-efficiency, real-time scheduling

(4)

1 Introduction

In the one-year project, we focus on the study and de-sign of energy-efficient real-time scheduling with slack time analysis. This report summerizes the major results we have achieved. In recent years, computations and communication have moved steadily toward mobile and portable devices with limited power supply. Therefore, many primary IC produc-ers have developed modern processors with dynamic voltage

scaling (DVS)[?], including Intel’s XScaler[?], the mobile

Athlonrby AMD[?] and SamSung’s Cortexr[?].

Many previous studies have investigated slack time analy-sis [?, ?, ?, ?, ?, ?] while assuming a feasible schedule. Pil-lai and Shin [?] proposed a cycle-conserving rate-monotonic

(ccRM) scheduling scheme that contains off-line and on-line

algorithms. The off-line algorithm computes the worst-case response time of each task and derives the maximum speed needed to meet all task deadlines. It recomputes the utiliza-tion by comparing the actual time for completed tasks with WCET schedule(also called canonical schedule [?]) whose length could be the least common multiplier of task peri-ods. When a task completes early, they have to compare the used actual processor cycles to a pre-computed worst-case execution time schedule. ccRM only considers possi-ble slack time before the next task arrival (NTA) of current job. Gruian proposed a DVS method for off-line task stretch-ing and on-line slack distribution [?]. The off-line part of this method consists of two separate techniques. One focuses on the intra-task stochastic voltage scheduling that employs a task-execution length probability function. The second tech-nique computes stretching factors by using a response time analysis. It is similar to Pillar and Shin’s off-line technique, but instead of adopting a stretching factor for all tasks that before NTA, Gruian assigns different stretching factor to the individual task within the longest task period. Kim et al. [?] proposed a greedy on-line algorithm called the low-power

work-demand analysis (lpWDA) that derives slack from

low-priority tasks, as opposed to the method in [?, ?] that gains slack time from high-priority tasks. This algorithm also bal-ances the gap in voltage levels between high-priority and low-priority tasks. Its analysis interval limited by the longest of task periods is longer than NTA and gains more energy sav-ing than the previous RM DVS schemes applysav-ing NTA. In this project, we propose an on-line slack-time computation

scheme called low-power fluid slack analysis(lpFSA), which

computes the length of potential slack in an interval longer than the longest of task periods. With minor modification,

lpFSAcan be applied to many RM DVS scheduling scheme with various assumptions, including transition and preemp-tion criteria[?, ?, ?, ?]. Addipreemp-tionally, it does not increase computational complexity of the existing on-line DVS algo-rithms. Experimental results indicate that existing RM DVS

algorithms combined with the proposed method can reduce energy consumption by 5-25% compared with that by initial

algorithms such aslpWDA,lpLDAT[?], etc.

The remainder of this report is organized as follows: Sec-tion 2 presents the system model and problem definiSec-tion. The concept of fluid slack analysis is proposed in Section 3. Prop-erties of this method are presentd. Section 4 reports the per-formance of these methods. Section 5 is the conclusion.

2 System Model and Notations

The DVS processor used in the model operates at a

fi-nite set of supply voltage levels V = {v1, . . . , vmax}, each

with an associated speed. Processor speed is normalized

by Smax = 1 corresponding to vmax = 1, yielding a set

S={s1,...,1} of speed levels. A set of n preemptive periodic

tasks is denoted by T = {τ1, τ2, . . . , τn}, where the tasks are

assumed mutually independent. Each task τi is described by

its worst-case execution cycles wci, and average case

execu-tion cycles aci(wci ≥ aci). Throughout this report, the

exe-cution cycles of each task are called work for short.

Addition-ally, each task τihas a shorter period length pi (i.e., a higher

priority) than that of τj when i < j, and pnis the longest of

task periods. The relative deadline di of τi is assumed equal

to its period length pi. Each task is invoked periodically by

a job, and the k-th job of task τi is τi,k. The first job of each

task is assumed activated at time t=0. Each job is described

by a release time, ri,j, deadline, di,k, and number of cycles

that have been executed exk_i. The utilization U of a task set

T is denoted byP

τi∈T

wci

pi . During run-time, we refer to the

earliest job of each task not completed as the current job for that task, and that job is indexed with cur. The deadline of the

current job for task τi is dcuri , and excuri denotes the number

of cycles that the current job of τihas executed. Without loss

of generality, when τi is the first scheduled task after time

rn,k−1, where i 6= n, the border is the next release time of

τn(i.e., the rn,k). In this project, available slack in the

inter-val [border, rn,k+1) is computed, and the original techniques

such aslpWDA[?],lpWDA-DP[?] andlpLDAT[?] are called

the host algorithms of proposed method.

(5)

3. Slack Time Analysis for Energy-Efficient Real-Time Scheduling

Table 1:An example of real-time task set T . Task Period(pi) WCET(wci) ACET(aci)

τ1 3 1.0 0.5

τ2 4 1.0 0.5

τ3 6 2.0 1.0

3.1 The concept of the fluid slack

Let rn,k be theborder of τi, which is the first scheduled

job at time t where t ≥ rn,k−1. We compute the length

of additional slack in the interval [border, rn,k+1). For

instance, Figure 1 presents the WCET scheduling tasks

mentioned in Table 1. When job τ1,1 is ready at time t = 0,

the currentborder is at r3,2 = 6 and the target interval for

extracting additional slack time is [border, r3,3). In this

case, the period of τ2,2 spans astride the border, while the

ends of period of τ1,2 and τ3,1 are equal to theborder. To

compute how much additional slack can be transferred like

liquid from interval [border, rn,k+1) to [rn,k−1, border),

we takes the following stages.

Phase 1: In the interval [border , rn,k+1),

we compute the minimum available slack that can be shifted to approach the right side of the border .

Phase 2: Analyze the length of slack that can be moved across the border .

As long as the length of the slack that can be switched be-foreborder is derived, it can be utilized by an lpWDA-based scheme and improves energy efficiency of the schedules.

Algorithm 1: lpWDA(lpFSAwith ***, andlpLDATwithNNN)

Compute available execution time and set the voltage/speed forτα

1. set uda:= pα, eExchange:= 0 and τasyn:= Ø;

2. Compute Hα(t) :=Pτk∈TαACT(t)w rem k (t)+ Pα−1 i=1 budα−ε pi c − d t+ε pi e · wi; 3. _NNNAi:=Pi−1_j=0 ddcuri pj e × acj ;

4. When a job ταis activated, set wremk (t) := wα; 5. When a job ταis completed or preempted,

UpdateLoadInfo( wremα (t), udα, Hα(t)); 6. When a job ταis scheduled for execution

7. ***eExchange_{:=lpFSA(t, T );} _{//get additional slack}

8. ***if rα= τasynand `rightα < eExchange

thenταhas the lowest priority inreadyQ(t);

9. slackα(t) := ClacSlackTime(eExchange); //get slack time

10. set the clock frequency as

fclk:=

wrem α (t)

slackα(t)+wremα (t)· fmax;

11. NNNfACL := max {

Ai+acα−excurα (t)

dcur

i −t |i = 1, . . . , n};

12. _NNNfclk:= max {fclk, fACL}; 13. Set the voltage accordingly;

Algorithm 2: CalcSlackTime(additional slack eExchange₎ Input:the active task τα,readyQand current time t

Output:the slack time slackα(t) for τα

14. Identify the task τβthat has the earliest upcoming deadline among tasks whose priorities are not higher than that of τa; 15. Lβ(t) := CalcLowerPriorityWork (τβ, t, eExchange); 16. loadβ(t) := wβrem(t) + Hβ(t) + Lβ(t);

17. slackα(t) := max (0, udβ− t − loadβ); 18. return (slackα(t));

3.2 Low-Power Work Demand Analysis (lpWDA)

In line 2 of Algorithm∗ 1, ε is an infinitesimal and

readyQ contains the currently activated tasks, and its subset,

ΓACT_α (t), containing the active tasks is

ΓACT

α (t) := {τκ|κ < α and τκ∈ readyQ }. (2.1)

In thelpWDA, the tasks in readyQ are scheduled according

to RM priority. When a task is activated, its job ταis moved

to readyQ, and the job’s remaining WCET is set to wci, i.e.,

W_irem(t) = wci. When ταis executed at time t, loadα(t) is

the amount of work required to be processed in [t, dα).

In Algorithms 1 and 2, lpWDA performs in the following steps. First, the system is initialized by setting the initial up-coming deadlines (ud) and remaining worst-case execution

(wrem) of each task. When τα is active at time t, notation

udkof each task τkis defined as follows[?]:

udk = dt+ε_p

ke × pk. (2.2)

The jobs which are active during [t, max{udk(t)}] will be

ex-amined for slack estimation. Hα(t) denotes the estimation of

higher-priority work that must be executed before udα(lines

1-2). Whenever a job τα is completed or preempted at time

t, the remaining work wrem_α (t), upcoming deadline udαand

high-priority work Hα(t) are updated in line 5. In lines 6-9,

when a job ταis scheduled for execution at time t, Algorithm

2 computes the available slack for ταaccording to Hβ(t) and

Lβ(t) (see lines 15 and 16), where udβ is the earliest

up-coming deadline with respect to τα. Notably, function Lβ(t)

∗

Procedures UpdateLoadInfo() and CalcLowerPriorityWork () which provide detailed slack computation[?, ?, ?] are abridge in the report.

(6)

computing the amount of low-priority work is performed

re-cursively until it finds τγwith the longest of task periods and

lowest priority with respect to τα. As defined in Section 2, the

length of interval [0, border) is pγ. Then, lpWDA computes

the length of slack time stealing from low-priority tasks in

the interval [rα,border) and applies the slack to the current

job. Formally, to describe the slack analysis method using

lp-WDA, the following notations are defined.

loadα(t): the amount of work required to be processed in

in-terval [t, dα).

slackα(t): the available slack for τα scheduled at time t can

be computed as follows

slackα= dα− t − loadα(t). (2.3)

In Eq.(2.3), loadα(t) consists of three types of work : (1)

W_αrem(t), (2) Hα(t) from the higher-priority tasks, and (3)

Lα(t) from the lower-priority tasks. The work required by

higher-priority tasks is derived as follows:

Hα(t) = Hαpast(t) + Hαf uture, (2.4)

where Hαpast(t) denotes the work required by uncompleted

tasks released before t and Hαf uture(t) the work released

dur-ing [t,dα]. we compute Hαpast(t) and Hαf uture(t) as follows:

H_αpast(t) =P τκ∈ΓACTα (t)W rem κ (t) and (2.5) Hf uture α (t) = Pα−1 i=1(b dα−ε pi c − d t+ε pi e + 1) · wci. (2.6)

According to the above statements, the amount of work

re-quired by the scheduled task ταcan be formulated as

Hα(t) =Pτκ∈ΓACTα (t)W rem κ (t)+ Pα−1 i=1(b dα−ε pi c − d t+ε pi e + 1) · wci, (2.7)

Lα(t) = (loadβ(t) − wαrem(t) − Hα(t) − (udβ(t) − dα))+ (2.8)

and loadα(t) = Hα(t) + Lα(t) + wremα (t) (2.9)

where notation x+stands for max(x, 0). Eqs.(2.7), (2.8) and

(2.9) are repeated iteratively until ταis the lowest priority task

in T (i.e., Lα(t) = 0). lpWDA uses this linear-time heuristic

to estimate available slack in an interval up to the upcoming deadline of lower-priority tasks.

Example 1.Consider a periodic task set T in Table 1, which

presents the period length, WCET and ACET of each task. Figure 2(a) presents the execution schedule under the worst-case workload in the first hyperperiod. Figure 2(b) shows

the speed schedule using lpWDA algorithm for task set T

and assumes actual work of each task equals its ACET.

Be-fore assigning τ1,1at time t = 0,lpWDAcomputes available

slack-time in an interval up to d3,1 = 6 by calling

CalcLow-erPriorityWork( ) in line 15 recursively. However, interval

[0, 6) has no slack-time under the WCET schedule. If the

length of the analysis interval is extended to 2 × pn, one unit

of slack-time is derived from 2 × pn−Pni=1b

2pn

pi c × wci.

One can imagine the slack in [11,12) as fluid, exchanging it with earlier work and moving it backward to the current scheduling point. For instance, in Figure 2(a), the slack in interval [11,12) can be exchanged with the work in interval [7,8), and then slack in interval [7,8) can be exchanged with the work in interval [4,5), and it can be exchanged once again with the work in interval [2,3). Finally, the slack in interval [2,3) can be exchanged with the work in interval [1,2).

There-fore, τ1,1 is scheduled with speed S1,1 = _wcwc_i₊₁i (Fig. 2(c)).

This example presents that an additional future slack can be utilized by current job and keeps the deadlines of the subse-quent jobs. Unfortunately, this straightforward idea does not

Figure 2. The inter-task voltage scheduling examples of (a)worst-case scheduling, (b)lpWDA, (c)lpWDA+lpFSAand (d)a modified worst-case schedule

work in actual situations. For example, in Figure 2(d), when

p2 is increased to 6, slack in the interval [11,12) cannot be

transferred before t=6. In fact, jobs τ1,3, τ2,2, and τ3,2are

re-leased simultaneously at time 6. The slack in interval [11,12)

cannot be exchanged with the work of τ1,2, τ2,1, or τ3,1,

be-cause a deadline is likely to be missed by one of these three jobs. Thus, this slack cannot be shifted to an earlier time in the schedule and improve power efficiency. This project suc-cessfully devises a method that obtains additional slack in a lengthened analysis interval.

(7)

3.3 Low-Power Fluid Slack Analysis (lpFSA)

Before presenting the slack computation method, the fol-lowing notations are introduced.

Tb_{= T − {τ} n}

where b denotes the number of tasks in Tb, b < n, and τnis

the task with the longest period in T . In an extended

analy-sis interval [border, rn,k+1), the number of synchronization

points of tasks in Tb is computed as follows

Syn(Tb, k) = b rn,k+1

LCM (Tb₎c − b

rn,k

LCM (Tb₎c (3.1)

where LCM (Tb) is the least common multiplier(LCM) of

task periods in Tb. The tasks are defined as synchronous

at time t if their jobs are released at time t. Therefore, the

first synchronous point of Tb within the interval [border,

rn,k+1) is derived as

t(Tb, k) = d rn,k+1

LCM (Tb₎e × LCM (T

b_). _(3.2)

According to Eq. (3.1), we define two situations in the

inter-val [border, rn,k+1):

Slack Transmission Obstacle (STO): Syn(Tb_{, k) > 0, b = n − 1.}

Slack Transmission Bottleneck (STB): Syn(Tb_{, k) = 0, b = n − 1.}

Figure 3.An example ofSTO

When an ST O appears in the additional analysis interval, slack time is likely to be blocked or shrunken by the ST O.

For example, the tasks except τnare synchronized at time t

(Fig. 3). If a slack exists after time t, it cannot move back-ward to the left side of t. In this case, slack can still be shifted

by exchanging it with the work of τn; this case is discussed

later in Phase 2. When Syn(Tb, k) = 0 and b = n − 1, the

Figure 4.An example ofSTB

amount of fluid slack approaching the currentborder can be

estimated by applying Phase 1. For example, τidoes not

syn-chronize with other tasks in Tb (Fig. 4). Therefore, one can

compute the value of Syn(Tx = Tb−τi, k) for each τi, where

x = n − 2 and b = n − 1. Suppose Syn(Tb− τ_i, k) > 0, the

earliest synchronization point of tasks in Tb − τi is derived

using Eq. (3.2). In the interval [border, rn,k+1), we define

the earliest slack transmission bottleneck incurred by task set

Tb_{− τ}

ias follows:

ST B(Tb− τi, k) = t(Tb− τi, k), i 6= n and b = n − 2. (3.3)

The release time of τiwhich spans astride Syn(Tb− τi, k) is

ri k= b

ST B(Tb−τi,k)

pi c × pi , τi6∈ T

b_. _(3.4)

The difference between Eqs. (3.3) and (3.4) is defined as

dif fi

k =max{ST B(Tb− τi, k) − rik, 0}. (3.5)

In Figure 4, suppose an initial slack is in the period of at least

one task in Tb, the amount of slack that can be shifted across

ST B(Tb− τ_i, k) depends on the length of work of τiwithin

the interval [r_ki, r_ki+dif f_ki). However, the high-priority tasks

in Tb− τi may interfere with the length of work of τiin this

interval. To estimate precisely the amount of work of τiin the

interval [border, rn,k+1), the high-priority work is classified

into two parts. The first part of work is provided by the tasks

with a period shorter than or equal to dif f_ki. Thus,

Hshort i (`) = P τk∈Tb−τib ` pkc × wck, ` > pi−1 (3.6)

where ` denotes the length of dif f_ki . For example, in Figure

4, the value of dif f_ki is ` and pi−2 is shorter than `. The

worst-case execution cycles of job τi−2must be included in

Hshort

i (`), because τi−2 does not go astride the border and

has higher priority than τi. The second part is additional work

(8)

H_ilong(`) =max{(W Ri−1+ ri−1) − ri, 0}, ` ≤ pi−1 (3.7)

where W Ri−1denotes the worst-case response time of τi−1.

Because of RM scheduling, job τh with ` ≤ ph < pi has a

higher priority than τi. Moreover, since the period of τhdoes

not span astride the border, its work cannot be exchanged with the slack located on the right side of the border. By Eqs. (3.6) and (3.7), the amount of high-priority work required in

the interval [ri_k, ri_k+ dif f_ki) can be expressed as

H_iexec= Hshort i (`) + H long i (`), ` ≤ pi−1 Hshort i (`) , otherwise. (3.8)

The value of Hexec

i (`) denotes the length of work in dif fki

that cannot be exchanged with the slack on the right side of

the border. In the worst case, τi is the only asynchronous

job in Tb and pi−1 ≥ dif f_ki. We derive the work of task τb

(b ≤ i − 1) whose periods cross ri by computing the

worst-case response time [?] of τi−1. Therefore, the estimated work

of τiin the interval [ri_k, ri_k+ dif f_ki) is

eExchange_i =min{max{dif fi

k− H

exec

i (dif fki), 0}, wci}. (3.9)

After completing Phase 1, Phase 2 computes the length of

slack that can be exchanged across theborder.

Before transferring the slack, we continue with the case

of Syn(Tb, k)=1 and b = n − 1, as derived by Eq. (3.1).

When the worst-case response time of τnis not greater than

t(Tb, k), the slack situated after t(Tb, k) can be shifted to

approach the right side of the border by exchanging itself

with a part of wcn; that is,

eExchange n =    wcn, Rn ≤ t(Tb, k)and b = n − 1, wcn− (Rn− t(Tb, k)), Rn > t(Tb, k) 0 , otherwise. (3.10)

When eExchange_i =0 in Eq.(3.9), one can utilize Eq.(3.10) to

move slack using eExchange_n . The next step is to compute the

amount of slack that can be transferred across the border.

We assume Tborder_{denotes a task set whose tasks go astride}

theborder. Let τi∈ Tborder, the lengths of pi’s lef t and the

right portion split byborder is defined as `left

i and `

right

i ,

respectively. The longest `left

i and `

right

i is defined as `

left max

and `rightmax , respectively. Additionally, we define

accuborder₌P

τi∈Tborderwci

as the total amount of work in Tborder_{. As shown in Figure}

5, the lengths of `left

max , ` right

max and accu

border_{limit the}

max-imum length of slack that can be transferred across the

bor-der. Consequently, the restriction on slack length in Phase 2

can be described as

Figure 5.The task periods span astride theborder

eborder =min{`left max , ` right max , accu border_}. _(3.11)

After completing Phase 1 and Phase 2, the length of

addi-tional slack that can approach and cross the border is

de-rived. In the WCET schedule, the total slack in interval

[rn,k−1, rn,k+1) can be estimated as

eslack_{= 2 × p}

n−Pτi∈Td

2×pn

pi ewci. (3.12)

Firstly,lpFSAestimates the length of additional slack based

on Eqs.(3.9)-(3.12). Then, it changes the priority of a job that goes astride the border when this job is moved to readyQ

according RM scheduling. In line 1 ofProcedure lpFSA, ε

denotes an infinitesimal value.

Procedure: lpFSA(time t, task set T )

Inputt :present time, Tb = T − τn

01. setb = n − 1, eExchange_min ← ∞, k = dt+ε pne,

` ← 0, Tborder_={τ

i|i < n and

pispans astride the border}; (Phase l)

02. if Syn(Tb_{, k) :== 0} 03. for i := 1 to n − 1

04. if Syn(Tb− τi, k) :== 0 then continue; 05. if ` < dif fi

kthen

06. ` := dif fi

kand τasyn:= τi; 07. if ` ≤ 0 then τasyn:= Ø;

08. Compute the value of eExchange

asyn ;

(Phase 2)

09. else if Syn(Tb, k) > 0 or eExchange

asyn ≤ 0

10. then Compute the values of eExchange

n ,

eborder_{and e}slack_;

11. if eExchange asyn ≤ 0 then e Exchange min :=min {e Exchange n , eborder_{, e}slack_};

12. else eExchange_min := min {eExchange

n +

eExchange

(9)

13. Choose a job τδ ∈ Tborderwith `

right

δ ≥ e

Exchange min

and wcδ ≥ eExchangemin ;

14. When τδ ∈ readyQ, change the priority of job τδ lower than that of job τnin readyQ;

15. Return eExchange_min ;

Table 2:Scheduling parameters in Example 2.

time udy eExchange slack voltage wc ac

0 6 1 1 0.5 2 1 1 6 1 1 0.5 2 1 2 6 1 1 0.67 3 1.5 3 6 1 1.5 0.4 2.5 1.25 4 6 1 0.5 0.4 2.5 1.25 4.25 6 1 1 0.67 1 0.5 4.75 12 0 1.25 0.44 2.25 1.125

Example 2. Consider the example of WCET schedule shown

in Figure 2(a). Before assigning τ1,1 at time t=0, we can

derive border=6 and Tborder = {τ2,2} according the task

periods in T . Procedure lpFSAcan estimate the length of

fluid slack from interval [6, 12) as follows. When task set

T2 = {τ1, τ2},Procedure lpFSAcomputes Syn(T2, 1)=0.

Therefore, in Phase 1, the bottleneck caused by τ1 and τ2 is

ST B(T2 − τ2, 1) = 9 and ST B(T2− τ1, 1) = 8 ,

respec-tively. Line 6 derives `=2 and τasyn = τ1. Equations

(3.7)-(3.9), derive eExchange_asyn = 1. In line 10, the value of eExchange_n

, eborder _{, e}slack _{and accu}border _{is 1, 1, 1 and 2,}

respec-tively. The value of eExchange_min is 1 by line 12. Therefore,

Pro-cedure lpFSAreturns eExchange_min =1 to thelpWDAalgorithm

and passes additional slack eExchange _to

CalcLowerPriority-Work() in the Algorithm 2. Notably, the tasks usinglpFSA

still execute under RM priority policy except one of the jobs whose periods span astride the border. At time t=0, when

jobs τ1,1, τ2,1and τ3,1 enter readyQ at time t=0, τ1,1has the

highest priority and utilizes additional slack eExchange_min

esti-mated bylpFSA. Therefore, job τ1,1obtains one unit of time

of slack and changes its voltage level from 1 to 0.5. On the

contrary, if primitivelpWDAperforms τ1,1 at time t=0, τ1,1

cannot obtain any slack. When lpWDAexecutes iteratively,

the value of eExchange _{does not change until τ}

1,1 is

com-pleted. Figure 2(c) presents the scheduling result obtained

using Procedure lpFSA. After completing τ1,1, eExchange_min

unit of slack has been run out, primitivelpWDAcontinuously

performs voltage scaling on the subsequent jobs of τ1,1. In

the case of τ2,1, it begins after τ1,1(t=1) and obtains one unit

of slack time from primitive lpWDA. Therefore, its WCET

under voltage v=0.5 is changed to wc2,1 = 2 and actual

ex-ecution time is ac2,1 = 1. At time t=4, job τ2,2 is released

and moved to readyQ. Its priority is changed to and lower

than the remaining execution time of τ3,1 by executing line

14 in Procedure lpFSA. Therefore, job τ2,2 begins its work

after completing the remaining work of τ3,1. Notably,lpFSA

only changes job’s priority in Tborderand does not affect the

feasibility oflpWDA schedule. Table 2 shows the values of

scheduling parameters.

3.4. Properties

Let W Ridenote the worst-case response time(WCRT) of

τi, without loss of generality, the higher-priority tasks have

simultaneous release time with a job of τiand the LCM of the

tasks period in T exists. This section uses WCRT analysis to

prove the schedulability oflpFSA†.

Lemma 1. When a task set T contains only one task τγ, the

available slack produced bylpWDAfor τγ, is

slackγ(t) = dγ− t − wcremγ (t). (4.1)

Lemma 2. When a task set T contains n tasks where n ≥ 2,

the amount of work required to be processed in [t, dα] (α ≤

n) for the highest priority job ταis

loadα(t) = wremn (t) + Hn(t) − (udn− dα). (4.2)

Lemma 3. The length of slack that is provided bylpWDAfor the highest priority task in readyQ is at most

slackα(t) = udn(t) − t − wnrem(t) − Hn(t). (4.3)

Proof. Assuming τα has the highest priority in readyQ,

this proof can be derived directly from Eqs.(2.3), (4.1) and

Eq.(4.2).

The following theorem proves the schedulability of lpWDA

by using worst-case response time analysis. Its proof appears in[?, ?].

Theorem 1. Given a set T of tasks is feasible in RM

sched-ule, the maximum response time of task τγ underlpWDAis

less than or equal to its deadline.

Corollary 1. For some task τi ∈ Tborder, i < γ and

dγ = border, and pγ is not the multiple of these pis, the

difference between W Rnew_γ and dγis formulated as

dγ− W Rnewγ = X k<γ (bdγ− ε pk c + 1) · wck− X k<γ bpγ pk c · wck = X τi∈Tborder wci. (4.4)

Notably, W Rnew_γ presents the length of WCRT proposed by

lpWDA, and therefore the slack between W Rnew_γ and dγ

could be utilized by lpFSA. Consider the example shown

†

The detailed proof of some lemmas are presented in[?] and abridged in the report.

(10)

Figure 6.The slack reclamation in lpWDA algorithm.

in Figure 6(a). The value of load3,1(0) is set to the sum

of wc3 and H3,1(0), which is shown in the gray box of

Figure 6(b). There are 6 time units are required to be

processed before ud3,1=8. In order to guarantee the feasible

schedule of higher-priority jobs whose periods span astride

ud3,1 (i.e.,τ1,2 and τ2,2), lpWDA estimates how much time

should be reserved for the higher-priority jobs. In this case,

wc1,2+ wc2,2=2 is derived from Eq.(4.4). We investigate the

difference between dγ and W Rnewγ to keep the deadlines of

lpFSAjobs.

Lemma 4. The lpWDAalgorithm selects an effective feasi-ble speed for the active job in the analysis scope generated by upcoming deadlines.

Proof : This proof is derived directly from Theorem 1 and

Corollary 1.

Lemma 5. Let γ > δ, dγ = border and τδ ∈ Tborder. When

job τγis feasible underlpFSA, τδalso keeps its deadline.

Lemma 5 proves that additional slack produced bylpFSAis

shorter than the right part of pδ split by border. Therefore,

the deadline of job τδ is kept after changing the priority of

τδ to the lowest priority in thereadyQ. After completing τδ,

the schedule is performed continuously under lpWDA. The

schedulability proof in interval [border, rn,k+1) is similar to

Theorem 1 except additional work eExchange in the WCET

schedule.

Lemma 6. An lpWDA schedule remains feasible when

eExchange _{unit of work is moved in interval [border, r}

γ,k+1)

by using procedure lpFSA. Procedure lpFSA focus on

providing the potential and appropriate length of slack to the

lpWDA-based algorithms.

Theorem 2. Procedure lpFSA provides additional slack that guarantees all task deadlines in thelpWDAschedule.

Proof : In the interval [rn,k−1, border): Suppose job τα is

being executed in [rn,k−1, border). By executing line 9 in

Algorithm 1 while passing the additional slack eExchange

to the function CalcSlackTime( ) in the lpWDAalgorithm,

Algorithm 2 computes the length of the slack, which ταcan

use by calling CalcLowerPriorityWork ( )(line 15)

recur-sively. When job τα uses complete eExchange and all of its

subsequent jobs execute in their WCET, job rn,k−1 is likely

to miss its deadline. However, line 14 inProcedure lpFSA

solves this problem by changing the priority of job τδto the

lowest priority job in readyQ. Because `right_δ ≥ eExchange_,

according to Lemma 5, the deadline of jobs τδ and τn are

guaranteed in the interval [rn,k−1,border). By Lemma 6,

the additional work does not affect the feasibility oflpWDA

schedule, and completes the proof.

Theorem 3. Procedure lpFSA is still a valid slack-estimation algorithm withlpLDA.

The lpLDAt (as well as lpLDAT), which considers voltage

transition time in the schedule, is based on a feasiblelpLDA

schedule. The feasibility proofs are proposed by [?].

Theorem 4. The lpFSA algorithm has a computational complexity of O(n) per scheduling point, where n denotes the number of tasks in the systems.

4 Performance Evaluation

The simulations also consider the following algorithms which are modified to account for transition overhead:

ccRM: TheccRMalgorithm from [?].

lpWDA: ThelpWDAalgorithm from [?].

lpLDAT: ThelpLDATalgorithm from [?].

lpWDA-lpFSA: ThelpWDAis the host algorithm oflpFSA.

lpLDAT-lpFSA: ThelpLDATis the host algorithm oflpFSA.

The following four parameters are varied in simulations: (1) number of tasks totaltasks in T is varied at 2-18 in two task increments; (2) utilization U for task set is varied at 0.1-0.9; (3) the bc/wc ratio of BCET to WCET is varied at

0.1-0.9; and, (4) the analytical interval inbound being the

multiples of pnis denoted as Mpanalysisn . Before performing

these experiments, 10000 task sets have been generated ran-domly including the number of tasks in each set, task period lengths and their worst-case execution requirements in accor-dance with a uniformed distribution function. Early com-pletion time of each job in simulation (1), (2) and (4) was randomly drawn from a Gaussian distribution in the range of [BCET, WCET], where BCET/WCET=0.1. In simulation (3), each experiment was performed by varying BCET at 10-90% of WCET. The processor model we assumed is based on the ARM8 microprocessor core. For all experiments, we assume 10 frequency levels are available in the range of 10-100MHz, with corresponding voltage levels of 1-3.3 Volts. The energy consumption caused by memory access and cache

(11)

misses are ignored, and all experimental results are normal-ized against the same processor running at maximum speed without a DVS technique (non-DVS for short). More

assump-tions and experiments‡are presented in[?, ?].

Figures 7, 8, 9 and 10 list the energy consumption of each

Number of Tasks N o rm al iz ed en er g y co n su m p ti o n 2 4 0.60 0.10 1.00 0.50 0.70 0.80 0.90 0.40 0.30 0.20 6 8 10 12 14 16 18 (bc/wc 0.9], U 0.7] )

bound ccRM lpWDA lpWDA-lpFSA lpLDAT lpLDAT-lpFSA

Figure 7.Energy consumption under different totaltask

Utilization N o rm al iz ed en er g y co n su m p ti o n 0.1 0.60 0.10 1.00 0.50 0.70 0.80 0.90 0.40 0.30 0.20 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

(bc/wc 0.9], totaltask 18] )

Figure 8.Energy consumption under different U

method and the results for a clairvoyant algorithm, named

bound, which knows the actual execution cycle of each task beforehand and adopts the optimal speed accordingly.

En-ergy consumption includes both execution duration oflpFSA

and its host algorithms (i.e., thelpWDAandlpLDA) and the

context-switch time required to switch to and from other real-time tasks. Since the range of task periods has been short-ened to between [1, 100]ms, the difference between task pe-riods and context-switch times or transition times are smaller than those assumed in [?, ?, ?, ?]. As shown in Figure 7, the lpWDA-lpFSAandlpLDAT-lpFSA method reduces the

energy consumption by at least 11% and 4% over that of

lp-WDAandlpLDATalone, respectively. The value of U of each task set is assigned randomly at 10-70% by uniform probabil-ity distribution function. The value of U and the bc/wc ratio

‡

Many real-world overheads and experiments including program ex-ecution time/energy , voltage transition time/energy and context-switch time/energy are presented in [?, ?].

bc/wc N o rm al iz e d e n er g y co n su m p ti o n 0.60 0.10 1.00 0.50 0.70 0.80 0.90 0.40 0.30 0.20

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

(totaltask ], U 0.7])

Figure 9.Energy consumption under different bc/wc ratio

N o rm al iz ed en er g y co n su m p ti o n 2 0.60 0.10 1.00 0.50 0.70 0.80 0.90 0.40 0.30 0.20 (totaltask ], bc/wc 0.9], U 0.7] )

4 6 8 10 12 14 16 18 n analysis p _p n interval i analytical length of M n bound) ( 

Figure 10. Energy consumption under different values of

Manalysis pn

for each task set is to 0.8 and 0.5, respectively. With a large U,

lpFSAoutperforms its host algorithms in a small totaltask,

consuming up to 26% and 23% less energy thanlpWDAand

lpLDAT, respectively.

Experimental results in Figure 8 indicate that lpWDA

-lpFSA andlpLDAT-lpFSAsaved up to 21% and 4% more energy than their host algorithms. In the experiment, the

totaltask and the bc/wc ratio of each task set is 10 and 0.5,

respectively. Increasing the value of U in Figure 8 increases

the energy consumption of lpFSA and its host algorithms.

With a small U , the gain fromlpFSAis modest, with 1% and

4% saving compared to that of the initiallpLDATandlpWDA

algorithms, respectively. Additionally, with these methods, U is an important factor when computing the slack for deciding

processor speeds. With a moderate U value,lpWDA-lpFSA

andlpLDAT-lpFSAconsumes at most 14% and 3% less

en-ergy than that of the initial lpWDA andlpLDAT algorithm,

respectively. Therefore, lpFSAutilizes not only the

advan-tages of its host algorithms, but also slack belonging to the value of 1 − U as possible and shifts the slack to current job.

(12)

In Figure 9, the set of experiments varies the bc/wc ratio at 0.1-0.9 and the value of U and totaltask is 0.8 and 10,

respectively. The energy consumed by lpFSAis positively

correlated with the bc/wc ratio, while their host s are not

sen-sitive to the bc/wc ratio. With a low bc/wc ratio,lpFSAis

the best, consuming up to 25% and 15% less energy than the

lpWDAandlpLDAT, respectively. With a bc/wc ratio of 0.9,

lpFSAcollaborated withlpLDATconsumes slightly more

en-ergy than the initiallpLDATalgorithm. The reason is likely

that the additional saving gained by thelpFSAalgorithm are

compromised by its execution overhead.

In Figure 10, the analytical interval in bound is exactly

Mpanalysisn times the length of pn. The values of M

analysis pn

are controlled by the simulation from 2 to 18 and the

total-task of each task set is assigned randomly at 2 to 20 tasks.

For simplicity, the length of each schedule is also controlled in Mpanalysisn ×pn. When the value of M

analysis

pn increases, the

energy saving ofboundare not obvious, and the energy

con-sumption required by the schemes with lpFSAare not

sen-sitive to Mpanalysisn . Notably, when M

analysis

pn =2,lpFSAand

boundhave equal length of analytical interval,boundgains at most 32% energy savings less than proposed schemes. Therefore, extending the additional analysis interval, such

that it is several times longer than pn does not increase an

already substantial energy saving but rather increases com-puting overhead during slack-time analysis. Additional

sim-ulations are presented in[?]§.

5 Conclusions

This report explores slack computation method in the DVS real-time scheduling to decrease their energy consumption. We proposed a slack estimation algorithm based on the

con-cept of fluid slack analysis calledlpFSA. This method is the

first in its class that many existing RM DVS methods can

serve as a host algorithm. lpFSAcooperating with host

al-gorithms can further decrease energy consumption without increasing time complexities. Experimental results indicate thatlpFSAcan reduce overall energy consumption by up to 25% when compared with that of the initial schemes.

For future research, we shall explore the minization issues of energy consumption with other scheduling policies such as earliest-deadline first (EDF) or EDF* [?]. Additionally, the

existence ofSTOin thelpFSAhampers the transmission of

slack in an analysis interval. Future work will try to prevent

the STOs by relaxing job release times, thereby increasing

available slack. Because the off-line slack-stretching tech-nique proposed by Gruian [?] benefits the slack prediction

§

It also considers the min/max variation in energy consumption of

lpFSAand its host algorithms.

oflpFSA, the future work will also decrease computational

complexity of this technique and combine it withlpFSA.

References

[1] AMD, “Mobile AMD Athlon 4 Processor Model 6 CPGA Data Sheet Rev.:g,” Advanced Micro Devices, Technique Report 24332, October 2003.

[2] Hakan Aydin, Rami Melhem, Daniel Mosse and Pedro Mejia-Alvarez. Power-Aware Scheduling for Periodic Real-Time Tasks.

IEEE Trans. Comput., 53(5): 584-600, May 2004.

[3] Da-Ren Chen,“Slack Computation for DVS Algorithms in Fixed-priority Real-time Systems Using Fluid Slack Analysis,” Journal of

Systems Archicture, vol. 57, Issue 9, pp.850-865, Oct. 2011.

[4] Da-Ren Chen, Chiun-Chieh Hsu, You-Shyang Chen, Chi-Jung Kuo and Lin-Chih Chen, “Transition-Aware DVS Algorithm for Real-Time Systems Using Tree Structure Analysis,” Journal of Systems

Archicture, vol. 56, pp.352-367, 2009.

[5] Flavius Gruian, “Hard Real-Time Scheduling for Low-Energy Using Stochastic Data and DVS Processors,” in Proceedings of the 2001

International Symposium on Low Power Electronics and Design (IS-PLED’01). CA: ACM Press, Aug. 2001, pp. 46-51.

[6] XiaoChuan He and Yan Jia, “Energy-Efficient Scheduling Fixed-Priority Tasks with Preemption Thresholds on Variable Volt-age Processors,” Lecture Notes in Computer Science, Springer Berlin/Heidelberg, vol.4672, pp.133-142, 2008.

[7] Intel Corporation, “The Intel XScale Microarchitecture,” Intel Corpo-ration, Technique Report, 2000.

[8] Woonseok Kim, Jihong Kim and Sang Lyul Min, “A Dynamic Volt-age Scaling Algorithm for Dynamic-Priority Hard Real-Time Sys-tems Using Slack Time Analysis,” in Proceedings of the 2002 Design

Automation and Test in Europe (DATE’02), Aug. 2002, pp. 788-797.

[9] Woonseok Kim, Jihong Kim and Sang Lyul Min, “Dynamic Volt-age Scaling Algorithm for Fixed-Priority Real-Time Systems Using Work-Demand Analysis,” in Proceedings of the 2003 International

Symposium on Low Power Electronics and Design (ISPLED’03). New

York, NY: ACM Press, Aug. 2003, pp. 396-401.

[10] Woonseok Kim, Jihong Kim and Sang Lyul Min, “Preemption-Aware Dynamic Voltage Scaling in Hard Real-Time Systems,”in the

Pro-ceedings of the 2004 International Symposium on Low Power Elec-tronics and Design (ISPLED’04). New York, NY: ACM Press, 2004

pp. 393-398.

[11] Jane W. S. Liu, Real-Time Systems. Upper Saddle River, New Jersey 07458: Prentice Hall, 2000.

[12] Bren Mochocki, Xiaobo S. Hu and Gang Quan, “Transition-Overhead-Aware Voltage Scheduling for Fixed-Priority Real-Time Systems,” ACM Transactions on Design Automation of Electronic

Systems, vol. 12, issue 2, no. 11, 2007.

[13] Padmanabhan Pillai and Kang G. Shin, “Real-Time Dynamic Voltage Scaling for Low-Power Embedded Operating Systems,”in

Proceed-ings of the eighteenth ACM symposium on Operating systems princi-ples (SOSP’01). New York, NY: ACM Press, 2001, pp. 89-102.

[14] Samsung Corporation, “Samsung and Intrinsity Jointly De-velop the World’s Fatest ARMrCORTEXT M_-A8 _Processor Based Mobile Core In 45 Nanometer Low Power Processor,” http://www.samsung.com/.

[15] Mark Weiser, Brent Welch, Alan Demers and Scott Shenker, “Scheduling for Reduced CPU Energy,”in Proceedings of the 1st

USENIX conference on Operating Systems Design and Implementa-tion, 1994, vol. 1, pages 13-23.

(13)

國科會補助專題研究計畫成果報告自評表

請就研究內容與原計畫相符程度、達成預期目標情況、研究成果之學術或應用價

值（簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性）

、是否適

合在學術期刊發表或申請專利、主要發現或其他有關價值等，作一綜合評估。

1. 請就研究內容與原計畫相符程度、達成預期目標情況作一綜合評估

■

達成目標

□ 未達成目標（請說明，以 100 字為限）

□ 實驗失敗

□ 因故實驗中斷

□ 其他原因

說明：

2. 研究成果在學術期刊發表或申請專利等情形：

論文：

■

已發表

■

未發表之文稿

■

撰寫中 □無

專利：□已獲得 □申請中 □無

技轉：□已技轉 □洽談中 □無

其他：

（以 100 字為限）

(14)

3. 請依學術成就、技術創新、社會影響等方面，評估研究成果之學術或應用價

值（簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性）（以

500 字為限）

在學術研究上，本計畫探討動態電壓調整與轉換負載對於即時系統排程之影響，其演算

法設計技術、排程應用技術等，均屬於學術研究的範疇。本計畫首先成功發展出新的鬆弛時

間分析法與動態電壓調整技術，稱為 lpFSA，它以速率單調(rate monotonic, RM)排程為基礎

，建立通用型即時電壓排程法。這些方法能套用在現有的鬆弛時間分析技術之上、預測更多

可用鬆弛時間，並且能夠分配給更多工作使用，以進一步降低 CPU 功耗。其特色如下：

(1)以靜態優先權排程為基礎，能與現存方法相容的鬆弛時間分析法

這是一套基植於現存方法，如 ccRM 與 lpWDA 的 DVS 技術。原有的技術只需稍作

修改即可與新方法結合，在提昇他們電源效率的同時，不會增加時間複雜度。

(2)針對目前提出的方法，降低排程中電壓調整次數與幅度

透過前述研究出來的方法所所得到的鬆弛時間，可以快速調整每個工作應分配的鬆弛

時間以達到減少電壓調整的次數與幅度。

第 57 卷，以及 2010 年十月法國 HAL-INRIA 資訊研究機構的技術報告(hal-00546926，

RTNS2010)中。另外，後續的研究成果也分別被國際期刊 Journal of Information Processing

Systems 所接受，以及另一部分已投稿至 Information Sciences(SCI) 並已進入第二階段審

查。

在技術創新上，藉由電腦模擬與實際驗證的結果，有助於降低嵌入式即時系統的實際功

耗、延長系統運作時間，使系統提供更多樣的服務。另外，所應用的「速率單調」即時排程，

亦可應用在多媒體資料傳輸與寬頻網路技術上，除了能夠保證服務品質，還能降低網路傳輸

所需的功耗。

(15)

國科會補助計畫衍生研發成果推廣資料表

日期：100 年 10 月 20 日

國科會補助計畫

計畫名稱：鬆弛時間分析法應用於嵌入式系統之研究

計畫主持人：陳大仁

計畫編號：

NSC 99－2221－ E－146－011

－

領域：資訊工程

（中文）可流動鬆弛時間分析用於動態電壓調整方法

研發成果名稱

（英文）Low-power Fluid Slack Analysis for Dynamic Voltage

Scaling Methods

成果歸屬機構

華夏技術學院資訊管

_理系

發明人

(創作人)

陳大仁

（中文）

可流動鬆弛時間分析(Low-power Fluid Slack Analysis)的概念是預

先分析兩倍最長工作週期內所有鬆弛時間(slack time)的可移動

情形，讓系統能夠快速找出有多少鬆弛時間可以延長目前即將執

行工作的執行時間，同時降低它的執行電壓與速度。由於現有多

數的靜態優先權DVS排程法在降低工作執行速度上較為保守，此觀

念不僅減少浪費鬆弛時間，有效降低系統的電源消耗外，可流動

鬆弛時間分析可以應用在許多現有的DVS排程法之上，只需微幅修

改這些方法即可有效運作。

技術說明

（英文）

Low-power fluid slack analysis is to pre-analysis of the length of

“

movable”slack time within a period of two times the length of the

longest task period so that the slack can be used when a new task is

dispatched from the ready queue. It can decrease the execution speed

and voltage of the current tasks and guarantee their timing constraints.

Additionally, low-power fluid slack analysis can cooperate with many

static-priority DVS scheduling with minor revision and does not

increase their computational complexities.

產業別

嵌入式作業系統軟體開發、系統晶片之省電即時核心設計、動態

_{電壓調整之即時排程設計。}

技術/產品應用範圍

此觀念可以實作到目前的許多即時排程或是作業系統中所使用的

動態電壓調整法，將會有效協助開發省電的系統晶片之各式相關

產品。

技術移轉可行性及預期

效益

在不需修改現行即時排程法的情況下，此技術可以模組化的形態

進一步加強系統動態電壓調整的能力、縮短系統升級與測試的時

程。應用在嵌入式系統相關產業中，本計畫的成果有助於延長產

品運作時間或減少電源消耗，使系統提供更多樣的服務。

註：

本項研發成果若尚未申請專利，請勿揭露可申請專利之主要內容。

(16)

國科會補助專題研究計畫項下出席國際學術會議心得報告

日期：100 年 10 月 20 日

一、參加會議經過

國際即時與網路系統會議(RTNS 2010)每年開一次會，主辦單位為IEEE Computer Society

法國分部與IRIT(Institute de Recherche en Informatique de Toulouse)，主要參加者涵蓋

全球各大學或產業界從事嵌入式系統與網路相關研究之學者或研究人員，此會議今年已是第十

八屆，每年固定在法國南部工業大城土魯斯舉辦一次，在即時系統相關的國際會議中是相當重

要的會議。

筆者於十一月三日凌晨從台北搭機出發，經泰國曼谷市轉機，十一月四日抵達法國戴高樂機場

並搭乘 TGV 列車抵達法國南部的土魯斯，接著搭乘計程車抵達位於土魯斯大學的會議中心，筆

者在會議前研讀幾篇相關領域的論文，會議開始時專心聆聽，並提出一些疑惑與作者討論，十

一月五日筆者自己宣讀論文，當日會議結束後，於十一月十日返回台灣。

二、與會心得

本次大會約有二十三篇論文發表，分成八個場次以口頭報告方式發表，大部份的場次皆由該

領域中著名的學者主持，主題如下：

Session 1. Uniprocessor scheduling 1 (chair: Alan Burns)

Session 2. Uniprocessor scheduling 2 (chair: Joel Goossens)

Session 3. Networks (chair: Zoubir Mammeri)

Session 4. Verification and timing analysis (chair: Serge Midonnet)

Session 5. Dynamic voltage and frequency scheduling (Sathish Gopalakrishnan)

Session 6. Fault tolerance (chair: Rob Davis)

Session 7. Multiprocessor scheduling 1 (chair: Laurent George)

Session 8. Multiprocessor scheduling 2 (chair: Pascal Richard)

計畫編號

NSC 99－

2221 －

E －

146 －

011 －

計畫名稱

鬆弛時間分析法應用於嵌入式系統之研究

出國人員

姓名

陳大仁

服務機構

及職稱

華夏技術學院助理教授

會議時間

自 99 年 11 月 4 日起

_{至 99 年 11 月 5 日止}

會議地點

土魯斯，法國。

會議名稱

(中文)第十八屆國際即時系統與網路系統會議

(英文)

18th International Conference on Real-Time and Network Systems

(RTNS2010)

發表論文

(17)

其間發表個人之論文“A practical slack-time analysis method for DVS real-time scheduling”，

與會國外學者也詢問筆者幾個問題，並且提供一些看法，也讓筆者獲益良多。從各場次論文報

告中，筆者觀察到即時系統與嵌入式系統未來研究的重點將傾向於(1)無線感測網路(2)多核心

處理器排程(3)動態電源調整的技術。

本次與會除了學術專業獲得充電外，並且認識許多以往只能在論文中看見列名的國際同仁，筆

者便趁此機會作一些“即時的”學術交流。另外，筆者還對法國的久遠歷史與宗教文化建築印

象深刻。本次會議所在地土魯斯，都市人口約80多萬人，是法國甚至全歐洲的航空與太空探索

重鎮，著名的空中巴士(AirBus)在此地擁有大型組裝工廠，負責飛機最後的組裝階段。此地還

有一個專門發射人造衛星的太空基地，稱為太空城(Cite de l’espace) 。另外它也是法國境

內知名的大學城，八十多萬人口中有十一萬名學生，歷史與人文氣息讓土魯斯理性中流露感性。

四、建議

(1)在參與這次會議之前，得知發表的論文不算很多，因此作者們採循序方式進行報告，故可從

頭到尾聆聽大部份的論文，類型較不同的論文，也可以聽聽看，應該會有觸類旁通的功效，也

可以得到參與會議的最大收穫。

(2)筆者在聆聽論文時，發覺有些母語非英語的發表者，報告的口音有點重，這可能是一些非英

語系國家的學者之通病，建議國科會能多獎勵國內學者的國外研究及補助，促使大家能有機會

多多訓練英語發表。

(3)此次會議，與各國學者討論，得知各國大學訓練資訊人員的人數都大幅增加，以因應未來的

趨勢，而國內在資訊領域的人才也是供不應求，而通訊網路領域的人才更是欠缺，為了發展知

識經濟，人才培育是非常重要的課題，值得有關單位深思，並及早規劃。最後，筆者感謝國科

會補助參與此次會議，有機會與世界各國的學者交換彼此的經驗及研究心得，對本人未來的研

究助益匪淺。

五、攜回資料名稱及內容

研討會論文紙本與光碟片。

六、其他

出席與報告證明(下一頁)。

(18)

(19)

Da-Ren Chen

寄件者: Ki-seo Park [[email protected]]

寄件日期: 2011年9月2日星期五下午 12:51

收件者: [email protected]

主旨: [KIPS] JIPS - Review Result[11E03-030]

郵件標幟: 待處理 標幟狀態: 紅色

類別: 紅色類別

Page 1 of 1

J

our

nal

of

I

nf

or

mat

i

on

P

r

ocessi

ng

S

y

st

ems

Paper

number

:

11E03-030

Paper

Ti

t

l

e

:

Ef

f

i

ci

ent

Al

gor

i

t

hms

f

or

Pi

nwheel

Task

s

t

o

Di

scr

et

e

Vol

t

age

Schedul

es

Aut

hor

(

s)

:

Da-Ren

Chen

The first round review for the paper above has been finished, and your paper has

been accepted for the final edition. Congratulations!

The deadline for the camera-ready paper(MS word version only) is

September 23

.

You will need to revise your paper according to the Journal of Information

Processing System format.

When you submit the final paper, please attach KIPS copyright transfer form. We

cannot guarantee to include your paper in final edition, in case of late submission

of the final camera-ready paper.

The reviews for your paper are attached in this email. We hope that the review

feedback is helpful to you.

I appreciate your contribution to the journal.

---Doo-Soon Park

Editor-in-Chief of JIPS

Professor at Soonchunhyang University

(20)

---出席國際學術會議心得報告

計畫編號

NSC 99 - 2221 - E - 146 - 011

計畫名稱

鬆弛時間分析法應用於嵌入式系統之研究

出國人員姓名

陳大仁

服務機關及職稱

華夏技術學院資訊管理系助理教授

會議時間地點

土魯斯，法國，自 99 年 11 月 4 日起至 99 年 11 月 5 日止

會議名稱

第十八屆國際即時系統與網路系統會議

18th International Conference on Real-Time and Network Systems (RTNS 2010)

發表論文題目

A Practical Slack-time Analysis Method for DVS Real-time Scheduling

主要內容

一、參加會議經過

國際即時與網路系統會議(RTNS 2010)每年開一次會，主辦單位為IEEE

Computer Society法國分部與IRIT(Institute de Recherche en Informatique

de Toulouse)，主要參加者涵蓋全球各大學或產業界從事嵌入式系統與網路

部工業大城土魯斯舉辦一次，在即時系統相關的國際會議中是相當重要的會

議。

筆者於十一月三日凌晨從台北搭機出發，經泰國曼谷市轉機，十一月四日抵

達法國戴高樂機場並搭乘TGV列車抵達法國南部的土魯斯，接著搭乘計程車

抵達位於土魯斯大學的會議中心，筆者在會議前研讀幾篇相關領域的論文，

會議開始時專心聆聽，並提出一些疑惑與作者討論，十一月五日筆者自己宣

讀論文，當日會議結束後，於十一月十日返回台灣。

二、與會心得

本次大會約有二十三篇論文發表，分成八個場次以口頭報告方式發表，每個

場次皆由該領域中著名的學者主持，主題如下：

Session 1. Uniprocessor scheduling 1 (chair: Alan Burns)

Session 2. Uniprocessor scheduling 2 (chair: Joel Goossens)

Session 3. Networks (chair: Zoubir Mammeri)

(21)

Session 4. Verification and timing analysis (chair: Serge Midonnet)

Session 5. Dynamic voltage and frequency scheduling (Sathish

Gopalakrishnan)

Session 6. Fault tolerance (chair: Rob Davis)

Session 7. Multiprocessor scheduling 1 (chair: Laurent George)

Session 8. Multiprocessor scheduling 2 (chair: Pascal Richard)

其間發表個人之論文， “A practical slack-time analysis method for DVS

real-time scheduling”，與會國外學者也詢問筆者幾個問題，並且提供一些看

法，也讓筆者獲益良多。從各場次論文報告中，筆者觀察到即時系統與嵌入

式系統未來研究的重點將傾向於(1)無線感測網路(2)多核心處理器排程(3)

動態電源調整的技術。

本次與會除了學術專業獲得充電外，並且認識許多以往只能在論文中看見列

名的國際同仁，筆者便趁此機會作一些“即時的”學術交流。另外，筆者還

對法國的久遠歷史與宗教文化建築印象深刻。本次會議所在地土魯斯，都市

人口約80多萬人，是法國甚至全歐洲的航空與太空探索重鎮，著名的空中巴

士(AirBus)在此地擁有大型組裝工廠，負責飛機最後的組裝階段。此地還有

一個專門發射人造衛星的太空基地，稱為太空城(Cite de l’espace) 。另

外它也是法國境內知名的大學城，八十多萬人口中有十一萬名學生，歷史與

人文氣息讓土魯斯理性中流露感性。

三、建議

(1)這次參與會議之前，得知發表的論文不算很多，因此作者們採循序方式進

行報告，故可從頭到尾聆聽大部份的論文，類型較不同的論文，也可以聽聽

看，應該會有觸類旁通的功效，也可以得到參與會議的最大收穫。

(2)筆者在聆聽論文時，發覺有些母語非英語的發表者，報告的口音有點重，

這可能是一些非英語系國家的學者之通病，建議國科會能多獎勵國內學者的

國外研究及補助，促使大家能有機會多多訓練英語發表。

(3)此次會議，與各國學者討論，得知各國大學訓練資訊人員的人數都大幅增

加，以因應未來的趨勢，而國內在資訊領域的人才也是供不應求，而通訊網

路領域的人才更是欠缺，為了發展知識經濟，人才培育是非常重要的課題，

值得有關單位深思，並及早規劃。最後，筆者感謝國科會補助參與此次會議，

有機會與世界各國的學者交換彼此的經驗及研究心得，對本人未來的研究助

益匪淺。

四、其它

攜回資料名稱及內容：研討會論文全集與光碟片。

出席報告證明

最末頁

(22)

(23)

國科會補助計畫衍生研發成果推廣資料表

日期:2011/09/20

國科會補助計畫

計畫名稱: 鬆弛時間分析法應用於嵌入式即時系統之研究計畫主持人: 陳大仁計畫編號: 99-2221-E-146-011- 學門領域: 計算機理論與演算法

無研發成果推廣資料

(24)

99 年度專題研究計畫研究成果彙整表

計畫主持人：

陳大仁

計畫編號：

99-2221-E-146-011-計畫名稱：

鬆弛時間分析法應用於嵌入式即時系統之研究

量化

成果項目

實際已達成數（被接受或已發表）預期總達成數(含實際已達成數)

華夏機構典藏 HWHIR : Item 987654321/939

行政院國家科學委員會專題研究計畫 成果報告

鬆弛時間分析法應用於嵌入式即時系統之研究

研究成果報告(精簡版)

計 畫 類 別 ： 個別型

計 畫 編 號 ： NSC 99-2221-E-146-011-

執 行 期 間 ： 99 年 08 月 01 日至 100 年 07 月 31 日

執 行 單 位 ： 華夏技術學院資訊管理系

計 畫 主 持 人 ： 陳大仁

共 同 主 持 人 ： 陳祐祥、謝衛民

計畫參與人員： 碩士班研究生-兼任助理人員：劉佑玫

大專生-兼任助理人員：林秉毅

大專生-兼任助理人員：施淳仁

報 告 附 件 ： 出席國際會議研究心得報告及發表論文

處 理 方 式 ： 本計畫可公開查詢

中 華 民 國 100 年 10 月 24 日

行政院國家科學委員會補助專題研究計畫

■ 成 果 報 告

□期中進度報告

鬆弛時間分析法應用於嵌入式即時系統之研究

計畫類別：■個別型計畫

□整合型計畫

計畫編號：NSC 99－2221－ E－146－011

－

執行期間：99 年 08 月 01 日至 100 年 07 月 31 日

執行機構及系所：華夏技術學院資訊管理系

計畫主持人：陳大仁

共同主持人：謝樹明、陳祐祥、謝衛民

計畫參與人員：劉佑玫、林秉毅、施淳仁

成果報告類型(依經費核定清單規定繳交)：■精簡報告

□完整報告

本計畫除繳交成果報告外，另須繳交以下出國心得報告：

□赴國外出差或研習心得報告

□赴大陸地區出差或研習心得報告

■出席國際學術會議心得報告

□國際合作研究計畫國外研究報告

處理方式：

除列管計畫及下列情形者外，得立即公開查詢

□涉及專利或其他智慧財產權，□一年□二年後可公開查詢

中

華

民

國 一百 年 十 月 二十 日

行政院國家科學委員會補助專題研究計畫成果報告

計畫名稱：具有轉換感知與線上動態電壓調整之即時系統排程法之研究

計畫編號：NSC 99－2221－ E－146－011－

執行期間：99 年 08 月 01 日至 100 年 07 月 31 日

主持人: 陳大仁 華夏技術學院資訊管理系

計畫參與人員：謝樹明

、陳祐祥、謝衛民、劉佑玫、林秉毅、施淳仁

華夏技術學院資訊管理系

I、

中文摘要

由於電腦系統大量的電源消耗除了會增加使

用成本，還會增加其運作時的溫度，進而提高系

統故障的機率與降低系統的可靠度，而攜帶型系

統還會因為電池電量的限制而縮短其運作時間。

因此，如何降低處理器的電源消耗已經成為當今

電腦系統發展的一個重要研究課題。本計畫對於

固定優先權工作的硬式即時系統，提出新的鬆弛

時間計算方式，並設計一套排程法降低週期性工

作的 CPU 功耗。鬆弛時間計算的方式採用獨特的

新 概 念 ， 稱 為 low-power fluid slack analysis

(lpFSA)，根據這個概念開發的排程法可以取得更

多 鬆 弛 時 間 ， 並 利 用 動 態 電 壓 調 整 (dynamic

voltage scaling，DVS)技術，降低目前工作的執行

電壓。不同於一般鬆弛時間回收法，lpFSA 具以

下特性: 1.能與現有排程法(lpWDA，lpLDAT，…)

合作，進一步改善能源效率。2.與現有方法互相

獨立，只需小幅調整與設定即可讓排程運作。3.

時間複雜度與實際執行的額外負擔低。根據實驗

結果，新方法能比原始 lpWDA 與 lpLDAT 節省最

多 15%-21%的電源消耗。

關鍵詞:

動態電壓調整，鬆弛時間分析，電源效

率，即時系統排程。

II、Abstract

The power consumption of computer systems

can not only increase the cost but also the operating

temperature, which leads to the increased chance of

行政院國家科學委員會專題研究計畫成果報告

計畫類別：個別型

計畫編號： NSC 99-2221-E-146-011-

執行期間： 99 年 08 月 01 日至 100 年 07 月 31 日

執行單位：華夏技術學院資訊管理系

計畫主持人：陳大仁

共同主持人：陳祐祥、謝衛民

計畫參與人員：碩士班研究生-兼任助理人員：劉佑玫

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫可公開查詢

中華民國 100 年 10 月 24 日

■ 成果報告

國一百年十月二十日

主持人: 陳大仁華夏技術學院資訊管理系

新概念，稱為 low-power fluid slack analysis

多鬆弛時間，並利用動態電壓調整 (dynamic