GSR: A global seek-optimizing real-time disk-scheduling algorithm

(1)

GSR: A global seek-optimizing real-time disk-scheduling algorithm

Hsung-Pin Chang

a,*

, Ray-I Chang

b

, Wei-Kuan Shih

c

, Ruei-Chuan Chang

d

a_{Department of Computer Science, National Chung Hsing University, 250 Kuo Kuang Road, Taichung 402, Taiwan, ROC} b_{Institute of Engineering Science and Ocean Engineering, National Taiwan University, Taipei, Taiwan, ROC}

c_{Department of Computer Science, National Tsing Hau University, Hsinchu, Taiwan, ROC} d_{Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan, ROC} Received 22 August 2005; received in revised form 31 March 2006; accepted 31 March 2006

Available online 22 May 2006

Abstract

Earliest-deadline-ﬁrst (EDF) is good for scheduling real-time tasks in order to meet timing constraint. However, it is not good enough for scheduling real-time disk tasks to achieve high disk throughput. In contrast, although SCAN can maximize disk throughput, its schedule results may violate real-time requirements. Thus, during the past few years, various approaches were proposed to combine EDF and SCAN (e.g., SCAN-EDF and RG-SCAN) to resolve the real-time disk-scheduling problem. However, in previous schemes, real-time tasks can only be rescheduled by SCAN within a local group. Such restriction limited the obtained data throughput. In this paper, we proposed a new globally rescheduling scheme for real-time disk scheduling. First, we formulate the relations between the EDF schedule and the SCAN schedule of input tasks as EDF-to-SCAN mapping (ESM). Then, on the basis of ESM, we propose a new real-time disk-scheduling algorithm: globally seek-optimizing rescheduling (GSR) scheme. Diﬀerent from previous approaches, a task in GSR may be rescheduled to anywhere in the input schedule to optimize data throughput. Owing to such a globally rescheduling characteristic, GSR obtains a higher disk throughput than previous approaches. Furthermore, we also extend the GSR to serve fairly non-real-time tasks. Experiments show that given 15 real-time tasks, our data throughput is 1.1 times that of RG-SCAN. In addition, in a mixed workload, compared with RG-SCAN, our GSR achieves over 7% improvement in data throughput and 33% improvement in average response time.

Keywords: Real-time disk scheduling; Disk scheduling; Operating systems

1. Introduction

Recent advancement in hardware technology and network communications has increased the popularity of data ser-vices. In some applications, the data services must be provided with timing characteristics. For example, media data must be accessed under real-time constraints to guarantee jitter-free playback. Furthermore, media data are often in large vol-ume and consvol-ume signiﬁcant disk bandwidth. As a result, performances of multimedia applications depend heavily on the real-time disk-scheduling algorithm applied (Lougher and Shepherd, 1993;Steinmetz, 1995). A well-behaved real-time disk scheduling should maximize data throughput while guaranteeing real-time constraints.

The earliest time at which a disk task can start is deﬁned as its ready time (or release time). The latest time at which a disk task must be completed is its deadline. The actual times at which a disk task is started and completed are its start-time and fulﬁll-start-time, respectively. To meet timing constraints for real-start-time data services, each disk task must guarantee

*

Corresponding author. Tel.: +886 4 22852106; fax: +886 4 22853869. E-mail address:[email protected](H.-P. Chang).

(2)

that its start-time scheduled is not earlier than its ready time achieved. Moreover, its fulﬁll-time scheduled is not later than its deadline set (Stankovic and Buttazzo, 1995). Diﬀerent from the conventional disk-scheduling problem, timing constraints on accessing real-time data is crucial for supporting timing critical applications (Anderson et al., 1991, 1992; Chang et al., 1997). Although the well-known SCAN algorithm, which scans disk surface back and forth to retrieve the data under disk head, has been proved as the best algorithm for maximizing disk throughput (Chen and Yang, 1992; Chen et al., 1992), its output result may not meet timing constraint on scheduling time disk tasks. Therefore, real-time data retrieved by SCAN may be meaningless and even harmful to systems (Gemmell et al., 1995; Gemmell and Christodoulakis, 1992).

In contrast, earliest-deadline-ﬁrst (EDF), which serves tasks in deadline order, is one of the best-known schemes for scheduling real-time tasks (Liu and Layland, 1973; Lehoczky, 1990). However, EDF earns its optimization under the assumption that tasks are independent. Nevertheless, in real-time disk scheduling, tasks are non-preemptive and inter-dependent. Once a disk task is issued to a disk drive, its service cannot be interrupted. Moreover, for each disk task, its service time depends not only on the location of the data block retrieved, but also on the location of the current disk head. As a result, taking only deadlines into account without service time consideration, EDF incurs excessive seek-time costs and results in poor disk throughput (Yee and Varaiya, 1991; Reddy and Wyllie, 1994). Actually, owing to the non-preemp-tive property and the non-prespeciﬁed service time of disk tasks, it is very hard to optimize the schedule result of real-time disk tasks. This problem has been proved to be NP-complete (Wong, 1980).

Previous studies have examined heuristic methods for combining the features of SCAN type of seek-optimizing algo-rithms with EDF type of real-time scheduling algoalgo-rithms. In 1993, Reddy and Wyllie proposed the SCAN-EDF method that first sorts input tasks by the EDF order and, then reschedules tasks with the same deadlines by SCAN. Experiments show that their obtained results depend highly on the probability of tasks that have the same deadlines. To increase the probability of employing SCAN to reschedule tasks, DM-SCAN (deadline-modification-SCAN) and RG-SCAN (resched-ulable-group-SCAN) are proposed to select automatically contiguous tasks that can be rescheduled by SCAN (Chang et al., 1998, 2002). In other words, these contiguous tasks can be viewed as having the ‘‘same deadline’’ in SCAN-EDF. However, previous approaches are locally seek-optimizing schemes; i.e., tasks can only be rescheduled by SCAN within a local group. Note that, each group is a set of consecutive tasks that can be rescheduled by SCAN without missing their respective timing constraints. For example, in SCAN-EDF, a group is made up of tasks having the same deadline. Simi-larly, given a set of EDF tasks, DM-SCAN automatically selects groups of consecutive tasks and these groups are named as MSGs (maximum-scannable-groups). RG-SCAN also has its own group definition and is called R-Group (reschedula-ble-group). However, no matter in SCAN-EDF, DM-SCAN, or RG-SCAN, once a task belongs to a certain group, it can-not be rescheduled to a different group; even though such a rescheduling derives a better performance. The detailed operations of DM-SCAN and RG-SCAN with their proposed MSG and R-Group concepts are introduced in Section2. To resolve the drawback of previous approaches, we propose herein a globally seek-optimizing scheduling approach: GSR (globally seek-optimizing rescheduling) scheme. First, a graph of EDF-to-SCAN mapping (ESM) is introduced to explore relations between the EDF schedule and the SCAN schedule of input tasks. Given a set of real-time disk tasks, schedule results of EDF and SCAN just denote two permutations of input tasks. By representing each task as a vertex and connecting each task in the EDF schedule to the same task in the SCAN schedule with an edge, there is a bipartite mapping; which is called ESM in this paper. On the basis of this ESM mapping, our algorithm then identifies scan-groups where each scan-group contains the maximum number of contiguous tasks that are in the same SCAN direction (left-to-right or (left-to-right-to-left). Now, the input schedule can be viewed as a piecewise-SCAN schedule. After that, input tasks are tested for being rescheduled into suitable scan-groups to achieve the highest improvement of disk throughput while guar-anteeing real-time requirements. Thus, our scheme provides a good combination of the EDF scheme and the SCAN scheme.

Note that, since there are at most n scan-groups where n is the number of input tasks, a naive algorithm will take O(n2) time to decide the best reschedule result for the selected task. To speed up its computation, we introduce a concept of the schedulable-region to each input task. With the help of the pre-computed schedulable-regions, the best-ﬁt scan-group for rescheduling each input task can be decided in O(n) time. In addition, we extend the GSR to serve mixed real-time/non-real-time disk tasks such that non-real-time/non-real-time tasks can be served to minimize response time while guaranteeing the timing constraints of real-time tasks. Compared with DM-SCAN, experiments show that our GSR algorithm can support over 11% data throughput improvement in a real-time system. Moreover, in a mixed workload, our GSR achieves over 7% improvement compared with SCAN scheme in data throughput and oﬀers 33% improvement compared with RG-SCAN in terms of average response time of non-real-time tasks.

The remainder of this paper is organized as follows. Section2gives mathematical definitions about real-time disk sched-uling and shows some related work. The EDF-to-SCAN mapping and our proposed GSR algorithm are introduced in Sec-tion3. In Section 4, we present the definition of reschedulable region and proposed a speed-up method for scheduling. Section5demonstrates how GSR is extended to efficiently serve mixed real-time/non-real-time disk tasks. Finally, Sections 6 and 7show the experimental results and conclusion remarks, respectively.

(3)

2. Problem descriptions and related work 2.1. Real-time disk-scheduling problem

The problem input considered in this paper is a set of real-time disk tasks T = {T0, T1, . . . , Tn} where n is the number of tasks. The ith task is represented by Ti= (ri, di, ai, li, bi) where riis its ready time, diis its deadline, aiis its track location, liis its sector number and biis its data size. While serving disk task Ti, the disk head needs to be moved from the current track to the target track aiby a seek time cost. Then, a rotational latency is presented for the desired sector lirotated under the disk head. Finally, data under disk head are retrieved with size biby a transfer time. The ﬁrst task T0is assigned as a special task to represent the initial location of disk head. Without loss of generality, it can be assumed to be at the outermost track (track 0). Assume that the schedule sequence is TjTi(Tiis served after Tj). The service time of task Tiis calculated as,

cj;i¼ seek timeðabsðai ajÞÞ þ rotational latencyðliÞ þ transfer timeðbiÞ: ð1Þ Clearly, the service time not only depends on the issued disk task itself, but is also related to the previous one. For example, in a HP 97560 hard disk (Ruemmler and Wilkes, 1994), the seek time seek_timej,iwith moving distance Dj,i= |aj ai| can be modeled by

seek timej;i¼

3:24þ 0:4 ffiffiffiffiffiffiffiDj;i p ; Dj;i6383; 8:00þ 0:008Dj;i; Dj;i>383: ( ð2Þ Since Tiis a real-time task, the ready time riand deadline diare used to characterize its timing constraint (Stankovic and Buttazzo, 1995). Because disk service is non-preemptive, the related start-time and fulfill-time are ei= max{ri, fj} and fi= ei+ cj,i. A simple example T = {T0, T1, T2, T3} for demonstrating the terminology used in this paper is shown inFig. 1. A schedule result of real-time disk tasks T = {T0, T1, . . . ,Tn} is called feasible if all input tasks Ti, for i = 0 to n, satisfy real-time requirements ri6_ei_{and fi}6_di_{(Lehoczky, 1990;}_{Stankovic and Buttazzo, 1995}_{). To measure the efficiency of a} real-time disk scheduling algorithm, given a set of real-time disk tasks, the applied disk scheduling algorithm should serve as many tasks as possible under tasks’ timing constraints. If the same number of tasks is feasibly served, the applied disk scheduling algorithm needs to maximize data throughput.

To determine the data throughput improvement, we define schedule fulfill-time as the finish time it takes to serve all input tasks according to their respective timing constraints. Clearly, this is the finish time of the latest task f(n). Since the disk throughput is related to the inverse of schedule fulfill-time, thus, the problem objective to maximize disk through-put can be redefined as to minimize schedule fulfill-time. In real-time disk scheduling, the system time required for serving each task is determined by its schedule sequence (Peterson and Silberschatz, 1985). However, the schedule sequence of a task depends on its service time required. Thus, it is hard to decide the optimal schedule result that maximizes the disk throughput without violating real-time requirements.

2.2. Related work

Owing to the NP-complete feature, previous real-time disk scheduling algorithms thus apply heuristically the seek-opti-mizing SCAN scheme to an EDF schedule for reducing the disk service time. For example, the well-known SCAN-EDF scheme reschedules tasks having the same deadline in an EDF order to reduce service times of tasks. However, since only

request _ri _di _ai _bi _cj,i i = 0 i = 1 i = 2 i = 3 T0 0 0 0 0 j = 0 - 3 5 6 T1 1 11 2 1 j = 1 3 - 3 4 T2 0 5 4 1 j = 2 5 3 - 2 T3 3 12 5 1 j = 3 6 4 2 - r₁ r₃ r₂ 1 0 d₀ d₁ d3 d₂ r₀ T₀ T₂ T₃ T₁ 3 2 4 5 6 7 8 9 10 1311 12 1 0 2 34 5 time disk

(4)

tasks having the same deadline are seek-optimized, the reduction in schedule fulfill-time compared with EDF is not signif-icant. To increase the probability of employing the SCAN scheme to reschedule input tasks, DM-SCAN (deadline-mod-ification-SCAN) proposed the concept of maximum-scannable-group (MSG) (Chang et al., 1998). Given an EDF schedule, consecutive tasks that can be rescheduled by SCAN without missing their respective timing constraints can be directly derived by the concept of MSG. Given a set of real-time disk tasks with EDF-ordered T = T1T2. . .Tn, the MSG Gistarting from Tiis defined as the sequential tasks Gi= TiTi+1Ti+2. . .Ti+m with each task Tkfor k = i to i + m satisfies fk6_di_and rk6_{si. However, DM-SCAN requires that the input tasks must be EDF-ordered. Therefore, they proposed a} deadline-modification scheme that transfers a non-EDF schedule into an EDF order by modifying tasks’ deadlines. Unfortunately, in order to guarantee real-time constraints, the modified deadlines are earlier than the original ones. As a result, the dead-line modification scheme causes a negative impact on the number of supported tasks by DM-SCAN.

To relieve from such a constraint, RG-SCAN (reschedulable-group-SCAN) is proposed with the concept of R-Group (reschedulable group). Given a set of real-time disk tasks T = T1T2. . .Tn, the R-Group Gistarting from task Tiis deﬁned as the maximum number of consecutive tasks Gi= TiTi+1. . . Ti+m with each task Tk for k = i to i + m satisﬁes following criteria:

f_iþm6_min iþm

k¼ifdkg and max

iþm

k¼i frkg 6 si: ð3Þ

RG-SCAN also shows that after seek-optimizing tasks within an R-Group, it can obtain more data throughput while guaranteeing real-time requirements.

However, previous approaches are locally seek-optimizing algorithms. SCAN scheme is only applied to a set of consec-utive tasks. Thus, a task would only be rescheduled by SCAN within its own group. In other words, a task belonging to a group i, say R-Group, cannot be rescheduled to another group j, since this would violate the constraints of Eq.(3)even if such a rescheduling would derive a higher disk throughput. Therefore, in this paper, we propose a globally seek-optimizing real-time disk-scheduling algorithm, GSR, to overcome the limitations of previous approaches. In GSR, a task belonging to a group i would be ‘‘globally’’ rescheduled to another group j as long as the new rescheduled result obtains a higher data throughput while guaranteeing real-time constraints.

3. GSR: Globally seek-optimizing rescheduling scheme

In this section, we present the design of our proposed GSR real-time disk-scheduling algorithm. Section3.1ﬁrst shows the construction scheme of EDF-to-SCAN mapping (ESM) graph to relate an EDF schedule to a SCAN schedule. Then, we present the idea of scan-groups, which is derived from an ESM graph. On the basis of scan-groups, our proposed GSR algorithm is described in Section3.2.

3.1. EDF-to-SCAN mapping

As described in Section1, although SCAN can maximize data throughput, its schedule result does not meet the timing constraints of real-time tasks. In contrast, the EDF schedule is good for real-time requirements. However, its disk through-put is low.Fig. 2 demonstrates the SCAN schedule and EDF schedule of the example shown inFig. 1. InFig. 2(a), the

0 1 2 3 4 5 6 7 8 SCAN = T0T1T2T3 T1 T2 T3 9 d2 T2miss deadline (a) (b) 10 11 12 10 11 12 T3 T2 T1 TEDF = T0T2T1T3 t 0 1 2 3 4 5 6 7 8 9 t

Fig. 2. For the example presented inFig. 1, the schedule results obtained by (a) the SCAN approach and (b) the EDF approach are demonstrated. Note that the schedule result obtained by SCAN is not feasible.

(5)

SCAN schedule derives a shorter schedule fulfill-time but is not feasible. In contrast, the EDF schedule shown inFig. 2(b) owns a feasible result but results in a longer schedule-fulfill time. There is a tradeoff relation between these two extreme schedule cases. It motivates us to construct a graph of EDF-to-SCAN mapping (ESM) to develop a new scheme for real-time disk scheduling.

Given a set of real-time disk tasks T = {T0, T1, . . ., Tn}, their SCAN schedule is TSCAN= TS(0)TS(1). . .TS(n) with data locations aS(0)6_aS(1)6_{6 aS(n)} _{where S(i), for i = 0 to n, is a permutation of indexes {0, 1, . . ., n}. We can represent} TSCAN by an one-dimensional graph GSCAN= (VSCAN, ESCAN) where the vertex set VSCAN= T and the edge set ESCAN= {(TS(i 1), TS(i))j for i = 1 to n}. The same idea can be applied to tasks’ EDF schedule TEDF= TE(0)TE(1). . .TE(n) with deadlines dE(0)6_dE(1)6_{6 dE(n)}_{where E(i), for i = 0 to n, is also a permutation of indexes {0, 1, . . ., n}. The related} graph is GEDF= (VEDF, EEDF), where the vertex set VEDF= T and the edge set EEDF= {(TE(i 1), TE(i)) | for i = 1 to n}. Since VSCAN= VEDF= T, there is a bipartite mapping between GSCANand GEDF.

Deﬁnition 1. [EDF-to-SCAN mapping (ESM)] EDF-to-SCAN mapping of real-time disk tasks T = {T0, T1, . . ., Tn} is a bipartite graph GESM= (VESM, EESM) that satisﬁes(1)the vertex set VESM= VEDF[ VSCAN, and(2)the edge set EESM= {(TE(j), TS(i))j for TE(j)2 V EDF, TS(i)2 VSCANand TE(j)= TS(i)}.

Fig. 3(a) shows an example of ESM for the input tasks inFig. 1. Using this mapping, we can investigate the possible transformations from the real-time EDF schedule to the seek-optimized SCAN schedule to ﬁnd a good real-time disk schedule.

To mimic the behavior of the SCAN schedule, we ﬁrst track the scan directions of tasks in the input schedule (EDF, usually but not necessary) to decompose the input schedule into a sequence of scan-groups where each scan-group contains the maximum number of contiguous tasks with the same SCAN direction. Therefore, the input schedule can be represented by a piecewise-SCAN schedule. Given an input schedule T0T1T2. . .Tn, an O(n) algorithm to identify all these scan-groups is shown as follows:

Algorithm 1 (Scan-groups identification (SGI))

/* INPUT: an EDF schedule T0T1T2. . .T_{n. OUTPUT: a set of scan-groups. */} Initial the 1-st scan-group S1= T0T1;

Initial direction = +1; /* from location a0= 0 to location a1, where a06_a1_*/ i = 1; /* the index of scan-group */

for k = 2 to n do begin _{/* for each task */} if (ak16ak) then new_direction = +1;

else new_direction =1;

if (direction = new_direction) then Si= Si+ Tk; /* in the same scan-group */ else begin /* new scan-group */

i = i + 1; TEDF = T0 T2 T1 T3 TSCAN = T0 T1 T2 T3 (a) scan-groups S2=T2T1 S1=T0T2 S3=T1T3 (b)

Fig. 3. (a) The EDF-to-SCAN mapping (ESM) graph for the tasks presented inFig. 1is shown. (b) Based on this mapping graph, we can identify scan-groups for the EDF schedule.

(6)

Initial the ith scan-group Si= Tk1Tk; Initial direction = new_direction; end /* else */

end /* for */

As the example shown inFig. 3(b), by mapping the input EDF schedule T0T2T1T3to the SCAN schedule T0T1T2T3, we obtain a piecewise-SCAN schedule with three scan-groups S1= T0T2, S2= T2T1and S3= T1T3. It introduces a new point of view for analyzing the relations between the input schedule and the SCAN schedule. Different heuristic methods can be developed for real-time disk scheduling. For example, we may try to minimize the number of scan-groups or to maximize the sizes of scan-groups under real-time requirements of such a piecewise-SCAN schedule. Since the more the tasks can be seek-optimized, the more the disk throughput is obtained. In this paper, on the basis of ESM, an effective and efficient real-time disk scheduling method GSR is proposed. Note that although the same idea can be employed to represent a SCAN schedule by a piecewise-EDF schedule, its schedule result usually violates real-time requirements.

3.2. GSR algorithm

In this subsection, we describe our proposed GSR algorithm. Without loss of generality, we assume that the input is a feasible TEDFschedule to meet basic timing constraints. Our algorithm then selects an input task according to the first-in-first-serve (FIFS) order and tries to reschedule the selected task into the best-fit scan-group to maximize disk throughput under real-time requirements. For example, given the input tasks shown inFig. 1, we can improve disk throughput by rescheduling task T3into scan-group T0T2. The new scan-groups are S1= T0T2T3and S2= T3T1, as shown inFig. 4. Note that, it reduces the number of scan-groups by 1 (S3is removed) and increases the size of scan-group S1from 2 to 3. In this paper, a rescheduled result is accepted only when it is feasible and the disk throughput is improved. Considering the input tasks shown in Fig. 4, T2 will miss its deadline if T1 is rescheduled into the scan-group T0T2. The rescheduled result T0T1T2T3is not acceptable as it violates real-time constraints. A detailed description of the proposed GSR algorithm is illustrated as follows.

Algorithm 2 (GSR real-time disk-scheduling algorithm)

/*INPUT: an feasible EDF schedule T0T1T2. . .T_{n. OUTPUT: an improved schedule result.*/} Identify all scan-groups {S1, S2, . . .} of input schedule T0T1T2. . .Tn;

for i = 2 to n do begin /* for all tasks Ti*/ Assume that Tiis in the jth scan-group Sj;

Initialize the index of scan-group tested by task Tias p = j;

The initial value of the improvement of data throughput is Op= 0; for q = j-1 down to 1 do begin /* test all scan-groups Sj*/

Try to reschedule task Tiinto scan-group Sqas a new schedule; Compute the improvement of data throughput Oqafter rescheduling; if ((the new schedule is feasible) and (Op6_Oq))

then the new index of the best-ﬁt scan-group is p = q;

10 11 12 T1 T2 T3 Tnew= T0T2T3T1 0 1 2 3 4 5 6 7 8 9 t

Fig. 4. For the example presented inFig. 1, we can select a suitable task T3and reschedule it into the suitable scan group T0T2to improve the disk throughput.

(7)

end /* for */

if (Op> 0) then do begin

Reschedule Tiinto Spas the rescheduled result Tresch Identify all scan-groups {S1, S2, . . .} of Tresch;

end /* if */ end /* for */

After identifying all scan-groups, GSR tries to reschedule each task into all scan-groups before its own scan-group and to compute the improvement of data throughput of each rescheduling result. The one with the largest throughput improve-ment while guaranteeing a feasible schedule is selected for rescheduling (the task is rescheduled from its own scan-group to the new one.) Notably, the above algorithm assumes that the input schedule is feasible. However, even an infeasible input schedule is given, GSR may still produce feasible output schedule, although it is not guaranteed.

Example 3.1. We show an example to clarify the GSR algorithm. Fig. 5 shows a set of ﬁve tasks with their timing attributes, track numbers, and the requested data size. T0is added as a special task to represent the initial location of the disk head and is assumed to be at track 0.

(1) Suppose that the input schedule is EDF-ordered and TEDF= T0T2T1T3T4T5. After applying Algorithm 1, we have three scan groups: S1= T0T2, S2= T2T1, and S3= T3T4T5. Furthermore, from Fig. 5, the schedule fulﬁll time of TEDF= 19.

(2) i = 2: since T1is in scan-group S2, thus p = j = 2, Op= 0.

[1] q = 1, reschedule T1into scan-group S1and the new schedule Tnew= T0T1T2T3T4T5. The schedule fulﬁll time of Tnew= 16. Since the new schedule is feasible and Oq= 21%. > 0, thus p = 1. GSR has found a new reschedule result which is better than the input schedule.

[2] Since Op. = 21% > 0, the new schedule Tresch= T0T1T2T3T4T5. Furthermore, the new scan groups of Tresch: S1= T0T1T2T3, S2= T3T4, and S3= T4T5.

(3) i = 3: since T3is in scan-group S1, thus p = j = 1, Op= 0 Because q = 0 < 1, the second for loop is not executed. (4) i = 4: since T4is in scan-group S2, thus p = j = 2, Op= 0.

[1] q = 1, reschedule T4into scan-group S1and the new schedule Tnew= T0T1T4T2T3T5. The schedule fulﬁll time of Tnew= 13. However, the new schedule is infeasible since f2(=8) > d2(=6). Actually, the Tnewschedule is SCAN-ordered. Although it obtains the largest data throughput, however, it violates tasks’ timing constraints and is not acceptable.

(5) i = 5: since T5is in scan-group S3, thus p = j = 3, Op= 0.

[1] q = 2, reschedule T5into scan-group S2and the new schedule Tnew= T0T1T2T4T3T5. The schedule fulﬁll time of Tnew= 14. Since the new schedule is feasible and Oq= 12.5%. > 0, thus p = 2. GSR has found a new reschedule result which is better than the previous reschedule result.

[2] q = 1, reschedule T5into scan-group S1and the new schedule Tnew= T0T1T2T4T3T5. The schedule fulﬁll time of Tnew= 15. The new schedule is feasible and Oq= 6.25% > 0. However, Oq< Op, Thus, p is not updated. [3] Since Op. = 12.5% > 0, the new schedule Tresch= T0T1T2T4T3T5. Furthermore, the new scan groups of Tresch:

S1= T0T1T2, S2= T2T4, and S3= T4T3T5.

From Example 3.1, the input schedule has its schedule fulﬁll time as 19. After the completion of Algorithm 2, GSR derives a reschedule result that has a schedule fulﬁll time of 15. As a result, compared with the input schedule, the data throughput of the new reschedule has an improvement of 21% compared to the original input schedule.

task _ri _di _ai _bi cj,i i = 0 i = 1 i = 2 i = 3 i = 4 i = 5 T₀ 0 0 0 0 j = 0 - 3 5 6 4 7 T₁ 1 11 2 1 j = 1 3 - 3 4 2 5 T₂ 0 7 4 1 j = 2 5 3 - 2 2 3 T₃ 3 12 5 1 j = 3 6 4 2 - 3 2 T₄ 3 14 3 1 j = 4 4 2 2 3 - 4 T5 4 15 6 1 j = 5 7 5 3 2 4 -

Fig. 5. The four tuple (ri, di, ai, bi) of a set of ﬁve tasks used inExample 3.1. Furthermore, we also shows cj,ifor all combination of schedule sequence TjTi. For simplicity, we ignore the rotational latency.

(8)

4. Speed-up method

However, inAlgorithm 2, when rescheduling a task into a tested scan-group, a native algorithm will take O(n) time to verify the feasibility of rescheduled result and to measure the improvement of data throughput. Since there are at most n scan-groups to be tested, it totally takes O(n2) time to decide the best-ﬁt scan-group for each task. To accelerate the testing process, in this paper, we further introduce the schedulable-region concept to reduce the time complexity from O(n2) to O(n).

In this section, we show how to speed up the testing process by the concept of schedulable-region. Section4.1should ﬁrst introduce the deﬁnition of schedulable-region. After that, a fast algorithm involving the schedulable-region is presented in Section4.2.

4.1. Schedulable-region

Before defining the schedulable-region, for each task Tkin the input schedule T0T1. . .Tn, we first introduce the minimal schedulable fulfill-time fL

k, the minimal schedulable start-time e L

k, the maximal schedulable fulﬁll-time f R

k and the maximal schedulable start-time eR

k. The superscripts ‘‘L’’ and ‘‘R’’ represent ‘‘Left-most’’ and ‘‘Right-most’’, respectively.

Definition 2. [Minimal schedulable time/fulfill-time] For each task in the input schedule, the minimal schedulable start-time/fulfill-time is the earliest (Left-most) start-start-time/fulfill-time to serve the task without violating real-time requirements.

Given eL

0 ¼ f

L

0 ¼ 0 to represent the initial disk head by T0= (0, 0, 0, 0, 0), the minimal schedulable start-time e L

k and the minimal schedulable fulﬁll-time fL

k of task Tk, for k = 1 to n, can be computed by the aggressive scheduling scheme (Chang et al., 1997) as follows: eL k ¼ maxfrk; f L k1g; fL k ¼ e L k þ ck1;k: ð4Þ

Obviously, the start-time eL

k and fulﬁll-time f L

k obtained are minimized to serve task Tkas early as possible under a feasible schedule sequence T0T1. . .Tn. (The real-time requirements fL

k 6 dk, for k = 1 to n, are guaranteed.) According to the sim-ilar idea, we can deﬁne the maximal schedulable start-time and fulﬁll-time as follows.

Definition 3. [Maximal schedulable start-time/fulfill-time] For each task in an input schedule, the maximal schedulable start-time/fulfill-time is the latest (Right-most) start-time/fulfill-time for serving the task without violating real-time requirements.

According to the above deﬁnition, tasks are served as late as possible under a feasible schedule sequence T0T1. . .Tn. To guarantee a feasible schedule, for the last task Tn, its maximal schedulable fulﬁll-time is its deadline, i.e., fR

n ¼ dn. Then, its maximal schedulable start-time can be decided by eR

n ¼ f

R

n cn1;n¼ dn cn1;n. Using the same method, the maximal schedulable start-time and fulﬁll-time of task Tk, for k = n 1 down to 0, can be computed by the lazy scheduling scheme (Chang et al., 1997) as follows:

fR k ¼ minfdk; e R kþ1g; eR k ¼ f R k ck1;k; ð5Þ where the service time c1,0is assigned as 0. From Eqs.(4) and (5), fL

k e L k ¼ f R k e R

k ¼ ck1;k. Furthermore, we can prove that eL k 6e R k, f L k 6f R k .

Lemma 1. By the definition of eL_k, eRk, fkL, fkR from Eqs.(4) and (5), we have eLk 6eRk, fkL6fkR. Proof. We prove it by the induction scheme.

(1) When k = 1, eL

1 ¼ maxfr1; f0Lg ¼ r1 (since f0L¼ 0). Thus, f1L¼ eL1þ c0;1¼ r1þ c0;1: In order to guarantee a feasible schedule, r1+ c0,1must be smaller than or equal to d1. Thus

f₁L ¼ r1þ c0;16_d₁_: _ð6Þ

From Eq.(5), fR

1 ¼ minfd1; e R

2g, we discuss it in the following two cases. (a) Suppose d16eR

2;then, f R

1 ¼ d1. Thus, from Eq.(6), f₁L ¼ r1þ c0;16_d1_{¼ f}R

(9)

From Eqs.(5) and (6), eR

1 ¼ f

R

1 c0;1¼ d1 c0;1. To guarantee a feasible schedule, d1 c0,1must be larger than or equal to r1. Thus,

eR₁ ¼ fR

1 c0;1¼ d1 c0;1 P r1¼ e L

1: ð8Þ

From Eqs.(7) and (8), we have eL

1 6eR1, f1L6f1R.

(b) Suppose eR

2 6d1, then, f1R¼ eR2. The proof can be derived in the same way as the proof in (a) From (a) and (b), we derive that when k = 1, we have eL

1 6eR1, f1L 6f1R.

(2) Suppose when k = n, we have eL

k 6eRk, fkL6fkR. Then, when k = n + 1, we can prove that eLkþ16eRkþ1, fkþ1L 6fkþ1R in the same way as the proof in(1). As a result, from(1) and (2), we prove that eL

k 6eRk, fkL 6fkR. h

After the definition of minimal schedulable start-time/fulfill-time and minimal schedulable start-time/fulfill-time, an arbitrary schedulable-region Ri,j, for 0 6 i 6 j 6 n, can be defined as follows.

Deﬁnition 4. [Schedulable-region] For a set of contiguous tasks TiTi+1. . .Tj in the input schedule T0T1. . .Tn, the schedulable-region Ri,jis the time region that TiTi+1. . .Tjcan be served without violating real-time requirements.

From the definitions of the minimal/maximal schedulable start-time/fulfill-time, the schedulable-region of tasks TiTi+1. . .Tj is from the minimal schedulable start-time eLi of the first task Ti to the maximal schedulable fulfill-time f

R j of the last task Tj. It is denoted as Ri;j¼ ½eL

i; f

R

j .

4.2. A fast algorithm

Assume that the task Tx(0 < x 6 n) of input schedule T = T0T1. . .Tnis selected for rescheduling. To record T’s schedule fulﬁll-time, we add a null task Tn+1with rn+1= 0 and dnþ1¼ fL

n to the end of schedule. The service time ci,n+1= 0 for all i. Removing the selected task Tx from the input schedule, the new schedule is TA= TA(0)TA(1). . .TA(n)= T0T1. . .Tx1 Tx+1. . .Tn+1. By applying Eqs.(4) and (5), we can pre-compute the minimal schedulable fulﬁll-time fL

AðjÞ and the maximal schedulable fulﬁll-time fR

AðjÞ, for j = 0 to n, in O(n) time. Note that the schedulable-region of TA= TA(0)TA(1). . .TA(n), i.e., RA(0),A(n) should be upper bounded by the original schedule fulﬁll-time fL

n. That is, the schedulable-region RAð0Þ;AðnÞ¼ ½0; fR

AðnÞ¼ fnL. As shown in Section3.2, the GSR tries to reschedule task Txinto diﬀerent scan-groups. Assume that Txis rescheduled into a scan-group and is served before TA(k). The new start-time and fulﬁll-time of task Ti, for i = 0 to n + 1, in the new rescheduled result TA(0)TA(1). . .TA(k1)TxTA(k). . .TA(n)are denoted as eAðkÞi and f

AðkÞ

i , respectively. Since the original schedulable region RA(k),A(k)is known and the new schedulable region of TxTA(k)can be decided in O(1) time, the feasibility of rescheduled result can be veriﬁed in O(1) time.

Theorem 1. By applying the pre-computed minimal schedulable fulfill-time f_AðjÞL and maximal schedulable fulfill-time fR AðjÞ(for j = 0 to n) for input schedule TA(0)TA(1). . .TA(n), the feasibility of schedule TA(0)TA(1). . .TA(k1)TxTA(k). . .TA(n)can be verified in O(1) time.

Proof. Divide the rescheduled result into three sub-schedules: TA(0)TA(1). . .TA(k1), TxTA(k)and TA(k+1)TA(k+2). . .TA(n). (a) We can serve sub-schedule TA(0)TA(1). . .T_A(k1) by the aggressive scheduling scheme as shown in Eq. (3). Since

TA(0)TA(1). . . T_A(k1)= T0T1. . .T_k1, the start-time and the fulﬁll-time are eAðkÞ_AðiÞ ¼ eL j and f

AðkÞ

AðjÞ ¼ f

L

i , respectively for j = 0 to k 1. The feasibility of sub-schedule TAð0ÞTAð1Þ. . . TAðk1ÞðfAðjÞAðkÞ¼ f

L

j 6dAðjÞ¼ dj for j = 0 to k 1) is guaranteed.

(b) In schedule sequence TxTA(k), we can compute the minimal schedulable fulﬁll-time fAðkÞ x and f AðkÞ AðkÞ as follows: f_xAðkÞ¼ eAðkÞ x þ cAðk1Þ;x¼ maxfrx; f L Aðk1Þg þ cAðk1Þ;x ð9Þ

f_AðkÞAðkÞ¼ eAðkÞ_AðkÞþ cx;AðkÞ¼ maxfrAðkÞ; f_xAðkÞg þ cx;AðkÞ¼ maxfrAðkÞ;maxfrx; f_Aðk1ÞL g þ cAðk1Þ;xg þ cx;AðkÞ: ð10Þ Since fL

Aðk1Þis pre-computed, real-time requirements fxAðkÞ6dx and f

AðkÞ

AðkÞ 6dAðkÞ can be veriﬁed in O(1) time. (c) From the deﬁnition of schedulable-region, real-time requirementsðf_AðjÞAðkÞ6dAðjÞ for j¼ k þ 1 to nÞ of the remainder

schedule TA(k+1)TA(k+2). . .TA(n)is guaranteed if f_AðkÞAðkÞ6_fR

AðkÞ. Since fAðkÞR is pre-computed and f AðkÞ

AðkÞ can be computed from Eq.(10), the feasibility of sub-schedule TA(k+1)TA(k+2). . .TA(n)can be veriﬁed in O(1) time.

(d) According to the pre-computed fL

AðjÞ and f R

AðjÞ (for j = 0 to n), the feasibility of rescheduled result TA(0)TA(1). . .T_A(k1)TxTA(k). . .TA(n)can be veriﬁed in O(1) time. h

(10)

Fig. 6shows an example for demonstrating the schedulable-region concept (input schedule is T0T2T1T3as shown in Fig. 1). The null taskT4is added to the end of schedule. Assume that task T3is selected and removed from input schedule for rescheduling. Using our algorithm, we can reschedule task T3into the front of T1.

Note that, in the proposed GSR algorithm, tasks are selected and rescheduled into the best-fit scan-groups to minimize the schedule fulfill-time under real-time requirements. We need to calculate not only the feasibility, but also the schedule fulfill-time of rescheduled result (to determine if the new rescheduled result obtains a better data throughput than other schedules). Assume that the rescheduled result is TA(0)TA(1). . .TA(k1)TxTA(k). . .TA(n)as described above. The improvement of schedule fulfill-time vAðkÞ_AðnÞ can be calculated in O(1) time.

Theorem 2. With the pre-computed WAðiÞ¼Piþ1j¼nðeRAðjÞ fAðj1ÞR Þ and VAðiÞ¼ minfðeRAðjÞ fAðj1ÞR þ WAðjÞÞ; for j = n to i + 1}, for i = 0 to n, the improvement of schedule fulfill-time vAðkÞ_AðnÞ for the rescheduled result TA(0)TA(1). . .TA(k1)TxTA(k). . .TA(n)can be computed by vAðkÞ_AðnÞ¼ minfVAðkÞ; fAðkÞR f

AðkÞ

AðkÞ þ WAðkÞg in O(1) time.

Proof. Divide this rescheduled result TA(0)TA(1). . .T_A(k1)TxTA(k). . .TA(n) into three sub-schedules: TA(0)TA(1). . .TA(k1), TxTA(k)and TA(k+1)TA(k+2). . .TA(n).

(a) Since the ﬁrst sub-schedule TA(0)TA(1). . .T_A(k1)= T0T1. . .T_k1is not changed, we have eAðkÞ_AðiÞ ¼ eL i and f AðkÞ AðiÞ ¼ f L i for i = 0 to k 1.

(b) Deﬁne the improvement of schedule fulﬁll-time for sub-schedule TA(0)TA(1). . .TA(i)as vAðkÞ_AðiÞ ¼ fAðiÞR f AðkÞ

AðiÞ. In sub-sche-dule TxTA(k), the fulﬁll-time f_AðkÞAðkÞ can be computed by Eq.(10). The improvement in schedule fulﬁll-time

vAðkÞ_AðkÞ¼ fR

AðkÞ f

AðkÞ

AðkÞ ð11Þ

is obtained in O(1) time.

(c) As shown inFig. 7, for task TA(i)in the remainder sub-schedule TA(k+1)TA(k+2). . .TA(n), the improvement in schedule fulﬁll-time vAðkÞ_AðiÞ (for i = k + 1 to n) can be computed by

vAðkÞ_AðiÞ ¼ minfuAðiÞ; vAðkÞ_Aði1Þþ wAðiÞg ð12Þ

where the parameters uAðiÞ ¼ eR

AðiÞ rAðiÞ and wAðiÞ¼ eRAðiÞ fAði1ÞR denote the upper bound and the lower bound of improvement vAðkÞ_AðiÞ, respectively.

(d) The improvement in schedule fulﬁll-time vAðkÞ_AðnÞ for an arbitrary task TA(k)can be deﬁned as the following recursive function.

vAðkÞ_AðnÞ¼ minfuAðnÞ; v AðkÞ Aðn1Þþ wAðnÞg if ðn > kÞ fR AðnÞ f AðkÞ AðnÞ if ðn ¼ kÞ 8 < : ð13Þ T2 T1 T3 Identify schedulable-regions T2 T1 T2 T1 T4 T4

Serve T2T3T1in time region R2,1

time 0 1 2 3 4 5 6 7 8 9 10 11 12

d2 d1 d3

Input Schedule d0

T0T2T1T3

Remove the selected request T3

Add the last null request T4

Insert request T3 into scan-group T0T2

Rescheduled Result T0T2T3T1 R2,1 T2 T3 _T 1

Fig. 6. A simple example to demonstrate the schedulable-region concept. Task T3can be served before task T1if T2T3T1can be served in the schedulable-region R2,1.

(11)

Deﬁne WAðiÞ¼Piþ1j¼nwAðjÞ, UA(i)= uA(i)+ WA(i)and VA(i)= min{UA(n), UA(n1), . . ., UA(i+1)} for i = 0 to n. The above recursive function can be rewritten as follows:

vAðkÞ_AðnÞ ¼ minfuAðnÞ; vAðkÞ_Aðn1Þþ wAðnÞg ¼ minfuAðnÞ;minfuAðn1Þ; vAðkÞ_Aðn2Þþ wAðn1Þg þ wAðnÞg ¼ minfuAðnÞ; uAðn1Þþ wAðnÞ; vAðkÞ_Aðn2Þþ wAðn1Þþ wAðnÞg

¼ minfuAðnÞ; u_Aðn1Þþ WAðn1Þ; . . . ; uAðkþ1Þþ WAðkþ1Þ; vAðkÞAðkÞþ WAðkÞg

¼ minfUAðnÞ; UAðn1Þ; . . . ; UAðkþ1Þ; vAðkÞAðkÞþ WAðkÞg ¼ minfminfUAðnÞ; UAðn1Þ; . . . ; UAðkþ1Þg; vAðkÞAðkÞþ WAðkÞg ¼ minfVAðkÞ; fAðkÞR f

AðkÞ

AðkÞ þ WAðkÞg: ð14Þ

(e) With the pre-computed WA(i)and VA(i)(for i = 0 to n), the improvement in schedule fulﬁll-time vAðkÞ_AðnÞ for an arbitrary task TA(k)can be obtained in O(1) time. h

As there are at most n scan-groups to be considered, thus, the GSR takes O(n) time to ﬁnd out the best rescheduled result TA(0)TA(1). . .TA(y1)TxTA(y). . .TA(n) that satisﬁes v

AðyÞ

AðnÞ¼ maxfv AðkÞ

AðnÞ, for serving Tx before tasks TA(y) in diﬀerent scan-groups}.

5. Supporting non-real-time tasks

In a real-time system, although most disk accesses are timing critical, there are still a few disk tasks for non-real-time data access. For example, in a Video-on-Demand (VoD) system, users may first browse the archive to select the desired video. After that, a continuous real-time retrieval of selected video should be guaranteed by the applied real-time disk-scheduling scheme for jitter-free playback. Although the non-real-time browsing task has no deadline constraints, reason-able response time should be offered to provide a comfortreason-able service to users. Of course, such a reasonreason-able response must be offered under the timing requirements of real-time tasks.

Intuitively, non-real-time tasks would be served after the completion of all real-time tasks. However, such an approach would cause an undesired large response time and at worst, be starved of service to non-real-time tasks. Assume that T0T1. . .Tnis the original schedule and Tn+1is a newly added non-real-time task. Another naive approach would set the ready time rn+1= 0 and the deadline dn+1=1, thus, non-real-time task Tn+1can be viewed as a real-time task and sched-uled by previous real-time scheduling algorithms. For example, as shown inFig. 8, the non-real-time task will be placed at

rA(i) eRA(i)

vA(k)

A(i) = vA(k)A(i-1) + wA(i)

vA(k)A(i-1)

fRA(i-1)

vA(k) A(i) = uA(i)

wA(i)= eRA(i) - fRA(i-1)

uA(i)= eRA(i) - rA(i)

rA(i) e R A(i) fR A(i-1) v A(k) A(i-1) rA(i) e R A(i) fR_A(i-1)

Fig. 7. A simple example to illustrate the recursive relation between the improvement vAðkÞ_Aði1Þand the improvement vAðkÞ_AðiÞ. The upper bound and lower bound of the related improvement are shown.

SCAN-EDF non-real-time tasks RG-SCAN non-real-time tasks input schedule input schedule

(12)

the end of input schedule and rescheduled with the last task (by SCAN-EDF) or the last R-Group (by RG-SCAN). Nev-ertheless, such an approach still results in an undesired long response time. Although we may assign an earliest deadline to the non-real-time task, the choice of a proper deadline is not easy.

In this paper, we extend the GSR algorithm described in Section 3.2 to serve fairly non-real-time tasks. Let TA(0)TA(1). . .TA(n)= T0T1. . .Tnbe the schedule of real-time tasks. The non-real-time task Tn+1, just as Tidescribed in Algo-rithm 2, is selected from the task queue for rescheduling into one of the scan-groups. Note that, as our previous algorithm is designed to maximize disk throughput, the best-fit scan-group is selected for rescheduling. As a result, Ti= Tn+1will not be served as soon as possible. However, for serving non-real-time tasks, the response time should be minimized. Conse-quently, for each non-real-time task, we try to reschedule it into its first-fit scan-group. To serve Tn+1as soon as possible, the schedulable-region of TA(0)TA(1). . .TA(n) should be upper bounded by the deadline dn of the last task Tn(it is upper bounded by fL

n inAlgorithm 2). Moreover, it is not necessary to add a null task with deadline f L

n to the end of input sche-dule. By applying similar operation steps inAlgorithm 2, we try to reschedule non-real-time task Tx(=Tn+1) into the first-fit scan-group Sk(it may not be the best-fit one in terms of maximizing data throughput). The index k is as small as possible to offer non-real-time task Txa short response time without causing the real-time tasks to violate their timing requirements. As shown in the proposed fast algorithm, after pre-computing some problem parameters WA(i)and VA(i)(for i = 0 to n), the feasibility for serving each rescheduled result TA(0)TA(1). . .TA(k1)TxTA(k). . .TA(n) can be verified in constant time. It takes a total of O(n) time to decide the best schedule result that serves fairly non-real-time tasks without violating timing requirements of the original schedule. Its time complexity is the same as that of conventional methods.

6. Experimental results

In this section, the experimental results of our proposed GSR algorithm are compared with those of previous approaches.Table 1 shows the important parameters of HP 97560, which is used as the disk model in our experiments (Reddy and Wyllie, 1993). The seek time cost is deﬁned in Eq.(2). The rotational latency is assumed half of the time of a full track revolution. Each real-time task is assumed to request for a track of data (36 KB in HP 97560). Ready times of tasks are uniformly distributed among 0 and 240 ms. The related deadline is the summation of its ready time and a per-iod time that varies from 120 to 480 ms. Non-real-time tasks are assumed to arrive with a Poisson distribution. The mean inter-arrival time between each real-time task is described in the related experiments. The queuing principle for non-real-time tasks is followed by FIFO order for its simplicity and fairness. The size of data accessed by each non-non-real-time task is assumed to be 4 KB. The workloads of both real-time and non-real-time tasks are uniformly distributed over the disk surface. In all following experiments, 100 experiments are conducted with diﬀerent seeds for random number gener-ation and the average value is used for performance evalugener-ation.

6.1. Number of supported tasks

The number of tasks supported is one of the most important factors in measuring performance of a real-time disk sched-uling algorithm. InTable 2, we summarize the minimum, the maximum and the average number of real-time disk tasks that can be supported by diﬀerent methods, such as GSR, RG-SCAN, and SCAN-EDF. For fair comparison, we apply the same one hundred test examples to diﬀerent test methods. In each test example, a set of 30 real-time disk tasks are given and the number of feasibly completed tasks is counted.

According to the experimental results, our GSR method can support more tasks than conventional schemes. In average, the number of tasks supported by our method is larger than the conventional SCAN-EDF method with 50% improve-ments. Comparing to RG-SCAN, our method shows around 10% improveimprove-ments. Notably, based on the above experiments, test examples that are not schedulable for conventional method can be successfully scheduled by our GSR method. This is because GSR scheme is a global seek-optimizing algorithm. As a result, compared to previous local

Table 1

Disk parameters of HP 97560

No. of cylinders per disk 1972

No. of tracks per cylinder 19

No. of sectors per track 72

Sector size 512 bytes

Seek time function (ms)

SeekðdÞ ¼ 3:24þ 0:4 ffiffiffi d p ; d 6383 8:00þ 0:008d; d > 383 ( Revolution speed 4002 RPM Transfer time 10 MBps

(13)

seek-optimizing schemes, input tasks’ services times are further reduced after rescheduling and thus more tasks can be served before their deadlines. In other words, the further reduction of tasks’ service times prompts more tasks to be served.

6.2. Improvement of disk throughput

Note that, if the same input tasks are given, a well-behaved real-time disk-scheduling algorithm should finish the sche-dule as quickly as possible to maximize data throughput. In this paper, test workloads with different numbers of input tasks are employed to measure the disk throughput obtained under different disk-scheduling schemes. To compare the results with SCAN-EDF, we let the test workloads to be feasibly scheduled by the EDF method. The result is shown inFig. 9. Note that the improvement in data throughput is compared with achieved by SCAN-EDF. Experiments show that the data throughput obtained by our GSR method is always better than that obtained by SCAN-EDF and DM-SCAN, regardless of the problem size of test workload.Table 3summarizes the minimum, the maximum, the average schedule fulfill-time, and the improvement in data throughput obtained under 15 input real-time tasks for detailed comparisons.Table 3shows that the data throughput achieved by GSR is 1.1 times that of RG-SCAN’s.Table 4presents the same performance metric but assumes 20 input real-time tasks. Likewise, our GSR scheme achieves 11% improvement when compared with the RG-SCAN scheme. As stated in Section2, previous schemes are locally seek-optimizing scheme; a task can only be rescheduled to the locally best position (within a limited group) in terms of data throughput improvement. In contrast, GSR is a glob-ally rescheduling scheme; that is, a task is rescheduled to the globglob-ally best position. As a result, the further reduction in service time of tasks promotes a higher disk throughput achieved by the GSR scheme.

Table 2

The minimal, maximal, and average number of supported real-time tasks under diﬀerent scheduling policies

Algorithms Number of supported tasks

Minimum Maximum Average

GSR 18 28 23 RG-SCAN 18 27 21 SCAN-EDF 14 23 15 0 5 10 15 20 25 30 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of Real-Time Tasks

Impro

v

ement (%)

DM_SCAN RG_SCAN GSR

Fig. 9. The data throughput improvement of various real-time disk scheduling algorithm under diﬀerent number of input real-time tasks.

Table 3

Given 15 real-time tasks, the minimum, maximum, average schedule fulﬁll-time, and the obtained data throughput improvement of diﬀerent disk scheduling approaches

Algorithms Schedule fulﬁll-time (ms)

Minimum (ms) Maximum (ms) Average (ms) Improvement (%)

GSR 139.23 207.58 172.46 22.60

RG-SCAN 146.34 261.64 189.35 15.02

DM-SCAN 154.44 261.64 191.98 13.84

(14)

Furthermore, we present the data throughput improvement under a mixed workload. In RG-SCAN, they also extend their algorithm to serve fairly non-real-time tasks. Thus, we compare our performance result with that obtained by RG-SCAN scheme. Assume that the mean inter-arrival time of non-real-time tasks is 10.1 ms.Fig. 10shows the data through-put improvement under different scheduling schemes while each inthrough-put problem size consists of a number of real-time tasks and three non-real-time tasks. For example, the bar under 10 real-time tasks actually consists of 10 real-time tasks and three non-real-time tasks. As stated before, lots of systems may consist of a mixed workload that has both real-time and non-real-time traffics. Thus, the applied disk-scheduling scheme should maximize the disk throughput under guar-anteed real-time constraints while minimizing the response time to non-real-time tasks. As seen in Fig. 10, the data throughput improvement of GSR still outperforms the conventional schemes in a mixed environment. For example, given 15 real-time tasks and three non-real-time tasks, our data throughput is 1.1 times that of RG-SCAN’s. Note that, we will show the effectiveness of our scheme concerning the performance of non-real-time tasks in Section6.3.Fig. 11shows the same performance metrics but each problem size consists of five non-real-time disk tasks.

0 2 4 6 8 10 12 14 16 18 20 10 11 12 13 14 15 16 17 18 19 20

Impro

v

ement (%)

RG_SCAN GSR

Fig. 10. Given three non-real-time tasks, the data throughput improvement of GSR and RG-SCAN under diﬀerent number of input real-time tasks. Table 4

Given 20 real-time tasks, the minimum, maximum, average schedule fulﬁll-time, and the obtained data throughput improvement of diﬀerent disk scheduling approaches

Algorithms Schedule fulﬁll-time (ms)

Minimum (ms) Maximum (ms) Average (ms) Improvement (%)

GSR 180.75 277.47 221.31 23.91 RG-SCAN 195.15 318.54 248.90 14.43 DM-SCAN 195.15 318.54 254.88 12.37 SCAN-EDF 236.52 338.36 290.86 0 0 2 4 6 8 10 12 14 10 11 12 13 14 15 16 17 18 19 20

Impro

v

ement (%)

DM_SCAN RG_SCAN GSR

(15)

6.3. Response time of non-real-time task

A real system would consist of a mixed workload and require serving both real-time and non-real-time tasks. As a result, given a real-time schedule and a number of non-real-time tasks, the original real-time schedule should be adjusted to serve fairly the non-real-time tasks but without violating timing constraints. For a non-real-time task, the response time that counts the difference between its fulfill-time and its ready-time plays an important factor for measuring the effectiveness of a disk-scheduling algorithm. Given 10 real-time tasks, Fig. 12 shows the average response time obtained by GSR and RG-SCAN under different number of non-real-time tasks. The mean inter-arrival time of non-real-time tasks is assumed to be 10.1 ms, which saturates the queue of non-real-time task to avoid the occurrence of an empty queue. To show the effectiveness of our scheme in supporting the non-real-time tasks under different non-real-time task workload, Figs. 13 and 14 show the same performance metric but the mean inter-arrival time of each non-real-time task is set to 5.1 and 20.1 ms, respectively.Figs. 15–17also show the experimental results but the problem size consists of 20 real-time tasks.

As stated in Section2, RG-SCAN partitions the input tasks into a set of R-Groups. After rescheduling tasks by the seek-optimizing SCAN scheme within an R-Group, the finish-time of the R-Group is improved. As a result, a slack is derived between the advanced finish-time and the original one. RG-SCAN thus uses this slack to serve non-real-time tasks. If the slack derived in an R-Group is not large enough to sustain the execution of a non-real-time task, RG-SCAN continue to identify the next R-Group and the derived slack is added to the previous one and so on, until the non-real-time task can be served. In other words, in RG-SCAN, a non-real-time task is served after the real-time tasks until an enough slack is encountered. In contrast, the GSR treats non-real-time tasks with the same manner of the real-time task but uses the best-fit selection scheme and without real-time requirements. Thus, a non-real-time task in GSR is served in its earliest pos-sible point as long as the schedule result is feapos-sible. In addition, as shown in Section6.1, owing to the superiority of our

0 10 20 30 40 50 60 70 80 90 100 3 4 5 6

Number of Non-Real-Time Tasks

Response T

ime

(ms)

RG_SCAN GSR

Fig. 12. Mean inter arrival time = 10.1, given 10 real-time tasks, the response time of GSR and RG-SCAN under diﬀerent number of input non-real-time tasks. 0 10 20 30 40 50 60 70 80 90 3 4 5 6

Response T

ime

(ms)

RG_SCAN GSR

Fig. 13. Mean inter arrival time = 5.1, given 10 real-time tasks, the response time of GSR and RG-SCAN under diﬀerent number of input non-real-time tasks.

(16)

proposed GSR scheme in serving real-time tasks, real-time tasks are served more quickly than by the RG-SCAN scheme. As a result, in a mixed workload, a non-real-time task can also be quickly served by the GSR scheme since real-time tasks ahead of it are soon finished. Consequently, by adapting the best-fit selection scheme to the first-fit selection criterion for non-real-time tasks and the superiority in serving real-time tasks, our proposed GSR scheme offers a shorter response time than RG-SCAN for serving non-real-time tasks.

Table 5summarizes the minimum, maximal, and the average schedule fulﬁll-time and the minimum, maximal, and the average response time for non-real-time tasks for 10 real-time tasks and three non-real-time tasks with 10.1 ms mean

0 10 20 30 40 50 60 70 80 3 4 5 6

Response T

ime

(ms)

RG_SCAN GSR

Fig. 14. Mean inter arrival time = 20.1, given 10 real-time tasks, the response time of GSR and RG-SCAN under diﬀerent number of input non-real-time tasks. 0 20 40 60 80 100 120 3 4 5 6

Response T

ime

(ms)

RG_SCAN GSR

Fig. 15. Mean inter arrival time = 10.1, given 20 real-time tasks, the response time of GSR and RG-SCAN under diﬀerent number of input non-real-time tasks. 0 10 20 30 40 50 60 70 80 90 100 3 4 5 6

Response T

ime

(ms)

RG_SCAN GSR

(17)

inter-arrival time for detailed comparison. Note that, the average schedule fulfill-time includes both the execution of real-time tasks and non-real-real-time tasks.Table 6shows the same performance metric but with 20 real-time tasks and five non-real-time tasks. As seen inTables 5 and 6, our proposed GSR scheme not only offers shorter response time for non-real-time tasks, but also provides a larger data throughput (i.e., shorter schedule fulfill-non-real-time) for the total schedule results. For example,Table 6shows that our GSR scheme achieves over 7% improvement compared with the RG-SCAN scheme in obtained data throughput but offers 33% improvement compared with the RG-SCAN in terms of average response time of non-time tasks. Notably, the extra non-time tasks supported by GSR and RG-SCAN, together with their real-time tasks, still have a shorter schedule fulfill real-time than SCAN-EDF, which only counts the real-real-time tasks, i.e., the input workload to the SCAN-EDF does not include the non-real-time tasks. This furthermore demonstrates the effectiveness of our proposed GSR scheme.

7. Conclusion

In order to improve data throughput, the seek-optimizing SCAN scheme should be employed to reschedule the input tasks as much as possible. However, previous approaches limit their ﬂexibility and eﬃciency in that a task can only be seek-optimizing rescheduled with the tasks having the same deadline or within the same local group (a set of contiguous tasks). In order words, in conventional schemes, a task can only be rescheduled by SCAN with a locally best position. In this paper, we propose a globally seek-optimizing disk-scheduling scheme called GSR. In GSR, a task can be rescheduled to the globally best position, i.e., position with the maximal data throughput while guaranteeing a feasible schedule.

0 20 40 60 80 100 120 140 3 4 5 6

Response T

ime

(ms)

RG_SCAN GSR

Table 5

Given 10 real-time tasks and three non-real-time tasks, the schedule fulﬁll-time and the response time of diﬀerent real-time disk scheduling approaches

Algorithm Schedule fulﬁll time (ms) Response time (ms)

Minimum Maximum Average Improvement (%) Minimum Maximum Average

GSR 136.63 184.53 164.86 9.72 13.85 113.88 59.61

RG-SCAN 134.23 195.77 173.61 4.93 19.85 114.55 70.93

DM-SCAN 151.41 204.79 177.19 2.97 19.85 114.55 70.93

SCAN-EDF 160.18 249.34 199.75 0 N.A. N.A. N.A.

Table 6

Given 20 real-time tasks and five non-real-time tasks, the schedule fulfill-time and the response time of different real-time disk scheduling approaches

Algorithm Schedule fulﬁll time (ms) Response time

Minimum Maximum Average Improvement Minimum Maximum Average

GSR 267.78 377.53 308.39 10.85 26.64 122.47 66.40

RG-SCAN 273.35 390.33 332.42 3.90 39.79 167.12 99.61

DM-SCAN 276.98 398.36 337.69 2.38 39.79 167.12 99.61

(18)

In addition, we extend the GSR scheme to serve mixed workloads that consist of both real-time and non-real-time traf-fic. Instead of the best-fit rescheduling policy for real-time tasks, the GSR reschedules a non-real-time task to the first-fit scan-group to minimize the response time. The experimental results show that our proposed GSR scheme is better than the conventional methods not only in improving the data throughput, but also in shortening response time.

References

Anderson, D.P., Osawa, Y., Govindan, R., 1991. Real-time disk storage and retrieval of digital audio/video data. Technical Report 1991, Department of Computer Science, University of California, Berkeley.

Anderson, D.P., Osawa, Y., Govindan, R., 1992. A ﬁle system for continuous media. ACM Trans. Computer Systems 10 (4), 311–337.

Chang, R.I., Chen, M.C., Ho, J.M., Ko, M.T., 1997. Designing the ON-OFF CBR transmission schedule for jitter-free VBR media playback in real-time networks. Proc. IEEE RTCSA, 2–9.

Chang, R.I., Shih, W.K., Chang, R.C., 1998. Deadline-modiﬁcation-scan with maximum scannable-groups for multimedia real-time disk scheduling. In: Proceedings of the 19th IEEE Real-Time Systems Symposium, pp. 40–49.

Chang, H.P., Chang, R.I., Shih, W.K., Chang, R.C., 2002. Reschedulable-Group-SCAN Scheme for Mixed Real-Time/Non-Real-Time Disk Scheduling in a Multimedia System. J. Syst. Software 59 (2), 143–152.

Chen, T.S., Yang, W.P., 1992. Amortized analysis of disk scheduling algorithm V(R)*_{. J. Inf. Sci. Eng. 8, 223–242.}

Chen, T.S., Yang, W.P., Lee, R.C.T., 1992. Amortized analysis of some disk scheduling algorithms: SSTF, SCAN, and N-Step SCAN. BIT 32, 546–558. Gemmell, D.J., Christodoulakis, S., 1992. Principles of delay sensitive multimedia data storage and retrieval. ACM Trans. Information Systems 10 (1),

51–90.

Gemmell, D.J., Vin, H.M., Kandlur, D.D., Rangan, P.V., Rowe, L.A., 1995. Multimedia storage servers: a tutorial. IEEE Comput., 40–49.

Lehoczky, J.P., 1990. Fixed priority scheduling of periodic task sets with arbitrary deadlines. In: Proceedings of the Real-Time Systems Symposium, pp. 201–212.

Liu, C.L., Layland, J.W., 1973. Scheduling algorithms for multiprogramming in a hard real-time environment. J. ACM, 46–61. Lougher, P., Shepherd, D., 1993. The design of a storage server for continuous media. Comput. J. 36 (1), 32–42.

Reddy, A.L.N., Wyllie, J., 1993. Disk scheduling in a multimedia I/O system. In: Proceedings of the ACM Multimedia Conference, pp. 225–233. Reddy, A.L.N., Wyllie, J., 1994. I/O issues in a multimedia system. IEEE Comput., 69–74.

Ruemmler, C., Wilkes, J., 1994. An introduction to disk drive modeling. IEEE Comput., 16–28.

Stankovic, J.A., Buttazzo, G.C., 1995. Implications of classical scheduling results for real-time systems. IEEE Comput., 16–25.

Steinmetz, R., 1995. Multimedia ﬁle systems survey: approaches for continuous media disk scheduling. Comput. Commun. 18 (3), 133–144.

Wong, C.K., 1980. Minimizing expected head movement in one dimension and two dimension mass storage system. Comput. Survey 12 (2), 167–178. Yee, J., Varaiya, P., 1991. Disk scheduling policies for real-time multimedia applications. Technical Report, Department of Computer Science, University