Backfilling Algorithm - Background and Related Work

Chapter 2: Background and Related Work

2.3. Backfilling Algorithm

The backfilling algorithm, a way to balance between the goals of utilizing system resources and maintaining the FCFS (first come, first served) order of job execution [15], was first introduced by Lifka [16]. The implication behind the algorithm is that it allows small jobs from the back of the waiting queue to be processed before previously submitted jobs that are delayed due to the insufficiency of available resources. This principle helps exploit idle resources by backfilling with suitable jobs, thereby increasing system utilization and throughput. Figure 1 illustrates the difference between the traditional FCFS scheduling and backfilling scheduling. In Figure 1, backfilling scheduling allows Job B to be processed ahead of Job A;

therefore, the resultant job waiting time and resource idle time are reduced significantly in comparison with those of FCFS. Backfilling scheduling might lead to

“starvation”, a phenomenon where some jobs never occupy sufficient resources and

hence never start processing because they are constantly delayed by new job arrivals that are granted the use of resources ahead of those already waiting in the queue. In order to prevent starvation from happening, a backfilling algorithm needs to make resource reservation for some of the waiting jobs for future time in advance. One should notice here that backfilling algorithms require job service time to be known in advance, which in practice is often specified by an upper-bound.

Figure 1: Illustration of how backfilling can reduce job waiting time and idle time of resources.

Aggressive vs. Conservative

In contrast to aggressive backfilling, conservative backfilling [8, 9] makes reservation for every queued job which cannot be executed at a given moment. It means that a job can be backfilled on the condition that it does not delay any previous jobs in the queue. Clearly, this reduces the number of jobs that can utilize idle resources. As a result, its performance tends to be inferior to that of aggressive backfilling. For the performance comparison between these two approaches, Mu’alem and Feitelson [8] showed that the performance of aggressive backfilling algorithm is better than that of conservative backfilling in most cases. However, conservative backfilling can remove the above-mentioned weakness of aggressive backfilling because of its ability to guarantee job starting time by establishing resource reservation for every waiting job.

There are several variants of backfilling algorithms. The most popular one is aggressive backfilling [15, 16], in which only the first job in the queue can receive a resource reservation. To put it another way, if an arrived job is the first job in the queue and cannot be processed immediately, the algorithm calculates the earliest possible starting time for this job using its resource requirement and service time; then, the scheduler makes a reservation for this job at this pre-calculated time. Other jobs are allowed to backfill only if they do not violate this reservation. The core problem of aggressive backfilling is its unpredictability since waiting jobs, except the first one, do

not get reservations. Therefore, the algorithm cannot give every job in the queue a guaranteed starting time.

Some variants of backfilling algorithms between aggressive and conservative backfilling, for example making reservation for the first few jobs in the waiting queue, have also been introduced [17, 18]. The ideas of using an adaptive number of reservations were presented by the authors of [17]. In this strategy, jobs are not necessarily given reservations until their expected turn-around time exceeds some threshold, whereupon they get a reservation. Chiang et al. [18] suggested that four is a good number of reservations for compromise between aggressive and conservative backfilling.

Slack-based Backfilling

In the original backfilling algorithm, a newly arriving job can be backfilled as long as it does not delay any existing reservations. In order to make backfilling scheduling more flexible and increase resource utilization, slack-based backfilling algorithms [19, 20, 21, 22] have introduced the concept of slack factor, by which the actual starting time of reserved jobs can be relaxed up to a certain slack. In other words, a newly submitted job can move to the head of the waiting queue on the condition that it will not delay already existing reservations by more than a specific slack factor. In those algorithms, the system’s slack factor is used to control for how long jobs will have to wait before the start of execution.

The idea of slack factor has already been introduced to real time scheduling, parallel scheduling, and grid scheduling environments, and has been confirmed to be effective [15]. Dynamic backfilling allows the scheduler to overrule a previous reservation by a slight delay if doing so can improve system utilization considerably [19]. In order to enhance backfilling and support priority scheduling, Talby and Feitelson [20] combined three parameters – the target job’s individual priority, tunable system slack factor, and the average job waiting time – to assign each waiting job a slack value. The authors also provided several heuristics to reduce the search space of finding the least costly schedule profile from all possible candidates. The cost of a schedule is the sum of costs of all its jobs, and the cost of a job is calculated based on its delay and resource requirements. In [21], Ward et al. suggested the use of a relaxed backfilling strategy in which a backfill candidate is selected from the job waiting queue by considering its waiting time, estimated service time and resource requirement together. Bo Li et al. [22] introduced an approach different from previous

algorithms such that the slack factor is calculated based on each job’s service time and slack-based backfilling with more than one reservation is supported.

Other Variants of Backfilling

Lawson and Smirni [23] introduced multiple-queue backfilling which divides the system resources into multiple disjoint partitions. Each partition is associated with an individual queue, and a submitted job is assigned to a partition and hence the associated queue based on its estimated service time. The approach aims at reducing fragmentation of system resources reducing the likelihood that a short job is queued behind a long job. Backfilling with lookahead [24] algorithms make scheduling decisions by considering a set of jobs at once. The algorithms look ahead into the job queue and try to find a packing of jobs which maximizes the scheduler’s objective using a dynamic programming technique.

在文檔中雲端計算環境中雙階層計畫與工作排程演算法 (頁 14-17)