Background - 應用線上排程於複合平行工作流程之研究

In this chapter, we describe the application model and computing platform, and survey related workflow scheduling algorithms. Section 2-1 and section 2-2 describes the application model and computing platform. Section 2-3 reviews static workflow scheduling algorithms. Section 2-4 surveys concurrent workflow and online workflow scheduling algorithms.

2-1 Application Model

A scientific workflow application can be modeled as a Directed Acyclic Graph (DAG) to represent the tasks and their order. A DAG is usually defined as a pair (V, E), where V and E are finite sets. V={ti|i=1,…,n} denotes the set of n individual rigid tasks [15], of which each uses a fixed number of resources. E denotes the set of edges {ei,j|1≤

i, j≤ n} where ei,j, an arc from ti to tj, represents that ti is assigned as a pre-task of tj, i.e., transfer between two tasks assigned to the same processor incurs no communication. In a workflow application, a task without ancestor is called as an entry task and a task without any descendant is an exit task. It is assumed that there is only one entry and one exit task in a workflow application.

2-2 Computing Platform Model and Workflow Scheduling

A High Performance Computing Cloud (HPC Cloud) can be implemented with a multi-cluster platform [16], which consists of k heterogeneous clusters Ci, i=1,…,k that can be geographically distributed and vary on both performance and architecture. Each cluster C_i contains P_i processors of same type and speed (homogeneous), while different clusters may differ in the amount of processors. All of the clusters are fully connected through heterogeneous network links with different bandwidths and latencies.

In general, scheduling parallel and distributed applications is a known NP-Complete problem. There are many scientific efforts paid for optimizing workflow scheduling problem by minimizing the overall execution time, or makespan, of the workflow application in the past years. Therefore, many scheduling methods have been proposed and can be classified into three categories [17]: full-ahead planning, just-in-time and hybrid.

A full-ahead planning scheduling algorithm (static planning) assumes that a scheduler has enough knowledge of workflows and resources in the very beginning. A static planning makes task assignments according to the knowledge and machine status before workflow application starts to execute. HEFT (Heterogeneous Earliest Finish Time) [2] is one of the most popular static heuristic and proven that it performs better than other heuristics. A static planning is not suitable for some situations, e.g., individual resource fails, and not easy to accurately estimate the costs of tasks.

On the contrary, a just-in-time scheduling algorithm (dynamic planning) makes a task allocation with available tasks and free resources when an application is running. A

dynamic planning is usually applied when it is difficult to estimate the costs of tasks, or when the workflow applications are submitted at different times (which is also called online scheduling). For example, RANK_HYBD, a planner-guided scheduling strategy presented in [9], is designed to deal with the multiple online workflow scheduling problem.

A hybrid (adaptive) approach presumes enough information is known in the beginning, and a task assignment decision is made before execution of workflow applications. However, it also makes reassignments when the following circumstances happen: (1) inaccuracy prediction, (2) change of resource status, or (3) another workflow application is submitted, at runtime. For example, Z. Yu et al. [5] proposed a HEFT-based adaptive rescheduling algorithm, AHEFT. An adaptive approach seems to take full advantages of static and dynamic ones. However it might introduce new efforts due to the consideration from both information.

2-3 Static Workflow Scheduling

The taxonomy proposed in [18] classified workflow scheduling algorithms into two groups: heuristics-based and meta-heuristics-based.

Figure 2-1 A taxonomy of heuristics-based workflow scheduling algorithms

Heuristics-based scheduling algorithms fall into several categories, including (1) immediate task scheduling, (2) list-based scheduling, (3) cluster-based scheduling, and (4) duplication-based scheduling as shown in Figure 2-1. The immediate task scheduling is the simplest heuristic for workflow applications; it makes schedule decisions based on the availability of tasks only. The Myopic algorithm [19] has been implemented in some Grid systems such as Condor DAGMan [20].

A list-based scheduling algorithm comprises two phases: the task prioritizing phase and the resource selection phase. The task prioritizing phase sets the priority of each task and generates a scheduling list by sorting the tasks according to their priorities. The resource selection phase selects tasks in order and maps each task to its optimal resource. List-based heuristics, which are generally accepted as the best overall approach, can be further divided into three subclasses according to the task parallelism [2][4][21].

HEFT [2] is a well-known list-based algorithm in heterogeneous environments.

HEFT first traverses the DAG from bottom to top in order to calculate an upward rank value for each task. The tasks are then sorted in non-ascending order of their ranks.

According to the order, each task is assigned to the resource that minimizes the Earliest Finish Time (EFT) of the task. Many heuristics have been applied based on HEFT [3][5][6]. Figure 2-2 shows an example of HEFT.

Figure 2-2 An example of HEFT

Both cluster-based heuristics and duplication-based heuristics are designed to reduce the communication costs between data interdependent tasks [22][23][24][25].

In cluster-based heuristics, the tasks in the same group (cluster) are assigned into the same resource, while the duplicated-based heuristics assign the idling time of a resource to some parent tasks, which have been scheduled on other resources.

The meta-heuristics-based scheduling algorithm provides both a general structure and strategy guidelines for developing a heuristic to fit a particular kind of problem. A meta-heuristics-based algorithm, which is generally applied to a large and complicated problem, provides an efficient way of moving quickly toward a very

good solution. There are three meta-heuristics-based algorithms, namely Greedy randomized adaptive search procedure (GRASP) [26], Genetic Algorithm [27] and Simulated Annealing [28]. However, the scheduling time in meta-heuristics-based algorithms is significantly higher than heuristics-based algorithms.

There are comparisons [18][29] between the heuristics-based approaches and meta-heuristics-based approaches. The result shows that the meta-heuristics-based one usually performs better than the heuristics-based one, since a meta-heuristics-based approach can produce an optimized solution based on the performance of the entire workflow. However, the time complexity of the meta-heuristics based algorithm grows more rapidly than that of the heuristics-based algorithm if the workflow has more tasks.

2-4 Scheduling Multiple Workflows

The scheduling algorithms aforementioned usually consider a single workflow only. In recent years, few methods have been proposed for dealing multiple workflows. Zhao and Sakellariou [8] presented three different approaches to schedule multiple workflows at the same time.

(1) Scheduling the workflows one after the other with any single-workflow scheduling algorithm

(2) Scheduling the workflows in sequence with backfilling (3) Merging multiple workflows into a single workflow.

Furthermore, approaches mentioned above are infeasible when multiple workflows come at different time. Thus, RANK_HYBD [9] has been proposed to support online workflow scheduling. The task scheduling approach of RANK_HYBD

re-prioritizes the tasks in the waiting queue repeatedly by the following rules:

(1) If all the tasks in waiting queue come from single workflow, then it prioritizes tasks in a non-ascending order of task ranking value, which is described in HEFT [2].

(2) Otherwise, it prioritizes tasks in the opposite order.

Moreover, RANK_HYBD does not consider the mixed-parallel workflows where an application has more than one task that can execute concurrently and a task can run with more than one resource simultaneously. Online Workflow Management (OWM) [12] has been proposed for the online mixed-parallel workflows.

In OWM, there are four processes: Critical Path Workflow Scheduling (CPWS), Task Scheduling, Task Rearrangement and Adaptive Allocation (AA). Figure 2-3 shows the structure of OWM. CPWS manages the task interdependence and submits tasks into waiting queue according to the critical path in workflows. The task scheduling process in OWM sorts waiting queue like RANK_HYBD. In the task-parallel task scheduling, there may have some slacks among the tasks when the free processes are not enough for the first task in the waiting queue. The multi-processor task rearrangement process works for minimizing the slacks with latter tasks in the queue to improve utilization. When there are free resources, AA takes the highest priority task in the waiting queue, and selects the required resources to execute the task.

Figure 2-3 Online Workflow Management (OWM)

Chapter 3 Mixed-Parallel Online Workflow

在文檔中應用線上排程於複合平行工作流程之研究 (頁 14-22)