Task Ranking in Clustering-based Workflow Scheduling

Chapter 3. Task Ranking and Task Group Allocation

3.1 Task Ranking in Clustering-based Workflow Scheduling

Scheduling

HEFT [10] is one of the most famous list-based workflow scheduling approaches. Many later list-based and clustering-based approaches [1, 21] follow the task ranking and allocation mechanisms in HEFT. In HEFT [10], the priority of each task is calculated in a way similar to the bottom-level calculation for DAG in [12]. Several other possible task-ranking methods were also mentioned in [10], however, without further discussion and evaluation. In the following, we describe several alternative task-ranking methods and illustrate their potential to outperform the popular task-ranking mechanism in HEFT [10] under certain circumstances.

In the task-ranking approach of HEFT [10], called bottom rank hereafter in this thesis, the bottom rank of a node represents the length of the longest path starting with it and is defined as follows. The bottom ranks of the tasks in a directed acyclic graph can be computed recursively by traversing the graph upward starting from the exit nodes of the graph.

( ) {

( )( ( ))

where is the computation cost of node represents each child of , ( ) is the set of children of and represents the communication cost between and .

Another possible task-ranking approach discussed in this section is called top rank, which is similar to the top-level calculation for DAG in [12]. The top rank of a node n_i is defined as follows, which represents the length of the longest path ending in it. The top ranks of the tasks in a DAG can be computed recursively by traversing the graph downward starting from the entry nodes of the graph.

( ) {

( ₎( ( ))

where ( ) is the set of parents of .

In the following, we first describe another task-ranking method, called bottom amount, for workflows of fork-join structure [28]. As defined in the following, the bottom amount rank differs from the existing bottom rank in that it sums up the bottom ranks of a task’s all children instead of choosing the largest one as in the bottom rank approach. The bottom amount rank is more capable of representing the amount of remaining workload depending on a task than the bottom rank. This feature is even important for scheduling workflows of fork-join structure. The bottom amount rank of a node is defined as follows. The computation of bottom amount rank requires the calculation of bottom rank to be performed first.

( ) {

∑ ( ( ))

( )

In the following, we propose a new task-ranking approach for clustering-based workflow scheduling. In contrast to previous task-ranking approaches which rank the tasks based on a single criterion, e.g., bottom rank or bottom amount rank, this new approach is a dual mechanism which ranks the tasks on the allocated critical paths by their top rank+bottom rank and ranks other tasks by their bottom rank or bottom amount rank. Therefore, there are two variants for the new approach: allocated top+bottom and bottom rank (ATBBR) and allocated top+bottom and bottom amount rank (ATBBAR). Since the top rank represents the length of the longest path ending in a node and the bottom rank represents the length of the longest path starting from a node, the top+bottom rank of a node is the length of the longest path going through the node. As a critical path is the longest path within a DAG, the nodes of the largest top+bottom rank are those nodes on the critical path.

The proposed approach tries to make a balance between two philosophies. The first is giving higher priority to the tasks on the allocated critical paths which are the tasks with the highest top+bottom rank. The second philosophy is ranking a task according to the amount of remaining workload depending on it, e.g., bottom rank or bottom amount rank. The approach aims to give higher priority to the tasks on the allocated critical paths [12] which might be different from the critical paths determined simply based on the workflow structure properties.

The allocated critical paths in our approach are determined based on the actually allocated partial schedule, which might change during the scheduling process. Therefore, the proposed approach adopts a dynamic mechanism which updates the top rank of each remaining task after a task is scheduled. This dynamic mechanism enables it to more accurately find the tasks

on the actual critical paths and give them higher ranks, aiming to reduce the makespan of overall workflow execution effectively.

In a clustering-based workflow scheduling approach, there are in general two steps where task ranking information is needed, including choosing the start task of a new group and selecting the next task to join a group. In PCH [4], it adopts the bottom rank when choosing the start task of a new group and uses the top + bottom rank for selecting the next task in a group. In this section, we propose that by adopting different task ranking information at these two steps, the workflow execution performance can be further improved. Figure 3.1 is an illustrative example where the upper schedule is produced by PCH [4] and the lower schedule is generated by our approach which adopts the allocated top+bottom and bottom amount rank (ATBBAR) when choosing the start task of a new group. The example in Figure 3.1 shows that our new task ranking method can significantly improve the overall workflow execution performance in terms of makespan.

Fig. 3.1 Comparison of the PCH and our new ranking mechanism

3.2 Adaptive Task Allocation for Clustering-based

在文檔中工作流程排程問題中工作次序評定、群組及配置方法之研究 (頁 16-20)