Performance Evaluation - 工作流程排程問題中工作次序評定、群組及配置方法之研究

In this section, we evaluate the proposed approaches through a series of simulation experiments, which compare the proposed appraoches with previous methods in the literature in terms of average makespan of all workflows. Here, the makespan of a workflow is defined to be the time period between its arrival and its finishing execution.

4.1 Task Ranking in Clustering-based Workflow Scheduling

This section evaluates the proposed task-ranking methods for clustering-based workflow scheduling. The performance metric used in the experiments is makespan, which represents the total execution time for a workflow application. It is used to measure the performance of a scheduling algorithm from the perspective of workflow applications. In each experiment, the average makespan of 100 different workflows is used to evaluate different scheduling methods.

We implemented a DAG generator to randomly generate synthetic workflows with fork-join DAG structure for the following simulation experiments. Fork-join DAGs are generated as follows:

1. The generator generates a DAG with one entry node and one exit node.

2. Each DAG contains 1–2 fork-join structures randomly.

3. Each fork operation produces 2–10 branches randomly.

4. Each branch contains 2–6 nodes randomly.

5. Each node has the computation cost ranging from 1 to 100 s.

Figures 4.1 to 4.3 show the experimental results comparing PCH [4] and our new task ranking method for three different Communication-to-Computation Ratio (CCR) [12]. When CCR is one, PCH [4] and our method achieve quite similar performance. However, as CCR grows, our method delivers increasing performance improvement compared to PCH [4].

Figure.4.1: Ranking fork-join workflows on 30 resources (CCR = 0.1)

Figure.4.2.: Ranking fork-join workflows on 30 resources (CCR = 1)

660.51

Figure.4.3.: Ranking fork-join workflows on 30 resources (CCR = 10)

4.2 Adaptive Task Allocation for Clustering-based Workflow Scheduling

This section presents a series of simulation experiments which evaluate the proposed adaptive subgroup allocation method in terms of makespan. The adaptive subgroup allocation method is compared to the typical allocation mechanism in PCH [4, 5]. The experimental results are shown in Figure 4.4 for workflows of different CCR properties. The performance shown in each experiment is the average makespan of 100 different workflows scheduled by the evaluated methods. Our adaptive subgroup allocation method outperforms PCH significantly in all the three experiments, and the performance improvement rises as CCR

Figure.4.4: Experimental results for workflows of different CCR

Figure 4.5 is an example illustrating how our adaptive subgroup allocation method can outperform PCH. Our method partitions a task group into several subgroups adaptively for individual allocation based on each join node. This arrangement gives flexibility to the allocation of join nodes while retaining most benefits of clustering-based methods, and thus achieve better workflow execution performance.

660.51 773.4

2972.22

650.05 766.94

2462.12

500 1000 1500 2000 2500 3000

CCR0.1 CCR1 CCR10

Avg-Makespans

PCH

ICEAI (NEW RANK)

(a)

(c)

Figure 4.5: Schedules of different allocation methods (a) Example workflow (b) PCH allocation (c) adaptive subgroup allocation

20 25 A

39 56

B I J K

73 L

40 49 52 60 72

50 65 73

40 45 52 54 72 75 76 90 100 110

C F G D

M N

E H

O P

80 82 Q

R S

4.3 Task Group Allocation for Online Multi-workflow Scheduling

This section evaluates the proposed adaptive dual-criteria task group allocation method for clustering-based multiple workflow scheduling. The proposed method is compared with previous approaches, including the best-fit heuristic in [16], the EFT heuristic in PCH [4] [18], and the EST + fitness approach [17].

Figures 4.6, 4.7, and 4.8 compare the adaptive dual-criteria task group allocation method with previous approaches in terms of average makespan under different CCR values. In the figures, adjustable allocation represents an approach that adopts only the adjustable idle time gap selection mechanism, while adaptive dual-criteria is an approach adopting both adjustable idle time gap selection and adaptive task group rearrangement. Next to the name of adjustable allocation is a pair of parentheses indicating the best weight of fitness, i.e.  in score function (1), and the pair of parentheses next to adaptive dual-criteria contains the weights of fitness and EFT, i.e.  and  in score function (2), which lead to the best performance in that case. The inter-arrival time between two consecutive workflows is determined by a random number within a range. In the experiments of Figures 4.6, 4.7, and 4.8, the range of inter-arrival time is 30 seconds. The experimental results indicate that adaptive dual-criteria outperforms other methods only when CCR is one.

Fig. 4.6 CCR = 0.1 and inter-arrival time range=30 seconds

Fig. 4.7 CCR=1 and inter-arrival time range=30 seconds

3562

Fig. 4.8 CCR=10 and inter-arrival time range=30 seconds

Figures 4.9, 4.10, and 4.11 presents the evaluation of the proposed adaptive dual-criteria method when the range of inter-arrival time between workflows is 500 seconds, much longer than in Figures 4.6, 4.7, and 4.8. In this scenario, our adaptive dual-criteria method outperforms the previous approaches under all the three CCR values. Again, the adaptive dual-criteria approach achieves larger performance improvement whenCCR is one. It achieves the least performance for large CCR value, i.e. CCR=10.

3804

Fig. 4.10 CCR=1 and inter-arrival time range=500 seconds

Fig. 4.11 CCR=10 and inter-arrival time range=500 seconds

Figures 4.13 and 4.14 show the experimental results based on the structure and properties of a real workflow application, LIGO [19], as shown in Figure 4.12. The experimental results show that our adaptive dual-criteria approach can achieve better performance than previous methods in [16][18][20] for larger inter-arrival time.

3569

Fig. 4.12 DAG structure of a real workflow application LIGO

Fig. 4.13 inter-arrival time range=30 seconds for LIGO

3874.3 3874

3905.3

3873.3 3877.8

3850 3860 3870 3880 3890 3900 3910

IAT 30

Avg-Makespans

EFT

EST+fitness best-fit adjustable allocation (0.1) adaptive dual-criteria (0.7 0.1)

Fig. 4.14 inter-arrival time range=500 seconds for LIGO

The time complexity of choosing a good gap for task group allocation depends on serval factors, including the number of resources, the number of gaps on each resource in the partial schedule, and the score evaluation function. In our implementation of the methods evaluated in the experiments, each gap would be checked and given a score before determining the best gap for allocation. Therefore, the time complexity of different methods differs mainly in the score function. Among the task group allocation methods evaluated in the experiments, EFT has the lowest time complexity since it doesn’t have to calculate the fitness value of each gap.

All the other methods, EST+fitness, best-fit, and our adaptive dual-criteria method, have to calculate the fitness value and even perform further computation in the score function, resulting in higher time complexity. Our adaptive dual-criteria method has the most complex score evaluation function, as shown in equation (2), and thus has the highest time complexity.

Figure 4.15 compares the scheduling overheads of different methods measured in the

allocation approach requires the longest computation time. However, compared to the long execution time of scientific workflow applications, usually in hours or even more, the scheduling overhead, in milliseconds, is negligible.

Fig. 4.15 Scheduling overheads in milliseconds

Table 4.1 shows resultant average makespan of applying our adaptive dual-criteria task group allocation approach to the set of workflows in Figure 4.11 with different ,  values.

The results indicate that careful selection of appropriate ,  values is important to achieve good performance. The difference between the best, (=0, =0.8), and the worst, (=0.9,

=0.1), performance is enormous. The performance achieved by (=0.9, =0.1) is even worse

than previous methods according to the data in Figure 4.11. Choosing the best ,  values is no easy. Exhaustive experiments, such as Table 4.2, based on historical workload data can help to find good ,  values for a specific system. The experimental results presented in this section also shed some light on how to choose appropriate ,  values based on information such as CCR values and average inter-arrival time.

Table 4.1. An exhaustive experiment on ,  values

α β 1-α-β makespans

0.0 0.1 0.9 3829.6

0.6 0.4 0.0 3910.4

0.7 0.1 0.2 4364.9

0.7 0.2 0.1 4108.3

0.7 0.3 0.0 4023.3

0.8 0.1 0.1 4430.2

0.8 0.2 0.0 4201.8

0.9 0.1 0.0 4521.3

在文檔中工作流程排程問題中工作次序評定、群組及配置方法之研究 (頁 32-48)