Performance Evaluation - 高效能計算即服務平台上於執行期限條件下以收益極大化為目標之可調式平行工作排程方法研究

This section evaluates the proposed reservation-based dynamic scheduling approach, and compares it to two previous methods: the moldable EDF method in [27] and the Algorithm 3 in [35]. One thing to be noted is that moldable EDF was not developed for dynamic scheduling and the EDF + ASAP + AFAP combination in our reservation-based dynamic scheduling framework actually is the dynamic version of moldable EDF.

Since our goal is revenue maximization, to conduct the evaluation we have to define the profit model, charge model and penalty model first. The profit of an HPCaaS provider is defined to be the income minus the penalty. The income is determined by the charge model. In traditional HPC systems dealing with rigid jobs, the number of processors used to run a job is specified by the user, and thus the cost of running a job is simple to calculate, usually proportional to the number of processors used multiplied by the job’s parallel runtime, i.e. the period of wall-clock time that the processors are occupied. However, for dealing with moldable jobs where the number processors used is determined by the scheduler, not the user, there is not yet a commonly adopted charge model. One of the complexities comes from the fact that if the traditional charge model is used, using different numbers of processors for a job might lead to different costs since the efficiency of parallel applications usually is not 100%. Although using more processors usually leads to shorter turnaround time for a job, it also costs much. Therefore, an HPCaaS provider might tend to use more processors for a job to maximize its income, which could be unfair from the point of view of users. Since there is not a commonly agreed charge model for moldable jobs yet, in this study we evaluated the proposed scheduling approach with two possible charge model. The first model, called parallel-runtime model, is similar to the

traditional charge model where the fee of running a job is proportional to the number of processors used multiplied by the job’s parallel runtime. In the second model, called sequential-runtime model, the cost of running a job is determined by its equivalent sequential

runtime, i.e. the execution time when running with one processor. We assume a job’s sequential runtime is available by the execution-time conversion based on the parallel speedup model, e.g.

the Amdahl’s Law model [34].

For the penalty model, in the evaluation, we explored two types of deadline-constrained jobs: soft-deadline and hard-deadline. In the scenario of soft-deadline jobs, all jobs submitted will be executed. Some of them might meet their deadlines, while the others miss the deadlines.

No penalty will be attributed to the jobs meeting their deadlines, and the penalty of a job missing its deadline is defined to be its finish time minus its deadline. For the scenario of hard-deadline jobs, the jobs missing their deadlines are further divided into two groups. The first group contains the jobs which are found to be unable to meet their deadlines right on their submission. Those jobs are viewed as being rejected immediately after their submission, and thus no penalties are assumed for them. The HPCaaS provider only loses some potential incomes. On the other hand, the second group of jobs are those that are found to be unable to meet their deadlines after a time period from their submission, due to the partial re-scheduling triggered by a new job’s arrival. The second group of jobs will account for penalties determined by their equivalent sequential runtime. Therefore, for the second group of jobs, the HPCaaS provider not only loses some potential incomes, but also pays for the penalties.

The experiments simulate a 128-processor homogeneous cluster and an online workload based on a public workload log [13]. The workload log contains 73496 records collected on a 128-node IBM SP2 machine at San Diego Supercomputer Center (SDSC) from May 1998 to

April 2000. After excluding some problematic records based on the completed field [13] in the log, the simulation experiments in this thesis use 56490 job records as the input workload. The speedup of a job with different numbers of processors is calculated using Amdahl’s Law [34].

The parameter α indicates the fraction of computation within a job that is parallelizable in Amdahl’s Law [34]. We set α=0.6 in the following experiments. The deadline of each job is given according to the following formula.

D(i) = Tsub(i) + k * Texec(1,i)

where i is the index of jobs; Tsub(i) is the submission time of job i; Texec(n,i) is execution time of job i with n processors; k is a random number picked up within a specified range, 0.1 to kmax. In the following experiments, kmax is set to 2.

In the following experiments, we evaluated three different waiting queue sequencing policies. The first is Earliest Deadline First (EDF) [25], where jobs with earlier deadline have higher priority. The second policy is Shortest Job First (SJF) [33], where jobs with shorter sequential runtime have higher priority. In contrast to SJF, the third policy, Largest Income First (LIF), tends to give higher priority to the jobs with longer sequential runtime since they are likely to bring more incomes.

For the temporal allocation policies, we evaluated As Soon As Possible (ASAP) and As Late As Possible (ALAP). With the ASAP policy, the scheduler will check the resource reservation profile in a non-decreasing order of time from the earliest time instant. On the other hand, for the ALAP policy, the scheduler starts from the latest time instant in the profile, and then proceeds in a non-increasing order of time.

For the spatial allocation policies, As Few As Possible (AFAP) and As Many As Possible (AMAP) were evaluated in the experiments. With AFAP, the scheduler tries to find a feasible amount of processor allocation at a specific time instant in the profile for a job to meet its deadline in an increasing order starting from one. On the other hand, for AMAP, the schedulers checks the possible amount of processor allocation in a decreasing order starting from a pre-defined threshold value. The threshold value is set to prevent the system from being trapped into a purely serial execution scenario where the jobs are executed one by one with each job exclusively using all processors in the system. As indicated by the Amdahl’s Law [34], efficiency of a parallel application usually declines as the number of used processors increases.

Therefore, the purely serial execution scenario would lead to poor overall system performance, and should be avoided.

The proposed approach was evaluated and compared to previous methods in terms of different performance metrics, including deadline-miss rate, completion rate, average turnaround time, penalty, income, and profit, in different scenarios. Among them, completion rate is defined as the ratio of the amount of jobs meeting their deadlines over the total number of jobs submitted to the system. The deadline-miss rate is defined to be the number of jobs unable to meet their deadlines and incurring penalties divided by the total number of jobs submitted.

Average turnaround time is one of the most commonly used performance metrics for comparing different batch job scheduling approaches, where turnaround time is calculated by subtracting the job submission time from the job finish time.

Figures 5.1 and 5.2 are representative experimental results comparing the different combinations of temporal and spatial allocation policies of the reservation-based dynamic scheduling approach with the sequential-runtime charge model. For waiting queue sequencing,

SJF was used in this scenario. For soft-deadline scenarios, all jobs will be executed, and thus the incomes of different methods are equivalent. Therefore, the profit is determined by the incurred penalty. Less penalties imply higher profits. The experimental results indicate that ASAP+AFAP leads to the largest profit. Figures 5.3 and 5.4 compare different waiting queue sequencing policies, using the ASAP + AFAP combination for temporal and spatial allocation.

The results show that SJF delivers the best performance. This is because the SJF policy has the potential to achieve the shortest average turnaround time for job scheduling as illustrated in [33], promising to lead to the shortest over-deadline time periods. Figure 5.5 compares the average turnaround time resulted from the three waiting queue sequencing policies, and confirms that SJF leads to the shortest average turnaround time.

Fig. 5.1 Penalty for soft-deadline jobs with sequential-runtime charge model and SJF policy

141661977

Fig. 5.2 Profit for soft-deadline jobs with sequential-runtime charge model and SJF policy

Fig. 5.3 Penalty for soft-deadline jobs with sequential-runtime charge model and ASAP+AFAP policy

Fig. 5.4 Profit for soft-deadline jobs with sequential-runtime charge model and ASAP+AFAP policy

162586151

Fig. 5.5 Average turnaround time for soft-deadline jobs with sequential-runtime charge model and ASAP+AFAP policy

For soft-deadline jobs with the parallel-runtime charge model, the experimental results show that SJF is still the best waiting queue sequencing policy. However, in contrast to the scenario with the sequential-runtime charge model, AMAP is now a better spatial allocation policy than AFAP, as shown in Figures 5.6 and 5.7 that ASAP + AMAP achieves the largest profit. This implies that as the number of used processors increases the income grows more quickly than the penalty due to the degraded parallel efficiency for large numbers of processors reflected by the Amdahl’s Law [34].

Fig. 5.6 Penalty for soft-deadline jobs with parallel-runtime charge model and SJF policy

Fig. 5.7 Profit for soft-deadline jobs with parallel-runtime charge model and SJF policy

141661977

The following experiments deal with hard-deadline scenarios. Figures 5.8 to 5.12 are representative experimental results comparing the different combinations of temporal and spatial allocation policies of the reservation-based dynamic scheduling approach with the sequential-runtime charge model. For waiting queue sequencing, EDF was used in this scenario.

The results show that ALAP + AFAP achieves the highest completion rate, allowing the most jobs to finish execution before deadline, and thus leads to the largest income. In addition, ALAP + AFAP results in the lowest deadline miss rate, i.e. zero percent as shown in Figure 5.8, and thus incurs the least penalty. Therefore, ALAP + AFAP achieves the largest profit among all methods.

Figures 5.13 and 5.14 is an illustrative example showing that why ALAP can achieve better performance than ASAP for hard-deadline scenarios. Table 5.1 shows the detailed attributes of the jobs in the example. In this example, ASAP leads to two more deadline misses, jobs 5.10 and 5.12. Comparing the resultant schedules in Figures 5.13 and 5.14, we can see that in ALAP jobs 6, 7, and 11 are reserved at later time periods, compared to in the ASAP schedule, leaving room for jobs 12 and 14, which arrive later, to meet their deadlines and thus accommodating more jobs for execution.

Fig. 5.8 Deadline miss rate for hard-deadline jobs with sequential-runtime charge model and the EDF policy 0.004% 0.147% 0.000%

2.305%

0.000%

0.500%

1.000%

1.500%

2.000%

2.500%

ASAP+AFAP ASAP+AMAP ALAP+AFAP ALAP+AMAP

deadline missrate

Fig. 5.9 Completion rate for hard-deadline jobs with sequential-runtime charge model and the EDF policy

Fig. 5.10 Penalty for hard-deadline jobs with sequential-runtime charge model and the EDF policy

Fig. 5.11 Income for hard-deadline jobs with sequential-runtime charge model and the EDF policy 82.84%

Fig. 5.12 Profit for hard-deadline jobs with sequential-runtime charge model and the EDF policy

Fig. 5.13 An example with ALAP

Fig. 5.14 An example with ASAP

468060487

444738911

468094136

436389391

420000000 425000000 430000000 435000000 440000000 445000000 450000000 455000000 460000000 465000000 470000000 475000000

ASAP+AFAP ASAP+AMAP ALAP+AFAP ALAP+AMAP

profit

Table 5.1 Job attributes and miss counts scenarios with the sequential-runtime charge model. The experimental results show that EDF leads to the highest completion rate and the lowest penalty-attributed deadline-missed rate, resulting in the largest profit.

0.000%

Fig. 5.15 Deadline miss rate for hard-deadline jobs with sequential-runtime charge model and ALAP+AFAP policy

Fig. 5.16 Completion rate for hard-deadline jobs with sequential-runtime charge model and ALAP+AFAP policy

Fig. 5.17 Profit for hard-deadline jobs with sequential-runtime charge model and ALAP+AFAP policy

For scenarios with the parallel-runtime charge model, as shown in Figures 5.18 to 5.22, the ALAP + AMAP combination in general achieves the largest income among all temporal and spatial allocation methods although it leads to neither the highest completion rate nor the lowest deadline miss rate. As in the scenario of soft-deadline with the parallel-runtime charge model, this is because as the number of used processors increases the income grows more quickly than the penalty due to the degraded parallel efficiency for large numbers of processors reflected by the Amdahl’s Law [34]. The highest income of ALAP + AMAP leads it to the largest profit.

82.862%

Fig. 5.18 Deadline miss rate for hard-deadline jobs with parallel-runtime charge model and the LIF policy

Fig. 5.19 Completion rate for hard-deadline jobs with parallel-runtime charge model and the LIF policy

Fig. 5.20 Penalty for hard-deadline jobs with parallel-runtime charge model and the LIF policy 0.007% 0.106% 0.005%

Fig. 5.21 Income for hard-deadline jobs with parallel-runtime charge model and the LIF policy

Fig. 5.22 Profit for hard-deadline jobs with parallel-runtime charge model and the LIF policy

For the waiting queue sequencing policies, the Largest Income First (LIF) policy is effective as shown in Figures 5.23 to 5.27. It leads to the largest income, shown in Figure 5.26, although with the lowest completion rate, shown in Figure 5.24. This implies that LIF can effectively maximize the income by favoring larger jobs. Since it also leads to the lowest deadline miss rate and thus the lowest penalty, shown in Figures 5.23 and 5.25, the LIF policy finally achieves the largest profit as shown in Figure 5.27.

713573846

Fig. 5.23 Deadline miss rate for hard-deadline jobs with parallel-runtime charge model and the ALAP+AMAP policy

Fig. 5.24 Completion rate for hard-deadline jobs with parallel-runtime charge model and the ALAP+AMAP policy

Fig. 5.25 Penalty for hard-deadline jobs with parallel-runtime charge model and the ALAP+AMAP policy

2.305%

Fig. 5.26 Income for hard-deadline jobs with parallel-runtime charge model and the ALAP+AMAP policy

Fig. 5.27 Profit for hard-deadline jobs with parallel-runtime charge model and the ALAP+AMAP policy

The following experiments compare the proposed reservation-based dynamic scheduling approach with previous methods, including the Algorithm 3 in [35] and a dynamic version of the moldable EDF method in [27]. Since the two previous methods were developed for jobs of hard deadlines. The comparisons were conducted with two hard-deadline scenarios based on the sequential-runtime and parallel-runtime charge models, respectively. The experimental results for the sequential-runtime charge model in Figures 5.28 to 5.30 shows that our approach, AFAP+ALAP+EDF, achieves the lowest deadline-missed penalty and the highest income, thus

2571971304

leading to the largest profit. Our approach achieves 7% and 20% increase in profit, compared to moldable EDF and Algorithm 3, respectively.

Fig. 5.28 Comparing different methods by penalty for hard-deadline jobs with the sequential-runtime charge model

Fig. 5.29 Comparing different methods by income for hard-deadline jobs with the sequential-runtime charge model

Fig. 5.30 Comparing different methods by profit for hard-deadline jobs with the sequential-runtime charge model

For the parallel-runtime charge model, although our approach, AMAP+ALAP+LIF, incurs more deadline-missed penalties compared to moldable EDF, shown in Figure 5.31, it still achieves the largest profit as shown in Figure 5.33. This is because our approach brings a larger income than other methods. The increase in profit achieved by our approach is 72% and 85%, compared to moldable EDF and Algorithm 3, respectively.

Fig. 5.31 Comparing different methods by penalty for hard-deadline jobs with the parallel-runtime charge model

Fig. 5.32 Comparing different methods by income for hard-deadline jobs with the parallel-runtime charge model

Fig. 5.33 Comparing different methods by profit for hard-deadline jobs with the parallel-runtime charge model

Figure 5.34 compares the scheduling overhead of the evaluated methods in terms of their execution time for scheduling the total 56490 jobs in the experiments. It’s clear that the Algorithm 3 in [35] has the least overhead since it doesn’t perform resource reservation for each job in the waiting queue. Our AMAP+ALAP+LIF approach for the parallel-runtime charge model has lower overhead than moldable EDF [27] and our another method for the sequential-runtime charge model because the AMAP policy tries the amount of free processors from the largest value, and thus could find the appropriate amount of processors to meet the deadline more quickly than the AFAP policy which tries from one processor. Our

AFAP+ALAP+EDF approach for the sequential-runtime charge model has the largest overhead since the ALAP policy, which tries from the latest time instant, might need to try more time instants than the ASAP policy before finding an appropriate one, although having the advantage of lower deadline-miss rate.

Fig. 5.34 Comparison of scheduling overhead of different methods

In the proposed reservation-based dynamic scheduling approach described in Algorithm 1, the scheduler will perform partial re-scheduling when a new job arrives. The effects of partial re-scheduling have pros and cons. It might cause some jobs which already have resource reservations for meeting their deadlines become missing deadlines. On the other hand, it also raises the probability that the new job can reserve appropriate resources to meet its deadline. In the following, we present a series of experiments, Figures 5.35 to 5.40, to evaluate the effectiveness of partial re-scheduling on job arrival for hard-deadline scenarios. In the figures, with commitment represents that no partial re-scheduling is performed, and without commitment

indicates that partial re-scheduling is enabled. For the with commitment policy, once a job gets resource reservation to meet its deadline upon its submission, it is guaranteed that the reservation won’t be changed. Therefore, there won’t be deadline-missed penalties, as shown in Figures 5.35 and 5.38. This is its potential benefit.

959

The experimental results show that in general the with commitment policy achieves larger profits as shown in Figures 5.37 and 5.40. However, the completion rate of the with commitment policy is lower than its counterpart for the sequential-runtime charge model,

shown in Figure 5.36, while being higher than the without commitment policy for the parallel-runtime charge model, shown in Figure 5.39.

Fig. 5.35 Evaluation of deadline-miss rate for partial re-scheduling in hard-deadline scenarios with the sequential-runtime charge model

Fig. 5.36 Evaluation of completion rate for partial re-scheduling in hard-deadline scenarios with the sequential-runtime charge model

Fig. 5.37 Evaluation of profit for partial re-scheduling in hard-deadline scenarios with the sequential-runtime charge model

Fig. 5.38 Evaluation of deadline-miss rate for partial re-scheduling in hard-deadline scenarios with the parallel-runtime charge model

Fig. 5.39 Evaluation of completion rate for partial re-scheduling in hard-deadline scenarios with the parallel-runtime charge model

Fig. 5.40 Evaluation of profit for partial re-scheduling in hard-deadline scenarios with the parallel-runtime

在文檔中高效能計算即服務平台上於執行期限條件下以收益極大化為目標之可調式平行工作排程方法研究 (頁 30-53)