Conclusions - 高效能計算即服務平台上具可調式平行度之工作排程問題研究

Traditional supercomputing centers usually adopt rigid job scheduling, which requires users to specify the amount of processors to use upon job submission and then allocates computing resources to each job according to the specified processor requirement. However, if the specified amount of processors is larger than current available resources, the job would have to wait while the available resources are kept idle, resulting in degraded resource utilization and job turnaround time. Most modern parallel applications, e.g. those written with MPI, usually have the moldable property, which allows them to exploit different parallelisms for execution at runtime. In such cases, the traditional rigid job scheduling is inappropriate. Therefore, moldable job scheduling becomes an important research topic which aims to improve the overall system performance through adaptive processor allocation taking advantage of the moldable property.

Moreover, as cloud computing emerges, recently the concept of HPC as a Service (HPCaaS) was proposed to transform HPC facilities and applications into a more convenient and accessible service model. For HPCaaS, users simply want to get their jobs done as soon as possible and don’t want to or even have no idea on how to specify an appropriate amount of processors for the application’s execution. Therefore, moldable job scheduling can contribute to HPCaaS in two important aspects. Firstly, it relieves users’ burden of selecting an appropriate number of processors upon job submission, leading to a much easier and convenient user experience for HPCaaS. Secondly, moldable job scheduling has the potential to improve the average turnaround time of parallel applications and the overall resource utilization, benefiting both HPCaaS users and providers.

In this thesis, we classify previous research on moldable job scheduling into four quadrants according to two aspects: submit-time or schedule-time decision and having job

runtime information or not. Based on this classification, we make three contributions to moldable job scheduling by taking advantage of the information of applications’ speedup models. The first contribution, called auxiliary moldable job scheduling, is a feasible extension to the usage model in most current HPC centers, which improves the overall system performance through limiting the maximum allowable amount of processors for each job according to the knee value calculated based on applications’ speedup models. In the second contribution, we proposed a moldable job scheduling approach for HPCaaS, which can automatically select a most appropriate amount of processors for a job’s execution based on applications’ speedup models and workload conditions at the moment, relieving users’ burden of selecting an appropriate number of processors upon job submission. The proposed approaches in the previous two contributions do not require job runtime information. In the third contribution, we propose an advanced moldable job scheduling approach, taking advantage of job runtime information to further improve the overall system performance.

The proposed moldable job scheduling approaches were evaluated through a series of simulation experiments, and compared to previous methods in the literature. The experimental results indicate that our approaches outperform existing methods significantly, achieving up to 83%, 78%, and 89% performance improvement in terms of average turnaround time, respectively.

Based on the results and experience in this thesis, two research directions are worthy of further exploration in the future. The first is evaluating the effects of inaccurate runtime estimation on the proposed moldable job scheduling methods, since job runtime plays an important role on allocation decisions. Although accurate runtime estimation is more probable in the HPCaaS usage scenarios than in traditional HPC platforms, 100% accuracy is still not

possible. Therefore, it is desirable to evaluate the influence of runtime estimation accuracy.

The second future research direction is to extend the proposed moldable job scheduling approach, when the parallel allocation policy is adopted, by exploring different resource partitioning strategies. In this thesis, a simple equal-partition strategy is used when the parallel allocation policy is applied, which means that each of the jobs, e.g. n jobs, in queue is allocated 1/n resources equally. Other possible partitioning strategies considering the workload difference among the jobs are worthy of further exploration and evaluation.

References

[1] D. G. Feitelson, “A Survey of Scheduling in Multiprogrammed Parallel Systems”, Proc.

Research Report RC 19790 (87657), IBM T. J. Watson Research Center, Oct. 1994.

[2] R. Gibbons, “A Historical Application Profiler for Use by Parallel Schedulers”, Proc.

Job Scheduling Strategies for Parallel Processing, pp. 58-77, Springer-Verlag, 1997.

[3] D. Lifka, “The ANL/IBM SP Scheduling System”, Proc. Job Scheduling Strategies for Parallel Processing, pp. 295-303, Springer-Verlag, 1995.

[4] D. G. Feitelson, L. Rudolph, U. Schweigelshohn, K. Sevcik, and P. Wong, “Theory and Practice in Parallel Job Scheduling”. Proc. Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), pp. 1-34, Springer-Verlag, 1997.

Lecture Notes in Computer Science Vol. 1291.

[5] The Message Passing Interface standard, http://www.mcs.anl.gov/research/projects/mpi/

(June 2013)

[6] Load sharing facility,

http://www-03.ibm.com/systems/technicalcomputing/platformcomputing/products/lsf/

(June 2013)

[7] M. AbdelBaky,M.Parashar, H. Kim,E. J.JordanKirk,V.Sachdeva, J. Sexton, H. Jamjoom, Z.Y. Shae, G. Pencheva, R. Tavakoli and M. F. Wheeler, “Enabling High Performance Computing as a Service”, Proc. IEEE Computer, Vol. 45, pp. 72-80. IEEE Press, Oct.

(2012).

[8] Cloud computing,

http://www.infoworld.com/d/cloud-computing/what-cloud-computing-really-means-031

(Mar 2012)

[9] S. Srinivasan, S. Krishnamoorthy and P. Sadayappan, “A Robust Scheduling Strategy for Moldable Scheduling of Parallel Jobs”, Proc. 5th IEEE International Conference on Cluster Computing, pp. 92-99, 2003.

[10] S. Srinivasan, V. Subramani, R. Kettimuthu, P. Holenarsipur and P. Sadayappan,

“Effective Selection of Partition Sizes for Moldable Scheduling of Parallel Jobs”, Proc.

9th International Conference on High Performance Computing, Springer, Lecture Notes in Computer Science, Bangalore, India, Vol. 2552, pp.174-183, 2002.

[11] K. C. Huang, “Performance Evaluation of Adaptive Processor Allocation Policies for Moldable Parallel Batch Jobs”, Proc. 3th Workshop on Grid Technologies and Applications, Dec 2006.

[12] D. L. Eager, J. Zahorjan, E. D. Lozowska,“Speedup versus efficiency in parallel systems”. IEEE Transactions on Computers archive Vol.38, Issue 3, pp. 408-423, March 1989.

[13] L. Kleinrock, J.H. Huang: On parallel processing systems, “Amdahl’s law generalized and some results on optimal design”, Proc. IEEE Transactions Softw. Eng. 18(5) (1992) [14] A. B. Downey, “A Model for Speedup of Parallel Programs”, Proc. UC Berkeley EECS

Technical Report, No. UCB/CSD-97-933, January 1997.

[15] A. B. Downey, “A Parallel Workload Model and Its Implications for Processor Allocation”, Proc. the 6th International Symposium on High Performance Distributed

Computing, 1997. Runtime Estimate in Scheduling the IBM SP2 with Backfilling”, IEEE Transactions on Parallel and Distributed Systems, Vol. 12, Issue 6, pp. 529-543, June 2001.

[21] H. Sun, Y. Cao and W. J. Hsu, “Efficient Adaptive Scheduling of Multiprocessors with Stable Parallelism Feedback”, Proc. IEEE Transactions on Parallel and Distributed System, Vol. 22, No. 4, April 2011.

[22] S. Ioannidis, U. Rencuzogullari, R. Stets, and S. Dwarkadas, “CRAUL: Compiler and run-time integration for adaptation under load”, Journal of Scientiﬁc Programming, Aug.

1999.

[23] J. Pruyne and M. Livny. Parallel Processing on Dynamic Resources with CARMI. In D.

G. Feitelson and L.Rudolph, editors, Proc. Job Scheduling Strategies for Parallel Processing, Vol 949, pp. 259–278. Springer, 1995.

[24] W. Cirne and F. Berman, “Using Moldability to Improve the Performance of

Supercomputer Jobs”, Journal of Parallel and Distributed Computing, Vol. 62, pp.

1571-1601, Oct 2002

[25] W. Cirne and F. Berman, “Adaptive Selection of Partition Size for Supercomputer Requests”, Proc. Job Scheduling Strategies for Parallel Processing Lecture Notes in Computer Science, Vol. 1911, pp. 187-207 , 2000

[26] G. Sabin, M. Lang and P Sadayappan, “Moldable Parallel Job Scheduling Using Job Efficiency: An Iterative Approach”, Proc. Job Scheduling Strategies for Parallel Processing, Saint Malo, France, June 2006.

[27] T. G. Lewis and H. E. Rewini, Introduction to Parallel Computing, Prentice-Hall International, 1992.

在文檔中高效能計算即服務平台上具可調式平行度之工作排程問題研究 (頁 44-50)