This thesis proposes and evaluates four dynamic scheduling approaches for online moldable jobs with deadline. We investigated two important parts when designing the four scheduling approaches. The first is concerned with how to determine the order of jobs in the waiting queue for arrangement of resource allocation. Three waiting queue sequencing policies, EDF, LLF, and SJF, are evaluated in the thesis. For the second part, we propose and evaluate three different allocation mechanisms: SNP, TLNP, and EST. The four approaches in combination with different waiting queue sequencing policies and allocation mechanisms are explored thoroughly towards the goal of HPC as a Service.
Online scheduling of moldable jobs with deadline is an important research issue from the following aspects. From the perspective of parallel job scheduling, scheduling moldable jobs is becoming more important than ever since most modern parallel application programs can be elaborately designed to have the moldable property, such as the famous HPC benchmark program HPL, and the moldable property is a potentially great help for schedulers to deliver efficient schedules taking advantage of the flexibility in jobs’ parallelism. However, most previous research works on parallel job scheduling focus on rigid jobs. Comparatively little attention has been paid to moldable job scheduling. Even few works deal with moldable jobs with deadline. Therefore, we try to explore the issues of scheduling moldable jobs with deadline in this thesis.
From the perspective of HPC as a Service, it is no longer enough for a HPC system to serve only best-effort rigid jobs like before. Flexibility and QoS play an important role in service-based computing. Therefore, moldable jobs with deadline become an inevitable requirement in HPCaaS systems. Moreover, new performance metrics are needed for
evaluating different scheduling approaches from a service-oriented perspective. Therefore, in the simulation experiments, in addition to traditional average turnaround time, we compared and evaluated the proposed scheduling approaches in terms of three other performance metrics, including deadline miss rate, total profit, and lag of discard. We also evaluated the proposed scheduling approaches with a mix of best-effort jobs and deadline constrained jobs, which is likely to be a common scenario in real HPCaaS systems. The experimental results indicate that no one single approach can win in all scenarios. Therefore, it is important to carefully select an appropriate scheduling approach according to the target performance metrics, QoS concerns, and workload characteristics.
In the future, there are several promising research directions which might further improve the performance of scheduling mechanisms for moldable jobs with deadline. For example, there might be an effective compromise between the greedy FM and conservative RB approaches, which could resolve both the starvation of the first job in queue and the unnecessary resource idling. Another valuable work to be done is to develop new scheduling algorithms for zeroing or at least minimizing the lag of discard, which is important for the quality of service of HPCaaS environments.
References
[1] M. AbdelBaky, M.Parashar, H. Kim, E. J.JordanKirk,V.Sachdeva, J. Sexton, H. Jamjoom, Z.Y. Shae, G. Pencheva, R. Tavakoli and M. F. Wheeler, “Enabling High Performance Management for Parallel and Distributed Syst, (2010)
[4] V. Subramani, V., R. Kettimuthu, S. Srinivasan, J. Johnston, and P. Sadayappan,
“Selective Buddy Allocation for Scheduling Parallel Jobs on Clusters,” Proc. IEEE Int’l Conf. Cluster Computing, pp. 107–116, (Sept. 2002)
[5] G. Stiehr and R. D. Chamberlain., “Improving Cluster Utilization Through Intelligent.”
In Proceedings of 20th International Parallel and Distributed Processing Symposium , IPDPS (2006)
[6] D. G. Feitelson, L. Rudolph, Schweigelshohn, U., Sevcik, K., and Wong. P. , “Theory and Practice in Parallel Job Scheduling. “, In: Job Scheduling Strategies for Parallel Processing, pp. 1-34, Springer-Verlag, Lecture Notes in Computer Science Vol. 1291, (1997).
[7] HPL benchmark, http://www.netlib.org/benchmark/hpl/ (June 2014)
[8] The Message Passing Interface standard, http://www.mcs.anl.gov/research/projects/mpi/
(June 2014)
[9] S. Baruah, J. Goossens., “The EDF Scheduling of Sporadic Task Systems on Uniform Multiprocessors.”, In Proceedings of the 29th Real-Time Systems Symposium, 2008, pages 367–374, (Dec. 2008)
[10] M. Bertogna, M. Cirinei, and G. Lipari., “Improved Schedulability Analysis of EDF on Multiprocessor Platforms.” In Proceedings of the EuroMicro Conference on Real-Time
Systems, pages 209–218, Palma de Mallorca, Balearic Islands, Spain, IEEE Computer Society Press. (July 2005)
[11] M. Bertogna, M. Cirinei, and G. Lipari., “New Schedulability Tests for Real-time Tasks Sets Scheduled by Deadline Monotonic on Multiprocessors.” In Proceedings of the 9th International Conference on Principles of Distributed Systems, Pisa, Italy, IEEE Computer Society Press. (Dec. 2005)
[12] M. Cirinei and T. P. Baker., “EDZL scheduling analysis.” In Proceedings of the EuroMicro Conference on Real-Time Systems, Pisa, Italy, IEEE Computer Society Press.
(July 2007)
[13] Parallel Workloads Archive, http://www.cs.huji.ac.il/labs/parallel/workload/ (June 2014) [14] S. Srinivasan, R. Kettimuthu, V. Subrarnani, and P. Sadayappan, “Characterization of
Backfilling Strategies for Parallel Job Scheduling”. In Int’l Conf. on Parallel Processing (ICPP), pp. 514–522, (Aug. 2002)
[15] A. M. G. Adam K.L. Wong, "Evaluating the EASY- Backfill Job Scheduling of Static Workloads on Clusters," in Cluster Computing, 2007 IEEE International Conference , pp.
64-73. (2007)
[16] A. K. Wong and A. M. Goscinski, “The Impact of Under-estimated Length of Jobs on Easy-backfill Scheduling” in Parallel, Distributed and Network-Based Processing IEEE, (2008)
[17] W. Cirne and F. Berman, “Using Moldability to Improve the Performance of Supercomputer Jobs”, Journal of Parallel and Distributed Computing, Vol. 62, pp.
1571-1601 (Oct. 2002)
[18] K. C. Huang, “Performance Evaluation of Adaptive Processor Allocation Policies for Moldable Parallel Batch Jobs.” In: 3th Workshop on Grid Technologies and Applications, (2006).
[19] S. Srinivasan, S. Krishnamoorthy and P. Sadayappan, “A Robust Scheduling Strategy for Moldable Scheduling of Parallel Jobs.” In: 5th IEEE International Conference on
Cluster Computing, pp. 92-99, (2003).
[20] S. Srinivasan., V. Subramani, R. Kettimuthu, P. Holenarsipur and P. Sadayappan,
“Effective Selection of Partition Sizes for Moldable Scheduling of Parallel Jobs.” In: 9th International Conference on High Performance Computing, Springer, Lecture Notes In
Computer Science, Bangalore, India, Vol. 2552, pp.174-183, (2002).
[21] E. Caron, P. K. Chouhan, F. Desprez, ”Deadline Scheduling with Priority for
Client-Server Systems on the Grid.” In: the Fifth IEEE/ACM International Workshop on Grid Computing (2004)
[22] G. Le, K. Xu, J. Song, “Dynamic Resource Provisioning and Scheduling with Deadline Constraint in Elastic Cloud.” In: 2013 International Conference on Service Science (2013)
[23] Q. Perret, , G. Charlemagne, S. Sotiriadis, N. Bessis, “ A Deadline Scheduler for Jobs in Distributed Systems.” In: 27th International Conference on Advanced Information Networking and Applications Workshops (2013)
[24] J. Li, et.al., “Workload Efficient Deadline and Period Assignment for Maintaining Temporal Consistency under EDF.” In: IEEE Transaction on Computers, pp.1-14. (2013) [25] R. G. Herrtwich, "An Introduction to Real-Time Scheduling", ICSI Technique Report,
TR-90-035, (July 1990)
[26] F. Pop, “Scheduling of Sporadic Tasks with Deadline Constrains in Cloud Environments”, In: IEEE 27th International Conference on Advanced Information Networking and Applications (2013)
[27] E. Saulea, D. Bozdag, U. V. Catalyurek, “Optimizing the Stretch of Independent Tasks on a Cluster: From Sequential Tasks to Moldable Tasks.” In: Journal of Parallel and Distributed. Computing, vol. 72, issue 4, pp. 489-503 (2012)
[28] L. He, S. A. Jarvis, D. P. Spooner, X. Chen, and G. R. Nudd., ”Hybrid Performance-oriented Scheduling of Moldable Jobs with QoS Demands in Multiclusters and Grids.” In GCC, pages 217–224, (2004)
[29] K. OH, C. KY, ”Scheduling Parallel Tasks with Individual Deadlines.“, Theoretical Computer Science 215(1-2):209–223 (1999)
[30] E. Saule, D. Bozdağ, and U. Catalyurek,“A Moldable Online Scheduling Algorithm and its Application to Parallel Short Sequence Mapping.”, Proc. IEEE Job Scheduling Strategies for Parallel Processing, 93–109 (2010)
[31] D. Tsafrir, et al., “Backfilling Using System-generated Predictions Rather than User Runtime Estimates”, In: IEEE Transactions on Parallel and Distributed Systems 18 789–803. (2007)
[32] W. Zhao, K. Ramamritham, and J. A. Stankovic, “Scheduling Tasks with Resource Requirements in Hard Real-time Systems,” In Proceedings of IEEE Transaction on Software Engineering, VOL. SE-13, NO. 5, (May 1987)
[33] A. Silberschatz and P. Galvin, Operating System Concepts, 8th ed., Addison-Wesley, Boston, (2010)
[34] L. Kleinrock, J.H. Huang: On parallel processing systems, “Amdahl’s Law Generalized and some Results on Optimal Design”, Proc. IEEE Transactions Software Engineering 18(5) (1992)