In this thesis, we propose a loading driven adaptive algorithm, named LASA, which dynamically schedules real-time tasks with fault-tolerant capability based on Primary Backup model in a heterogeneous multiprocessor system effectively. Through the simulations, we have evaluated the performance of our proposed algorithm comparing with FTMA and DNA. We will make conclusions and give some future work in this chapter.
5.1 Conclusion
In summary, our new algorithm has the following main features and contributions:
(1) The integrated heuristic function used for task selection in our algorithm has been shown to have the highest Guarantee Ratio than others based on our simulation results. This function selected a task which can be finished earlier and has smaller deadline.
(2) When scheduling backup copies, we have shown that ALAP strategy is the most effective strategy for cooperating with backup deallocation than ASAP and MNO strategy.
(3) By adding a waiting queue, tasks with larger deadlines have more opportunities to utilize the reclaimed backup slots when encountering heavy system loading. Thus the Guarantee Ratio is improved. Meanwhile, we use the latest start time to limit the waiting time. If a deferred task cannot be feasibly scheduled finally, it will be rejected far before its deadline so that the system still has enough time to execute error handling routines.
(4) The loading driven adaptation strategy allows the scheduler to stop scheduling backup copies temporarily when loading exceeds a predetermined threshold. Because it reduces the resources reserved for backup copies which might be unused, the Guarantee Ratio is increased with minor degradation of reliability.
5.2 Future work
In addition to our previous features, there are still some attractive issues worthy of further investigations in the future.
(1) In our task model, we assume that all tasks have the same importance. However, it will be more desirable if we allow tasks have different level of importance. More critical tasks should have higher priorities to be accepted with scheduling both primary and backup copies, even if some less important tasks must be deallocated from the current schedule.
Moreover, we can schedule more than two copies for a task which has higher level of importance. In this model, how to define the importance level of tasks would be a difficult problem. It may refer to the workload of real-world applications. Besides, the performance metrics requires more sophisticated design rather than Guarantee Ratio. The importance level of individual task should be taken into consideration.
(2) We assume that each task has a large enough deadline so that primary and backup slots can be scheduled without timing overlaps. It will be more general by allowing the primary and backup copies to have timing overlaps. Even for a task with a large deadline, we can still use this approach if the only two available time slots happen to have timing overlaps. Further, if more than two copies are required for reliability, some of them should have timing overlaps. Although this scheme requires more computing resources, we can design an effective adaptation mechanism to dynamically switch the scheduling strategies.
(3) We only monitor the system loading for switching scheduling strategies, but there are still many other system statuses that may be useful. As mentioned in section 2.2, the concept of monitoring the fault rate was proposed in feedback-based adaptive scheduling scheme [14]. This can be extended to an adaptation strategy. Besides, the amount of tasks in the task queue is not considered in our algorithm. Evaluating total resource requirements of all tasks in the task queue may also be taken into consideration. We expect a more effective
adaptation strategy scheduler integrated with more than one system status, such as system loading, failure rate, and number of incoming tasks.
(4) The threshold values are constant in our algorithm. It can also be adaptive for different workload scenarios. We need a more sophisticate system model and a more complex performance metrics. As mentioned in section 4.2.2, we hope that our algorithm can still tolerate most of the failure tasks and assume that CF ≅ CR, we simply assign the LA and LR with larger values. In the future, we may define a more precise system model in which the tasks have different priorities, contributions and penalties. Then, we can evolve a method which dynamically adjusts the LA and LR according to the statistics of the penalty and the fault rate.
References
[1] K. Ramamritham, J. A. Stankovic, and Perng-fei Shiah, “Efficient Scheduling Algorithms for Real-time Multiprocessor Systems”, IEEE Trans. on Parallel and Distributed Systems, Vol. 1, No. 2, pp. 184-194, April 1990.
[2] B. Hamidzadeh and Y. Atif, ”Dynamic Scheduling of Real-time Tasks, by Assignment”, IEEE Concurrency, Vol. 6, Issue 4, pp. 14-25, Oct.- Dec. 1998.
[3] J. A. Stankovic, K. Ramamritham, “The Spring Kernel:A New Paradigm for Real-Time Systems", IEEE Trans. Software Eng., Vol. 8, Issue 3, pp. 62–72, May 1991.
[4] S. Ghosh, R. Melhem, and D. Mosse, “Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems”, IEEE Trans. on Parallel and Distributed Systems, Vol. 8, No. 3, pp. 272-284, March 1997.
[5] G. Manimaran and C. S. R. Murthy, “A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis”, IEEE Trans. on Parallel and Distributed Systems, Vol. 9, No. 11, pp. 1137-1152, Nov. 1998.
[6] R. Al-Omari, G. Manimaran, and A. K. Somani, “An Efficient Backup-overloading for Fault-tolerant Scheduling of Real-time Tasks”, Proc. of IEEE Workshop on Fault-tolerant Parallel and Distributed Systems, pp. 1291-1295, 2000.
[7] R. Al-Omari, A. K. Somani, and G. Manimaran, “A New Fault-tolerant Technique for Improving Schedulability in Multiprocessor Real-time systems”, Proc. of 15th International Parallel and Distributed Processing Symposium, April 2001.
[8] R. Al-Omari, A. K. Somani, and G. Manimaran, “Effieicnt Overloading Techniques for Primary-Backup Scheduling in Real-Time Systems” , Journal of Parallel and Distributed Computing, Vol. 64, No. 1, pp. 629-648, Jan. 2004.
[9] C. Shen , K. Ramamritham , J. A. Stankovic, “Resource Reclaiming in Multiprocessor
Real-Time Systems”, IEEE Trans. on Parallel and Distributed Systems, Vol. 4, No. 4, pp. 382-397, April 1993.
[10] X. Qin, H. Jiang, and D. R. Swanson, “An Efficient Fault-tolerant Scheduling Algorithm for Real-time Tasks with Precedence Constraints in Heterogeneous Systems”, Proc. of the 31st International Conference on Parallel Processing (ICPP 2002), pp. 360-368. Vancouver, British Columbia, Canada, Aug. 18-21, 2002.
[11] Y. H. Lee, M. D. Chang, and C. Chen, “Effective Fault-tolerant Scheduling Algorithms for Real-time Tasks on Heterogeneous Systems”, Proc. of National Computer Symposium, Dec. 2003.
[12] M. D. Chang, A Fault-tolerant Dynamic Scheduling Algorithm for Real-time Systems on Heterogeneous Multiprocessor, Master Thesis, National Chiao-Tung
University, June 2004.
[13] S. Swaminathan and G. Manimaran, ”A Value-based Scheduler Capturing Schedulability- Reliability Tradeoff in Multiprocessor Read-time Systems”, Journal of Parallel and Distributed Computing, Vol. 64, No. 5, pp. 629-648, May 2004.
[14] R. Al-Omari, A. K. Somani, and G. Manimaran, ”An Adaptive Scheme for Fault-Tolerant Scheduling of Soft Real-Time Tasks in Multiprocessor Systems”, Proc.
Intl. Conference on High Performance Computing (HiPC), Hyderabad, India, Dec.
2001.
[15] T. Tsuchiya, Y. Kakuda, and T. Kikuno, ”A New Fault-Tolerant Scheduling Technique for Real-Time Multiprocessor Systems”, Proceedings of Second International Workshop on Real-Time Computing Systems and Applications, pp. 197-202, 1995.
[16] M. L. Dertouzos and A. K. Mok, “Multiprocessor On-Line Scheduling of Hard Real-Time Tasks”, IEEE Trans. Software Eng., Vol. 15, No. 12, pp.1479-1506, Dec.
1989.
[17] J. W. S. Liu, W. K. Shih, K. J. Lin, R. Bettati, and J.Y. Chung, “Imprecise
Computations”, Proc. IEEE, Vol.82, No.1, pp.83-94, Jan. 1994.
[18] L. V. Mancini, “Modular Redundancy in a Message Passing System”, IEEE Trans.
Software Eng., Vol.12, No. 1, pp. 79-86, Jan. 1986.
[19] K. G. Shin and P. Ramanathan, “Real-Time Computing: A New Discipline of Computer Science and Engineering”, Proc. IEEE, Vol.82, No. 1, pp.6-24, Jan. 1994.