Abstract
Cloud computing has become popular for small businesses due to its cost-effectiveness and the ability to acquire necessary on-demand services, including software, hardware, network, etc., anytime around the globe. Efficient job scheduling in the Cloud is essential to optimize operational costs in data centers. Therefore, scheduling should consider assigning tasks to Virtual Machines (VMs) in a Cloud environment in such a manner that could speed up execution, maximize resource utilization, and meet users’ SLA and other constraints such as deadlines. For this purpose, the tasks can be prioritized based on their deadlines and task lengths, and the resources could be provisioned and released as needed. Moreover, to cope with unexpected execution situations or hardware failures, a fault-tolerance mechanism could be employed based on hybrid replication and the re-submission method. Most of the existing techniques tend to improve performance. However, their pitfall lies in certain aspects such as either those techniques prioritize tasks based on a singular value (e.g., usually deadline), only utilize a singular fault tolerance mechanism, or try to release resources that cause more overhead immediately. This research work proposes a new scheduler called the Deadline and fault-aware task Adjusting and Resource Managing (DFARM) scheduler, the scheduler dynamically acquires resources and schedules deadline-constrained tasks by considering both their length and deadlines while providing fault tolerance through the hybrid replication–resubmission method. Besides acquiring resources, it also releases resources based on their boot time to lessen costs due to reboots. The performance of the DFARM scheduler is compared to other scheduling algorithms, such as Random Selection, Round Robin, Minimum Completion Time, RALBA, and OG-RADL. With a comparable execution performance, the proposed DFARM scheduler reduces task-rejection rates by 2.34–9.53 times compared to the state-of-the-art schedulers using two benchmark datasets.
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Hussain, A., Aleem, M., Khan, A., Iqbal, M., Islam, A.: RALBA: a computation-aware load balancing scheduler for cloud computing. Clust. Comput. 21, 09 (2018)
Hussain, A., Aleem, M., Iqbal, M., Islam, A.: SLA-RALBA: cost-efficient and resource-aware load balancing algorithm for cloud computing. J. Supercomput. 75, 10 (2019)
Marahatta, A., Wang, Y., Zhang, F., Kumar, A., Sah Tyagi, S., Liu, Z.: Energy-aware fault-tolerant dynamic task scheduling scheme for virtualized cloud data centers. Mob. Netw. Appl. 24, 1–15 (2019)
Zhou, A., Wang, S., Cheng, B., Zheng, Z., Yang, F., Chang, R.N., Lyu, M.R., Buyya, R.: Cloud service reliability enhancement via virtual machine placement optimization. IEEE Trans. Serv. Comput. 10(6), 902–913 (2017)
Saidi, K., Bardou, D.: Task scheduling and VM placement to resource allocation in cloud computing: challenges and opportunities. Clust. Comput. 26(5), 3069–3087 (2023)
AbdElfattah, E., Elkawkagy, M., El-Sisi, A.: A reactive fault tolerance approach for cloud computing. In: 2017 13th International Computer Engineering Conference (ICENCO), pp. 190–194 (2017)
Arabnejad, V., Bubendorfer, K., Ng, B.: Scheduling deadline constrained scientific workflows on dynamically provisioned cloud resources. Future Gener. Comput. Syst. 75, 348–364 (2017)
Kumar, M., Sharma, S.: Deadline constrained based dynamic load balancing algorithm with elasticity in cloud environment. Comput. Electr. Eng. 69, 395–411 (2018)
Nabi, S., Ahmed, M.: OG-RADL: overall performance-based resource-aware dynamic load-balancer for deadline constrained cloud tasks. J. Supercomput. 77, 07 (2021)
Garg, N., Singh, D., Singh Goraya, M.: Deadline aware energy-efficient task scheduling model for a virtualized server. SN Comput. Sci. 2(3), 169 (2021). https://doi.org/10.1007/s42979-021-00571-2
Chinnathambi, S., Santhanam, A., Rajarathinam, J., Maruthamuthu, S.: Scheduling and checkpointing optimization algorithm for byzantine fault tolerance in cloud clusters. Clust. Comput. 22, 11 (2019)
Adhikari, M., Amgoth, T.: Heuristic-based load-balancing algorithm for IaaS cloud. Future Gener. Comput. Syst. 81, 156–165 (2018)
Iftikhar, S., Ahmad, M.M.M., Tuli, S., Chowdhury, D., Xu, M., Gill, S.S., Uhlig, S.: HunterPlus: AI based energy-efficient task scheduling for cloud-fog computing environments. Internet Things 21, 100667 (2023)
Nazeri, M., Khorsand, R.: Energy aware resource provisioning for multi-criteria scheduling in cloud computing. Cybern. Syst. (2022). https://doi.org/10.1080/01969722.2022.2071409
Alaei, M., Khorsand, R., Ramezanpour, M.: An adaptive fault detector strategy for scientific workflow scheduling based on improved differential evolution algorithm in cloud. Appl. Soft Comput. 99, 106895 (2021)
Hussain, A., Aleem, M.: GoCJ: Google cloud jobs dataset for distributed and cloud computing infrastructures. Data 3(4), 38 (2018)
Braun, T.D., Siegel, H.J., Beck, N., Bölöni, L.L., Maheswaran, M., Reuther, A.I., Robertson, J.P., Theys, M.D., Yao, B., Hensgen, D., Freund, R.F.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)
Zhu, X., Yang, L.T., Chen, H., Wang, J., Yin, S., Liu, X.: Real-time tasks oriented energy-aware scheduling in virtualized clouds. IEEE Trans. Cloud Comput. 2(2), 168–180 (2014)
Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A.F., Buyya, R.: CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw.: Pract. Exp. 41(1), 23–50 (2011). https://doi.org/10.1002/spe.995
Acknowledgements
The European Union (Horizon Europe Graph-Massivizer, 101093202) and the Austrian Research Promotion Agency (FFG Kärntner Fog, 888098) funded this work.
Funding
The European Union (Horizon Europe Graph-Massivizer, 101093202) and the Austrian Research Promotion Agency (FFG Kärntner Fog, 888098) funded this work.
Author information
Authors and Affiliations
Contributions
Ahmad Awan: literature review, implementation, detailed solution design, experimentations. Muhammad Aleem: idea formulation, supervision, detailed solution design. Altaf Hussain: proposed architecture, draft review, experimentation validation. Radu Prodan: proposed architecture, experimental plan and design, writeup review.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Awan, A., Aleem, M., Hussain, A. et al. DFARM: a deadline-aware fault-tolerant scheduler for cloud computing. Cluster Comput 27, 9323–9344 (2024). https://doi.org/10.1007/s10586-024-04419-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-024-04419-1