DFARM: a deadline-aware fault-tolerant scheduler for cloud computing

Ahmad Awan¹,
Muhammad Aleem¹,
Altaf Hussain² &
…
Radu Prodan³

176 Accesses
Explore all metrics

Abstract

Cloud computing has become popular for small businesses due to its cost-effectiveness and the ability to acquire necessary on-demand services, including software, hardware, network, etc., anytime around the globe. Efficient job scheduling in the Cloud is essential to optimize operational costs in data centers. Therefore, scheduling should consider assigning tasks to Virtual Machines (VMs) in a Cloud environment in such a manner that could speed up execution, maximize resource utilization, and meet users’ SLA and other constraints such as deadlines. For this purpose, the tasks can be prioritized based on their deadlines and task lengths, and the resources could be provisioned and released as needed. Moreover, to cope with unexpected execution situations or hardware failures, a fault-tolerance mechanism could be employed based on hybrid replication and the re-submission method. Most of the existing techniques tend to improve performance. However, their pitfall lies in certain aspects such as either those techniques prioritize tasks based on a singular value (e.g., usually deadline), only utilize a singular fault tolerance mechanism, or try to release resources that cause more overhead immediately. This research work proposes a new scheduler called the Deadline and fault-aware task Adjusting and Resource Managing (DFARM) scheduler, the scheduler dynamically acquires resources and schedules deadline-constrained tasks by considering both their length and deadlines while providing fault tolerance through the hybrid replication–resubmission method. Besides acquiring resources, it also releases resources based on their boot time to lessen costs due to reboots. The performance of the DFARM scheduler is compared to other scheduling algorithms, such as Random Selection, Round Robin, Minimum Completion Time, RALBA, and OG-RADL. With a comparable execution performance, the proposed DFARM scheduler reduces task-rejection rates by 2.34–9.53 times compared to the state-of-the-art schedulers using two benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing the Performance of Cloud Environment by a Novel Three-Stage Task Scheduling Policy

Fault-tolerant allocation of deadline-constrained tasks through preemptive migration in heterogeneous cloud environments

Article 27 May 2024

GP-MSJF: An Improved Load Balancing Generalized Priority-Based Modified SJF Scheduling in Cloud Computing

Data availability

No datasets were generated or analysed during the current study.

References

Hussain, A., Aleem, M., Khan, A., Iqbal, M., Islam, A.: RALBA: a computation-aware load balancing scheduler for cloud computing. Clust. Comput. 21, 09 (2018)
Article Google Scholar
Hussain, A., Aleem, M., Iqbal, M., Islam, A.: SLA-RALBA: cost-efficient and resource-aware load balancing algorithm for cloud computing. J. Supercomput. 75, 10 (2019)
Article Google Scholar
Marahatta, A., Wang, Y., Zhang, F., Kumar, A., Sah Tyagi, S., Liu, Z.: Energy-aware fault-tolerant dynamic task scheduling scheme for virtualized cloud data centers. Mob. Netw. Appl. 24, 1–15 (2019)
Article Google Scholar
Zhou, A., Wang, S., Cheng, B., Zheng, Z., Yang, F., Chang, R.N., Lyu, M.R., Buyya, R.: Cloud service reliability enhancement via virtual machine placement optimization. IEEE Trans. Serv. Comput. 10(6), 902–913 (2017)
Article Google Scholar
Saidi, K., Bardou, D.: Task scheduling and VM placement to resource allocation in cloud computing: challenges and opportunities. Clust. Comput. 26(5), 3069–3087 (2023)
Article Google Scholar
AbdElfattah, E., Elkawkagy, M., El-Sisi, A.: A reactive fault tolerance approach for cloud computing. In: 2017 13th International Computer Engineering Conference (ICENCO), pp. 190–194 (2017)
Arabnejad, V., Bubendorfer, K., Ng, B.: Scheduling deadline constrained scientific workflows on dynamically provisioned cloud resources. Future Gener. Comput. Syst. 75, 348–364 (2017)
Article Google Scholar
Kumar, M., Sharma, S.: Deadline constrained based dynamic load balancing algorithm with elasticity in cloud environment. Comput. Electr. Eng. 69, 395–411 (2018)
Article Google Scholar
Nabi, S., Ahmed, M.: OG-RADL: overall performance-based resource-aware dynamic load-balancer for deadline constrained cloud tasks. J. Supercomput. 77, 07 (2021)
Article Google Scholar
Garg, N., Singh, D., Singh Goraya, M.: Deadline aware energy-efficient task scheduling model for a virtualized server. SN Comput. Sci. 2(3), 169 (2021). https://doi.org/10.1007/s42979-021-00571-2
Article Google Scholar
Chinnathambi, S., Santhanam, A., Rajarathinam, J., Maruthamuthu, S.: Scheduling and checkpointing optimization algorithm for byzantine fault tolerance in cloud clusters. Clust. Comput. 22, 11 (2019)
Article Google Scholar
Adhikari, M., Amgoth, T.: Heuristic-based load-balancing algorithm for IaaS cloud. Future Gener. Comput. Syst. 81, 156–165 (2018)
Article Google Scholar
Iftikhar, S., Ahmad, M.M.M., Tuli, S., Chowdhury, D., Xu, M., Gill, S.S., Uhlig, S.: HunterPlus: AI based energy-efficient task scheduling for cloud-fog computing environments. Internet Things 21, 100667 (2023)
Article Google Scholar
Nazeri, M., Khorsand, R.: Energy aware resource provisioning for multi-criteria scheduling in cloud computing. Cybern. Syst. (2022). https://doi.org/10.1080/01969722.2022.2071409
Article Google Scholar
Alaei, M., Khorsand, R., Ramezanpour, M.: An adaptive fault detector strategy for scientific workflow scheduling based on improved differential evolution algorithm in cloud. Appl. Soft Comput. 99, 106895 (2021)
Article Google Scholar
Hussain, A., Aleem, M.: GoCJ: Google cloud jobs dataset for distributed and cloud computing infrastructures. Data 3(4), 38 (2018)
Article Google Scholar
Braun, T.D., Siegel, H.J., Beck, N., Bölöni, L.L., Maheswaran, M., Reuther, A.I., Robertson, J.P., Theys, M.D., Yao, B., Hensgen, D., Freund, R.F.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)
Article Google Scholar
Zhu, X., Yang, L.T., Chen, H., Wang, J., Yin, S., Liu, X.: Real-time tasks oriented energy-aware scheduling in virtualized clouds. IEEE Trans. Cloud Comput. 2(2), 168–180 (2014)
Article Google Scholar
Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A.F., Buyya, R.: CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw.: Pract. Exp. 41(1), 23–50 (2011). https://doi.org/10.1002/spe.995
Article Google Scholar

Download references

Acknowledgements

The European Union (Horizon Europe Graph-Massivizer, 101093202) and the Austrian Research Promotion Agency (FFG Kärntner Fog, 888098) funded this work.

Funding

The European Union (Horizon Europe Graph-Massivizer, 101093202) and the Austrian Research Promotion Agency (FFG Kärntner Fog, 888098) funded this work.

Author information

Authors and Affiliations

Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad, Pakistan
Ahmad Awan & Muhammad Aleem
Department of Computer Science, KICSIT Kahuta Campus, Institute of Space Technology, Islamabad, Pakistan
Altaf Hussain
Institute of Information Technology, Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Radu Prodan

Authors

Ahmad Awan
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Aleem
View author publications
You can also search for this author in PubMed Google Scholar
Altaf Hussain
View author publications
You can also search for this author in PubMed Google Scholar
Radu Prodan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Ahmad Awan: literature review, implementation, detailed solution design, experimentations. Muhammad Aleem: idea formulation, supervision, detailed solution design. Altaf Hussain: proposed architecture, draft review, experimentation validation. Radu Prodan: proposed architecture, experimental plan and design, writeup review.

Corresponding author

Correspondence to Muhammad Aleem.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Awan, A., Aleem, M., Hussain, A. et al. DFARM: a deadline-aware fault-tolerant scheduler for cloud computing. Cluster Comput 27, 9323–9344 (2024). https://doi.org/10.1007/s10586-024-04419-1

Download citation

Received: 22 November 2023
Revised: 11 February 2024
Accepted: 05 March 2024
Published: 20 April 2024
Issue Date: October 2024
DOI: https://doi.org/10.1007/s10586-024-04419-1

DFARM: a deadline-aware fault-tolerant scheduler for cloud computing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing the Performance of Cloud Environment by a Novel Three-Stage Task Scheduling Policy

Fault-tolerant allocation of deadline-constrained tasks through preemptive migration in heterogeneous cloud environments

GP-MSJF: An Improved Load Balancing Generalized Priority-Based Modified SJF Scheduling in Cloud Computing

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

DFARM: a deadline-aware fault-tolerant scheduler for cloud computing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing the Performance of Cloud Environment by a Novel Three-Stage Task Scheduling Policy

Fault-tolerant allocation of deadline-constrained tasks through preemptive migration in heterogeneous cloud environments

GP-MSJF: An Improved Load Balancing Generalized Priority-Based Modified SJF Scheduling in Cloud Computing

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation