Machine Learning-Based Scheduling and Resources Allocation in Distributed Computing

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13353))

Included in the following conference series:

International Conference on Computational Science

1300 Accesses
2 Citations

Abstract

In this work we study a promising approach for efficient online scheduling of job-flows in high performance and distributed parallel computing. The majority of job-flow optimization approaches, including backfilling and microscheduling, require apriori knowledge of a full job queue to make the optimization decisions. In a more general scenario when user jobs are submitted individually, the resources selection and allocation should be performed immediately in the online mode. In this work we consider a neural network prototype model trained to perform online optimization decisions based on a known optimal solution. For this purpose, we designed MLAK algorithm which implements 0–1 knapsack problem based on the apriori unknown utility function. In a dedicated simulation experiments with different utility functions MLAK provides resources selection efficiency comparable to a classical greedy algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Machine Learning-Based Online Scheduling in Distributed Computing

SchedP: I/O-aware Job Scheduling in Large-Scale Production HPC Systems

Computation of workflow scheduling using backpropagation neural network in cloud computing: a virtual machine placement approach

Article 08 February 2021

References

Bharathi, S., Chervenak, A.L., Deelman, E., Mehta, G., Su, M., Vahi, K.: Characterization of scientific workflows. In: Proceedings of 2008 Third Workshop on Workflows in Support of Large-Scale Science, pp. 1–10 (2008)
Google Scholar
Rodriguez, M.A., Buyya, R.: Scheduling dynamic workloads in multi-tenant scientific workflow as a service platforms. Futur. Gener. Comput. Syst. 79(P2), 739–750 (2018)
Article Google Scholar
Kurowski, K., Nabrzyski, J., Oleksiak, A., Weglarz, J.: Multicriteria aspects of grid resource management. In: Nabrzyski, J., Schopf, J.M., Weglarz J. (eds.) Grid Resource Management. State of the Art and Future Trends, pp. 271–293. Kluwer Academic Publishers (2003)
Google Scholar
Toporkov, V., Yemelyanov, D.: Heuristic rules for coordinated resources allocation and optimization in distributed computing. In: Rodrigues, J.M.F., et al. (eds.) ICCS 2019. LNCS, vol. 11538, pp. 395–408. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22744-9_31
Chapter Google Scholar
Toporkov, V., Yemelyanov, D., Toporkova, A.: Coordinated global and private job-flow scheduling in grid virtual organizations. J. Simulation Modelling Practice and Theory 107. Elsevier (2021)
Google Scholar
Sukhoroslov, O., Nazarenko, A., Aleksandrov, R.: An experimental study of scheduling algorithms for many-task applications. J. Supercomputing 75, 7857–7871 (2019)
Article Google Scholar
Samimi, P., Teimouri, Y., Mukhtar, M.: A combinatorial double auction resource allocation model in cloud computing. J. Inform. Sci. 357(C), 201–216 (2016)
Article Google Scholar
Rodero, I., Villegas, D., Bobroff, N., Liu, Y., Fong, L., Sadjadi, S.: Enabling interoperability among grid meta-schedulers. J. Grid Comput. 11(2), 311–336 (2013)
Article Google Scholar
Shmueli, E., Feitelson, D.G.: Backfilling with lookahead to optimize the packing of parallel jobs. J. Parallel Distrib. Comput. 65(9), 1090–1107 (2005)
Article Google Scholar
Khemka, B., Machovec, D., Blandin, C., Siegel, H.J., Hariri, S., Louri, A., Tunc, C., Fargo, F., Maciejewski, A.A.: Resource management in heterogeneous parallel computing environments with soft and hard deadlines. In: Proceedings of 11th Metaheuristics International Conference (MIC 2015) (2015)
Google Scholar
Netto, M.A.S., Buyya, R.: A flexible resource co-allocation model based on advance reservations with rescheduling support. In: Technical Report, GRIDSTR-2007–2017, Grid Computing and Distributed Systems Laboratory. The University of Melbourne, Australia (2007)
Google Scholar
Toporkov, V., Toporkova, A., Yemelyanov, D.: Slot co-allocation optimization in distributed computing with heterogeneous resources. In: Del Ser, J., Osaba, E., Bilbao, M.N., Sanchez-Medina, J.J., Vecchio, M., Yang, X.-S. (eds.) IDC 2018. SCI, vol. 798, pp. 40–49. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99626-4_4
Chapter Google Scholar
Toporkov, V., Yemelyanov, D.: Optimization of resources selection for jobs scheduling in heterogeneous distributed computing environments. In: Shi, Y., Fu, H., Tian, Y., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2018. LNCS, vol. 10861, pp. 574–583. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93701-4_45
Chapter Google Scholar
Toporkov, V., Yemelyanov, D.: Scheduling optimization in heterogeneous computing environments with resources of different types. In: Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J. (eds.) DepCoS-RELCOMEX 2021. AISC, vol. 1389, pp. 447–456. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76773-0_43
Chapter Google Scholar
Xu, S., Panwar, S.S., Kodialam, M.S., Lakshman, T.V.: Deep neural network approximated dynamic programming for combinatorial optimization. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1684–1691 (2020)
Google Scholar
Nomer, H.A.A., Alnowibet, K.A., Elsayed, A., Mohamed, A.W.: Neural knapsack: A neural network based solver for the knapsack problem. In: Proceedings of the IEEE Access, vol. 8, pp. 224200–224210 (2020)
Google Scholar
Hertrich, C., Skutella, M.: Provably good solutions to the knapsack problem via neural networks of bounded size. In: Proceedings of the AAAI Conference on Artificial Intelligence 35(9), pp. 7685–7693 (2021)
Google Scholar
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1800–1807 (2017)
Google Scholar

Download references

Acknowledgments

This work was supported by the Russian Science Foundation (project no. 22–21-00372).

Author information

Authors and Affiliations

National Research University “MPEI”, Moscow, Russia
Victor Toporkov, Dmitry Yemelyanov & Artem Bulkhak

Authors

Victor Toporkov
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Yemelyanov
View author publications
You can also search for this author in PubMed Google Scholar
Artem Bulkhak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Victor Toporkov .

Editor information

Editors and Affiliations

Brunel University London, London, UK
Derek Groen
University of Amsterdam, Amsterdam, The Netherlands
Clélia de Mulatier
AGH University of Science and Technology, Krakow, Poland
Maciej Paszynski
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Tennessee at Knoxville, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Toporkov, V., Yemelyanov, D., Bulkhak, A. (2022). Machine Learning-Based Scheduling and Resources Allocation in Distributed Computing. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13353. Springer, Cham. https://doi.org/10.1007/978-3-031-08760-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-08760-8_1
Published: 15 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08759-2
Online ISBN: 978-3-031-08760-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Machine Learning-Based Scheduling and Resources Allocation in Distributed Computing

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Machine Learning-Based Online Scheduling in Distributed Computing

SchedP: I/O-aware Job Scheduling in Large-Scale Production HPC Systems

Computation of workflow scheduling using backpropagation neural network in cloud computing: a virtual machine placement approach

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Machine Learning-Based Scheduling and Resources Allocation in Distributed Computing

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Machine Learning-Based Online Scheduling in Distributed Computing

SchedP: I/O-aware Job Scheduling in Large-Scale Production HPC Systems

Computation of workflow scheduling using backpropagation neural network in cloud computing: a virtual machine placement approach

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation