Abstract
Microservices are containerized, loosely coupled, interactive smaller units of the application that can be deployed, reused, and maintained independently. In a microservices-based application, allocating the right computing resources for each containerized microservice is important to meet the specific performance requirements while minimizing the infrastructure cost. Microservices-based applications are easy to scale automatically based on incoming workload and resource demand automatically. However, it is challenging to identify the right amount of resources for containers hosting microservices and then allocate them dynamically during the auto-scaling. Existing auto-scaling solutions for microservices focus on identifying the appropriate time and number of containers to be added/removed dynamically for an application. However, they do not address the issue of selecting the right amount of resources, such as CPU cores, for individual containers during each scaling event. This paper presents a novel approach to dynamically allocate the CPU resources to the containerized microservice during the autoscaling events. Our proposed approach is based on the machine learning method, which can identify the right amount of CPU resources for each container, dynamically spawning for the microservices over time to satisfy the application’s response time requirements. The proposed solution is evaluated using a benchmark microservices-based application based on real-world workloads on the Kubernetes cluster. The experimental results show that the proposed solution outperforms by yielding a 40% to 60% reduction in violating the response time requirements with 0.5\(\times\) to 1.5\(\times\) less cost compared to the state-of-art baseline methods.
Similar content being viewed by others
Data availability
Not applicable
References
Vinoth, S., Vemula, H.L., Haralayya, B., Mamgain, P., Hasan, M.F., Naved, M.: Application of cloud computing in banking and e-commerce and related security threats. Mater. Today 51, 2172–2175 (2022)
Dörnemann, T., Juhnke, E., Freisleben, B.: On-demand resource provisioning for bpel workflows using amazon’s elastic compute cloud. In: 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 140–147 (2009). IEEE
Llorente, I.M., Moreno-Vozmediano, R., Montero, R.S.: Cloud computing for on-demand grid resource provisioning. Adv. Parallel Comput. 18(5), 177–191 (2009)
Buyya, R., Garg, S.K., Calheiros, R.N.: Sla-oriented resource provisioning for cloud computing: Challenges, architecture, and solutions. In: 2011 International Conference on Cloud and Service Computing, pp. 1–10 (2011). IEEE
Villamizar, M., Garces, O., Ochoa, L., Castro, H., Salamanca, L., Verano, M., Casallas, R., Gil, S., Valencia, C., Zambrano, A., et al.: Infrastructure cost comparison of running web applications in the cloud using aws lambda and monolithic and microservice architectures. In: 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 179–182 (2016). IEEE
Acharya, J.N., Suthar, A.C.: Docker container orchestration management: a review. In: International Conference on Intelligent Vision and Computing, pp. 140–153 (2022). Springer
Amaral, M., Polo, J., Carrera, D., Mohomed, I., Unuvar, M., Steinder, M.: Performance evaluation of microservices architectures using containers. In: 2015 IEEE 14th International Symposium on Network Computing and Applications, pp. 27–34 (2015). IEEE
Mao, Y., Fu, Y., Gu, S., Vhaduri, S., Cheng, L., Liu, Q.: Resource management schemes for cloud-native platforms with computing containers of docker and kubernetes. arXiv preprint arXiv:2010.10350 (2020)
Abid, A., Manzoor, M.F., Farooq, M.S., Farooq, U., Hussain, M.: Challenges and issues of resource allocation techniques in cloud computing. KSII Trans. Internet Inf. Syst. 14(7), 2815–2839 (2020)
Vinothina, V.V., Sridaran, R., Ganapathi, P.: A survey on resource allocation strategies in cloud computing. Int. J. Adv. Comput. Sci. Appl. (2012). https://doi.org/10.14569/IJACSA.2012.030616
Podolskiy, V., Jindal, A., Gerndt, M.: Iaas reactive autoscaling performance challenges. In: 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pp. 954–957 (2018). IEEE
Iqbal, W., Dailey, M.N., Carrera, D., Janecek, P.: Adaptive resource provisioning for read intensive multi-tier applications in the cloud. Future Gener. Comput. Syst. 27(6), 871–879 (2011)
Roy, N., Dubey, A., Gokhale, A.: Efficient autoscaling in the cloud using predictive models for workload forecasting. In: 2011 IEEE 4th International Conference on Cloud Computing, pp. 500–507 (2011). IEEE
Imdoukh, M., Ahmad, I., Alfailakawi, M.G.: Machine learning-based auto-scaling for containerized applications. Neural Comput. Appl. 32(13), 9745–9760 (2020)
Wajahat, M., Karve, A., Kochut, A., Gandhi, A.: Mlscale: a machine learning based application-agnostic autoscaler. Sustain. Comput. 22, 287–299 (2019)
Abdullah, M., Iqbal, W., Mahmood, A., Bukhari, F., Erradi, A.: Predictive autoscaling of microservices hosted in fog microdata center. IEEE Syst. J. 15(1), 1275–1286 (2020)
Abdullah, M., Iqbal, W., Berral, J.L., Polo, J., Carrera, D.: Burst-aware predictive autoscaling for containerized microservices. IEEE Trans. Serv. Comput. 15(3), 1448–1460 (2020)
Cloud, G.: Microservice Demo Benchmark. https://github.com/GoogleCloudPlatform/microservices-demo
Thönes, J.: Microservices. IEEE Softw. 32(1), 116–116 (2015)
Dragoni, N., Giallorenzo, S., Lafuente, A.L., Mazzara, M., Montesi, F., Mustafin, R., Safina, L.: Microservices: Yesterday, Today, and Tomorrow, pp. 195–216. Springer, New York (2017)
Usman, M., Ferlin, S., Brunstrom, A., Taheri, J.: A survey on observability of distributed edge & container-based microservices. IEEE Access 10, 86904–86919 (2022)
Balalaie, A., Heydarnoori, A., Jamshidi, P.: Migrating to cloud-native architectures using microservices: an experience report. In: European Conference on Service-Oriented and Cloud Computing, pp. 201–215 (2015). Springer
Shakarami, A., Shakarami, H., Ghobaei-Arani, M., Nikougoftar, E., Faraji-Mehmandar, M.: Resource provisioning in edge/fog computing: a comprehensive and systematic review. J. Syst. Architect. 122, 102362 (2022)
Li, X., Pan, L., Liu, S.: A survey of resource provisioning problem in cloud brokers. J. Network Comput. Appl. 203, 103384 (2022)
Khan, S.A., Abdullah, M., Iqbal, W., Butt, M.A., Bukhari, F., Hassan, S.-U.: Automatic migration-enabled dynamic resource management for containerized workload. IEEE Syst. J. 17(2), 2378–2389 (2022)
Tseng, F.-H., Wang, X., Chou, L.-D., Chao, H.-C., Leung, V.C.: Dynamic resource prediction and allocation for cloud data center using the multiobjective genetic algorithm. IEEE Syst. J. 12(2), 1688–1699 (2017)
Yadav, M.P., Yadav, D.K., et al.: Resource provisioning through machine learning in cloud services. Arab. J. Sci. Eng. 47(2), 1–23 (2022)
Vayghan, L.A., Saied, M.A., Toeroe, M., Khendek, F.: Kubernetes as an availability manager for microservice applications. arXiv preprint arXiv:1901.04946 (2019)
Mostofi, V.M., Krul, E., Krishnamurthy, D., Arlitt, M.: Trace-driven scaling of microservice applications. IEEE Access 11, 29360–29379 (2023)
Cai, B., Wang, B., Yang, M., Guo, Q.: Automan: resource-efficient provisioning with tail latency guarantees for microservices. Future Gener. Comput. Syst. 143, 6–75 (2023)
Savitha, S., Sangana, C., Devendran, K., Pravin, L., Rajkumar, M., Nirmal, C.: Auto scaling infrastructure with monitoring tools using linux server on cloud. In: 2023 7th International Conference on Computing Methodologies and Communication (ICCMC), pp. 45–52 (2023). IEEE
Abdullah, M., Iqbal, W., Bukhari, F., Erradi, A.: Diminishing returns and deep learning for adaptive cpu resource allocation of containers. IEEE Trans. Network Serv. Manag. 17(4), 2052–2063 (2020)
Rossi, F., Cardellini, V., Presti, F.L., Nardelli, M.: Dynamic multi-metric thresholds for scaling applications using reinforcement learning. IEEE Trans. Cloud Comput. 11(2), 1807–1821 (2022)
Yu, G., Chen, P., Zheng, Z.: Microscaler: cost-effective scaling for microservice applications in the cloud with an online learning approach. IEEE Trans. Cloud Comput. 10(2), 1100–1116 (2020)
Yan, M., Liang, X., Lu, Z., Wu, J., Zhang, W.: Hansel: adaptive horizontal scaling of microservices using bi-lstm. Appl. Soft Comput. 105, 107216 (2021)
Hossen, M.R., Islam, M.A.: A lightweight workload-aware microservices autoscaling with qos assurance. arXiv preprint arXiv:2202.00057 (2022)
Ruíz, L.M., Pueyo, P.P., Mateo-Fornés, J., Mayoral, J.V., Tehàs, F.S.: Autoscaling pods on an on-premise kubernetes infrastructure qos-aware. IEEE Access 10, 33083–33094 (2022)
Razavi, K., Luthra, M., Koldehofe, B., Mühlhäuser, M., Wang, L.: Fa2: Fast, accurate autoscaling for serving deep learning inference with sla guarantees. In: Proceedings of the 28th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2022) (2022). IEEE
Khaleq, A.A., Ra, I.: Intelligent autoscaling of microservices in the cloud for real-time applications. IEEE Access 9, 35464–35476 (2021)
Sachidananda, V., Sivaraman, A.: Learned autoscaling for cloud microservices with multi-armed bandits. arXiv preprint arXiv:2112.14845 (2021)
Ezzeddine, M., Tauvel, S., Baude, F., Huer, F.: On the design of sla-aware and cost-efficient event driven microservices. In: Proceedings of the Seventh International Workshop on Container Technologies and Container Clouds, pp. 25–30 (2021)
Zafeiropoulos, A., Fotopoulou, E., Filinis, N., Papavassiliou, S.: Reinforcement learning-assisted autoscaling mechanisms for serverless computing platforms. Simul. Model. Pract. Theory 116, 102461 (2022)
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Machine learning-based scaling management for kubernetes edge clusters. IEEE Trans. Netw. Serv. Manag. 18(1), 958–972 (2021)
Dang-Quang, N.-M., Yoo, M.: Deep learning-based autoscaling using bidirectional long short-term memory for kubernetes. Appl. Sci. 11(9), 3835 (2021)
Joyce, J.E., Sebastian, S.: Reinforcement learning based autoscaling for kafka-centric microservices in kubernetes. In: 2022 IEEE 4th PhD Colloquium on Emerging Domain Innovation and Technology for Society (PhD EDITS), pp. 1–2 (2022). IEEE
Cheng, L., Kalapgar, A., Jain, A., Wang, Y., Qin, Y., Li, Y., Liu, C.: Cost-aware real-time job scheduling for hybrid cloud using deep reinforcement learning. Neural Comput. Appl. 34(21), 18579–18593 (2022)
Cheng, L., Wang, Y., Cheng, F., Liu, C., Zhao, Z., Wang, Y.: A deep reinforcement learning-based preemptive approach for cost-aware cloud job scheduling. IEEE Trans. Sustain. Comput. (2023). https://doi.org/10.1109/TSUSC.2023.3303898
Xu, M., Song, C., Ilager, S., Gill, S.S., Zhao, J., Ye, K., Xu, C.: Coscal: multi-faceted scaling of microservices with reinforcement learning. IEEE Trans. Netw. Serv. Manag. 19(4), 3995–4009 (2022)
James, G., Witten, D., Hastie, T., Tibshirani, R., Taylor, J.: Linear regression, 69–134 (2023)
Kim, I.K., Wang, W., Qi, Y., Humphrey, M.: Forecasting cloud application workloads with cloudinsight for predictive resource management. IEEE Trans. Cloud Comput. 10(3), 1848–1863 (2022). https://doi.org/10.1109/TCC.2020.2998017
Arlitt, M., Jin, T.: A workload characterization study of the 1998 world cup web site. IEEE Netw. 14(3), 30–37 (2000)
Urdaneta, G., Pierre, G., Van Steen, M.: Wikipedia workload analysis for decentralized hosting. Comput. Netw. 53(11), 1830–1845 (2009)
Bangari, K., Rao, C.: Real workload characterization and synthetic workload generation. IJRET 5(5), 417–429 (2016)
Kumar, J., Goomer, R., Singh, A.K.: Long short term memory recurrent neural network (lstm-rnn) based workload forecasting model for cloud datacenters. Procedia Comput. Sci. 125, 676–682 (2018)
Ghobaei-Arani, M., Jabbehdari, S., Pourmina, M.A.: An autonomic resource provisioning approach for service-based cloud applications: a hybrid approach. Future Gener. Comput. Syst. 78, 191–210 (2018)
Prachitmutita, I., Aittinonmongkol, W., Pojjanasuksakul, N., Supattatham, M., Padungweang, P.: Auto-scaling microservices on iaas under sla with cost-effective framework. In: 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI), pp. 583–588 (2018). IEEE
Chieu, T.C., Mohindra, A., Karve, A.A., Segal, A.: Dynamic scaling of web applications in a virtualized cloud computing environment. In: IEEE International Conference on e-Business Engineering, pp. 281–286 (2009)
Ali-Eldin, A., Tordsson, J., Elmroth, E.: An adaptive hybrid elasticity controller for cloud infrastructures. In: 2012 IEEE Network Operations and Management Symposium, pp. 204–212 (2012)
Ali-Eldin, A., Kihl, M., Tordsson, J., Elmroth, E.: Efficient provisioning of bursty scientific workloads on the cloud using adaptive elasticity control, 31–40 (2012)
Iqbal, W., Dailey, M.N., Carrera, D., Janecek, P.: Adaptive resource provisioning for read intensive multi-tier applications in the cloud. Future Gener. Comput. Syst. 27(6), 871–879 (2011)
Moreno-Vozmediano, R., Montero, R.S., Huedo, E., Llorente, I.M.: Efficient resource provisioning for elastic cloud services based on machine learning techniques. J. Cloud Comput. 8(1), 5 (2019)
Funding
No funding sources are applicable for this research
Author information
Authors and Affiliations
Contributions
Conceptualization: NS, MA & WI; methodology: NS, MA, WI, & AE; formal analysis and investigation: NS, MA, & WI; writing original draft preparation: NS, MA, WI,& FB; writing review and editing: MA, WI, FB, & AE; supervision: MA, WI, & FB
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shafi, N., Abdullah, M., Iqbal, W. et al. Cdascaler: a cost-effective dynamic autoscaling approach for containerized microservices. Cluster Comput 27, 5195–5215 (2024). https://doi.org/10.1007/s10586-023-04228-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-023-04228-y