Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Humas: A Heterogeneity- and Upgrade-Aware Microservice Auto-Scaling Framework in Large-Scale Data Centers

Published: 27 November 2024 Publication History

Abstract

An effective auto-scaling framework is essential for microservices to ensure performance stability and resource efficiency under dynamic workloads. As revealed by many prior studies, the key to efficient auto-scaling lies in accurately learning performance patterns, i.e., the relationship between performance metrics and workloads in data-driven schemes. However, we notice that there are two significant challenges in characterizing performance patterns for large-scale microservices. Firstly, diverse microservices demonstrate varying sensitivities to heterogeneous machines, causing difficulty in quantifying the performance difference in a fixed manner. Secondly, frequent version upgrades of microservices result in uncertain changes in performance patterns, known as pattern drifts, leading to imprecise resource capacity estimation issues. To address these challenges, we propose Humas, a heterogeneity- and upgrade-aware auto-scaling framework for large-scale microservices. Firstly, Humas quantifies the difference in resource efficiency among heterogeneous machines for various microservices online and normalizes their resources in standard units. Additionally, Humas develops a least-squares density-difference (LSDD) based algorithm to identify pattern drifts caused by upgrades. Lastly, Humas generates capacity adjustment plans for microservices based on the latest performance patterns and predicted workloads. The experiment results conducted on 50 real microservices with over 11,000 containers demonstrate that Humas improves resource efficiency and performance stability by approximately 30.4% and 48.0%, respectively, compared to state-of-the-art approaches.

References

[1]
“Drift detection method based on maximum mean discrepancy.” Seldon. Accessed: Aug. 19, 2022. [Online]. Available: https://docs.seldon.io/projects/alibi-detect/en/stable/cd/methods/mmddrift.html
[2]
“Step and simple scaling policies for Amazon EC2 auto scaling.” Amazon.com. Accessed: Jul. 30, 2024. [Online]. Available: https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-simple-step.html
[3]
S. Athey, J. Tibshirani, and S. Wager, “Generalized random forests,” Ann. Statist., vol. 47, no. 2, pp. 1148–1178, 2019.
[4]
“AWS auto scaling documentation.” Amazon AWS Documentation. Accessed: Jul. 30, 2024. [Online]. Available: https://docs.aws.amazon.com/autoscaling/index.html
[5]
L. Bu, C. Alippi, and D. Zhao, “A pdf-free change detection test based on density difference estimation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 2, pp. 324–334, Feb. 2018.
[6]
L. Bu, D. Zhao, and C. Alippi, “An incremental change detection test based on density difference estimation,” IEEE Trans. Syst., Man, Cybern.: Syst., vol. 47, no. 10, pp. 2714–2726, Oct. 2017.
[7]
S. Chen, C. Delimitrou, and J. F. Martínez, “PARTIES: QoS-aware resource partitioning for multiple interactive services,” in Proc. 24th Int. Conf. Architectural Support Program. Lang. Operating Syst., New York, NY, USA: ACM, 2019, pp. 107–120.
[8]
T. Chen, R. Bahsoon, and X. Yao, “A survey and taxonomy of self-aware and self-adaptive cloud autoscaling systems,” ACM Comput. Surv., vol. 51, no. 3, pp. 61–100, Jun. 2018.
[9]
T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, New York, NY, USA: ACM, 2016, pp. 785–794.
[10]
K. Cheng et al., “ProScale: Proactive autoscaling for microservice with time-varying workload at the edge,” IEEE Trans. Parallel Distrib. Syst., vol. 34, no. 4, pp. 1294–1312, Apr. 2023.
[11]
O. Cobb and A. Van Looveren, “Context-aware drift detection,” in Proc. 39th Int. Conf. Mach. Learn., vol. 162, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, and S. Sabato, Eds., PMLR, Jul. 17–23, 2022, pp. 4087–4111. [Online]. Available: https://proceedings.mlr.press/v162/cobb22a.html
[12]
S. Dev, D. Lo, L. Cheng, and P. Ranganathan, “Autonomous warehouse-scale computers,” in Proc. 57th ACM/IEEE Des. Automat. Conf. (DAC), 2020, pp. 1–6.
[13]
M. Ghorbani, Y. Wang, Y. Xue, M. Pedram, and P. Bogdan, “Prediction and control of bursty cloud workloads: A fractal framework,” in Proc. Int. Conf. Hardware/Softw. Codesign Syst. Synthesis, New York, NY, USA: ACM, 2014, pp. 1–9.
[14]
K. Hazelwood et al., “Applied machine learning at Facebook: A datacenter infrastructure perspective,” in Proc. IEEE Int. Symp. High Perform. Comput. Archit. (HPCA), 2018, pp. 620–629.
[15]
Q. Hua, D. Yang, S. Qian, H. Hu, J. Cao, and G. Xue, “KAE-informer: A knowledge auto-embedding informer for forecasting long-term workloads of microservices,” in Proc. ACM Web Conf., New York, NY, USA: ACM, 2023, pp. 1551–1561.
[16]
S. Kannan, A. Gavrilovska, V. Gupta, and K. Schwan, “HeteroOS: OS design for heterogeneous memory management in datacenter,” in Proc. 44th Annu. Int. Symp. Comput. Archit., New York, NY, USA: ACM, 2017, pp. 521–534.
[17]
Z. Liu, X. Xia, D. Lo, M. Yan, and S. Li, “Just-in-time obsolete comment detection and update,” IEEE Trans. Softw. Eng., vol. 49, no. 1, pp. 1–23, Jan. 2023.
[18]
N. Lu, G. Zhang, and J. Lu, “Concept drift detection via competence models,” Artif. Intell., vol. 209, pp. 11–28, Apr. 2014.
[19]
S. Luo et al., “Characterizing microservice dependency and performance: Alibaba trace analysis,” in Proc. ACM Symp. Cloud Comput., New York, NY, USA: ACM, 2021, pp. 412–426.
[20]
S. Luo and et al, “Erms: Efficient resource management for shared microservices with SLA guarantees,” in Proc. 28th ACM Int. Conf. Architectural Support Program. Lang. Operating Syst., vol. 1, New York, NY, USA: ACM, 2022, pp. 62–77.
[21]
P. Minet, E. Renault, I. Khoufi, and S. Boumerdassi, “Analyzing traces from a google data center,” in Proc. 14th Int. Wireless Commun. Mobile Comput. Conf., 2018, pp. 1167–1172.
[22]
J. Park, B. Choi, C. Lee, and D. Han, “GRAF: A graph neural network based proactive resource allocation framework for SLO-oriented microservices,” in Proc. 17th Int. Conf. Emerg. Netw. Exp. Technol., New York, NY, USA: ACM, 2021, pp. 154–167.
[23]
H. Qiu, S. S. Banerjee, S. Jha, Z. T. Kalbarczyk, and R. K. Iyer, “FIRM: An intelligent fine-grained resource management framework for SLO-oriented microservices,” in Proc. 14th USENIX Conf. Operating Syst. Des. Implementation. Berkeley, CA, USA: USENIX Association, 2020, pp. 805–825.
[24]
C. Qu, R. N. Calheiros, and R. Buyya, “Auto-scaling web applications in clouds: A taxonomy and survey,” ACM Comput. Surv., vol. 51, no. 4, pp. 1–31, Jul. 2018.
[25]
R. Rosen, “Resource management: Linux kernel namespaces and cgroups,” Haifux, vol. 186, no. 70, pp. 1–121, May 2013.
[26]
K. Rzadca et al., “Autopilot: Workload autoscaling at Google,” in Proc. 15th Eur. Conf. Comput. Syst., New York, NY, USA: ACM, 2020.
[27]
M. Sablik, “Taylor's theorem and functional equations,” aequationes mathematicae, vol. 60, no. 3, pp. 258–267, 2000.
[28]
C. Singh, N. S. Gaba, M. Kaur, and B. Kaur, “Comparison of different CI/CD tools integrated with cloud platform,” in Proc. 9th Int. Conf. Cloud Comput., Data Sci. Eng., 2019, pp. 7–12.
[29]
A. Sriraman, A. Dhanotia, and T. F. Wenisch, “SoftSKU: Optimizing server architectures for microservice diversity @scale,” in Proc. 46th Int. Symp. Comput. Archit., New York, NY, USA: ACM, 2019, pp. 513–526.
[30]
R. Taft, “P-store: An elastic database system with predictive provisioning,” in Proc. Int. Conf. Manage. Data, New York, NY, USA: ACM, 2018, pp. 205–219.
[31]
S. J. Taylor and B. Letham, “Forecasting at scale,” Amer. Statistician, vol. 72, no. 1, pp. 37–45, 2018.
[32]
M. Tirmazi et al., “Borg: The next generation,” in Proc. 15th Eur. Conf. Comput. Syst., New York, NY, USA: ACM, 2020.
[33]
W. Wang, B. Li, and B. Liang, “Dominant resource fairness in cloud computing systems with heterogeneous servers,” in Proc. IEEE Conf. Comput. Commun., 2014, pp. 583–591.
[34]
Z. Wang et al., “DeepScaling: Microservices autoscaling for stable CPU utilization in large scale cloud systems,” in Proc. 13th Symp. Cloud Comput., New York, NY, USA: ACM, 2022, pp. 16–30.
[35]
Y. Xiao, Y. Xue, S. Nazarian, and P. Bogdan, “A load balancing inspired optimization framework for exascale multicore systems: A complex networks approach,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Des. (ICCAD), 2017, pp. 217–224.
[36]
Y. Xiao et al., “End-to-end programmable computing systems,” Commun. Eng., vol. 2, no. 1, p. 84, 2023.
[37]
Y. Xiao, S. Nazarian, and P. Bogdan, “Plasticity-on-chip design: Exploiting self-similarity for data communications,” IEEE Trans. Comput., vol. 70, no. 6, pp. 950–962, Jun. 2021.
[38]
S. Xue et al., “A meta reinforcement learning approach for predictive autoscaling in the cloud,” in Proc. 28th ACM SIGKDD Conf. Knowl. Discovery Data Mining, New York, NY, USA: ACM, 2022, pp. 4290–4299.
[39]
L. Yi, C. Li, and J. Guo, “CPI for runtime performance measurement: The good, the bad, and the ugly,” in Proc. IEEE Int. Symp. Workload Characterization, Piscataway, NJ, USA: IEEE Press, 2020, pp. 106–113.
[40]
Q. Zhang, M. F. Zhani, R. Boutaba, and J. L. Hellerstein, “Harmony: Dynamic heterogeneity-aware resource provisioning in the cloud,” in Proc. IEEE 33rd Int. Conf. Distrib. Comput. Syst., 2013, pp. 510–519.
[41]
S. Zhang, T. Wu, M. Pan, C. Zhang, and Y. Yu, “A-SARSA: A predictive container auto-scaling algorithm based on reinforcement learning,” in Proc. IEEE Int. Conf. Web Services (ICWS), 2020, pp. 489–497.
[42]
X. Zhang, E. Tune, R. Hagmann, R. Jnagal, V. Gokhale, and J. Wilkes, “CPI2: CPU performance isolation for shared compute clusters,” in Proc. 8th ACM Eur. Conf. Comput. Syst., 2013, pp. 379–391.
[43]
Y. Zhang, W. Hua, Z. Zhou, G. E. Suh, and C. Delimitrou, “Sinan: ML-based and QoS-aware resource management for cloud microservices,” in Proc. 26th ACM Int. Conf. Architectural Support Program. Lang. Operating Syst., New York, NY, USA: ACM, 2021, pp. 167–181.
[44]
D. Zhou, H. Chen, K. Shang, G. Cheng, J. Zhang, and H. Hu, “Cushion: A proactive resource provisioning method to mitigate SLO violations for containerized microservices,” IET Commun., vol. 16, no. 17, pp. 2105–2122, 2022.
[45]
H. Zhou et al., “Informer: Beyond efficient transformer for long sequence time-series forecasting,” in Proc. AAAI Conf. Artif. Intell., May 2021, vol. 35, no. 12, pp. 11106–11115.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers
IEEE Transactions on Computers  Volume 74, Issue 3
March 2025
360 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 27 November 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media