Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3472883.3487013acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Tell me when you are sleepy and what may wake you up!

Published: 01 November 2021 Publication History

Abstract

Nowadays, there is a shift in the deployment model of Cloud and Edge applications. Applications are now deployed as a set of several small units communicating with each other - the microservice model. Moreover, each unit - a microservice, may be implemented as a virtual machine, container, function, etc., spanning the different Cloud and Edge service models including IaaS, PaaS, FaaS. A microservice is instantiated upon the reception of a request (e.g., an http packet or a trigger), and a rack-level or data-center-level scheduler decides the placement for such unit of execution considering for example data locality and load balancing. With such a configuration, it is common to encounter scenarios where different units, as well as multiple instances of the same unit, may be running on a single server at the same time.
When multiple microservices are running on the same server not necessarily all of them are doing actual processing, some may be busy-waiting - i.e., waiting for events (or requests) sent by other units. However, these "idle" units are consuming CPU time which could be used by other running units or cloud utility functions on the server (e.g., monitoring daemons). In a controlled experiment, we observe that units can spend up to 20% - 55% of their CPU time waiting, thus a great amount of CPU time is wasted; these values significantly grow when overcommitting CPU resources (i.e., units CPU reservations exceed server CPU capacity), where we observe up to 69% - 75%. This is a result of the lack of information/context about what is running in each unit from the server CPU scheduler perspective.
In this paper, we first provide evidence of the problem and discuss several research questions. Then, we propose an handful of solutions worth exploring that consists in revisiting hypervisor and host OS scheduler designs to reduce the CPU time wasted on idle units. Our proposal leverages the concepts of informed scheduling, and monitoring for internal and external events. Based on the aforementioned solutions, we propose our initial implementation on Linux/KVM.

Supplementary Material

MP4 File (Day4_12_1_DjobMvondo.mp4)
Presentation video

References

[1]
Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 419--434. https://www.usenix.org/conference/nsdi20/presentation/agache
[2]
Yuvraj Agarwal, Stefan Savage, and Rajesh Gupta. 2010. SleepServer: A Software-Only Approach for Reducing the Energy Consumption of PCs within Enterprise Environments. In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference (Boston, MA) (USENIXATC'10). USENIX Association, USA, 22.
[3]
Irfan Ahmad, Ajay Gulati, and Ali Mashtizadeh. 2011. VIC: Interrupt Coalescing for Virtual Machine Storage Device IO. In Proceedings of the 2011 USENIX Conference on USENIX Annual Technical Conference (Portland, OR) (USENIXATC'11). USENIX Association, USA, 4.
[4]
Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska, and Henry M. Levy. 1991. Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism. In Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles (Pacific Grove, California, USA) (SOSP '91). Association for Computing Machinery, New York, NY, USA, 95--109. https://doi.org/10.1145/121132.121151
[5]
AWS. 2018. Introducing Firecracker, a New Virtualization Technology and Open Source Project for Running Multi-Tenant Container Workloads. http://tiny.cc/iwm7tz. Online; accessed Jan, 05 2021.
[6]
M. Bacou, G. Todeschi, A. Tchana, D. Hagimont, B. Lepers, and W. Zwaenepoel. 2019. Drowsy-DC: Data Center Power Management System. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE Computer Society, Los Alamitos, CA, USA, 825--834. https://doi.org/10.1109/IPDPS.2019.00091
[7]
Antonio Barbalace, Javier Picorel, and Pramod Bhatotia. 2019. ExtOS: Data-Centric Extensible OS. In Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems (Hangzhou, China) (APSys '19). Association for Computing Machinery, New York, NY, USA, 31--39. https://doi.org/10.1145/3343737.3343742
[8]
Salman A. Baset, Long Wang, and Chunqiang Tang. 2012. Towards an Understanding of Oversubscription in Cloud. In Proceedings of the 2nd USENIX Conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services (San Jose, CA) (Hot-ICE'12). USENIX Association, USA, 7.
[9]
Justinien Bouron, Sebastien Chevalley, Baptiste Lepers, Willy Zwaenepoel, Redha Gouicem, Julia Lawall, Gilles Muller, and Julien Sopena. 2018. The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 85--96. https://www.usenix.org/conference/atc18/presentation/bouron
[10]
K. Burns, A. Barbalace, V. Legout, and B. Ravindran. 2014. KairosVM: Deterministic introspection for real-time virtual machine hierarchical scheduling. In Proceedings of the 2014 IEEE Emerging Technology and Factory Automation (ETFA). 1--8. https://doi.org/10.1109/ETFA.2014.7005061
[11]
Jonathan Corbet. 2020. Rethinking the futex API. https://lwn.net/Articles/823513/.
[12]
Michael Drescher, Vincent Legout, Antonio Barbalace, and Binoy Ravindran. 2016. A Flattened Hierarchical Scheduler for Real-Time Virtualization. In Proceedings of the 13th International Conference on Embedded Software (Pittsburgh, Pennsylvania) (EMSOFT '16). Association for Computing Machinery, New York, NY, USA, Article 12, 10 pages. https://doi.org/10.1145/2968478.2968501
[13]
eBPF. 2021. eBPF Foundation. https://ebpf.io/foundation/. Online; accessed Sep, 10, 2021.
[14]
Simon Eismann, Joel Scheuner, Erwin van Eyk, Maximilian Schwinger, Johannes Grohmann, Nikolas Herbst, Cristina L. Abad, and Alexandru Iosup. 2020. A Review of Serverless Use Cases and their Characteristics. arXiv:2008.11110 [cs.SE]
[15]
Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. 2019. From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers. In Proceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference (Renton, WA, USA) (USENIX ATC '19). USENIX Association, USA, 475--488.
[16]
Hubertus Franke, Rusty Russell, and Matthew Kirkwood. [n.d.]. Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux.
[17]
Yoann Ghigoff, Julien Sopena, Kahina Lazri, Antoine Blin, and Gilles Muller. 2021. BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 487--501. https://www.usenix.org/conference/nsdi21/presentation/ghigoff
[18]
Weiwei Jia, Jianchen Shan, Tsz On Li, Xiaowei Shang, Heming Cui, and Xiaoning Ding. 2020. vSMT-IO: Improving I/O Performance and Efficiency on SMT Processors in Virtualized Clouds. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 449--463. https://www.usenix.org/conference/atc20/presentation/jia
[19]
Weiwei Jia, Cheng Wang, Xusheng Chen, Jianchen Shan, Xiaowei Shang, Heming Cui, Xiaoning Ding, Luwei Cheng, Francis C. M. Lau, Yuexuan Wang, and Yuangang Wang. 2018. Effectively Mitigating I/O Inactivity in vCPU Scheduling. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 267--280. https://www.usenix.org/conference/atc18/presentation/jia
[20]
Kostis Kaffes, Neeraja J. Yadwadkar, and Christos Kozyrakis. 2019. Centralized Core-Granular Scheduling for Serverless Functions. In Proceedings of the ACM Symposium on Cloud Computing (Santa Cruz, CA, USA) (SoCC '19). Association for Computing Machinery, New York, NY, USA, 158--164. https://doi.org/10.1145/3357223.3362709
[21]
Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2016. Opportunistic Spinlocks: Achieving Virtual Machine Scalability in the Clouds. SIGOPS Oper. Syst. Rev. 50, 1 (March 2016), 9--16. https://doi.org/10.1145/2903267.2903271
[22]
Katacontainers. [n.d.]. Katacontainers: The speed of containers, the security of VMs. https://katacontainers.io/.
[23]
Kenneth van Surksum. 2012. Best Practices for Oversubscription of CPU, Memory and Storage in vSphere Virtual Environments.
[24]
Knative. 2018. Knative. https://knative.dev/. Online; accessed Jan, 16 2021.
[25]
Baptiste Lepers, Redha Gouicem, Damien Carver, Jean-Pierre Lozi, Nicolas Palix, Maria-Virginia Aponte, Willy Zwaenepoel, Julien Sopena, Julia Lawall, and Gilles Muller. 2020. Provable Multicore Schedulers with Ipanema: Application to Work Conservation. In Proceedings of the Fifteenth European Conference on Computer Systems (Heraklion, Greece) (EuroSys '20). Association for Computing Machinery, New York, NY, USA, Article 3, 16 pages. https://doi.org/10.1145/3342195.3387544
[26]
Jin Tack Lim and Jason Nieh. 2020. Optimizing Nested Virtualization Performance Using Direct Virtual Hardware. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Lausanne, Switzerland) (ASPLOS '20). Association for Computing Machinery, New York, NY, USA, 557--574. https://doi.org/10.1145/3373376.3378467
[27]
Martin Hosken. 2018. Architecting a VMware vSphere® Compute Platform for VMware Cloud Providers.
[28]
G. Muller, J.L. Lawall, and H. Duchesne. 2005. A framework for simplifying the development of kernel schedulers: design and performance evaluation. In Ninth IEEE International Symposium on High-Assurance Systems Engineering (HASE'05). 56--65. https://doi.org/10.1109/HASE.2005.1
[29]
OpenWhisk. 2016. Apache OpenWhisk-Open Source Serverless Cloud Platform. https://openwhisk.apache.org/. Online; accessed Jan, 16 2021.
[30]
Jiannan Ouyang and John R. Lange. 2013. Preemptable Ticket Spinlocks: Improving Consolidated Performance in the Cloud. SIGPLAN Not. 48, 7 (March 2013), 191--200. https://doi.org/10.1145/2517326.2451549
[31]
Thomas Rausch, Alexander Rashed, and Schahram Dustdar. 2021. Optimized container scheduling for data-intensive serverless edge computing. Future Generation Computer Systems 114 (2021), 259 - 271. https://doi.org/10.1016/j.future.2020.07.017
[32]
Red Hat. 2021. Overcommitting resources. http://tiny.cc/xdobtz.
[33]
Christopher Small and Margo Seltzer. 1996. A Comparison of OS Extension Technologies. In USENIX 1996 Annual Technical Conference (USENIX ATC 96). USENIX Association, San Diego, CA. https://www.usenix.org/conference/usenix-1996-annual-technical-conference/comparison-os-extension-technologies
[34]
Kun Suo, Yong Zhao, Jia Rao, Luwei Cheng, Xiaobo Zhou, and Francis C. M. Lau. 2017. Preserving I/O Prioritization in Virtualized OSes. In Proceedings of the 2017 Symposium on Cloud Computing (Santa Clara, California) (SoCC '17). Association for Computing Machinery, New York, NY, USA, 269--281. https://doi.org/10.1145/3127479.3127484
[35]
Boris Teabe, Vlad Nitu, Alain Tchana, and Daniel Hagimont. 2017. The Lock Holder and the Lock Waiter Pre-Emption Problems: Nip Them in the Bud Using Informed Spinlocks (I-Spinlock). In Proceedings of the Twelfth European Conference on Computer Systems (Belgrade, Serbia) (EuroSys '17). Association for Computing Machinery, New York, NY, USA, 286--297. https://doi.org/10.1145/3064176.3064180
[36]
Boris Teabe, Alain Tchana, and Daniel Hagimont. 2016. Application-Specific Quantum for Multi-Core Platform Scheduler. In Proceedings of the Eleventh European Conference on Computer Systems (London, United Kingdom) (EuroSys '16). Association for Computing Machinery, New York, NY, USA, Article 3, 14 pages. https://doi.org/10.1145/2901318.2901340
[37]
Volkmar Uhlig, Joshua LeVasseur, Espen Skoglund, and Uwe Dannowski. 2004. Towards Scalable Multiprocessor Virtual Machines. In Proceedings of the 3rd Conference on Virtual Machine Research And Technology Symposium - Volume 3 (San Jose, California) (VM'04). USENIX Association, USA, 4.
[38]
Uma Panda. 2017. How to decide VMWare vCPU to physical CPU ratio. https://www.cloudpanda.org/blog-single/?id=76.
[39]
Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking behind the Curtains of Serverless Platforms. In Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference (Boston, MA, USA) (USENIX ATC '18). USENIX Association, USA, 133--145.
[40]
S. Wu, Z. Xie, H. Chen, S. Di, X. Zhao, and H. Jin. 2016. Dynamic Acceleration of Parallel Applications in Cloud Platforms by Adaptive Time-Slice Control. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 343--352. https://doi.org/10.1109/IPDPS.2016.77
[41]
Tianyi Yu, Qingyuan Liu, Dong Du, Yubin Xia, Binyu Zang, Ziqian Lu, Pingchao Yang, Chenggang Qin, and Haibo Chen. 2020. Characterizing Serverless Platforms with Serverlessbench. In Proceedings of the 11th ACM Symposium on Cloud Computing (Virtual Event, USA) (SoCC '20). Association for Computing Machinery, New York, NY, USA, 30--44. https://doi.org/10.1145/3419111.3421280
[42]
Hang Zhu, Kostis Kaffes, Zixu Chen, Zhenming Liu, Christos Kozyrakis, Ion Stoica, and Xin Jin. 2020. RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 1225--1240. https://www.usenix.org/conference/osdi20/presentation/zhu

Cited By

View all
  • (2023)Towards Latency-Aware Linux Scheduling for Serverless WorkloadsProceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies10.1145/3592533.3592807(19-26)Online publication date: 8-May-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SoCC '21: Proceedings of the ACM Symposium on Cloud Computing
November 2021
685 pages
ISBN:9781450386388
DOI:10.1145/3472883
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SoCC '21
Sponsor:
SoCC '21: ACM Symposium on Cloud Computing
November 1 - 4, 2021
WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)56
  • Downloads (Last 6 weeks)4
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Towards Latency-Aware Linux Scheduling for Serverless WorkloadsProceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies10.1145/3592533.3592807(19-26)Online publication date: 8-May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media