Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3357223.3362730acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

Pufferfish: Container-driven Elastic Memory Management for Data-intensive Applications

Published: 20 November 2019 Publication History

Abstract

Data-intensive applications often suffer from significant memory pressure, resulting in excessive garbage collection (GC) and out-of-memory (OOM) errors, harming system performance and reliability. In this paper, we demonstrate how lightweight virtualization via OS containers opens up opportunities to address memory pressure and realize memory elasticity: 1) tasks running in a container can be set to a large heap size to avoid OutOfMemory (OOM) errors, and 2) tasks that are under memory pressure and incur significant swapping activities can be temporarily "suspended" by depriving resources from the hosting containers, and be "resumed" when resources are available. We propose and develop Pufferfish, an elastic memory manager, that leverages containers to flexibly allocate memory for tasks. Memory elasticity achieved by Pufferfish can be exploited by a cluster scheduler to improve cluster utilization and task parallelism. We implement Pufferfish on the cluster scheduler Apache Yarn. Experiments with Spark and MapReduce on real-world traces show Pufferfish is able to avoid OOM errors, improve cluster memory utilization by 2.7x and the median job runtime by 5.5x compared to a memory over-provisioning solution.

References

[1]
Spark-19371. https://issues.apache.org/jira/browse/SPARK-19371.
[2]
Tpch standard specification. http://www.tpch.org/tpcc/spec/tpcc.
[3]
Yarn-1645. https://issues.apache.org/jira/browse/SPARK-19371/.
[4]
Hadoop. http://hadoop.apache.org, 2009.
[5]
O. Alipourfard, H. H. Liu, J. Chen, S. Venkataraman, M. Yu, and M. Zhang. Cherrypick: Adaptively unearthing the best cloud configurations for big data analytics. In Proc. of USENIX NSDI, 2017.
[6]
M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan, M. J. Franklin, A. Ghodsi, et al. Spark sql: Relational data processing in spark. In Pro. of ACM SIGMOD, 2015.
[7]
V. Borkar, M. Carey, R. Grover, N. Onose, and R. Vernica. Hyracks: A flexible and extensible foundation for data-intensive computing. In Proc. of IEEE ICDE, 2011.
[8]
R. Bruno, D. Patrício, J. Simão, L. Veiga, and P. Ferreira. Runtime object lifetime profiler for latency sensitive big data applications. In Proc. of ACM Eurosys, 2019.
[9]
W. Chen, A. Pi, S. Wang, and X. Zhou. Characterizing scheduling delay for low-latency data analytics workloads. In Proc. of IEEE IPDPS, 2018.
[10]
W. Chen, J. Rao, and X. Zhou. Preemptive, low latency datacenter scheduling via lightweight virtualization. In Proc. of USENIX ATC, 2017.
[11]
E. Cortez, A. Bonde, A. Muzio, M. Russinovich, M. Fontoura, and R. Bianchini. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proc. of the ACM SOSP, 2017.
[12]
P. Delgado, F. Dinu, A.-M. Kermarrec, and W. Zwaenepoel. Hawk: Hybrid datacenter scheduling. In Proc. of USENIX ATC, 2015.
[13]
C. Engle, A. Lupher, R. Xin, M. Zaharia, M. J. Franklin, S. Shenker, and I. Stoica. Shark: fast data analysis using coarse-grained distributed memory. In Proc. of ACM SIGMOD, 2012.
[14]
L. Fang, K. Nguyen, G. Xu, B. Demsky, and S. Lu. Interruptible tasks: Treating memory pressure as interrupts for highly scalable data-parallel programs. In Proc. of ACM SOSP, 2015.
[15]
R. Gandhi, D. Xie, and Y. C. Hu. Pikachu: How to rebalance load in optimizing MapReduce on heterogeneous clusters. In Proc. of USENIX ATC, 2013.
[16]
P. Garefalakis, K. Karanasos, P. R. Pietzuch, A. Suresh, and S. Rao. Medea: scheduling of long running applications in shared production clusters. In Proc. of the ACM EuroSys, 2018.
[17]
I. Gog, J. Giceva, M. Schwarzkopf, K. Vaswani, D. Vytiniotis, G. Ramalingam, M. Costa, D. G. Murray, S. Hand, and M. Isard. Broom: Sweeping out garbage collection from big data systems. In Proc. of USENIX HotOS, 2015.
[18]
J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. Graphx: Graph processing in a distributed dataflow framework. In Proc. of USENIX OSDI, 2014.
[19]
R. Grandl, M. Chowdhury, A. Akella, and G. Ananthanarayanan. Altruistic scheduling in multi-resource clusters. In Proc. of USENIX OSDI, 2016.
[20]
R. Grandl, S. Kandula, S. Rao, A. Akella, and J. Kulkarni. Graphene: Packing and dependency-aware scheduling for data-parallel clusters. In Proc. of the USENIX OSDI, 2016.
[21]
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proc. of USENIX NSDI, 2011.
[22]
S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang. The hibench benchmark suite: Characterization of the mapreduce-based data analysis. In Proc. of IEEE Data Engineering Workshops (ICDEW), 2010.
[23]
C. Hunt and B. John. Java performance. Prentice Hall Press, 2011.
[24]
C. Iorgulescu, F. Dinu, A. Raza, W. U. Hassan, and W. Zwaenepoel. Don't cry over spilled records: Memory elasticity of data-parallel applications and its application to cluster scheduling. In Proc. of USENIX ATC, 2017.
[25]
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed dataparallel programs from sequential building blocks. In Proc. of ACM SOSP, 2007.
[26]
K. Karanasos, S. Rao, C. Curino, C. Douglas, K. Chaliparambil, G. M. Fumarola, S. Heddaya, R. Ramakrishnan, and S. Sakalanaga. Mercury: Hybrid centralized and distributed scheduling in large shared clusters. In Proc. of USENIX ATC, 2015.
[27]
Y. Kwon, M. Balazinska, B. Howe, and J. Rolia. Skewtune: mitigating skew in MapReduce applications. In Proc. of ACM SIGMOD, 2012.
[28]
Y. Kwon, K. Ren, M. Balazinska, B. Howe, and J. Rolia. Managing skew in hadoop. Proc. of IEEE Data Eng. Bull., 2013.
[29]
L. Liu and H. Xu. Elasecutor: Elastic executor scheduling in data analytics systems. In Proc. of the ACM SoCC, 2018.
[30]
D. Lo, L. Cheng, R. Govindaraju, P. Ranganathan, and C. Kozyrakis. Heracles: improving resource efficiency at scale. In Proc. of ACM ISCA, 2015.
[31]
J. Mars, L. Tang, R. Hundt, K. Skadron, and M. L. Soffa. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proc. of IEEE/ACM MICRO, 2011.
[32]
K. Nguyen, L. Fang, G. Xu, B. Demsky, S. Lu, S. Alamian, and O. Mutlu. Yak: A high-performance big-data-friendly garbage collector. In Proc. of USENIX OSDI, 2016.
[33]
K. Nguyen, K. Wang, Y. Bu, L. Fang, J. Hu, and G. Xu. Facade: A compiler and runtime for (almost) object-bounded big data applications. In Proc. of ACM SOSP, 2015.
[34]
K. Ousterhout, R. Rasti, S. Ratnasamy, S. Shenker, B.-G. Chun, and V. ICSI. Making sense of performance in data analytics frameworks. In Proc. of USENIX NSDI, 2015.
[35]
K. Ousterhout, P. Wendell, M. Zaharia, and I. Stoica. Sparrow: distributed, low latency scheduling. In Proc. of ACM SOSP, 2013.
[36]
J. W. Park, A. Tumanov, A. Jiang, M. A. Kozuch, and G. R. Ganger. 3sigma: distribution-based cluster scheduling for runtime uncertainty. In Proc. of the ACM EuroSys, 2018.
[37]
A. Qiao, A. Aghayev, W. Yu, H. Chen, Q. Ho, G. A. Gibson, and E. P. Xing. Litz: Elastic framework for high-performance distributed machine learning. In Proc. of the USENIX ATC), 2018.
[38]
C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proc. of ACM SoCC, 2012.
[39]
B. Saha, H. Shah, S. Seth, G. Vijayaraghavan, A. Murthy, and C. Curino. Apache tez: A unifying framework for modeling and building data processing applications. In Proc. of ACM SIGMOD, 2015.
[40]
T.-I. Salomie, G. Alonso, T. Roscoe, and K. Elphinstone. Application level ballooning for efficient server consolidation. In Proc. of ACM Eurosys, 2013.
[41]
A. Thusoo, J. S. Sarma, N.Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive: a warehousing solution over a map-reduce framework. Proc. of VLDB Endowment, 2009.
[42]
V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, et al. Apache Hadoop YARN: Yet another resource negotiator. In Proc. of ACM SoCC, 2013.
[43]
S. Venkataraman, Z. Yang, M. Franklin, B. Recht, and I. Stoica. Ernest: efficient performance prediction for large-scale advanced analytics. In Proc. of USENIX NSDI, 2016.
[44]
C. A. Waldspurger. Memory resource management in vmware esx server. ACM SIGOPS Operating Systems Review, 2002.
[45]
J. Wang and M. Balazinska. Elastic memory management for cloud data analytics. In Proc. of USENIX ATC, 2017.
[46]
W. Xiao, R. Bhardwaj, R. Ramjee, M. Sivathanu, N. Kwatra, Z. Han, P. Patel, X. Peng, H. Zhao, Q. Zhang, et al. Gandiva: introspective cluster scheduling for deep learning. In Proc. of the USENIX OSDI, 2018.
[47]
H. Yang, A. Breslow, J. Mars, and L. Tang. Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers. In Proc. of ACM ISCA, 2013.
[48]
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proc. of USENIX NSDI, 2012.
[49]
M. Zaharia, R. S. Xin, P. Wendell, T. Das, M. Armbrust, A. Dave, X. Meng, J. Rosen, S. Venkataraman, M. J. Franklin, et al. Apache spark: a unified engine for big data processing. Communications of the ACM, 59(11):56--65, 2016.
[50]
Z. Zhang, L. Cherkasova, and B. T. Loo. Exploiting cloud heterogeneity to optimize performance and cost of MapReduce processing. In Proc. of ACM SIGMETRICS, 2015.

Cited By

View all
  • (2024)Emma: Elastic Multi-Resource Management for Realtime Stream ProcessingIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621313(1581-1590)Online publication date: 20-May-2024
  • (2023)Let It Go: Relieving Garbage Collection Pain for Latency Critical Applications in GolangProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592998(169-180)Online publication date: 7-Aug-2023
  • (2023)Adapt Burstable Containers to Variable CPU ResourcesIEEE Transactions on Computers10.1109/TC.2022.317448072:3(614-626)Online publication date: 1-Mar-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SoCC '19: Proceedings of the ACM Symposium on Cloud Computing
November 2019
503 pages
ISBN:9781450369732
DOI:10.1145/3357223
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cloud computing
  2. cluster scheduling
  3. containerization
  4. memory management

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SoCC '19
Sponsor:
SoCC '19: ACM Symposium on Cloud Computing
November 20 - 23, 2019
CA, Santa Cruz, USA

Acceptance Rates

SoCC '19 Paper Acceptance Rate 39 of 157 submissions, 25%;
Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)166
  • Downloads (Last 6 weeks)17
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Emma: Elastic Multi-Resource Management for Realtime Stream ProcessingIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621313(1581-1590)Online publication date: 20-May-2024
  • (2023)Let It Go: Relieving Garbage Collection Pain for Latency Critical Applications in GolangProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592998(169-180)Online publication date: 7-Aug-2023
  • (2023)Adapt Burstable Containers to Variable CPU ResourcesIEEE Transactions on Computers10.1109/TC.2022.317448072:3(614-626)Online publication date: 1-Mar-2023
  • (2023)Container Restart Reduction Technique in Kubernetes Using Memory Oversubscription2023 IEEE 20th International Conference on Mobile Ad Hoc and Smart Systems (MASS)10.1109/MASS58611.2023.00081(606-607)Online publication date: 25-Sep-2023
  • (2023)Latency-Oriented Elastic Memory Management at Task-Granularity for Stateful Streaming ProcessingIEEE INFOCOM 2023 - IEEE Conference on Computer Communications10.1109/INFOCOM53939.2023.10228963(1-10)Online publication date: 17-May-2023
  • (2022)Improving Concurrent GC for Latency Critical Services in Multi-tenant SystemsProceedings of the 23rd ACM/IFIP International Middleware Conference10.1145/3528535.3531515(43-55)Online publication date: 7-Nov-2022
  • (2022)HolmesProceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing10.1145/3502181.3531464(110-121)Online publication date: 27-Jun-2022
  • (2021)Memory at your serviceProceedings of the 22nd International Middleware Conference10.1145/3464298.3493394(185-197)Online publication date: 6-Dec-2021
  • (2021)FlashByte: Improving Memory Efficiency with Lightweight Native Storage2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid51090.2021.00016(61-70)Online publication date: May-2021
  • (2021)Adaptive Online Estimation of Thrashing-Avoiding Memory Reservations for Long-Lived ContainersCollaborative Computing: Networking, Applications and Worksharing10.1007/978-3-030-67537-0_37(620-639)Online publication date: 22-Jan-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media