Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

NumaGiC: a Garbage Collector for Big Data on Big NUMA Machines

Published: 14 March 2015 Publication History

Abstract

On contemporary cache-coherent Non-Uniform Memory Access (ccNUMA) architectures, applications with a large memory footprint suffer from the cost of the garbage collector (GC), because, as the GC scans the reference graph, it makes many remote memory accesses, saturating the interconnect between memory nodes. We address this problem with NumaGiC, a GC with a mostly-distributed design. In order to maximise memory access locality during collection, a GC thread avoids accessing a different memory node, instead notifying a remote GC thread with a message; nonetheless, NumaGiC avoids the drawbacks of a pure distributed design, which tends to decrease parallelism. We compare NumaGiC with Parallel Scavenge and NAPS on two different ccNUMA architectures running on the Hotspot Java Virtual Machine of OpenJDK 7. On Spark and Neo4j, two industry-strength analytics applications, with heap sizes ranging from 160GB to 350GB, and on SPECjbb2013 and SPECjbb2005, ourgc improves overall performance by up to 45% over NAPS (up to 94% over Parallel Scavenge), and increases the performance of the collector itself by up to 3.6x over NAPS (up to 5.4x over Parallel Scavenge).

References

[1]
T. A. Anderson. Optimizations in a private nursery-based garbage collector. In ISMM '10, pages 21--30. ACM, 2010.
[2]
A. W. Appel. Simple generational garbage collection and fast allocation. SP&E, 19(2):171--183, 1989.
[3]
A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schupbach, and A. Singhania. The multikernel: a new OS architecture for scalable multicore systems. In SOSP '09, pages 29--44. ACM, 2009.
[4]
S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovic, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA '06, pages 169--190. ACM, 2006.
[5]
K. M. Chandy and L. Lamport. Distributed snapshots: Determining global states of distributed systems. TOCS, 3(1):63--75, 1985.
[6]
M. Dashti, A. Fedorova, J. Funston, F. Gaud, R. Lachaize, B. Lepers, V. Quema, and M. Roth. Traffic management: A holistic approach to memory placement on NUMA systems. In ASPLOS '13, pages 381--394. ACM, 2013.
[7]
D. Doligez and X. Leroy. A concurrent, generational garbage collector for a multithreaded implementation of ml. In POPL '93, pages 113--123. ACM, 1993.
[8]
Friendster. SNAP: network datasets: Friendster social network. http://snap.stanford.edu/data/com-Friendster.html, 2014.
[9]
L. Gidra, G. Thomas, J. Sopena, and M. Shapiro. Assessing the scalability of garbage collectors on many cores. In SOSP Workshop on Programming Languages and Operating Systems, PLOS '11, pages 1--5. ACM, 2011.
[10]
L. Gidra, G. Thomas, J. Sopena, and M. Shapiro. A study of the scalability of stop-the-world garbage collectors on multicores. In ASPLOS '13, pages 229--240. ACM, 2013.
[11]
H2. H2 database engine. http://www.h2database.com/, 2014.
[12]
M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann, 2008.
[13]
R. Jones, A. Hosking, and E. Moss. The garbage collection handbook: the art of automatic memory management. Chapman & Hall/CRC, 1st edition, 2011.
[14]
H. Lieberman and C. Hewitt. A real-time garbage collector based on the lifetimes of objects. CACM, 26(6):419--429, 1983.
[15]
LinuxMemPolicy. What is Linux memory policy? http://www.kernel.org/doc/Documentation/vm/numa_memory_policy.txt, 2014.
[16]
S. Marlow and S. Peyton Jones. Multicore garbage collection with local heaps. In ISMM '11, pages 21--32. ACM, 2011.
[17]
Neo4j. Neo4j -- the world's leading graph database. http://www.neo4j.org, 2014.
[18]
E. B. Nightingale, O. Hodson, R. McIlroy, C. Hawblitzel, and G. Hunt. Helios: heterogeneous multiprocessing with satellite kernels. In SOSP '09, pages 221--234. ACM, 2009.
[19]
T. Ogasawara. NUMA-aware memory manager with dominant-thread-based copying GC. In OOPSLA '09, pages 377--390. ACM, 2009.
[20]
K. Sivaramakrishnan, L. Ziarek, and S. Jagannathan. Eliminating read barriers through procrastination and cleanliness. In ISMM '12, pages 49--60. ACM, 2012.
[21]
P. Sobalvarro. A lifetime-based garbage collector for LISP systems on general-purpose computers. Technical report, Cam- bridge, MA, USA, 1988.
[22]
X. Song, H. Chen, R. Chen, Y. Wang, and B. Zang. A case for scaling applications to many-core with OS clustering. In EuroSys '11, pages 61--76. ACM, 2011.
[23]
Spark. Apache Spark-- lightning-fast cluster computing. http://spark.apache.org, 2014.
[24]
SPECjbb2005. SPECjbb2005 home page. http://www.spec.org/jbb2005/, 2014.
[25]
SPECjbb2013. SPECjbb2013 home page. http://www.spec.org/jbb2013/, 2014.
[26]
I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD '12, pages 1222--1230. ACM, 2012.
[27]
B. Steensgaard. Thread-specific heaps for multi-threaded programs. In ISMM '00, pages 18--24. ACM, 2000.
[28]
M. M. Tikir and J. K. Hollingsworth. NUMA-aware Java heaps for server applications. In IPDPS '05, pages 108--117. IEEE Computer Society, 2005.
[29]
D. Ungar. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In SDE '84, pages 157--167. ACM, 1984.
[30]
P. R. Wilson and T. G. Moher. A "card-marking" scheme for controlling intergenerational references in generation-based garbage collection on stock hardware. SIGPLAN Notice, 24 (5):87--92, 1989.
[31]
J. Zhou and B. Demsky. Memory management for many-core processors with software configurable locality policies. In ISMM '12, pages 3--14. ACM, 2012.

Cited By

View all
  • (2021)A Comparative Analysis of Garbage Collectors and Their Suitability for Big Data WorkloadsAdvances in Computing and Network Communications10.1007/978-981-33-6977-1_24(305-316)Online publication date: 21-Apr-2021
  • (2020)A Study on the Causes of Garbage Collection in Java for Big Data Workloads2020 IEEE International Conference on Big Data (Big Data)10.1109/BigData50022.2020.9378113(5831-5833)Online publication date: 10-Dec-2020
  • (2018)EspressoACM SIGPLAN Notices10.1145/3296957.317320153:2(70-83)Online publication date: 19-Mar-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 43, Issue 1
ASPLOS'15
March 2015
676 pages
ISSN:0163-5964
DOI:10.1145/2786763
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS '15: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems
    March 2015
    720 pages
    ISBN:9781450328357
    DOI:10.1145/2694344
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 March 2015
Published in SIGARCH Volume 43, Issue 1

Check for updates

Author Tags

  1. NUMA
  2. garbage collection
  3. multicore

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)3
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)A Comparative Analysis of Garbage Collectors and Their Suitability for Big Data WorkloadsAdvances in Computing and Network Communications10.1007/978-981-33-6977-1_24(305-316)Online publication date: 21-Apr-2021
  • (2020)A Study on the Causes of Garbage Collection in Java for Big Data Workloads2020 IEEE International Conference on Big Data (Big Data)10.1109/BigData50022.2020.9378113(5831-5833)Online publication date: 10-Dec-2020
  • (2018)EspressoACM SIGPLAN Notices10.1145/3296957.317320153:2(70-83)Online publication date: 19-Mar-2018
  • (2018)EspressoProceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3173162.3173201(70-83)Online publication date: 19-Mar-2018
  • (2018)A Study on Garbage Collection Algorithms for Big Data EnvironmentsACM Computing Surveys10.1145/315681851:1(1-35)Online publication date: 10-Jan-2018
  • (2018)NUMA Awareness: Improving Thread and Memory Management2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA.2018.00028(119-123)Online publication date: Aug-2018
  • (2016)FastCollectProceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems10.1145/2968455.2968520(1-10)Online publication date: 1-Oct-2016
  • (2016)TaurusACM SIGOPS Operating Systems Review10.1145/2954680.287238650:2(457-471)Online publication date: 25-Mar-2016
  • (2016)Quantifying the performance impact of large pages on in-memory big-data workloads2016 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC.2016.7581281(1-10)Online publication date: Sep-2016
  • (2024)Polar: A Managed Runtime with Hotness-Segregated Heap for Far MemoryProceedings of the 15th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3678015.3680490(15-22)Online publication date: 4-Sep-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media