Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Prudent Memory Reclamation in Procrastination-Based Synchronization

Published: 25 March 2016 Publication History

Abstract

Procrastination is the fundamental technique used in synchronization mechanisms such as Read-Copy-Update (RCU) where writers, in order to synchronize with readers, defer the freeing of an object until there are no readers referring to the object. The synchronization mechanism determines when the deferred object is safe to reclaim and when it is actually reclaimed. Hence, such memory reclamations are completely oblivious of the memory allocator state. This induces poor memory allocator performance, for instance, when the reclamations are ill-timed. Furthermore, deferred objects provide hints about the future that inform memory regions that are about to be freed. Although useful, hints are not exploited as deferred objects are not visible to memory allocators. We introduce Prudence, a dynamic memory allocator, that is tightly integrated with the synchronization mechanism to ensure visibility of deferred objects to the memory allocator. Such an integration enables Prudence to (i) identify the safe time to reclaim deferred objects' memory, (ii) have an inclusive view of the allocated, free and about-to-be-freed objects, and (iii) exploit optimizations based on the hints about the future during important state transitions. Our evaluation in the Linux kernel shows that Prudence integrated with RCU performs 3.9X to 28X better in micro-benchmarks compared to SLUB, a recent memory allocator in the Linux kernel. It also improves the overall performance perceptibly (4%-18%) for a mix of widely used synthetic and application benchmarks. Further, it performs better (up to 98%) in terms of object hits in caches, object cache churns, slab churns, peak memory usage and total fragmentation, when compared with the SLUB allocator.

References

[1]
Maya Arbel and Hagit Attiya. Concurrent updates with rcu: Search tree as an example. In Proceedings of the 2014 ACM Symposium on Principles of Distributed Computing, PODC '14, pages 196--205, New York, NY, USA, 2014. ACM.
[2]
Andrea Arcangeli, Mingming Cao, Paul E McKenney, and Dipankar Sarma. Using Read-Copy-Update Techniques for System V IPC in the Linux 2.5 Kernel. In USENIX Annual Technical Conference, FREENIX Track, pages 297--309, 2003.
[3]
David A Barrett and Benjamin G Zorn. Using lifetime predictors to improve memory allocation performance. In ACM SIGPLAN Notices, volume 28, pages 187--196. ACM, 1993.
[4]
Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, and Akhilesh Singhania. The Multikernel: A New OS Architecture for Scalable Multicore Systems. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP '09, pages 29--44, New York, NY, USA, 2009. ACM.
[5]
Andrew Baumann, Jeremy Kerr, Jonathan Appavoo, Dilma Da Silva, Orran Krieger, and Robert W Wisniewski. Module hot-swapping for dynamic update and reconfiguration in K42. In 6th Linux. Conf. Au, 2005.
[6]
Emery D Berger, Kathryn S McKinley, Robert D Blumofe, and Paul R Wilson. Hoard: A scalable memory allocator for multithreaded applications. ACM Sigplan Notices, 35(11):117--128, 2000.
[7]
Emery D Berger, Benjamin G Zorn, and Kathryn S McKinley. Oopsla 2002: reconsidering custom memory allocation. ACM SIGPLAN Notices, 48(4S):46--57, 2013.
[8]
Jeff Bonwick. The Slab Allocator: An Object-Caching Kernel Memory Allocator. In USENIX summer, volume 16. Boston, MA, USA, 1994.
[9]
Austin T. Clements, M. Frans Kaashoek, and Nickolai Zeldovich. Scalable address spaces using rcu balanced trees. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 199--210, New York, NY, USA, 2012. ACM.
[10]
Jonathan Corbet. The SLUB allocator. http://lwn.net/Articles/229984/, 2007.
[11]
Jonathan Corbet. Relocating rcu callbacks. http://lwn.net/Articles/522262/, 2012.
[12]
Jonathan Corbet. Epoll evolving. https://lwn.net/Articles/633422/, 2015.
[13]
Jason Evans. A scalable concurrent malloc (3) implementation for freebsd. In Proc. of the BSDCan Conference, Ottawa, Canada, 2006.
[14]
The Apache Software Foundation. Apache HTTP server benchmarking tool. https://httpd.apache.org/docs/2.2/programs/ab.html, 2015.
[15]
The Apache Software Foundation. Apache HTTP Server Project. http://httpd.apache.org/, 2015.
[16]
Benjamin Gamsa, Orran Krieger, Jonathan Appavoo, and Michael Stumm. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In OSDI, volume 99, pages 87--100, 1999.
[17]
Richard Golding, Peter Bosch, John Wilkes, USENIX Association, et al. Idleness is not sloth. In USENIX, pages 201--212, 1995.
[18]
Mel Gorman. Understanding the Linux virtual memory manager. Prentice Hall, 2004.
[19]
The PostgreSQL Global Development Group. pgbench. http://www.postgresql.org/docs/devel/static/pgbench.html, 2015.
[20]
The PostgreSQL Global Development Group. PostgreSQL. http://www.postgresql.org/, 2015.
[21]
Dinakar Guniguntala, Paul E McKenney, Josh Triplett, and Jonathan Walpole. The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with Linux. IBM Systems Journal, 47(2):221--236, 2008.
[22]
Thomas E Hart, Paul E McKenney, Angela Demke Brown, and Jonathan Walpole. Performance of memory reclamation for lockless synchronization. Journal of Parallel and Distributed Computing, 67(12):1270--1285, 2007.
[23]
Philip W Howard and Jonathan Walpole. Relativistic red-black trees. Concurrency and Computation: Practice and Experience, 2013.
[24]
Hajime Inoue, Darko Stefanović, and Stephanie Forrest. On the prediction of java object lifetimes. Computers, IEEE Transactions on, 55(7):880--892, 2006.
[25]
Rick Jones. NetPerf. http://www.netperf.org/, 2012.
[26]
Jeffrey Katcher. Postmark: A new file system benchmark. Technical report, Technical Report TR3022, Network Appliance, 1997. www.netapp.com/tech_library/3022.html, 1997.
[27]
H. T. Kung and Philip L. Lehman. Concurrent manipulation of binary search trees. ACM Trans. Database Syst., 5(3):354--382, September 1980.
[28]
Christoph Lameter. SLUB: The unqueued slab allocator. http://lwn.net/Articles/229096/, 2007.
[29]
Ran Liu, Heng Zhang, and Haibo Chen. Scalable read-mostly synchronization using passive reader-writer locks. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), pages 219--230, Philadelphia, PA, June 2014. USENIX Association.
[30]
Yandong Mao, Eddie Kohler, and Robert Tappan Morris. Cache craftiness for fast multicore key-value storage. In Proceedings of the 7th ACM European conference on Computer Systems, pages 183--196. ACM, 2012.
[31]
Paul E McKenney. Exploiting Deferred Destructions: An Analysis of Read-Copy-Update Techniques in Operating System kernels. PhD thesis, Oregon Health & Science University, 2004.
[32]
Paul E McKenney. Structured deferral: synchronization via procrastination. Communications of the ACM, 56(7):40--49, 2013.
[33]
Paul E McKenney. RCU Linux usage. www.rdrop.com/users/paulmck/RCU/linuxusage.html, 2014.
[34]
Paul E McKenney, Jonathan Appavoo, Andi Kleen, Orran Krieger, Rusty Russell, Dipankar Sarma, and Maneesh Soni. Read-copy update. In AUUG Conference Proceedings, page 175. AUUG, Inc., 2001.
[35]
Paul E McKenney, Dipankar Sarma, Ingo Molnar, and Suparna Bhattacharya. Extending RCU for realtime and embedded workloads. In Ottawa Linux Symposium, pages v2, pages 123--138, 2006.
[36]
Paul E McKenney, Dipankar Sarma, and Maneesh Soni. Scaling dcache with RCU. Linux Journal, 2004(117):3, 2004.
[37]
Paul E McKenney and John D Slingwine. Read-copy update: Using execution history to solve concurrency problems. In Parallel and Distributed Computing and Systems, pages 509--518, 1998.
[38]
James Morris. Have You Driven an SELinux Lately? In Linux Symposium Proceedings, 2008.
[39]
Robert Olsson and Stefan Nilsson. Trash a dynamic lc-trie and hash data structure. In High Performance Switching and Routing, 2007. HPSR'07. Workshop on, pages 1--6. IEEE, 2007.
[40]
Dipankar Sarma and Paul E McKenney. Making RCU safe for deep sub-millisecond response realtime applications. In Proceedings of the 2004 USENIX Annual Technical Conference (FREENIX Track), pages 182--191, 2004.
[41]
Josh Triplett, Paul E McKenney, and Jonathan Walpole. Scalable concurrent hash tables via relativistic programming. ACM SIGOPS Operating Systems Review, 44(3):102--109, 2010.
[42]
Josh Triplett, Paul E McKenney, and Jonathan Walpole. Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming. In USENIX Annual Technical Conference, page 11, 2011.
[43]
Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 18--32. ACM, 2013.
[44]
David Wentzlaff and Anant Agarwal. Factored operating systems (fos): the case for a scalable operating system for multicores. SIGOPS Oper. Syst. Rev., 43:76--85, April 2009.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 51, Issue 4
ASPLOS '16
April 2016
774 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2954679
  • Editor:
  • Andy Gill
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
    March 2016
    824 pages
    ISBN:9781450340915
    DOI:10.1145/2872362
    • General Chair:
    • Tom Conte,
    • Program Chair:
    • Yuanyuan Zhou
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2016
Published in SIGPLAN Volume 51, Issue 4

Check for updates

Author Tags

  1. dynamic memory allocator
  2. memory reclamation
  3. read-copy-update (RCU)

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media