research-article

Prudent Memory Reclamation in Procrastination-Based Synchronization

Authors:

Aravinda Prasad,

K. GopinathAuthors Info & Claims

ACM SIGPLAN Notices, Volume 51, Issue 4

Pages 99 - 112

https://doi.org/10.1145/2954679.2872405

Published: 25 March 2016 Publication History

Abstract

Procrastination is the fundamental technique used in synchronization mechanisms such as Read-Copy-Update (RCU) where writers, in order to synchronize with readers, defer the freeing of an object until there are no readers referring to the object. The synchronization mechanism determines when the deferred object is safe to reclaim and when it is actually reclaimed. Hence, such memory reclamations are completely oblivious of the memory allocator state. This induces poor memory allocator performance, for instance, when the reclamations are ill-timed. Furthermore, deferred objects provide hints about the future that inform memory regions that are about to be freed. Although useful, hints are not exploited as deferred objects are not visible to memory allocators. We introduce Prudence, a dynamic memory allocator, that is tightly integrated with the synchronization mechanism to ensure visibility of deferred objects to the memory allocator. Such an integration enables Prudence to (i) identify the safe time to reclaim deferred objects' memory, (ii) have an inclusive view of the allocated, free and about-to-be-freed objects, and (iii) exploit optimizations based on the hints about the future during important state transitions. Our evaluation in the Linux kernel shows that Prudence integrated with RCU performs 3.9X to 28X better in micro-benchmarks compared to SLUB, a recent memory allocator in the Linux kernel. It also improves the overall performance perceptibly (4%-18%) for a mix of widely used synthetic and application benchmarks. Further, it performs better (up to 98%) in terms of object hits in caches, object cache churns, slab churns, peak memory usage and total fragmentation, when compared with the SLUB allocator.

References

[1]

Maya Arbel and Hagit Attiya. Concurrent updates with rcu: Search tree as an example. In Proceedings of the 2014 ACM Symposium on Principles of Distributed Computing, PODC '14, pages 196--205, New York, NY, USA, 2014. ACM.

Digital Library

[2]

Andrea Arcangeli, Mingming Cao, Paul E McKenney, and Dipankar Sarma. Using Read-Copy-Update Techniques for System V IPC in the Linux 2.5 Kernel. In USENIX Annual Technical Conference, FREENIX Track, pages 297--309, 2003.

[3]

David A Barrett and Benjamin G Zorn. Using lifetime predictors to improve memory allocation performance. In ACM SIGPLAN Notices, volume 28, pages 187--196. ACM, 1993.

[4]

Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, and Akhilesh Singhania. The Multikernel: A New OS Architecture for Scalable Multicore Systems. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP '09, pages 29--44, New York, NY, USA, 2009. ACM.

Digital Library

[5]

Andrew Baumann, Jeremy Kerr, Jonathan Appavoo, Dilma Da Silva, Orran Krieger, and Robert W Wisniewski. Module hot-swapping for dynamic update and reconfiguration in K42. In 6th Linux. Conf. Au, 2005.

[6]

Emery D Berger, Kathryn S McKinley, Robert D Blumofe, and Paul R Wilson. Hoard: A scalable memory allocator for multithreaded applications. ACM Sigplan Notices, 35(11):117--128, 2000.

Digital Library

[7]

Emery D Berger, Benjamin G Zorn, and Kathryn S McKinley. Oopsla 2002: reconsidering custom memory allocation. ACM SIGPLAN Notices, 48(4S):46--57, 2013.

[8]

Jeff Bonwick. The Slab Allocator: An Object-Caching Kernel Memory Allocator. In USENIX summer, volume 16. Boston, MA, USA, 1994.

Digital Library

[9]

Austin T. Clements, M. Frans Kaashoek, and Nickolai Zeldovich. Scalable address spaces using rcu balanced trees. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 199--210, New York, NY, USA, 2012. ACM.

Digital Library

[10]

Jonathan Corbet. The SLUB allocator. http://lwn.net/Articles/229984/, 2007.

[11]

Jonathan Corbet. Relocating rcu callbacks. http://lwn.net/Articles/522262/, 2012.

[12]

Jonathan Corbet. Epoll evolving. https://lwn.net/Articles/633422/, 2015.

[13]

Jason Evans. A scalable concurrent malloc (3) implementation for freebsd. In Proc. of the BSDCan Conference, Ottawa, Canada, 2006.

[14]

The Apache Software Foundation. Apache HTTP server benchmarking tool. https://httpd.apache.org/docs/2.2/programs/ab.html, 2015.

[15]

The Apache Software Foundation. Apache HTTP Server Project. http://httpd.apache.org/, 2015.

[16]

Benjamin Gamsa, Orran Krieger, Jonathan Appavoo, and Michael Stumm. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In OSDI, volume 99, pages 87--100, 1999.

[17]

Richard Golding, Peter Bosch, John Wilkes, USENIX Association, et al. Idleness is not sloth. In USENIX, pages 201--212, 1995.

[18]

Mel Gorman. Understanding the Linux virtual memory manager. Prentice Hall, 2004.

Digital Library

[19]

The PostgreSQL Global Development Group. pgbench. http://www.postgresql.org/docs/devel/static/pgbench.html, 2015.

[20]

The PostgreSQL Global Development Group. PostgreSQL. http://www.postgresql.org/, 2015.

[21]

Dinakar Guniguntala, Paul E McKenney, Josh Triplett, and Jonathan Walpole. The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with Linux. IBM Systems Journal, 47(2):221--236, 2008.

Digital Library

[22]

Thomas E Hart, Paul E McKenney, Angela Demke Brown, and Jonathan Walpole. Performance of memory reclamation for lockless synchronization. Journal of Parallel and Distributed Computing, 67(12):1270--1285, 2007.

Digital Library

[23]

Philip W Howard and Jonathan Walpole. Relativistic red-black trees. Concurrency and Computation: Practice and Experience, 2013.

[24]

Hajime Inoue, Darko Stefanović, and Stephanie Forrest. On the prediction of java object lifetimes. Computers, IEEE Transactions on, 55(7):880--892, 2006.

Digital Library

[25]

Rick Jones. NetPerf. http://www.netperf.org/, 2012.

[26]

Jeffrey Katcher. Postmark: A new file system benchmark. Technical report, Technical Report TR3022, Network Appliance, 1997. www.netapp.com/tech_library/3022.html, 1997.

[27]

H. T. Kung and Philip L. Lehman. Concurrent manipulation of binary search trees. ACM Trans. Database Syst., 5(3):354--382, September 1980.

Digital Library

[28]

Christoph Lameter. SLUB: The unqueued slab allocator. http://lwn.net/Articles/229096/, 2007.

[29]

Ran Liu, Heng Zhang, and Haibo Chen. Scalable read-mostly synchronization using passive reader-writer locks. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), pages 219--230, Philadelphia, PA, June 2014. USENIX Association.

Digital Library

[30]

Yandong Mao, Eddie Kohler, and Robert Tappan Morris. Cache craftiness for fast multicore key-value storage. In Proceedings of the 7th ACM European conference on Computer Systems, pages 183--196. ACM, 2012.

Digital Library

[31]

Paul E McKenney. Exploiting Deferred Destructions: An Analysis of Read-Copy-Update Techniques in Operating System kernels. PhD thesis, Oregon Health & Science University, 2004.

[32]

Paul E McKenney. Structured deferral: synchronization via procrastination. Communications of the ACM, 56(7):40--49, 2013.

Digital Library

[33]

Paul E McKenney. RCU Linux usage. www.rdrop.com/users/paulmck/RCU/linuxusage.html, 2014.

[34]

Paul E McKenney, Jonathan Appavoo, Andi Kleen, Orran Krieger, Rusty Russell, Dipankar Sarma, and Maneesh Soni. Read-copy update. In AUUG Conference Proceedings, page 175. AUUG, Inc., 2001.

[35]

Paul E McKenney, Dipankar Sarma, Ingo Molnar, and Suparna Bhattacharya. Extending RCU for realtime and embedded workloads. In Ottawa Linux Symposium, pages v2, pages 123--138, 2006.

[36]

Paul E McKenney, Dipankar Sarma, and Maneesh Soni. Scaling dcache with RCU. Linux Journal, 2004(117):3, 2004.

Digital Library

[37]

Paul E McKenney and John D Slingwine. Read-copy update: Using execution history to solve concurrency problems. In Parallel and Distributed Computing and Systems, pages 509--518, 1998.

[38]

James Morris. Have You Driven an SELinux Lately? In Linux Symposium Proceedings, 2008.

[39]

Robert Olsson and Stefan Nilsson. Trash a dynamic lc-trie and hash data structure. In High Performance Switching and Routing, 2007. HPSR'07. Workshop on, pages 1--6. IEEE, 2007.

[40]

Dipankar Sarma and Paul E McKenney. Making RCU safe for deep sub-millisecond response realtime applications. In Proceedings of the 2004 USENIX Annual Technical Conference (FREENIX Track), pages 182--191, 2004.

[41]

Josh Triplett, Paul E McKenney, and Jonathan Walpole. Scalable concurrent hash tables via relativistic programming. ACM SIGOPS Operating Systems Review, 44(3):102--109, 2010.

Digital Library

[42]

Josh Triplett, Paul E McKenney, and Jonathan Walpole. Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming. In USENIX Annual Technical Conference, page 11, 2011.

[43]

Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 18--32. ACM, 2013.

Digital Library

[44]

David Wentzlaff and Anant Agarwal. Factored operating systems (fos): the case for a scalable operating system for multicores. SIGOPS Oper. Syst. Rev., 43:76--85, April 2009.

Digital Library

Index Terms

Prudent Memory Reclamation in Procrastination-Based Synchronization
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Allocation / deallocation strategies

Recommendations

Prudent Memory Reclamation in Procrastination-Based Synchronization
ASPLOS'16

Procrastination is the fundamental technique used in synchronization mechanisms such as Read-Copy-Update (RCU) where writers, in order to synchronize with readers, defer the freeing of an object until there are no readers referring to the object. The ...
Prudent Memory Reclamation in Procrastination-Based Synchronization
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems

Procrastination is the fundamental technique used in synchronization mechanisms such as Read-Copy-Update (RCU) where writers, in order to synchronize with readers, defer the freeing of an object until there are no readers referring to the object. The ...
Performance of memory reclamation for lockless synchronization

Achieving high performance for concurrent applications on modern multiprocessors remains challenging. Many programmers avoid locking to improve performance, while others replace locks with non-blocking synchronization to protect against deadlock, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 51, Issue 4

ASPLOS '16

April 2016

774 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/2954679

Editor:
Andy Gill
University of Kansas, Lawrence, KS

Issue’s Table of Contents

ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
March 2016
824 pages
ISBN:9781450340915
DOI:10.1145/2872362
General Chair:
Tom Conte
Georgia Tech, USA
,
Program Chair:
Yuanyuan Zhou
University of California, San Diego, USA

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2016

Published in SIGPLAN Volume 51, Issue 4

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
659
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents