research-article

Challenges and solutions for fast remote persistent memory access

Authors:

David Andersen,

Michael KaminskyAuthors Info & Claims

SoCC '20: Proceedings of the 11th ACM Symposium on Cloud Computing

Pages 105 - 119

https://doi.org/10.1145/3419111.3421294

Published: 12 October 2020 Publication History

Abstract

Non-volatile main memory DIMMs (NVMMs), such as Intel's Optane DC Persistent Memory modules, provide data durability with orders of magnitude higher performance than prior durable technologies. This paper explores the unique challenges that arise when building high-performance networked systems for NVMM. Compared to DRAM, we find that NVMMs have distinctive fundamental properties that pose unique challenges for networked access to NVMM, both from the NIC and the CPU. We show that much of the challenges in efficient access to remote NVMM arises from the fact that CPU caches are not optimized for NVMM. To address these challenges, we propose a menu of solutions for current hardware and evaluate their benefits.

Supplementary Material

MP4 File (p105-kalia-presentation.mp4)

Download
114.47 MB

References

[1]

2019. Accelerating Intra-Host PVRDMA Storage Traffic in a Future Dell AMD Server. Talk at VMWorld 2019.

[2]

2020. C implementation of the Raft Consensus protocol. https://github.com/willemt/raft.

[3]

2020. Distributed Asynchronous Object Storage Stack. https://github.com/daos-stack.

[4]

2020. Fast memcpy with SPDK and Intel I/OAT DMA Engine. https://software.intel.com/en-us/articles/fast-memcpy-using-spdk-and-ioat-dma-engine.

[5]

2020. InfiniBand Architecture Specification Volume 1. https://cw.infinibandta.org/document/dl/7859.

[6]

2020. Intel 64 and IA-32 Architectures Optimization Reference Manual. https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf.

[7]

2020. Intel's CLWB instruction invalidating cache lines. https://stackoverflow.com/questions/60266778/intels-clwb-instruction-invalidating-cache-lines.

[8]

Aerospike 2020. Aerospike Performance on Intel Optane Persistent Memory. https://www.aerospike.com/blog/performance-on-intel-optane-persistent-memory/.

[9]

Thomas E. Anderson, Marco Canini, Jongyul Kim, Dejan KostiÄĞ, Youngjin Kwon, Simon Peter, Waleed Reda, Henry N. Schuh, and Emmett Witchel. 2019. Assise: Performance and Availability via NVM Colocation in a Distributed File System. (2019). arXiv:1910.05106 [cs.DC]

[10]

Joy Arulraj, Andrew Pavlo, and Subramanya R. Dulloor. 2015. Let's Talk About Storage and Recovery Methods for Non-Volatile Memory Database Systems. In Proc. ACM SIGMOD. Melbourne, Australia.

[11]

Joy Arulraj, Matthew Perron, and Andrew Pavlo. 2016. Write-behind logging. Proceedings of the VLDB Endowment.

Digital Library

[12]

Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobber, Michael Wei, and John D. Davis. 2012. CORFU: a shared log design for flash clusters. In Proc. 9th USENIX NSDI. San Jose, CA.

Digital Library

[13]

Nathan Beckmann, Phillip B. Gibbons, Bernhard Haeupler, and Charles McGuffey. 2019. Writeback-Aware Caching (Brief Announcement). In The 31st ACM Symposium on Parallelism in Algorithms and Architectures.

Digital Library

[14]

Chiranjeeb Buragohain, Knut Magne Risvik, Paul Brett, Miguel Castro, Wonhee Cho, Joshua Cowhig, Nikolas Gloy, Karthik Kalyanaraman, Richendra Khanna, John Pao, Matthew Renzelmann, Alex Shamis, Timothy Tan, and Shuheng Zheng. 2020. A1: A Distributed In-Memory Graph Database. In Proc. ACM SIGMOD. Portland, OR, USA.

Digital Library

[15]

Youmin Chen, Youyou Lu, Fan Yang, Qing Wang, Yang Wang, and Jiwu Shu. 2020. FlatStore: An Efficient Log-Structured Key-Value Storage Engine for Persistent Memory. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems.

Digital Library

[16]

Yanzhe Chen, Xingda Wei, Jiaxin Shi, Rong Chen, and Haibo Chen. 2016. Fast and General Distributed Transactions Using RDMA and HTM. In Proc. 11th ACM European Conference on Computer Systems (EuroSys) (London, UK).

Digital Library

[17]

DPDK 2017. Data Plane Development Kit (DPDK). http://dpdk.org/.

[18]

Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson, and Miguel Castro. 2014. FaRM: Fast Remote Memory. In Proc. 11th USENIX NSDI. Seattle, WA.

[19]

Aleksandar Dragojević, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No Compromises: Distributed Transactions with Consistency, Availability, and Performance. In Proc. 25th ACM Symposium on Operating Systems Principles (SOSP). Monterey, CA.

Digital Library

[20]

Subramanya R. Dulloor, Amitabha Roy, Zheguang Zhao, Narayanan Sundaram, Nadathur Satish, Rajesh Sankaran, Jeff Jackson, and Karsten Schwan. 2016. Data Tiering in Heterogeneous Memory Systems. In Proc. 11th ACM European Conference on Computer Systems (EuroSys) (London, UK).

Digital Library

[21]

Daniel Firestone et al. 2018. Azure Accelerated Networking: SmartNICs in the Public Cloud. In Proc. 15th USENIX NSDI. Renton, WA.

[22]

Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitu Padhye, and Marina Lipshteyn. 2016. RDMA over Commodity Ethernet at Scale. In Proc. ACM SIGCOMM. Florianopolis, Brazil.

Digital Library

[23]

Michio Honda, Giuseppe Lettieri, Lars Eggert, and Douglas Santry. 2018. PASTE: A Network Programming Interface for Non-Volatile Main Memory. In Proc. 15th USENIX NSDI. Renton, WA.

[24]

Intel. 2013. Intel Data Plane Development Kit (Intel DPDK). http://www.intel.com/go/dpdk.

[25]

Zsolt István, David Sidler, Gustavo Alonso, and Marko Vukolic. 2016. Consensus in a Box: Inexpensive Coordination in Hardware. In Proc. 13th USENIX NSDI. Santa Clara, CA.

Digital Library

[26]

Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA Efficiently for Key-Value Services. In Proc. ACM SIGCOMM. Chicago, IL.

[27]

Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided RDMA Datagram RPCs. In Proc. 12th USENIX OSDI. Savannah, GA.

[28]

Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2019. Datacenter RPCs can be General and Fast. In Proc. 16th USENIX NSDI. Boston, MA.

[29]

Daehyeok Kim, Amirsaman Memaripour, Anirudh Badam, Yibo Zhu, Hongqiang Harry Liu, Jitu Padhye, Shachar Raindel, Steven Swanson, Vyas Sekar, and Srinivasan Seshan. 2018. HyperLoop: Group-based NIC-offloading to Accelerate Replicated Transactions in Multi-tenant Storage Systems. In Proc. ACM SIGCOMM. Budapest, Hungary.

Digital Library

[30]

Jiuxing Liu, Jiesheng Wu, and Dhabaleswar K Panda. 2004. High performance RDMA-based MPI implementation over InfiniBand. International Journal of Parallel Programming (2004).

[31]

Youyou Lu, Jiwu Shu, Youmin Chen, and Tao Li. 2017. Octopus: an RDMA-enabled Distributed Persistent Memory File System. In 2017 USENIX Annual Technical Conference (USENIX ATC 17).

Digital Library

[32]

Michael Marty, Marc de Kruijf, Jacob Adriaens, Christopher Alfeld, Sean Bauer, Carlo Contavalli, Michael Dalton, Nandita Dukkipati, William C. Evans, Steve Gribble, Nicholas Kidd, Roman Kononov, Gautam Kumar, Carl Mauer, Emily Musick, Lena Olson, Erik Rubow, Michael Ryan, Kevin Springborn, Paul Turner, Valas Valancius, Xi Wang, and Amin Vahdat. 2019. Snap: A Microkernel Approach to Host Networking. In Proc. 27th ACM Symposium on Operating Systems Principles (SOSP). Waterloo, Canada.

Digital Library

[33]

Memcached 2020. The Volatile Benefit of Persistent Memory. https://memcached.org/blog/persistent-memory/.

[34]

Radhika Mittal, Alexander Shpiner, Aurojit Panda, Eitan Zahavi, Arvind Krishnamurthy, Sylvia Ratnasamy, and Scott Shenker. 2018. Revisiting Network Support for RDMA. In Proc. ACM SIGCOMM. Budapest, Hungary.

Digital Library

[35]

Stanko Novakovic, Yizhou Shan, Aasheesh Kolli, Michael Cui, Yiying Zhang, Haggai Eran, Boris Pismenny, Liran Liss, Michael Wei, Dan Tsafrir, and Marcos Aguilera. 2019. Storm: a fast transactional data-plane for remote data structures. In 12th ACM International Systems and Storage Conference (SYSTOR). ACM, USENIX.

Digital Library

[36]

Diego Ongaro and John Ousterhout. 2014. In Search of an Understandable Consensus Algorithm. In Proc. USENIX Annual Technical Conference. Philadelphia, PA.

Digital Library

[37]

Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. 2011. Fast crash recovery in RAMCloud. In Proc. 23rd ACM Symposium on Operating Systems Principles (SOSP). Cascais, Portugal.

Digital Library

[38]

Oracle TimesTen 2020. Using Intel Optane DC Persistent Memory with Oracle TimesTen In-Memory Database. https://blogs.oracle.com/timesten/using-intel-optane-dc-persistent-memory-with-oracle-timesten-in-memory-database.

[39]

Marius Poke and Torsten Hoefler. 2015. DARE: High-performance state machine replication on RDMA networks. In HPDC.

Digital Library

[40]

Hanfeng Qin and Hai Jin. 2017. Warstack: Improving LLC Replacement for NVM with a Writeback-Aware Reuse Stack. In 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing, PDP.

[41]

Redis Labs 2020. Break the Cost and Capacity Barrier with Intel Optane DC Persistent Memory. https://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/redis-enterprise-brief.pdf.

[42]

Andy Rudoff. 2017. Persistent Memory Programming. USENIX ;login: (2017).

[43]

Shelby Thomas, Geoffrey M. Voelker, and George Porter. 2018. Cachecloud: Towards Speed-of-Light Datacenter Communication. In Proceedings of the 10th USENIX Conference on Hot Topics in Cloud Computing.

[44]

Tom Talpey RDMA Commit 2020. RDMA Extensions for Enhanced Memory Placement. https://tools.ietf.org/id/draft-talpey-rdma-commit-01.html.

[45]

TPC-C 2010. TPC Benchmark C. http://www.tpc.org/tpcc/.

[46]

Shin-Yeh Tsai, Yizhou Shan, and Yiying Zhang. 2020. Towards Low-Cost, Fast, and Scalable Disaggregated Persistent Memory Systems. In 2018 USENIX Annual Technical Conference.

[47]

Haris Volos, Andres Jaan Tack, and Michael M. Swift. 2011. Mnemosyne: Lightweight Persistent Memory. In Proc. 16th International Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Newport Beach, CA.

[48]

Zhe Wang, Shuchang Shan, Ting Cao, Junli Gu, Yi Xu, Shuai Mu, Yuan Xie, and Daniel A. Jiménez. 2013. WADE: Writeback-Aware Dynamic Cache Management for NVM-Based Main Memory System. ACM Trans. Archit. Code Optim. (2013).

[49]

Xingda Wei, Zhiyuan Dong, Rong Chen, and Haibo Chen. 2018. Deconstructing RDMA-enabled Distributed Transactions: Hybrid is Better!. In Proc. 13th USENIX OSDI. Carlsbad, CA.

[50]

Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, and Haibo Chen. 2015. Fast In-memory Transaction Processing Using RDMA and HTM. In Proc. 25th ACM Symposium on Operating Systems Principles (SOSP). Monterey, CA.

Digital Library

[51]

Jian Yang, Joseph Izraelevitz, and Steven Swanson. 2019. Orion: A Distributed File System for Non-Volatile Main Memory and RDMA-Capable Networks. In Proc. USENIX Conference on File and Storage Technologies. Boston, MA.

[52]

Jian Yang, Joseph Izraelevitz, and Steven Swanson. 2020. FileMR: Rethinking RDMA Networking for Scalable Persistent Memory. In Proc. 17th USENIX NSDI. Santa Clara, CA.

[53]

Jian Yang, Juno Kim, Morteza Hoseinzadeh, Joseph Izraelevitz, and Steven Swanson. 2020. An Empirical Guide to the Behavior and Use of Scalable Persistent Memory. Technical Report. Santa Clara, CA.

[54]

Erfan Zamanian, Carsten Binnig, Tim Harris, and Tim Kraska. 2017. The End of a Myth: Distributed Transactions Can Scale. In Proc. VLDB. Munich, Germany.

Digital Library

[55]

Yiying Zhang and Steven Swanson. 2015. A study of application performance with non-volatile main memory. In 31st Symposium on Mass Storage Systems and Technologies (MSST). IEEE.

[56]

Yiying Zhang, Jian Yang, Amirsaman Memaripour, and Steven Swanson. 2015. Mojim: A Reliable and Highly-Available Non-Volatile Memory System. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS âĂ&Zacute;15).

Digital Library

Cited By

Li JSu JChen LLi CZhang KYang LNoh SXu Y(2024)Fastmove: A Comprehensive Study of On-Chip DMA and its Demonstration for Accelerating Data Movement in NVM-based Storage SystemsACM Transactions on Storage10.1145/365647720:3(1-30)Online publication date: 6-Jun-2024
https://dl.acm.org/doi/10.1145/3656477
Yuan YWang RRanganathan NRao NKumar SLantz PSanjeepan VCabrera JKwatra ASankaran RJeong IKim N(2024)Intel Accelerators Ecosystem: An SoC-Oriented Perspective : Industry Product2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00066(848-862)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00066
Stavrakakis DPanfil ANam MBhatotia P(2024)SPP: Safe Persistent Pointers for Memory Safety2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00019(37-52)Online publication date: 24-Jun-2024
https://doi.org/10.1109/DSN58291.2024.00019
Show More Cited By

Index Terms

Challenges and solutions for fast remote persistent memory access
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
2. Networks
  1. Network protocols

Recommendations

Wear-leveling-aware buddy-like memory allocator for persistent memory file systems
Abstract
Existing persistent memory file systems usually ignore the problem that persistent memories (PMs) have limited write endurance. Then the underlying PMs can be damaged easily by the unbalanced writes of file systems. However, existing wear-...
Highlights
- We reveal the high overhead and severe imbalanced wear problem caused by allocator of PM file systems.
- We propose WBAlloc which provides O(1) time complexity in both allocation and deallocation while achieving near-balanced writes.
Using DRAM Buffer to Reduce Persistence and Consistence Overheads of Persistent Memory

Persistent memory has the potential to become universal storage for memory and storage uses. Unfortunately, our system architecture is good fit for two-level storage model with DRAM and storage. It incurs two of important performance overheads. First is ...
Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using intel optane DC persistent memory modules
MEMSYS '19: Proceedings of the International Symposium on Memory Systems

Non-volatile, byte-addressable memory (NVM) has been introduced by Intel in the form of NVDIMMs named Intel® Optane™ DC PMM. This memory module has the ability to persist the data stored in it without the need for power. This expands the memory ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SoCC '20: Proceedings of the 11th ACM Symposium on Cloud Computing

October 2020

535 pages

ISBN:9781450381376

DOI:10.1145/3419111

General Chair:
Rodrigo Fonseca
Microsoft and Brown University
,
Program Chairs:
Christina Delimitrou
Cornell University
,
Beng Chin Ooi
National University of Singapore

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation
Intel

Conference

SoCC '20

Sponsor:

SoCC '20: ACM Symposium on Cloud Computing

October 19 - 21, 2020

Virtual Event, USA

Acceptance Rates

SoCC '20 Paper Acceptance Rate 35 of 143 submissions, 24%;

Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
1,235
Total Downloads

Downloads (Last 12 months)87
Downloads (Last 6 weeks)14

Reflects downloads up to 02 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li JSu JChen LLi CZhang KYang LNoh SXu Y(2024)Fastmove: A Comprehensive Study of On-Chip DMA and its Demonstration for Accelerating Data Movement in NVM-based Storage SystemsACM Transactions on Storage10.1145/365647720:3(1-30)Online publication date: 6-Jun-2024
https://dl.acm.org/doi/10.1145/3656477
Yuan YWang RRanganathan NRao NKumar SLantz PSanjeepan VCabrera JKwatra ASankaran RJeong IKim N(2024)Intel Accelerators Ecosystem: An SoC-Oriented Perspective : Industry Product2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00066(848-862)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00066
Stavrakakis DPanfil ANam MBhatotia P(2024)SPP: Safe Persistent Pointers for Memory Safety2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00019(37-52)Online publication date: 24-Jun-2024
https://doi.org/10.1109/DSN58291.2024.00019
Su JLi JChen LLi CZhang KYang LNoh SXu YNaor DGoel A(2023)Revitalizing the forgotten on-chip DMA to expedite data movement in NVM-based storage systemsProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585961(363-378)Online publication date: 21-Feb-2023
https://dl.acm.org/doi/10.5555/3585938.3585961
Stavrakakis DGiantsidi DBailleu MSändig PIssa SBhatotia P(2023)Anchor: A Library for Building Secure Persistent Memory SystemsProceedings of the ACM on Management of Data10.1145/36267181:4(1-31)Online publication date: 12-Dec-2023
https://dl.acm.org/doi/10.1145/3626718
Zeng JJeong JJung C(2023)Persistent Processor ArchitectureProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3623772(1075-1091)Online publication date: 28-Oct-2023
https://dl.acm.org/doi/10.1145/3613424.3623772
Sun YYuan YYu ZKuper RSong CHuang JJi HAgarwal SLou JJeong IWang RAhn JXu TKim N(2023)Demystifying CXL Memory with Genuine CXL-Ready Systems and DevicesProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614256(105-121)Online publication date: 28-Oct-2023
https://dl.acm.org/doi/10.1145/3613424.3614256
Qi ZZheng SHui YZhang BHuang L(2023)Conflux: Exploiting Persistent Memory and RDMA Bandwidth via Adaptive I/O Mode SelectionProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605574(685-694)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3605573.3605574
Zhang MHua YZuo PLiu L(2023)Localized Validation Accelerates Distributed Transactions on Disaggregated Persistent MemoryACM Transactions on Storage10.1145/358201219:3(1-35)Online publication date: 19-Jun-2023
https://dl.acm.org/doi/10.1145/3582012
Samanta AAhmed FCao LStutsman RSharma P(2023)Persistent Memory-Aware Scheduling for Serverless Workloads2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00105(615-621)Online publication date: May-2023
https://doi.org/10.1109/IPDPSW59300.2023.00105
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents