Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3357526.3357541acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article
Public Access

Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using intel optane DC persistent memory modules

Published: 30 September 2019 Publication History

Abstract

Non-volatile, byte-addressable memory (NVM) has been introduced by Intel in the form of NVDIMMs named Intel® Optane™ DC PMM. This memory module has the ability to persist the data stored in it without the need for power. This expands the memory hierarchy into a hybrid memory system due the differences in access latency and memory bandwidth from DRAM, which has been the predominant byte-addressable main memory technology. The Optane DC memory modules have up to 8x the capacity of DDR4 DRAM modules which can expand the byte-address space up to 6 TB per node. Many applications can now scale up the their problem size given such a memory system. We evaluate the capabilities of this DRAM-NVM hybrid memory system and its impact on High Performance Computing (HPC) applications. We characterize the Optane DC in comparison to DDR4 DRAM with a STREAM-like custom benchmark and measure the performance for HPC mini-apps like VPIC, SNAP, LULESH and AMG under different configurations of Optane DC PMMs. We find that Optane-only executions are slower in terms of execution time than DRAM-only and Memory-mode executions by a minimum of 2 to 16% for VPIC and maximum of 6x for LULESH.

References

[1]
Dmytro Apalkov, Alexey Khvalkovskiy, Steven Watts, Vladimir Nikitin, Xueti Tang, Daniel Lottis, Kiseok Moon, Xiao Luo, Eugene Chen, Adrian Ong, Alexander Driskill-Smith, and Mohamad Krounbi. 2013. Spin-transfer Torque Magnetic Random Access Memory (STT-MRAM). J. Emerg. Technol. Comput. Syst. 9, 2, Article 13 (May 2013), 35 pages.
[2]
KJ Bowers, BJ Albright, L Yin, W Daughton, V Roytershteyn, B Bergen, and TJT Kwan. 2009. Advances in petascale kinetic plasma simulation with VPIC and Roadrunner. Journal of Physics: Conference Series 180 (jul 2009), 012055.
[3]
Dhruva R Chakrabarti, Hans-J Boehm, and Kumud Bhandari. 2014. Atlas: Leveraging locks for non-volatile memory consistency. In ACM SIGPLAN Notices, Vol. 49. ACM, 433--452.
[4]
Joel Coburn, Adrian M Caulfield, Ameen Akel, Laura M Grupp, Rajesh K Gupta, Ranjit Jhala, and Steven Swanson. 2012. NV-Heaps: making persistent objects fast and safe with next-generation, nonvolatile memories. ACM Sigplan Notices 47, 4 (2012), 105--118.
[5]
Leonardo Dagum and Ramesh Menon. 1998. OpenMP: An Industry-Standard API for Shared-Memory Programming. IEEE Comput. Sci. Eng. 5, 1 (Jan. 1998), 46--55.
[6]
Subramanya R Dulloor, Sanjay Kumar, Anil Keshavamurthy, Philip Lantz, Dheeraj Reddy, Rajesh Sankaran, and Jeff Jackson. 2014. System software for persistent memory. In Proceedings of the Ninth European Conference on Computer Systems. ACM, 15.
[7]
Gurbinder Gill, Roshan Dathathri, Loc Hoang, Ramesh Peri, and Keshav Pingali. 2019. Single Machine Graph Analytics on Massive Datasets Using Intel Optane DC Persistent Memory. CoRR abs/1904.07162 (2019). arXiv:1904.07162 http://arxiv.org/abs/1904.07162
[8]
Saurabh Gupta, Tirthak Patel, Christian Engelmann, and Devesh Tiwari. 2017. Failures in large scale systems: long-term measurement, analysis, and implications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 44.
[9]
Hideto Hidaka, Yoshio Matsuda, Mikio Asakura, and Kazuyasu Fujishima. 1990. The cache DRAM architecture: A DRAM with an on-chip cache memory. IEEE Micro 10, 2 (1990), 14--25.
[10]
Joseph Izraelevitz, Jian Yang, Lu Zhang, Juno Kim, Xiao Liu, Amirsaman Memaripour, Yun Joon Soh, Zixuan Wang, Yi Xu, Subramanya R. Dulloor, Jishen Zhao, and Steven Swanson. 2019. Basic Performance Measurements of the Intel Optane DC Persistent Memory Module. CoRR abs/1903.05714 (2019). arXiv:1903.05714 http://arxiv.org/abs/1903.05714
[11]
JEDEC (2017). JEDEC DDR4 SDRAM standards. https://www.jedec.org/standards-documents/docs/jesd79-4a
[12]
Sudarsun Kannan, Ada Gavrilovska, Karsten Schwan, and Dejan Milojicic. 2013. Optimizing checkpoints using nvm as virtual memory. In 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. IEEE, 29--40.
[13]
Ian Karlin, Jeff Keasler, and Rob Neely. 2013. LULESH 2.0 Updates and Changes. Technical Report LLNL-TR-641973. 1--9 pages.
[14]
Rob Latham, N Miller, Robert Ross, P Carns, and Clemson Univ. 2004. A next-generation parallel file system for Linux cluster. LinuxWorld Mag. 2 (01 2004).
[15]
Benjamin C Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable dram alternative. ACM SIGARCH Computer Architecture News 37, 3 (2009), 2--13.
[16]
Xu Li, Kai Lu, Xiaoping Wang, and Xu Zhou. 2012. NV-process: a fault-tolerance process model based on non-volatile memory. In Proceedings of the Asia-Pacific Workshop on Systems. ACM, 1.
[17]
Kevin Lim, Jichuan Chang, Trevor Mudge, Parthasarathy Ranganathan, Steven K Reinhardt, and Thomas F Wenisch. 2009. Disaggregated memory for expansion and sharing in blade servers. In ACM SIGARCH computer architecture news, Vol. 37. ACM, 267--278.
[18]
John D. McCalpin. 1991-2007. STREAM: Sustainable Memory Bandwidth in High Performance Computers. Technical Report. University of Virginia, Charlottesville, Virginia. http://www.cs.virginia.edu/stream/ A continually updated technical report. http://www.cs.virginia.edu/stream/.
[19]
John D. McCalpin. 1995. Memory Bandwidth and Machine Balance in Current High Performance Computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter (Dec. 1995), 19--25.
[20]
Onur Mutlu. 2013. Memory scaling: A systems architecture perspective. In 2013 5th IEEE International Memory Workshop. IEEE, 21--25.
[21]
Ravi Nair. 2015. Evolution of memory architecture. Proc. IEEE 103, 8 (2015), 1331--1345.
[22]
Onkar Patil, Saurabh Hukerikar, Frank Mueller, and Christian Engelmann. 2017. Exploring use-cases for non-volatile memories in support of hpc resilience. SC Poster Session (2017).
[23]
Onkar Patil, Charles Johnson, Mesut Kuscu, Joseph Tucek, Tuan Tran, and Harumi Kuno. 2009. Persistent Regions that Survive NVM Media Failure. (2009).
[24]
Georgios Psaropoulos, Ismail Oukid, Thomas Legler, Norman May, and Anastasia Ailamaki. 2019. Bridging the latency gap between NVM and DRAM for latency-bound operations. In Proceedings of the 15th International Workshop on Data Management on New Hardware. ACM.
[25]
Simone Raoux, Feng Xiong, Matthias Wuttig, and Eric Pop. 2014. Phase change materials and phase change memory. MRS Bulletin 39, 8 (2014), 703âĂŞ710.
[26]
Brian M Rogers, Anil Krishna, Gordon B Bell, Ken Vu, Xiaowei Jiang, and Yan Solihin. 2009. Scaling the bandwidth wall: challenges in and avenues for CMP scaling. ACM SIGARCH Computer Architecture News 37, 3 (2009), 371--382.
[27]
Thomas Shull, Jian Huang, and Josep Torrellas. 2019. Designing a User-Friendly Java NVM Framework. (2019).
[28]
SICM 2018. Proceedings of the Workshop on Memory Centric High Performance Computing, MCHPC@SC 2018, Dallas, TX, USA, November 11, 2018. ACM. http://dl.acm.org/citation.cfm?id=3286475
[29]
SNAP [n. d.]. SNAP: SN (Discrete Ordinates) Application Proxy. https://github.com/lanl/SNAP
[30]
Titan (2019). TITAN. https://www.olcf.ornl.gov/olcf-resources/compute-systems/titan/
[31]
TOP500 List - November 2018 (2018). TOP500 List - November 2018. https://www.top500.org/list/2018/11/
[32]
Jan Treibig, Georg Hager, and Gerhard Wellein. 2010. Likwid: A lightweight performance-oriented tool suite for x86 multicore environments. In 2010 39th International Conference on Parallel Processing Workshops. IEEE, 207--216.
[33]
Alexander van Renen, Lukas Vogel, Viktor Leis, Thomas Neumann, and Alfons Kemper. 2019. Persistent Memory I/O Primitives. arXiv preprint arXiv:1904.01614 (2019).
[34]
Jeffrey S Vetter and Sparsh Mittal. 2015. Opportunities for nonvolatile memory systems in extreme-scale high-performance computing. Computing in Science & Engineering 17, 2 (2015), 73--82.
[35]
Haris Volos, Andres Jaan Tack, and Michael M Swift. 2011. Mnemosyne: Lightweight persistent memory. In ACM SIGARCH Computer Architecture News, Vol. 39. ACM, 91--104.
[36]
Chao Wang, Sudharshan S Vazhkudai, Xiaosong Ma, Fei Meng, Youngjae Kim, and Christian Engelmann. 2012. NVMalloc: Exposing an aggregate SSD store as a memory partition in extreme-scale machines. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium. IEEE, 957--968.
[37]
Kai Wu, Frank Ober, Shari Hamlin, and Dong Li. 2017. Early Evaluation of Intel Optane Non-Volatile Memory with HPC I/O Workloads. CoRR abs/1708.02199 (2017). arXiv:1708.02199 http://arxiv.org/abs/1708.02199
[38]
Jun Yang, Qingsong Wei, Cheng Chen, Chundong Wang, Khai Leong Yong, and Bingsheng He. 2015. NV-Tree: reducing consistency cost for NVM-based single level systems. In 13th {USENIX} Conference on File and Storage Technologies ({FAST} 15). 167--181.
[39]
Ulrike Meier Yang et al. 2002. BoomerAMG: a parallel algebraic multigrid solver and preconditioner. Applied Numerical Mathematics 41, 1 (2002), 155--177.

Cited By

View all
  • (2025)The ECP SICM projectInternational Journal of High Performance Computing Applications10.1177/1094342024128824339:1(193-207)Online publication date: 1-Jan-2025
  • (2024)Achieving DRAM-Like PCM by Trading Off Capacity for LatencyIEEE Transactions on Computers10.1109/TC.2024.335577973:4(1180-1189)Online publication date: 1-Apr-2024
  • (2024)Enabling Efficient NVM-Based Text Analytics without Decompression2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00286(3725-3738)Online publication date: 13-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
MEMSYS '19: Proceedings of the International Symposium on Memory Systems
September 2019
517 pages
ISBN:9781450372060
DOI:10.1145/3357526
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Intel Optane DC
  2. NUMA
  3. NVM
  4. SICM
  5. hybrid memory
  6. memory allocation
  7. persistent memory

Qualifiers

  • Research-article

Funding Sources

Conference

MEMSYS '19
MEMSYS '19: The International Symposium on Memory Systems
September 30 - October 3, 2019
District of Columbia, Washington, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)227
  • Downloads (Last 6 weeks)30
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)The ECP SICM projectInternational Journal of High Performance Computing Applications10.1177/1094342024128824339:1(193-207)Online publication date: 1-Jan-2025
  • (2024)Achieving DRAM-Like PCM by Trading Off Capacity for LatencyIEEE Transactions on Computers10.1109/TC.2024.335577973:4(1180-1189)Online publication date: 1-Apr-2024
  • (2024)Enabling Efficient NVM-Based Text Analytics without Decompression2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00286(3725-3738)Online publication date: 13-May-2024
  • (2023)NVM: Is it Not Very Meaningful for Databases?Proceedings of the VLDB Endowment10.14778/3603581.360358616:10(2444-2457)Online publication date: 1-Jun-2023
  • (2023)Accelerating In Situ Analysis using Non-volatile MemoryProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624176(995-1004)Online publication date: 12-Nov-2023
  • (2023)CXL Memory as Persistent Memory for Disaggregated HPC: A Practical ApproachProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624175(983-994)Online publication date: 12-Nov-2023
  • (2023)MerchandiserProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577497(204-217)Online publication date: 25-Feb-2023
  • (2023)Extending Memory Capacity in Modern Consumer Systems With Emerging Non-Volatile Memory: Experimental Analysis and Characterization Using the Intel Optane SSDIEEE Access10.1109/ACCESS.2023.331788411(105843-105871)Online publication date: 2023
  • (2022)Memórias Não Voláteis: Uma visão geral sobre as principais tecnologias, suas características e níveis de maturidadeAnais Estendidos do XXIII Simpósio em Sistemas Computacionais de Alto Desempenho (SSCAD Estendido 2022)10.5753/wscad_estendido.2022.226288(25-32)Online publication date: 19-Oct-2022
  • (2022)PATSProceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe10.5555/3539845.3540053(885-890)Online publication date: 14-Mar-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media