Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3579371.3589051acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article
Open access

DRAM Translation Layer: Software-Transparent DRAM Power Savings for Disaggregated Memory

Published: 17 June 2023 Publication History

Abstract

Memory disaggregation is a promising solution to scale memory capacity and bandwidth shared by multiple server nodes in a flexible and cost-effective manner. DRAM power consumption, which is reported to be around 40% of the total system power in the datacenter server, will become an even more serious concern in this high-capacity environment. Exploiting the low average utilization of DRAM capacity in today's datacenters, it is appealing to put unallocated/cold DRAM ranks into a power-saving mode. However, the conventional DRAM address mapping with fine-grained interleaving to maximize rank-level parallelism is incompatible with such rank-level DRAM power management techniques. Furthermore, existing DRAM power-saving techniques often require intrusive changes to the system stack, including OS, memory controller (MC), or even DRAM devices, to pose additional challenges for deployment. Thus, we propose DRAM Translation Layer (DTL) for host software/MC-transparent DRAM power management with commodity DRAM devices. Inspired by Flash Translation Layer (FTL) in modern SSDs, DTL is placed in the CXL memory controller to provide (i) flexible address mappings between host physical address and DRAM device physical address and (ii) host-transparent memory page migration. Leveraging DTL, we propose two DRAM power-saving techniques with different temporal granularities to maximize the number of DRAM ranks that can enter low-power states while provisioning sufficient DRAM bandwidth: rank-level power-down and hotness-aware self-refresh. The first technique consolidates unallocated memory pages into a subset of ranks at deallocation of a virtual machine (VM) and turns them off transparently to both OS and host MC. Our evaluation with CloudSuite benchmarks demonstrates that this technique saves DRAM power by 31.6% on average at a 1.6% performance cost. The hotness-aware self-refresh scheme further reduces DRAM energy consumption by up to 14.9% with negligible performance loss via opportunistically migrating cold pages into a rank and making it enter self-refresh mode.

References

[1]
Azure VM Comparison. https://azureprice.net.
[2]
Cloudsuite. https://github.com/parsa-epfl/cloudsuite.
[3]
Compute Express Link (CXL). https://www.computeexpresslink.org.
[4]
Intel Max Series Brings Breakthrough Memory Bandwidth and Performance to HPC and AI. https://www.intel.com/content/www/us/en/newsroom/news/introducing-intel-max-series-product-family.html.
[5]
Intel Memory Latency Checker v3.9a. https://www.intel.com/content/www/us/en/developer/articles/tool/intelr-memory-latency-checker.html.
[6]
Intel Performance Counter Monitor. https://github.com/intel/pcm.
[7]
Amazon. Amazon Web Services (AWS). https://aws.amazon.com.
[8]
AMD. Offering Unmatched Performance, Leadership Energy Efficiency and Next-Generation Architecture, AMD Brings 4th Gen AMD EPYC™ Processors to The Modern Data Center. https://www.amd.com/en/press-releases/2022-11-10-offering-unmatched-performance-leadership-energy-efficiency-and-next.
[9]
Raja Appuswamy, Matthaios Olma, and Anastasia Ailamaki. 2015. Scaling the Memory Power Wall with DRAM-aware Data Management. In Proceedings of the 11th International Workshop on Data Management on New Hardware. 1--9.
[10]
Luis Angel D Bathen, Mark Gottscho, Nikil Dutt, Alex Nicolau, and Puneet Gupta. 2012. ViPZonE: OS-level Memory Variability-driven Physical Address Zoning for Energy Savings. In Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis. 33--42.
[11]
Avishek Biswas and Anantha P Chandrakasan. 2018. CONV-SRAM: An Energy-efficient SRAM with In-memory Dot-product Computation for Low-power Convolutional Neural Networks. IEEE Journal of Solid-State Circuits 54, 1, 217--230.
[12]
Shuang Chen, Christina Delimitrou, and José F Martínez. 2019. Parties: Qos-aware Resource Partitioning for Multiple Interactive Services. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 107--120.
[13]
Chia Chen Chou, Aamer Jaleel, and Moinuddin K Qureshi. 2014. CAMEO: A Two-level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-managed Cache. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 1--12.
[14]
FJ Corbato. Festschrift: In Honor of PM Morse, chapter A Paging Experiment with the Multics System.
[15]
Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles. 153--167.
[16]
Frank Denneman. Memory Deep Dive: Optimizing for Performance. https://frankdenneman.nl/2015/02/20/memory-deep-dive/.
[17]
Bruno Diniz, Dorgival Guedes, Wagner Meira Jr, and Ricardo Bianchini. 2007. Limiting the Power Consumption of Main Memory. In Proceedings of the 34th annual international symposium on Computer architecture. 290--301.
[18]
Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P Jouppi. 2010. Simple But Effective Heterogeneous Main memory with On-chip Memory Controller Support. In SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--11.
[19]
Subramanya R Dulloor, Amitabha Roy, Zheguang Zhao, Narayanan Sundaram, Nadathur Satish, Rajesh Sankaran, Jeff Jackson, and Karsten Schwan. 2016. Data Tiering in Heterogeneous Memory Systems. In Proceedings of the Eleventh European Conference on Computer Systems. 1--16.
[20]
Assaf Eisenman, Darryl Gardner, Islam AbdelRahman, Jens Axboe, Siying Dong, Kim Hazelwood, Chris Petersen, Asaf Cidon, and Sachin Katti. 2018. Reducing DRAM Footprint with NVM in Facebook. In Proceedings of the Thirteenth EuroSys Conference. 1--13.
[21]
Xiaobo Fan, Carla Ellis, and Alvin Lebeck. 2001. Memory Controller Policies for DRAM Power Management. In Proceedings of the 2001 international symposium on Low power electronics and design. 129--134.
[22]
Saugata Ghose, Abdullah Giray Yaglikçi, Raghav Gupta, Donghyuk Lee, Kais Kudrolli, William X Liu, Hasan Hassan, Kevin K Chang, Niladrish Chatterjee, Aditya Agrawal, et al. 2018. What Your DRAM Power Models are not Telling You: Lessons from a Detailed Experimental Study. Proceedings of the ACM on Measurement and Analysis of Computing Systems 2, 3 (2018), 1--41.
[23]
Google. Google Cloud Platform. https://cloud.google.com.
[24]
Huang Hai, P Padmanabhan, and GS Kang. 2003. Design and Implementation of Power-aware Virtual Memory. In Proceedings of the USENIX Annual Technical Conference.
[25]
Christian Helm, Soramichi Akiyama, and Kenjiro Taura. 2020. Reliable Reverse Engineering of Intel DRAM Addressing Using Performance Counters. In 2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). IEEE, 1--8.
[26]
Hai Huang, Kang G Shin, Charles Lefurgy, and Tom Keller. 2005. Improving Energy Efficiency by Making DRAM Less Randomly Accessed. In Proceedings of the 2005 International Symposium on Low Power Electronics and Design. 393--398.
[27]
Ciji Isen and Lizy John. 2009. ESKIMO-Energy Savings using Semantic Knowledge of Inconsequential Memory Occupancy for DRAM Subsystem. In 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 337--346.
[28]
Syed MAH Jafri, Hasan Hassan, Ahmed Hemani, and Onur Mutlu. 2020. Refresh Triggered Computation: Improving the Energy Efficiency of Convolutional Neural Network Accelerators. ACM Transactions on Architecture and Code Optimization (TACO) 18, 1 (2020), 1--29.
[29]
Svilen Kanev, Juan Pablo Darago, Kim Hazelwood, Parthasarathy Ranganathan, Tipp Moseley, Gu-Yeon Wei, and David Brooks. 2015. Profiling a Warehouse-scale Computer. In Proceedings of the 42nd Annual International Symposium on Computer Architecture. 158--169.
[30]
Alexey Karyakin and Kenneth Salem. 2019. DimmStore: Memory Power Optimization for Database Systems. Proceedings of the VLDB Endowment 12, 11 (2019), 1499--1512.
[31]
Hiwot Tadese Kassa, Jason Akers, Mrinmoy Ghosh, Zhichao Cao, Vaibhav Gogte, and Ronald Dreslinski. 2021. Improving Performance of Flash Based {Key-Value} Stores Using Storage Class Memory as a Volatile Memory Extension. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). 821--837.
[32]
Taek Woon Kim. DDR5 Performance-Boost and Power-Saving Features, and IDDs. https://www.jedec.org/sites/default/files/TaekWoon_Kim_r5.pdf.
[33]
Hyun Ryong Lee and Daniel Sanchez. 2022. Datamime: Generating Representative Benchmarks by Automatically Synthesizing Datasets. In 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1144--1159.
[34]
Seunghak Lee, Ki-Dong Kang, Hwanjun Lee, Hyungwon Park, Younghoon Son, Nam Sung Kim, and Daehoon Kim. 2021. GreenDIMM: OS-Assisted DRAM Power Management for DRAM with a Sub-Array Granularity Power-Down State. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture. 131--142.
[35]
Seok-Hee Lee. 2016. Technology Scaling Challenges and Opportunities of Memory Devices. In 2016 IEEE International Electron Devices Meeting (IEDM). 1--1.
[36]
Sheng Li, Ke Chen, Jung Ho Ahn, Jay B. Brockman, and Norman P. Jouppi. 2011. CACTI-P: Architecture-level Modeling for SRAM-based Structures with Advanced Leakage Reduction Techniques. In ICCAD: International Conference on Computer-Aided Design. 694--701.
[37]
Qixiao Liu and Zhibin Yu. 2018. The Elasticity and Plasticity in Semi-containerized Co-locating Cloud Workload: A View from Alibaba Trace. In Proceedings of the ACM Symposium on Cloud Computing. 347--360.
[38]
Chengzhi Lu, Kejiang Ye, Guoyao Xu, Cheng-Zhong Xu, and Tongxin Bai. 2017. Imbalance in the Cloud: An Analysis on Alibaba Cluster Trace. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2884--2892.
[39]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. Acm sigplan notices 40, 6 (2005), 190--200.
[40]
Chris A Mack. 2011. Fifty Years of Moore's Law. IEEE Transactions on semiconductor manufacturing 24, 2 (2011), 202--207.
[41]
Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Johannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit Kanaujia, and Prakash Chauhan. 2022. TPP: Transparent Page Placement for CXL-Enabled Tiered Memory. arXiv preprint arXiv:2206.02878 (2022).
[42]
Microsoft. Microsoft Azure. https://azure.microsoft.com.
[43]
John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières, Subhasish Mitra, Aravind Narayanan, Guru Parulkar, Mendel Rosenblum, et al. 2010. The Case for RAMClouds: Scalable High-performance Storage Entirely in DRAM. ACM SIGOPS Operating Systems Review 43, 4 (2010), 92--105.
[44]
Andy Patrizio. Facebook and Amazon are Causing a Memory Shortage. https://www.networkworld.com/article/3247775/facebook-and-amazon-are-causing-a-memory-shortage.html.
[45]
Luiz E Ramos, Eugene Gorbatov, and Ricardo Bianchini. 2011. Page Placement in Hybrid Memory Systems. In Proceedings of the international conference on Supercomputing. 85--95.
[46]
Charles Reiss, Alexey Tumanov, Gregory R Ganger, Randy H Katz, and Michael A Kozuch. 2012. Towards Understanding Heterogeneous Clouds at Scale: Google Trace Analysis. Intel Science and Technology Center for Cloud Computing, Tech. Rep 84 (2012), 1--21.
[47]
Samsung. 288pin Registered DIMM based on 16Gb M-die. https://image.semiconductor.samsung.com/resources/data-sheet/128GB_DDR4_16Gb_M_Die_Registered_DIMM_Rev1.0_Feb.19.pdf.
[48]
Samsung. DDR4 SDRAM Specification. https://semiconductor.samsung.com/resources/data-sheet/DDR4_Device_Operations_Rev11_Oct_14-0.pdf.
[49]
Rathijit Sen and Karthik Ramachandra. 2018. Characterizing Resource Sensitivity of Database Workloads. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 657--669.
[50]
Jaewoong Sim, Alaa R Alameldeen, Zeshan Chishti, Chris Wilkerson, and Hyesoon Kim. 2014. Transparent Hardware Management of Stacked DRAM as Part of Memory. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 13--24.
[51]
Akshitha Sriraman, Abhishek Dhanotia, and Thomas F Wenisch. 2019. SoftSKU: Optimizing Server Architectures for Microservice Diversity @Scale. In Proceedings of the 46th International Symposium on Computer Architecture. 513--526.
[52]
Muhammad Tirmazi, Adam Barker, Nan Deng, Md E Haque, Zhijing Gene Qin, Steven Hand, Mor Harchol-Balter, and John Wilkes. 2020. Borg: the Next Generation. In Proceedings of the fifteenth European conference on computer systems. 1--14.
[53]
R Brett Tremaine, Peter A Franaszek, John T Robinson, Charles O Schulz, T Basil Smith, Michael EWazlowski, and P Maurice Bland. 2001. IBM Memory Expansion Technology (MXT). IBM Journal of Research and Development 45, 2 (2001), 271--285.
[54]
Haris Volos, Guilherme Magalhaes, Ludmila Cherkasova, and Jun Li. 2015. Quartz: A Lightweight Performance Emulator for Persistent Memory Software. In Proceedings of the 16th Annual Middleware Conference. 37--49.
[55]
Carl A Waldspurger. 2002. Memory Resource Management in VMware ESX Server. ACM SIGOPS Operating Systems Review 36, SI (2002), 181--194.
[56]
Lidia Warnes, Michael Bozich Calhoun, Dennis Carr, Teddy Lee, Dan Vu, and Ricardo Ernesto Espinoza-Ibarra. Rank Sparing System and Method. US Patent 8,892,942.
[57]
Johannes Weiner, Niket Agarwal, Dan Schatzberg, Leon Yang, Hao Wang, Blaise Sanouillet, Bikash Sharma, Tejun Heo, Mayank Jain, Chunqiang Tang, and Dimitrios Skarlatos. 2022. TMO: Transparent Memory Offloading in Datacenters. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 609--621.
[58]
Fei Wen, Mian Qin, Paul V Gratz, and AL Narasimha Reddy. 2020. Hardware Memory Management for Future Mobile Hybrid Memory Systems. IEEE Transactions on computer-aided design of integrated circuits and systems 39, 11 (2020), 3627--3637.
[59]
Donghong Wu, Bingsheng He, Xueyan Tang, Jianliang Xu, and Minyi Guo. 2012. RAMZzz: Rank-aware DRAM Power Management with Dynamic Migrations and Demotions. In SC'12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 1--11.
[60]
Dongli Zhang, Moussa Ehsan, Michael Ferdman, and Radu Sion. 2014. DIMMer: A Case for Turning off DIMMs in Clouds. In Proceedings of the ACM Symposium on Cloud Computing. 1--8.

Cited By

View all
  • (2025)Gate-controllable two-dimensional transition metal dichalcogenides for spintronic memoryJournal of Alloys and Compounds10.1016/j.jallcom.2024.1774871010(177487)Online publication date: Jan-2025
  • (2024)Salus: Efficient Security Support for CXL-Expanded GPU Memory2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00027(1-15)Online publication date: 2-Mar-2024
  • (2023)High-Speed Communication in Memory Controller by Novel Pipeline Register Design2023 Second International Conference On Smart Technologies For Smart Nation (SmartTechCon)10.1109/SmartTechCon57526.2023.10391333(600-604)Online publication date: 18-Aug-2023

Index Terms

  1. DRAM Translation Layer: Software-Transparent DRAM Power Savings for Disaggregated Memory

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ISCA '23: Proceedings of the 50th Annual International Symposium on Computer Architecture
      June 2023
      1225 pages
      ISBN:9798400700958
      DOI:10.1145/3579371
      This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 June 2023

      Check for updates

      Author Tags

      1. DRAM
      2. power management
      3. datacenters
      4. disaggregated memory
      5. CXL
      6. pooled memory
      7. address translation

      Qualifiers

      • Research-article

      Funding Sources

      • Samsung Electronics
      • Institute of Information & Communications Technology Planning & Evaluation (IITP) of Korea

      Conference

      ISCA '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 543 of 3,203 submissions, 17%

      Upcoming Conference

      ISCA '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1,330
      • Downloads (Last 6 weeks)152
      Reflects downloads up to 21 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Gate-controllable two-dimensional transition metal dichalcogenides for spintronic memoryJournal of Alloys and Compounds10.1016/j.jallcom.2024.1774871010(177487)Online publication date: Jan-2025
      • (2024)Salus: Efficient Security Support for CXL-Expanded GPU Memory2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00027(1-15)Online publication date: 2-Mar-2024
      • (2023)High-Speed Communication in Memory Controller by Novel Pipeline Register Design2023 Second International Conference On Smart Technologies For Smart Nation (SmartTechCon)10.1109/SmartTechCon57526.2023.10391333(600-604)Online publication date: 18-Aug-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media