Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2228360.2228519acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU

Published: 03 June 2012 Publication History

Abstract

Single-chip CPU/GPU architecture is being adopted in high-end (embedded) systems, e.g., smartphones and tablet PCs. Main memory subsystem is expected to consist of hybrid DRAM and phase-change RAM (PRAM) due to the difficulties in DRAM scaling. In this work, we address the performance optimization of the hybrid DRAM/PRAM main memory for single chip CPU/GPU. Based on the tight requirements of low latency from CPU and the relative tolerance to long latency from GPU, DRAM is first allocated to CPU while PRAM with longer write latency is allocated to GPU. Then, in order to improve the write performance of GPU traffic, we propose (1) an in-DRAM write buffer to accommodate GPU write traffics, (2) dynamic hot data management to improve the efficiency of write buffer, (3) runtime-adaptive adjustment of write buffer size to meet the given CPU performance bound, and (4) CPU-aware DRAM access scheduling to give low latency to CPU traffics. The experiments show that the proposed method gives 1.02~44.2 times performance improvement in GPU performance with modest (negligible) CPU performance overhead (when compute-intensive CPU programs run).

References

[1]
Intel, Co., 2nd Generation Intel® Core#8482; Processor Family (Codemane Sandy Bridge), http://software.intel.com/en-us/articles/sandy-bridge.
[2]
NVIDIA. Co., TEGRA 2 & TEGRA 3 SUPER CHIP PROCESSORS, http://www.nvidia.com/object/tegra-superchip.html.
[3]
S. Dumas, "Mobile Memory Forum: LPDDR3 and WideIO," JEDEC Mobile Forum, June 2011.
[4]
S. Keckler, et al., "GPUs and The Future of Parallel Computing," IEEE MICRO, vol. 32, issue 5, pp. 7--17, Sept/Oct. 2011.
[5]
International Technology Roadmap for Semiconductors (ITRS), available at www.itrs.net.
[6]
M. Abdulla, and M. Greenberg, "Will Phase Change Memory (PCM) Replace DRAM or NAND Flash?," Flash Memory Summit, Aug. 2010.
[7]
Numonyx, "Phase Change Memory (PCM): A new memory technology to enable new memory usage models," available at www.numonyx.com/en-us/MemoryProducts/PCM/Pages/PCM.aspx.
[8]
EE Times, Samsung to ship MCP with phase-change, http://www.eetimes.com/electronics-news/4088727/Samsung-to-ship-MCP-with-phase-change.
[9]
JEDEC Standard, Low Power Double Data Rate 2 (LPDDR2), JESD209-2E, April 2011.
[10]
C. Villa, et al., "A 45nm 1Gb 1.8V Phase-Change Memory," Proc. ISSCC, 2010.
[11]
H. Chung, et al., "A 58nm 1.8V 1Gb PRAM with 6.4MB/s Program BW," Proc. International Solid-State Circuits Conference (ISSCC), 2011.
[12]
M. K. Qureshi, V. Srinivasan, and J. A. Rivers, "Scalable High Performance Main Memory System Using Phase-Change Memory Technology," Proc. ISCA, 2009.
[13]
G. Dhiman, R. Ayoub, and T. Rosing, "PDRAM: A Hybrid PRAM and DRAM Main Memory System," Proc. DAC, 2009.
[14]
G. Sandre, et al., "A 90nm 4Mb Embedded Phase-Change Memory with 1.2V 12ns Read Access Time and 1MB/s Write Throughput," Proc. ISSCC, 2010.
[15]
B. D. Yang, et al., "A Low Power Phase-Change Random Access Memory Using a Data-Comparison Write Scheme," Proc. ISCAS, 2007.
[16]
K. Lee, et al., "A 90nm 1.8V 512Mb Diode-Switch PRAM with 266MB/s Read Throughput," IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 150--162, Jan. 2008.
[17]
M. Qureshi, et al., "Practical and Secure PCM Systems by Online Detection of Malicious Write Streams," Proc. HPCA, 2011.
[18]
L. Zhang, et al., "The Impulse Memory Controller," IEEE Trans. Computers, vol. 50, no. 11, Nov. 2001.
[19]
J. Ahn, M. Erez, and W. J. Dally, "The Design Space of Data-Parallel Memory Systems," Proc. SC, 2006.
[20]
Personal communications with Intel CPU/GPU designers, 2011.
[21]
N. B. Lakshminarayana, and H. Kim, "Effect of Instruction Fetch and Memory Scheduling on GPU Performance," Workshop on Language, Compiler, and Architecture Support for GPGPU, in conjunction with HPCA/PPoPP, 2010.
[22]
B. C. Lee, et al., "Architecting Phase Change Memory as a Scalable DRAM Alternative," Proc. ISCA, 2009.
[23]
T. M. Aamodt, et al., "GPGPU-Sim: A Performance Simulator for Massively Multithreaded Processor Research," available at http://www.ece.ubc.ca/~aamodt/gpgpu-sim/tutorial/GPGPU-Sim-Tutorial-MICRO42.pdf
[24]
Intel, Co., "Performance Analysis Guide for Intel Core i7 Processor and Intel Xeon 5500 Processors," available at http://software.intel.com/sites/products/collateral/hpc/vtune/performance_analysis_guide.pdf.
[25]
S. Rixner, et al., "Memory Access Scheduling," Proc. ISCA, 2000.
[26]
S. Li, et al., "McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures," Proc. MICRO, 2009.
[27]
A. Bakhoda, et al., "Analyzing CUDA Workloads Using a Detailed GPU Simulator," Proc. ISPASS, 2009.

Cited By

View all
  • (2024)A High-bandwidth High-capacity Hybrid 3D Memory for GPUsACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365505752:1(67-68)Online publication date: 13-Jun-2024
  • (2024)A High-bandwidth High-capacity Hybrid 3D Memory for GPUsAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655057(67-68)Online publication date: 10-Jun-2024
  • (2024)H3DM: A High-bandwidth High-capacity Hybrid 3D Memory Design for GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36390388:1(1-28)Online publication date: 21-Feb-2024
  • Show More Cited By

Index Terms

  1. Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DAC '12: Proceedings of the 49th Annual Design Automation Conference
    June 2012
    1357 pages
    ISBN:9781450311991
    DOI:10.1145/2228360
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 June 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. main memory subsystem
    2. phase-change RAM
    3. single-chip CPU/GPU

    Qualifiers

    • Research-article

    Conference

    DAC '12
    Sponsor:
    DAC '12: The 49th Annual Design Automation Conference 2012
    June 3 - 7, 2012
    California, San Francisco

    Acceptance Rates

    Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

    Upcoming Conference

    DAC '25
    62nd ACM/IEEE Design Automation Conference
    June 22 - 26, 2025
    San Francisco , CA , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 17 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A High-bandwidth High-capacity Hybrid 3D Memory for GPUsACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365505752:1(67-68)Online publication date: 13-Jun-2024
    • (2024)A High-bandwidth High-capacity Hybrid 3D Memory for GPUsAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655057(67-68)Online publication date: 10-Jun-2024
    • (2024)H3DM: A High-bandwidth High-capacity Hybrid 3D Memory Design for GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36390388:1(1-28)Online publication date: 21-Feb-2024
    • (2023)MC-ELMM: Multi-Chip Endurance-Limited Memory ManagementProceedings of the International Symposium on Memory Systems10.1145/3631882.3631905(1-16)Online publication date: 2-Oct-2023
    • (2019)Sparse-Insertion Write Cache to Mitigate Write Disturbance Errors in Phase Change MemoryIEEE Transactions on Computers10.1109/TC.2018.288113768:5(752-764)Online publication date: 1-May-2019
    • (2018)Set variation-aware shared LLC management for CPU-GPU heterogeneous architecture2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE.2018.8341983(79-84)Online publication date: Mar-2018
    • (2018)Shared Last-Level Cache Management and Memory Scheduling for GPGPUs with Hybrid Main MemoryACM Transactions on Embedded Computing Systems10.1145/323064317:4(1-25)Online publication date: 31-Jul-2018
    • (2018)M-CLOCKACM Transactions on Storage10.1145/321673014:3(1-17)Online publication date: 3-Oct-2018
    • (2018)Data Scheduling Based on Data Label in Hybrid Storage Architecture2018 IEEE 15th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)10.1109/MASS.2018.00083(537-542)Online publication date: Oct-2018
    • (2018)The New Hardware Development Trend and the Challenges in Data Management and AnalysisData Science and Engineering10.1007/s41019-018-0072-63:3(263-276)Online publication date: 24-Sep-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media