Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2485922.2485955acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Reducing memory access latency with asymmetric DRAM bank organizations

Published: 23 June 2013 Publication History

Abstract

DRAM has been a de facto standard for main memory, and advances in process technology have led to a rapid increase in its capacity and bandwidth. In contrast, its random access latency has remained relatively stagnant, as it is still around 100 CPU clock cycles. Modern computer systems rely on caches or other latency tolerance techniques to lower the average access latency. However, not all applications have ample parallelism or locality that would help hide or reduce the latency. Moreover, applications' demands for memory space continue to grow, while the capacity gap between last-level caches and main memory is unlikely to shrink. Consequently, reducing the main-memory latency is important for application performance. Unfortunately, previous proposals have not adequately addressed this problem, as they have focused only on improving the bandwidth and capacity or reduced the latency at the cost of significant area overhead.
We propose asymmetric DRAM bank organizations to reduce the average main-memory access latency. We first analyze the access and cycle times of a modern DRAM device to identify key delay components for latency reduction. Then we reorganize a subset of DRAM banks to reduce their access and cycle times by half with low area overhead. By synergistically combining these reorganized DRAM banks with support for non-uniform bank accesses, we introduce a novel DRAM bank organization with center high-aspect-ratio mats called CHARM. Experiments on a simulated chip-multiprocessor system show that CHARM improves both the instructions per cycle and system-wide energy-delay product up to 21% and 32%, respectively, with only a 3% increase in die area.

References

[1]
"The SAP HANA Database," http://www.sap.com.
[2]
"Virtual Channel DRAM. Elpida Memory, Inc." http://www.elpida.com/en/products/eol/vcdram.html.
[3]
J. Ahn, "ccTSA: A Coverage-Centric Threaded Sequence Assembler," PLoS ONE, vol. 7, no. 6, 2012.
[4]
J. Ahn et al., "Improving System Energy Efficiency with Memory Rank Subsetting," ACM TACO, vol. 9, no. 1, 2012.
[5]
J. Ahn et al., "McSimA+: A Manycore Simulator with Application-level+ Simulation and Detailed Microarchitecture Modeling," in ISPASS, Apr 2013.
[6]
R. Alverson et al., "The Tera Computer System," in ICS, Jun 1990.
[7]
D. L. Anand et al., "Embedded DRAM in 45-nm Technology and Beyond," Design Test of Computers, IEEE, vol. 28, no. 1, 2011.
[8]
S.-J. Bae et al., "A 40nm 2Gb 7Gb/s/pin GDDR5 SDRAM with a Programmable DQ Ordering Crosstalk Equalizer and Adjustable Clock-tracking BW," in ISSCC, Feb 2011.
[9]
A. Bhattacharjee and M. Martonosi, "Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors," in ISCA, Jun 2009.
[10]
C. Bienia et al., "The PARSEC Benchmark Suite: Characterization and Architectural Implications," in PACT, Oct 2008.
[11]
E. Cooper-Balis and B. Jacob, "Fine-Grained Activation for Power Reduction in DRAM," IEEE Micro, vol. 30, no. 3, 2010.
[12]
B. Ganesh et al., "Fully-Buffered DIMM Memory Architectures: Understanding Mechanisms, Overheads and Scaling," in HPCA, Feb 2007.
[13]
P. N. Glaskowsky, "MoSys Explains 1T-SRAM Technology," Microprocessor Report, Sep. 1999.
[14]
M. Hashimoto et al., "An Embedded DRAM Module using a Dual Sense Amplifier Architecture in a Logic Process," in ISSCC, Feb 1997.
[15]
J. L. Henning, "SPEC CPU2006 Memory Footprint," Computer Architecture News, vol. 35, no. 1, 2007.
[16]
B. Jacob et al., Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann Publishers Inc., 2007.
[17]
D. James, "Recent Innovations in DRAM Manufacturing," in Advanced Semiconductor Manufacturing Conference, Jul 2010.
[18]
U. J. Kapasi et al., "Programmable Stream Processors," IEEE Computer, vol. 36, no. 8, 2003.
[19]
D. Kaseridis et al., "Minimalist Open-page: a DRAM Page-mode Scheduling Policy for the Many-core Era," in MICRO, Dec 2011.
[20]
B. Keeth et al., DRAM Circuit Design, 2nd ed. IEEE, 2008.
[21]
C. Kim et al., "An Adaptive, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip Caches," in ASPLOS, Oct 2002.
[22]
J.-S. Kim et al., "A 1.2V 12.8GB/s 2Gb mobile Wide-I/O DRAM with 4x 128 I/Os using TSV-based stacking," in ISSCC, Feb 2011.
[23]
Y. Kim et al., "A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM," in ISCA, Jun 2012.
[24]
C. Kozyrakis, "Scalable Vector Media-processors for Embedded Systems," Ph.D. dissertation, University of California at Berkeley, 2002.
[25]
D. Lee et al., "LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies," IEEE TC, vol. 50, no. 12, 2001.
[26]
D. Lee et al., "Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture," in HPCA, Feb 2013.
[27]
S. Li et al., "The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing," ACM TACO, vol. 10, no. 1, 2013.
[28]
E. Lindholm et al., "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro, vol. 28, no. 2, 2008.
[29]
G. H. Loh, "A Register-file Approach for Row Buffer Caches in Die-stacked DRAMs," in MICRO, Dec 2011.
[30]
G. H. Loh and M. D. Hill, "Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches," in MICRO, Dec 2011.
[31]
N. Madan et al., "Optimizing Communication and Capacity in a 3D Stacked Reconfigurable Cache Hierarchy," in HPCA, Feb 2009.
[32]
J. D. McCalpin, "STREAM: Sustainable Memory Bandwidth in High Performance Computers," University of Virginia, Tech. Rep., 1991.
[33]
Micron Technology Inc., LPDDR2 SDRAM Datasheet, 2010.
[34]
Micron Technology Inc., RLDRAM3 Datasheet, 2011.
[35]
O. Mutlu and T. Moscibroda, "Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems," in ISCA, Jun 2008.
[36]
D. Patterson et al., "A Case for Intelligent RAM," Micro, IEEE, vol. 17, no. 2, 1997.
[37]
D. A. Patterson and J. L. Hennessy, Computer Architecture: A Quantitative Approach, 5th ed. Morgan Kaufmann Publishers Inc., 2012.
[38]
J. T. Pawlowski, "Hybrid Memory Cube," in Hot Chips, Aug 2011.
[39]
L. E. Ramos et al., "Page Placement in Hybrid Memory Systems," in ICS, Jun 2011.
[40]
S. Rixner et al., "Memory Access Scheduling," in ISCA, Jun 2000.
[41]
Samsung Electronics, DDR3 SDRAM Datasheet, 2012.
[42]
Y. Sato et al., "Fast Cycle RAM (FCRAM); a 20-ns Random Row Access, Pipelined Operating DRAM," in VLSI, Jun 1998.
[43]
T. Sherwood et al., "Automatically Characterizing Large Scale Program Behavior," in ASPLOS, Oct 2002.
[44]
A. Snavely and D. Tullsen, "Symbiotic Job Scheduling for a Simultaneous Mutlithreading Processor," in ASPLOS, Nov 2000.
[45]
K. Sudan et al., "Micro-pages: Increasing DRAM Efficiency with Locality-aware Data Placement," in ASPLOS, Oct 2010.
[46]
A. N. Udipi et al., "Combining Memory and a Controller with Photonics through 3D-stacking to Enable Scalable and Energy-efficient Systems," in ISCA, Jun 2011.
[47]
A. N. Udipi et al., "Rethinking DRAM Design and Organization for Energy-constrained Multi-cores," in ISCA, Jun 2010.
[48]
B. Verghese et al., "Operating System Support for Improving Data Locality on cc-NUMA Compute Servers," in ASPLOS, Oct 1996.
[49]
T. Vogelsang, "Understanding the Energy Consumption of Dynamic Random Access Memories," in MICRO, Dec 2010.
[50]
S. C. Woo et al., "The SPLASH-2 Programs: Characterization and Methodological Considerations," in ISCA, Jun 1995.
[51]
W. A. Wulf and S. A. McKee, "Hitting the Memory Wall: Implications of the Obvious," Computer Architecture News, vol. 23, no. 1, 1995.
[52]
Y. Yanagawa et al., "In-substrate-bitline Sense Amplifier with Array-noise-gating Scheme for Low-noise 4F2 DRAM Array Operable at 10-fF Cell Capacitance," in VLSI, Jun 2011.
[53]
D. H. Yoon et al., "BOOM: Enabling Mobile Memory Based Low-Power Server DIMMs," in ISCA, Jun 2012.
[54]
D. H. Yoon and M. Erez, "Virtualized ECC: Flexible Reliability in Main Memory," IEEE Micro, vol. 31, no. 1, 2011.
[55]
D. H. Yoon et al., "Adaptive Granularity Memory Systems: a Tradeoff Between Storage Efficiency and Throughput," in ISCA, Jun 2011.
[56]
Z. Zhang et al., "Cached DRAM for ILP Processor Memory Access Latency Reduction," IEEE Micro, vol. 21, no. 4, 2001.
[57]
W. Zhao and Y. Cao, "New Generation of Predictive Technology Model for Sub-45nm Design Exploration," in ISQED, Mar 2006.
[58]
H. Zheng et al., "Mini-Rank: Adaptive DRAM Architecture for Improving Memory Power Efficiency," in MICRO, Nov 2008.

Cited By

View all
  • (2024)HiFi-DRAM: Enabling High-fidelity DRAM Research by Uncovering Sense Amplifiers with IC Imaging2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00020(133-149)Online publication date: 29-Jun-2024
  • (2024)Agile-DRAM: Agile Trade-Offs in Memory Capacity, Latency, and Energy for Data Centers2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00089(1141-1153)Online publication date: 2-Mar-2024
  • (2023)Unity ECC: Unified Memory Protection Against Bit and Chip ErrorsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607081(1-16)Online publication date: 12-Nov-2023
  • Show More Cited By

Index Terms

  1. Reducing memory access latency with asymmetric DRAM bank organizations

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ISCA '13: Proceedings of the 40th Annual International Symposium on Computer Architecture
    June 2013
    686 pages
    ISBN:9781450320795
    DOI:10.1145/2485922
    • cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 41, Issue 3
      ICSA '13
      June 2013
      666 pages
      ISSN:0163-5964
      DOI:10.1145/2508148
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • IEEE CS

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 June 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. DRAM
    2. asymmetric bank organizations
    3. high-aspect-ratio mats
    4. microarchitecture

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ISCA'13
    Sponsor:

    Acceptance Rates

    ISCA '13 Paper Acceptance Rate 56 of 288 submissions, 19%;
    Overall Acceptance Rate 543 of 3,203 submissions, 17%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)152
    • Downloads (Last 6 weeks)21
    Reflects downloads up to 19 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)HiFi-DRAM: Enabling High-fidelity DRAM Research by Uncovering Sense Amplifiers with IC Imaging2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00020(133-149)Online publication date: 29-Jun-2024
    • (2024)Agile-DRAM: Agile Trade-Offs in Memory Capacity, Latency, and Energy for Data Centers2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00089(1141-1153)Online publication date: 2-Mar-2024
    • (2023)Unity ECC: Unified Memory Protection Against Bit and Chip ErrorsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607081(1-16)Online publication date: 12-Nov-2023
    • (2023)A Low-Cost Reduced-Latency DRAM Architecture With Dynamic Reconfiguration of Row DecoderIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2022.321943731:1(128-141)Online publication date: Jan-2023
    • (2023)A Comprehensive Evaluation of Convolutional Hardware AcceleratorsIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2022.322392570:3(1149-1153)Online publication date: Mar-2023
    • (2023)DRAM Bender: An Extensible and Versatile FPGA-Based Infrastructure to Easily Test State-of-the-Art DRAM ChipsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.328217242:12(5098-5112)Online publication date: Dec-2023
    • (2023)A Hardware-Based Approach to Determine the Frequently Accessed DRAM Pages for Multi-Core Systems2023 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT)10.1109/JEEIT58638.2023.10185689(146-153)Online publication date: 22-May-2023
    • (2023)SHADOW: Preventing Row Hammer in DRAM with Intra-Subarray Row Shuffling2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10070966(333-346)Online publication date: Feb-2023
    • (2023)High-Performance and Power-Saving Mechanism for Page Activations Based on Full Independent DRAM Sub-Arrays in Multi-Core SystemsIEEE Access10.1109/ACCESS.2023.329984811(79801-79822)Online publication date: 2023
    • (2022)ECMO: ECC Architecture Reusing Content-Addressable Memories for Obtaining High Reliability in DRAMIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2022.315389430:6(781-793)Online publication date: Jun-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media