research-article

PAC: Paged Adaptive Coalescer for 3D-Stacked Memory

Authors:

Xi Wang,

John D. Leidel,

Brody Williams,

Yong ChenAuthors Info & Claims

HPDC '20: Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing

Pages 137 - 148

https://doi.org/10.1145/3369583.3392670

Published: 23 June 2020 Publication History

Get Access

Abstract

Many contemporary data-intensive applications exhibit irregular and highly concurrent memory access patterns and thus challenge the performance of conventional memory systems. Driven by an expanding need for high-bandwidth memory featuring low access latency, 3D-stacked memory devices, such as the Hybrid Memory Cube (HMC) and High Bandwidth Memory (HBM), were designed to provide significantly higher throughput as compared to standard JEDEC DDR devices. However, existing memory interfaces and coalescing models, designed for conventional DDR devices, are unable to fully exploit the bandwidth potential inherent in these new 3D-stacked memory devices. In order to remedy this disparity, we introduce in this work a novel paged adaptive coalescer (PAC) infrastructure with a scalable coalescing network for 3D-stacked memory. We present the design and simulated implementation of this approach on RISC-V embedded cores with attached HMC devices. We have carried out extensive evaluations and the results show that the proposed PAC methodology yields an average coalescing efficiency of 56.01%. Further, our evaluation results also show that the PAC reduces bank conflicts and the power consumption by 85.16% and 59.21%, respectively. Overall, PAC achieves an average performance gain of 14.35% (and up to 26.06%) across 14 test suites. These results showcase the potential of the PAC methodology as applied to architecture design for increasingly critical data-intensive algorithms and applications.

Supplementary Material

MP4 File (3369583.3392670.mp4)

Driven by an expanding need for high-bandwidth memory featuring low access latency, 3D-stacked memory devices were designed to provide significantly higher throughput as compared to standard JEDEC DDR devices. However, existing memory interfaces and coalescing models, designed for conventional DDR devices, are unable to fully exploit the bandwidth potential inherent in these new 3D-stacked memory devices. In order to remedy this disparity, we introduce a new Paged Adaptive Coalescer (PAC) methodology and associated design for effectively performing DMC (dynamic memory coalescing) on emerging 3D-stacked memory. PAC is designed with a pipelined coalescing network that aggregates memory requests based on the granularity of physical pages to enhance the bandwidth utilization of the 3D-stacked memory. It also extends the miss status holding registers (MSHRs) to adaptively merge requests with flexible sizes. In addition to the PAC design, we also present our simulated implementation and evaluation of this work.

Download
453.03 MB

References

[1]

Laurent Schares et al. A throughput-optimized optical network for data-intensive computing. IEEE Micro, 2014.

Abstract

Supplementary Material

References

Index Terms

Recommendations

MAC: Memory Access Coalescer for 3D-Stacked Memory

Memory Coalescing for Hybrid Memory Cube

Concurrent Dynamic Memory Coalescing on GoblinCore-64 Architecture

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations