Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3357526.3357557acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
short-paper
Public Access

Statistical caching for near memory management

Published: 30 September 2019 Publication History

Abstract

Modern GPUs often use near memory or high-bandwidth memory, which may be managed as cache when the application data is too large to fit in the near memory. Unlike CPU caches, the near memory cache has a much larger size. A recent approach is statistical caching, which shows near optimal results when managing large memory for file caching.
The prior work is ideal and not practical. This paper outlines two extensions. It first formulates a new caching algorithm called least expected use (LEU) replacement and shows, through examples, that the statistical solution automatically integrates two otherwise disparate policies. Then the paper describes a system design to implement LEU. To position the new design for discussion, the paper draws parallels with two familiar ideas, branch prediction and spectral analysis, and considers a set of opportunities and challenges of achieving statistical caching in near memory.

References

[1]
L. A. Belady. 1966. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal 5, 2 (1966), 78--101.
[2]
Kristof Beyls and Erik H. D'Hollander. 2005. Generating cache hints for improved program efficiency. Journal of Systems Architecture 51, 4 (2005), 223--250.
[3]
Jacob Brock, Chencheng Ye, and Chen Ding. 2016. Replacement Policies for Heterogeneous Memories. In Proceedings of the International Symposium on Memory Systems (MEMSYS). 232--237.
[4]
Chiachen Chou, Aamer Jaleel, and Moinuddin Qureshi. 2017. BATMAN: Techniques for Maximizing System Bandwidth of Memory Systems with stacked-DRAM. In Proceedings of the International Symposium on Memory Systems (MEMSYS '17). ACM, New York, NY, USA, 268--280.
[5]
Dan Grossman. 2007. The transactional memory / garbage collection analogy. In Proceedings of the International Conference on Object Oriented Programming, Systems, Languages and Applications. 695--706.
[6]
Song Jiang and Xiaodong Zhang. 2005. Making LRU Friendly to Weak Locality Workloads: A Novel Replacement Algorithm to Improve Buffer Cache Performance. IEEE Trans. Computers 54, 8 (2005), 939--952.
[7]
Pengcheng Li, Colin Pronovost, Benjamin Tait, William Wilson, Jie Zhou, Chen Ding, and John Criswell. 2019. Beating OPT with Statistical Clairvoyance and Variable Size Caching. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. accepted, to appear.
[8]
G. Scott Lloyd and Maya Gokhale. 2017. Near memory key/value lookup acceleration. In Proceedings of the International Symposium on Memory Systems (MEMSYS). 26--33.
[9]
R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM System Journal 9, 2 (1970), 78--117.
[10]
NVIDIA. [n. d.]. CUPTI: CUPTI Documentation, v10.1, May 2019. https://docs.nvidia.com/cupti/Cupti/r_main.html.
[11]
Jason Power, Mark D. Hill, and David A. Wood. 2014. Supporting x86-64 address translation for 100s of GPU lanes. 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA) (2014), 568--578.
[12]
Nikolay Sakharnykh. 2016. Beyond GPU Memory Limits with Unified Memory on Pascal. https://devblogs.nvidia.com/beyond-gpu-memory-limits-unified-memory-pascal/.
[13]
Jeffrey R. Spirn. 1977. Program behavior: models and measurements. Elsevier New York. x, 277 p.: pages.
[14]
I. Tanasic, I. Gelado, M. Jorda, E. Ayguade, and N. Navarro. 2017. Efficient Exception Handling Support for GPUs. In 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 109--122.
[15]
Rommel Sánchez Verdejo, Kazi Asifuzzaman, Milan Radulovic, Petar Radojkovic, Eduard Ayguadé, and Bruce Jacob. 2018. Main memory latency simulation: the missing link. In Proceedings of the International Symposium on Memory Systems (MEMSYS). 107--116.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
MEMSYS '19: Proceedings of the International Symposium on Memory Systems
September 2019
517 pages
ISBN:9781450372060
DOI:10.1145/3357526
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2019

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Short-paper

Funding Sources

Conference

MEMSYS '19
MEMSYS '19: The International Symposium on Memory Systems
September 30 - October 3, 2019
District of Columbia, Washington, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 369
    Total Downloads
  • Downloads (Last 12 months)79
  • Downloads (Last 6 weeks)9
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media