research-article

Hardware support for protective and collaborative cache sharing

Authors:

Michael C. HuangAuthors Info & Claims

ISMM 2016: Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management

Pages 24 - 35

https://doi.org/10.1145/2926697.2926705

Published: 14 June 2016 Publication History

Abstract

Shared caches are generally optimized to maximize the overall throughput, fairness, or both, among multiple competing programs. In shared environments and compute clouds, users are often unrelated to each other. In such circumstances, an overall gain in throughput does not justify an individual loss. This paper explores cache management policies that allow conservative sharing to protect the cache occupancy for individual programs, yet enable full cache utilization whenever there is an opportunity to do so. We propose a hardware-based mechanism called cache rationing. Each program is assigned a portion of the shared cache as its ration. The hardware support protects the ration so it cannot be taken away by peer programs while in use. However, a program can exceed its pre-allocated ration, but only if another program has unused space in its allocated portion of ration. We show that rationing provides good resource protection and full cache utilization of the shared cache for a variety of co-runs.

References

[1]

R. Balasubramonian, D. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture, pages 245–257, 2000.

Digital Library

[2]

K. Beyls and E. H. DHollander. Generating cache hints for improved program efficiency. Journal of Systems Architecture, pages 223 – 250, 2005.

Digital Library

[3]

J. Brock, X. Gu, B. Bao, and C. Ding. Pacman: Program-assisted cache management. In Proceedings of the 2013 International Symposium on Memory Management, pages 39–50, 2013.

Digital Library

[4]

J. Brock, C. Ye, C. Ding, Y. Li, X. Wang, and Y. Luo. Optimal cache partition-sharing. In Parallel Processing (ICPP), 2015 44th International Conference on, pages 749–758, 2015.

Digital Library

[5]

J. Chang and G. S. Sohi. Cooperative cache partitioning for chip multiprocessors. In Proceedings of the 21st annual international conference on Supercomputing, pages 242–252, 2007.

Digital Library

[6]

H. Cook, M. Moreto, S. Bird, K. Dao, D. A. Patterson, and K. Asanovic. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In Proceedings of the 40th Annual International Symposium on Computer Architecture, pages 308–319, 2013.

Digital Library

[7]

J. Du, Y. Zhang, Z. Wu, and X. Wang. Advanced data mining and applications: 6th international conference, adma 2010, chongqing, china, november 19-21, 2010, proceedings, part ii. chapter Management Policies Analysis for Multi-core Shared Caches, pages 514–521. 2010.

Digital Library

[8]

N. Duong, D. Zhao, T. Kim, R. Cammarota, M. Valero, and A. V. Veidenbaum. Improving cache management policies using dynamic reuse distances. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pages 389–400, 2012.

Digital Library

[9]

X. Gu, T. Bai, Y. Gao, C. Zhang, R. Archambault, and C. Ding. Languages and compilers for parallel computing. chapter P-OPT: Program-Directed Optimal Cache Management, pages 217–231. 2008.

Digital Library

[10]

X. Gu and C. Ding. On the theory and potential of lru-mru collaborative cache management. In Proceedings of the International Symposium on Memory Management, pages 43–54, 2011.

Digital Library

[11]

F. Guo, H. Kannan, L. Zhao, R. Illikkal, R. Iyer, D. Newell, Y. Solihin, and C. Kozyrakis. From chaos to qos: Case studies in cmp resource management. SIGARCH Comput. Archit. News, pages 21–30, 2007.

Digital Library

[12]

R. Iyer. Cqos: A framework for enabling qos in shared caches of cmp platforms. In Proceedings of the 18th Annual International Conference on Supercomputing, pages 257–266, 2004.

Digital Library

[13]

R. Iyer, L. Zhao, F. Guo, R. Illikkal, S. Makineni, D. Newell, Y. Solihin, L. Hsu, and S. Reinhardt. Qos policies and architecture for cache/memory in cmp platforms. In Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pages 25–36, 2007.

Digital Library

[14]

A. Jaleel, W. Hasenplaugh, M. Qureshi, J. Sebot, S. Steely, Jr., and J. Emer. Adaptive insertion policies for managing shared caches. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pages 208–219, 2008.

Digital Library

[15]

A. Jaleel, K. B. Theobald, S. C. Steely, Jr., and J. Emer. High performance cache replacement using re-reference interval prediction (rrip). In Proceedings of the 37th Annual International Symposium on Computer Architecture, pages 60–71, 2010.

Digital Library

[16]

M. Kandemir, T. Yemliha, and E. Kultursay. A helper thread based dynamic cache partitioning scheme for multithreaded applications. In Proceedings of the 48th Design Automation Conference, pages 954– 959, 2011.

Digital Library

[17]

S. Kim, D. Chandra, and Y. Solihin. Fair cache sharing and partitioning in a chip multiprocessor architecture. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pages 111–122, 2004.

Digital Library

[18]

J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In 2008 IEEE 14th International Symposium on High Performance Computer Architecture, pages 367– 378, 2008.

[19]

X. Lin and R. Balasubramonian. Refining the utility metric for utilitybased cache partitioning. In Workshop on Duplicating, Deconstructing, and Debunking, 2011.

[20]

W. Liu and D. Yeung. Using aggressor thread information to improve shared cache management for cmps. In Parallel Architectures and Compilation Techniques, 2009. PACT ’09. 18th International Conference on, pages 372–383, 2009.

Digital Library

[21]

C. McCann, R. Vaswani, and J. Zahorjan. A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors. ACM Trans. Comput. Syst., pages 146–178, 1993.

Digital Library

[22]

M. Moreto, F. J. Cazorla, A. Ramirez, and M. Valero. High performance embedded architectures and compilers: Third international conference, hipeac 2008, göteborg, sweden, january 27-29, 2008. proceedings. pages 337–352, 2008.

[23]

M. K. Qureshi. Adaptive spill-receive for robust high-performance caching in cmps. In High Performance Computer Architecture, 2009. HPCA 2009. IEEE 15th International Symposium on, pages 45–54. IEEE, 2009.

[24]

M. K. Qureshi, A. Jaleel, Y. N. Patt, S. C. Steely, and J. Emer. Adaptive insertion policies for high performance caching. In Proceedings of the 34th Annual International Symposium on Computer Architecture, pages 381–391, 2007.

Digital Library

[25]

M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A lowoverhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pages 423–432, 2006.

Digital Library

[26]

M. K. Qureshi, D. Thompson, and Y. N. Patt. The v-way cache: demand-based associativity via global replacement. In 32nd International Symposium on Computer Architecture (ISCA’05), pages 544– 555, 2005.

Digital Library

[27]

N. Rafique, W.-T. Lim, and M. Thottethodi. Architectural support for operating system-driven cmp cache management. In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, pages 2–12, 2006.

Digital Library

[28]

S. Rus, R. Ashok, and D. X. Li. Automated locality optimization based on the reuse distance of string operations. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pages 181–190, 2011.

Digital Library

[29]

D. Sanchez and C. Kozyrakis. Vantage: Scalable and efficient finegrain cache partitioning. In Proceedings of the 38th Annual International Symposium on Computer Architecture, pages 57–68, 2011.

Digital Library

[30]

V. Seshadri, O. Mutlu, M. A. Kozuch, and T. C. Mowry. The evictedaddress filter: A unified mechanism to address both cache pollution and thrashing. In Proceedings of the 21st international conference on Parallel architectures and compilation techniques, pages 355–366. ACM, 2012.

Digital Library

[31]

A. Sharifi, S. Srikantaiah, M. Kandemir, and M. J. Irwin. Courteous cache sharing: Being nice to others in capacity management. In Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE, pages 678–687, 2012.

Digital Library

[32]

T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 45–57, 2002.

Digital Library

[33]

B. Sinharoy, R. N. Kalla, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner. Power5 system microarchitecture. IBM Journal of Research and Development, pages 505–521, 2005.

Digital Library

[34]

S. Srikantaiah, M. Kandemir, and M. J. Irwin. Adaptive set pinning: Managing shared caches in chip multiprocessors. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 135–144, 2008.

Digital Library

[35]

H. S. Stone, J. Turek, and J. L. Wolf. Optimal partitioning of cache memory. IEEE Transactions on Computers, pages 1054–1068, 1992.

Digital Library

[36]

G. E. Suh, L. Rudolph, and S. Devadas. Dynamic partitioning of shared cache memory. J. Supercomput., pages 7–26, 2004.

Digital Library

[37]

K. T. Sundararajan, T. M. Jones, and N. P. Topham. Energy-efficient cache partitioning for future cmps. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, pages 465–466, 2012.

Digital Library

[38]

G. Suo, X. Yang, G. Liu, J. Wu, K. Zeng, B. Zhang, and Y. Lin. Ipc-based cache partitioning: An ipc-oriented dynamic shared cache partitioning mechanism. In Convergence and Hybrid Information Technology, 2008. ICHIT ’08. International Conference on, pages 399–406, 2008.

Digital Library

[39]

Z. Wang, K. S. McKinley, A. L. Rosenberg, and C. C. Weems. Using the compiler to improve cache replacement decisions. In Parallel Architectures and Compilation Techniques, 2002. Proceedings. 2002 International Conference on, pages 199–208, 2002.

Digital Library

[40]

Y. Xie and G. H. Loh. Pipp: Promotion/insertion pseudo-partitioning of multi-core shared caches. In Proceedings of the 36th Annual International Symposium on Computer Architecture, pages 174–183, 2009.

Digital Library

[41]

X. Yang, S. M. Blackburn, D. Frampton, J. B. Sartor, and K. S. McKinley. Why nothing matters: The impact of zeroing. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications, pages 307–324, 2011.

Digital Library

[42]

C. Ye, J. Brock, C. Ding, and H. Jin. Rochester elastic cache utility (recu): Unequal cache sharing is good economics. International Journal of Parallel Programming, pages 1–15, 2015.

Index Terms

Hardware support for protective and collaborative cache sharing
1. Hardware
  1. Integrated circuits
    1. Semiconductor memory

Recommendations

Hardware support for protective and collaborative cache sharing
ISMM '16

Shared caches are generally optimized to maximize the overall throughput, fairness, or both, among multiple competing programs. In shared environments and compute clouds, users are often unrelated to each other. In such circumstances, an overall gain ...
CABARRE: Request Response Arbitration for Shared Cache Management
Special Issue ESWEEK 2023
Modern multi-processor systems-on-chip (MPSoCs) are characterized by caches shared by multiple cores. These shared caches receive requests issued by the processor cores. Requests that are subject to cache misses may result in the generation of responses. ...
Dynamic Partitioning of Shared Cache Memory

This paper proposes dynamic cache partitioning amongst simultaneously executing processes/threads. We present a general partitioning scheme that can be applied to set-associative caches.
Since memory reference characteristics of processes/threads can ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ISMM 2016: Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management

June 2016

133 pages

ISBN:9781450343176

DOI:10.1145/2926697

General Chair:
Christine H. Flood,
Program Chair:
Zheng Zhang

ACM SIGPLAN Notices Volume 51, Issue 11
ISMM '16
November 2016
133 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3241624
Editors:
Christine H. Flood
Redhat
,
Zheng (Eddy) Zhang
Rutgers University
Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISMM '16

Sponsor:

SIGPLAN

ISMM '16: International Symposium on Memory Management

June 14, 2016

CA, Santa Barbara, USA

Acceptance Rates

Overall Acceptance Rate 72 of 156 submissions, 46%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
169
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten