Article

Free access

An analytic study of dynamic hardware and software cache coherence strategies

Authors:

Harjinder S. Sandhu,

Kenneth C. SevcikAuthors Info & Claims

SIGMETRICS '95/PERFORMANCE '95: Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems

Pages 167 - 177

https://doi.org/10.1145/223587.223606

Published: 01 May 1995 Publication History

Abstract

Dynamic software cache coherence strategies use information about program sharing behaviour to manage caches at run-time and at a granularity defined by the application. The program-level information is obtained through annotations placed into the application by the user or the compiler. The coherence protocols may range from simple static algorithms to dynamic algorithms that use run-time data structures similar to the directories used in hardware strategies. In this paper, we present an analytic study of five dynamic software cache coherence algorithms and compare these to a representative hardware coherence strategy. The analytic model is constructed using four input parameters --- write probability, locality, granularity, and system size --- and solved by analysis of a Markov chain. We show that the fundamental tradeoffs between the different hardware and software strategies are captured in this model. The results of the study show that hardware schemes perform better for fine-grained data structures for much of the parameter space that we study. However, for coarse-grained data structures, various software algorithms are dominant over most of the parameter space. Further, hardware strategies are found to be more susceptible to the effects of contention, and also perform worse for the asymmetric workload that we study.

References

[1]

S. Adve, V. Adve, M. Hill, and M. Vernon. Comparison of hardware and software cache coherence strategies. In 18th Int'l. Symp. on Computer Architecture, pages 298-307, May 1991.

Digital Library

[2]

A. Agarwal, R. Simoni, M. Horowitz, and J. Hennessy. An evaluation of directory schemes for cache coherence. In 15th Int' l. Symp. on Computer Architecture, pages 280- 289, Jun 1988.

Digital Library

[3]

B. Bershad, M. Zekauskas and W. Sawdon. The Midway distributed shared memory system, In Proc. of COMP- COM'93, pages 528-537, Feb 1993.

[4]

J. Carter, J. Bennett, and W. Zwaenelxrel. Implementation and performance of Munin. In 13th Syrup. on Operating Systems Principles, pages 152-164, Oct 1991.

Digital Library

[5]

L. Censier and P. Feautrier, A new solution to coherence problems in multicache systems. In IEEE Transactions on Computers, c27(12), pages 1112-1118, Dec 1978.

[6]

D. Chaiken, J. Kubiatowicz and A. Agarwal. LimitLESS directories: A scalable cache coherence scheme, Fourth Int'l. Symp. on Architectural Support for Programming Languages and Operating Systems, Apr 1991.

Digital Library

[7]

R. Chandra, K. Gharachorloo, V. Soundararajan, and A. Gupta. Performance evaluation of hybrid hardware and software distributed shared memory protocols. In Eighth int' l. Conference on Supercomputing, Jul 1994.

Digital Library

[8]

H. Cheong and A. Veidenbaum. Compiler-directed cache management for multiprocessors. IEEE Computer. 23(6), pages 39-47, Jun 1990.

Digital Library

[9]

R. Cytron, S. Karlovsky, and K. McAuliffe. Automatic management of programmable caches. In Proc. of the int' l. Conference on Parallel Processing, Aug 1988.

[10]

S. J. Eggers and R. H. Katz. A characterization of sharing in parallel programs and its application to coherency protocol evaluation. In 15th Int'l. Symp. on Computer Architecture, May 1988.

Digital Library

[11]

M. Feeley and H. Levy. Distributed shared memory with versioned objects. In Conf. on Object-Oriented Programming Systems Languages, and Applications, Oct 1992.

Digital Library

[12]

K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennesy. Memory consistency and event ordering in scalable shared-memory multiprocessors, in 16th Int' l. Symp. on Computer Architecture, May 1990.

Digital Library

[13]

M. Hill, J. Larus and S. Reinhardt and D. Wood. Cooperative Shared-Memory: Software and hardware support for scalable multiprocessors. In Fifth Int' l. Symp. on Architectural Support for Programming Languages and Operating Systems, Oct 1992.

Digital Library

[14]

P. Keleher, A. Cox and W. Zwaenepoel. Lazy consistency for software distributed shared memory. In 18th Int'l. Symp. on Computer Architecture, May 1992

Digital Library

[15]

M. Lam and M. Rinard. Coarse-grain parallel programming in Jade. in 3rd Symp. on Principles and Practices of Parallel Programming, Apr 1991.

Digital Library

[16]

L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers. 28(9),pp.690-691, Sep 1979.

[17]

D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy. The directory-based cache coherence protocol for the DASH multiprocessor, in 17th Int' I. Symp. on Computer Architecture, pages 148-159, May 1990.

Digital Library

[18]

S. Owicki and A. Agarwal. Evaluating the performance of software cache coherency. In Proc. 3rd Int' I. Conf. on Architectural Support for Programming Languages and Operating Systems, Apr 1989.

Digital Library

[19]

H. Sandhu, B. Gamsa, and S. Zhou. The Shared Regions approach to software cache coherence. In Symp. on Principles and Practices of Parallel Programming,May 1993.

Digital Library

[20]

H. Sandhu. Shared Regions: A strategy for efficient cache coherence on shared-memory multiprocessors. Ph.D. Thesis, University of Toronto, In preparation.

[21]

Z.G. Vranesic, M. Stumm, D. Lewis and R. White. Hector - a hierarchically structured shared-memory multiprocessor, In IEEE Computer, Jan 1991.

Digital Library

[22]

W. Weber and A. Gupta. Analysis of cache invalidation patterns in multiprocessors. In Proc. 3rd Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, Apr 1989.

Digital Library

Cited By

Tavana MTeimouri NAbdollahi MGoudarzi M(2014)Simultaneous hardware and time redundancy with online task scheduling for low energy highly reliable standby-sparing systemACM Transactions on Embedded Computing Systems10.1145/2523781/256003513:4(1-31)Online publication date: 10-Mar-2014
https://dl.acm.org/doi/10.1145/2523781/2560035
Riemens DGaydadjiev GZeeuw CStrydis C(2014)Towards scalable arithmetic units with graceful degradationACM Transactions on Embedded Computing Systems10.1145/249936713:4(1-26)Online publication date: 10-Mar-2014
https://dl.acm.org/doi/10.1145/2499367
Liu SLo WLee CChen H(2013)Agglomerative-based flip-flop merging and relocation for signal wirelength and clock tree optimizationACM Transactions on Design Automation of Electronic Systems10.1145/2491477.249148418:3(1-20)Online publication date: 29-Jul-2013
https://dl.acm.org/doi/10.1145/2491477.2491484
Show More Cited By

Index Terms

An analytic study of dynamic hardware and software cache coherence strategies

Recommendations

An analytic study of dynamic hardware and software cache coherence strategies

Dynamic software cache coherence strategies use information about program sharing behaviour to manage caches at run-time and at a granularity defined by the application. The program-level information is obtained through annotations placed into the ...
Software-Based Cache Coherence with Hardware-Assisted Selective Self-Invalidations Using Bloom Filters

Implementing shared memory consistency models on top of hardware caches gives rise to the well-known cache coherence problem. The standard solution involves implementing coherence protocols in hardware, an approach with some design complexity, hardware ...
Evaluating the performance of software cache coherence
Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems

In a shared-memory multiprocessor with private caches, cached copies of a data item must be kept consistent. This is called cache coherence. Both hardware and software coherence schemes have been proposed. Software techniques are attractive because they ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMETRICS '95/PERFORMANCE '95: Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems

May 1995

340 pages

ISBN:0897916956

DOI:10.1145/223587

Editor:
Blaine D. Gaither
Hewlett-Packard, NSA, Fort Collins, CO

ACM SIGMETRICS Performance Evaluation Review Volume 23, Issue 1
May 1995
323 pages
ISSN:0163-5999
DOI:10.1145/223586
Editor:
Blaine D. Gaither
Hewlett-Packard, NSA, Fort Collins, CO
Issue’s Table of Contents

Copyright © 1995 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMETRICS: ACM Special Interest Group on Measurement and Evaluation
IFIP WG 7.3: IFIP WG 7.3

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1995

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

SIGMETRICS95

Sponsor:

SIGMETRICS
IFIP WG 7.3

SIGMETRICS95: ACM SIGMETRICS Conference on Measurement & Modeling of Computer Systems

May 15 - 19, 1995

Ontario, Ottawa, Canada

Acceptance Rates

Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
360
Total Downloads

Downloads (Last 12 months)36
Downloads (Last 6 weeks)12

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Tavana MTeimouri NAbdollahi MGoudarzi M(2014)Simultaneous hardware and time redundancy with online task scheduling for low energy highly reliable standby-sparing systemACM Transactions on Embedded Computing Systems10.1145/2523781/256003513:4(1-31)Online publication date: 10-Mar-2014
https://dl.acm.org/doi/10.1145/2523781/2560035
Riemens DGaydadjiev GZeeuw CStrydis C(2014)Towards scalable arithmetic units with graceful degradationACM Transactions on Embedded Computing Systems10.1145/249936713:4(1-26)Online publication date: 10-Mar-2014
https://dl.acm.org/doi/10.1145/2499367
Liu SLo WLee CChen H(2013)Agglomerative-based flip-flop merging and relocation for signal wirelength and clock tree optimizationACM Transactions on Design Automation of Electronic Systems10.1145/2491477.249148418:3(1-20)Online publication date: 29-Jul-2013
https://dl.acm.org/doi/10.1145/2491477.2491484
Gupta SSapatnekar S(2013)Employing circadian rhythms to enhance power and reliabilityACM Transactions on Design Automation of Electronic Systems10.1145/2491477.249148218:3(1-23)Online publication date: 29-Jul-2013
https://dl.acm.org/doi/10.1145/2491477.2491482
Nadakuditi RMarkov I(2013)On bottleneck analysis in stochastic stream processingACM Transactions on Design Automation of Electronic Systems10.1145/2491477.249147818:3(1-20)Online publication date: 29-Jul-2013
https://dl.acm.org/doi/10.1145/2491477.2491478
Sankar SShaw MVaid KGurumurthi S(2013)Datacenter Scale Evaluation of the Impact of Temperature on Hard Disk Drive FailuresACM Transactions on Storage10.1145/2491472.24914759:2(1-24)Online publication date: 1-Jul-2013
https://dl.acm.org/doi/10.1145/2491472.2491475
Altiparmak NTosun A(2013)Generalized Optimal Response Time Retrieval of Replicated Data from Storage ArraysACM Transactions on Storage10.1145/2491472.24914749:2(1-36)Online publication date: 1-Jul-2013
https://dl.acm.org/doi/10.1145/2491472.2491474
Grohe M(2012)Fixed-point definability and polynomial time on graphs with excluded minorsJournal of the ACM10.1145/2371656.237166259:5(1-64)Online publication date: 5-Nov-2012
https://dl.acm.org/doi/10.1145/2371656.2371662
Thomasian A(2001)Performance Analysis of Database SystemsPerformance Evaluation: Origins and Directions10.1007/3-540-46506-5_13(305-327)Online publication date: 9-Nov-2001
https://doi.org/10.1007/3-540-46506-5_13
Sandhu H(1999)An extensible framework for coherence in distributed shared data systemsProceedings Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'99)10.1109/ISPAN.1999.778925(106-111)Online publication date: 1999
https://doi.org/10.1109/ISPAN.1999.778925
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents