Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/264107.264205acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

Reactive NUMA: a design for unifying S-COMA and CC-NUMA

Published: 01 May 1997 Publication History

Abstract

This paper proposes and evaluates a new approach to directory-based cache coherence protocols called Reactive NUMA (R-NUMA). An R-NUMA system combines a conventional CC-NUMA coherence protocol with a more-recent Simple-COMA (S-COMA) protocol. What makes R-NUMA novel is the way it dynamically reacts to program and system behavior to switch between CC-NUMA and S-COMA and exploit the best aspects of both protocols. This reactive behavior allows each node in an R-NUMA system to independently choose the best protocol for a particular page, thus providing much greater performance stability than either CC-NUMA or S-COMA alone. Our evaluation is both qualitative and quantitative. We first show the theoretical result that R-NUMA's worst-case performance is bounded within a small constant factor (i.e., two to three times) of the best of CC-NUMA and S-COMA. We then use detailed execution-driven simulation to show that, in practice, R-NUMA usually performs better than either a pure CC-NUMA or pure S-COMA protocol, and no more than 57% worse than the best of CC-NUMA and S-COMA, for our benchmarks and base system assumptions.

References

[1]
Silicon Graphics Origin Technology. http://www.sgi.com/ Products/hardware/servers/technology/index.html.
[2]
Anant Agarwal, Ricardo Bianchini, David Chaiken, Kirk L. Johnson, David Kranz, John Kubiatowicz, Beng-Hong Lim, Kenneth Mackenzie, and Donald Yeung. The MIT Alewife Machine: Architecture and Performance. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 2-13, June 1995.
[3]
Tom Anderson, David Culler, and David Patterson. A Case for NOW (Networks of Workstations). IEEE Micro, 15(1):54-64, February 1995.
[4]
David Black, Anoop Gupta, and Wolf-Dietrich Weber. Competitive management of distributed shared memory. In Proceedings of COMPCON, March 1989.
[5]
Tony Brewer. A Highly Scalable System Utilizing up to 128 PA-RISC Processors. http:llwww.convex.comltech_cachelpsl SPP_Arch.times.ps.
[6]
B.R. Brooks, R.E. Bruccoleri, B. D. Olafson, D. L States, S. Swamintathan, and M. Karplus. Charmm: A program for macromolecular energy, minimization, and dynamics calculation. Journal of Computational Chemistry, 4(187), 1983.
[7]
John B. Carter, John K. Bennett, and Willy Zwaenepoel. Implementation and Performance of Munin. In Proceedings of the 13th A CM Symposium on Operating System Principles (SOSP), pages 152-164, October 1991.
[8]
John B. Carter, AI Davis, Ravindra Kuramkote, Chei-Chi Kuo, Leigh B. Stoller, and Mark Swanson. Avalanche: A Communication and Memory Architecture for Sealable Parallel Computing. In Workshop on Scalable Shared-Memory Multiprocessors, 1995. http:/Iwww.es.utah.edu:8Olprojeetsl avalanche/.
[9]
D.E. Culler, A. Dusseau, S. C. Goldstein, A. Krishnamurthy, S. Lumetta, T. yon Eicken, and K. Yelick. Parallel Programruing in Split-C. in Proceedings of Supercomputing '93, pages 262-273, November 1993.
[10]
Erik Hagersten, Anders Landin, and Seif Haridi. DDM-A Cache-Only Memory Architecture. IEEE Computer, 25(9):44-54, September 1992.
[11]
Erik Hagersten, Ashley Saulsbury, and Anders Landin. Simple COMA Node Implementations. In Proceedings of the 27th Hawaii International Conference on System Sciences, January 1994.
[12]
Kendall Square Research. Kendall Square Research Technical Summary, I992.
[13]
Jeffrey Kuskin et al. The Stanford FLASH Multiprocessor. In Proceedings of the 21st Annual International Symposium on Computer Architecture, pages 302-313, April 1994.
[14]
Rick LaRowe and Carla Ellis. Experimental Comparison of Memory Management Policies for NUMA multiprcessors. A CM Transactions on Computer Systems, 9(4):319-363, November 1991.
[15]
Daniel Lenoski, James Laudon, Kourosh Gharaehorloo, Wolf- Dietrich Weber, Anoop Gupta, John Hennessy, Mark Horowitz, and Moniea Lain. The Stanford DASH Multiprocessor. IEEE Computer, 25(3):63-79, March 1992.
[16]
Tom Lovett and Russel Clapp. STING: A CC-NUMA Compute System for the Commercial Marketplace. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, May 1996.
[17]
Michael Marchetti, Leonidas Kontothanassis, Ricardo Bianchini, and Michael L. Scott. Using Simple Page Placement Policies to Reduce the Cost of Cache Fills in Coherent Shared- Memory Systems. In Proceedings of the Nineth International Parallel Processing Symposium, April 1995.
[18]
A. Nowatzyk, M. Monger, M. Parkin, E. Kelly, M. Borwne, G. Aybay, and D. Lee. SJ.mp: A Muldprocessor in a Matchbox. In Proc. PASA, 1993.
[19]
E. Rosd, E. Smirni, "I,D. Wagner, A.W. Apon, and L.W. Dowdy. The KSRI: Experimentation and Modeling of Poststore. In Proceedings of the 1993 A CM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 74--85, May 1993.
[20]
Edward Rothberg, Jaswinder Pal Singh, and Anoop Gupta. Working Sets, Cache Sizes, and Node Granularity Issues for Large-Scale Multiprocessors. In Proceedings of the 20th Annual International Symposium on Computer Architecture, pages 14-25, June 1993.
[21]
Ashley Saulasbury, Tim Wilkinson, John Carter, and Anders Landin. An Argument for Simple COMA. In Proceedings of the First 1EEE Symposium on High-Performance Computer Architecture, pages 276--285, January 1995.
[22]
Ashley Saulsbury and Andreas Nowatzyk. Simple COMA on S3.MP. http:llplayground.Sun.COMIpublS3.mplsimplecoma/isea-95/present.html.
[23]
Chandramohan A. Thekkath and Henry M. Levy. Hardware and Software Support for Efficient Exception Handling. in Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 110-119, San Jose, California, 1994.
[24]
Ben Verghese, Scott Devine, Anoop Gupta, and Mendel Rosenblum. Operating System Support for Improving Data Locality on CC-NUMA Compute Servers. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII), October 1996.
[25]
Shlomo Weiss and James E. Smith. Power and PowerPC. Morgan Kaufmann Publishers, Inc., 1994.
[26]
StevenCameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 24-36, July 1995.

Cited By

View all
  • (2023)SAC: Sharing-Aware Caching in Multi-Chip GPUsProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589078(1-13)Online publication date: 17-Jun-2023
  • (2022)TD-NUCA: Runtime Driven Management of NUCA Caches in Task Dataflow Programming ModelsSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00085(1-15)Online publication date: Nov-2022
  • (2018)Combining HW/SW mechanisms to improve NUMA performance of multi-GPU systemsProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00035(339-351)Online publication date: 20-Oct-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '97: Proceedings of the 24th annual international symposium on Computer architecture
June 1997
350 pages
ISBN:0897919017
DOI:10.1145/264107
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 25, Issue 2
    Special Issue: Proceedings of the 24th annual international symposium on Computer architecture (ISCA '97)
    May 1997
    349 pages
    ISSN:0163-5964
    DOI:10.1145/384286
    Issue’s Table of Contents

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1997

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ISCA97
Sponsor:

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)143
  • Downloads (Last 6 weeks)52
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)SAC: Sharing-Aware Caching in Multi-Chip GPUsProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589078(1-13)Online publication date: 17-Jun-2023
  • (2022)TD-NUCA: Runtime Driven Management of NUCA Caches in Task Dataflow Programming ModelsSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00085(1-15)Online publication date: Nov-2022
  • (2018)Combining HW/SW mechanisms to improve NUMA performance of multi-GPU systemsProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00035(339-351)Online publication date: 20-Oct-2018
  • (2018)The Research of Several Situations About Memory Accessing on Non-Uniform Memory Access Architecture2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS)10.1109/ICIS.2018.8466393(744-747)Online publication date: Jun-2018
  • (2017)A good data allocation strategy on non-uniform memory access architecture2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)10.1109/ICIS.2017.7960048(527-530)Online publication date: May-2017
  • (2016)CANDYThe 49th Annual IEEE/ACM International Symposium on Microarchitecture10.5555/3195638.3195680(1-13)Online publication date: 15-Oct-2016
  • (2016)CANDY: Enabling coherent DRAM caches for multi-node systems2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO.2016.7783738(1-13)Online publication date: Oct-2016
  • (2015)Manycore network interfaces for in-memory rack-scale computingACM SIGARCH Computer Architecture News10.1145/2872887.275041543:3S(567-579)Online publication date: 13-Jun-2015
  • (2015)Manycore network interfaces for in-memory rack-scale computingProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750415(567-579)Online publication date: 13-Jun-2015
  • (2014)Scale-out NUMAACM SIGARCH Computer Architecture News10.1145/2654822.254196542:1(3-18)Online publication date: 24-Feb-2014
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media