Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

High-performance fractal coherence

Published: 24 February 2014 Publication History

Abstract

Bugs in cache coherence protocols can cause system failures. Despite many advances, verification runs into state explosion for even moderately-sized systems. As multicores' core counts increase, coherence verifiability continues to be a key problem. A recent proposal, called fractal coherence, avoids the state explosion problem by applying the idea of observational equivalence between a larger system and its smaller sub-systems. A fractal protocol for a larger system is verified by design if a minimal sub-system is verified completely. While fractal coherence is a significant step forward, there are two shortcomings: (1) Architectural limitation: To achieve fractal coherence's logical hierarchy, TreeFractal, the specific fractal protocol, employs a tree architecture where each miss traverses many levels up and down the tree and each level redundantly holds its sub-trees' coherence tags. (2) Protocol restrictions: TreeFractal imposes a restriction on responses to read requests that forces read requests to obtain clean blocks from the nearest sharer even if the shared L2 or L3 is faster. These limitations impose significant performance and coherence tag state overheads. In this paper, we propose architectural support for coherence protocols to achieve scalable performance and verifiability. To address the architectural limitation, we propose FlatFractal, a directory-based architecture which decouples fractal coherence's logical hierarchy from the architecture and eliminates redundant tag state. To address the protocol restriction, we propose a simple change to the protocol that, while preserving observational equivalence, allows read requests to obtain the blocks from the shared L2 or L3. Our simulations show that for 16 cores, FlatFractal performs, on average, 57% better than TreeFractal and within 3% of a conventional directory.

References

[1]
Open source development labs database test suite 2 v0.40 http://osdldbt.sourceforge.net/.
[2]
Postgresql. v9.2.0. http://www.postgresql.org/.
[3]
The standard performance evaluation corporation. SPECjbb2005 suite. http://www.spec.org/jbb2005/.
[4]
A. R. Alameldeen and D. A. Wood. Variability in architectural simulations of multi-threaded workloads. In Proceedings of the 9th International Symposium on High-Performance Computer Architecture, 2003.
[5]
AMD. Revision Guide for AMD Family 14H Models 00h-0Fh Processors, revision 3.,18. February 2013.
[6]
P. Barford and M. Crovella. Generating representative web workloads for network and server performance evaluation. In Proceedings of the 1998 ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, pages 151--160, 1998. ACM.
[7]
B. Bentley. Simulation Driven Verification. Presentation at Design Automation Summer School. 2009.
[8]
D. Culler, J. Singh, and A. Gupta. Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann, 1st edition, 1998. The Morgan Kaufmann Series in Computer Architecture and Design.
[9]
D. Dill, A. Drexler, A. Hu, and C. Yang. Protocol verification as a hardware design aid. In Proceedings of the IEEE 1992 International Conference on Computer Design: VLSI in Computers and Processors, pages 522--525, 1992.
[10]
A. Felty and F. Stomp. A correctness proof of a cache coherence protocol. In Proceedings of the 11th Annual Conference on Computer Assurance, pages 128--141, Oct 1992.
[11]
H. Garavel, F. Lang, R. Mateescu, and W. Serwe. CADP 2010: A toolbox for the construction and analysis of distributed processes. In P. Abdulla and K. Leino, editors, Tools and Algorithms for the Construction and Analysis of Systems, volume 6605 of Lecture Notes in Computer Science, pages 372--387. Springer Berlin / Heidelberg, 2011. 10.1007/978-3-642-19835-9_33.
[12]
S. Gjessing, S. Krogdahl, and E. Munthe-Kaas. A linked list cache coherence protocol: verifying the bottom layer. In Proceedings of the 5th International Parallel Processing Symposium, pages 324--329, 1991.
[13]
S. Gjessing, S. Krogdahl, and E. Munthe-Kaas. A top down approach to the formal specification of SCI cache coherence. In Computer Aided Verification, 3rd International Workshop, pages 83--91. Springer, 1991.
[14]
Intel. Intel Xeon Processor E7--800/4800/2800 Product Families Specification Update. August 2013.
[15]
C. N. Ip and D. L. Dill. Better verification through symmetry. Formal Methods in System Design, 9(1-2):41--75, Aug. 1996.
[16]
J. Laudon and D. Lenoski. The SGI Origin: a ccNUMA highly scalable server. In Proceedings of the 24th Annual International Symposium on Computer Architecture, pages 241--251, 1997. ACM.
[17]
D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy. The directory-based cache coherence protocol for the DASH multiprocessor. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 148--159, 1990. ACM.
[18]
P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hållberg, J. Högberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. Computer, 35(2):50--58, Feb. 2002.
[19]
M. M. K. Martin, D. J. Sorin, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hill, and D. A. Wood. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. SIGARCH Computer Architecture News, 33(4):92--99, Nov. 2005.
[20]
M. R. Marty and M. D. Hill. Virtual hierarchies to support server consolidation. In Proceedings of the 34th Annual International Symposium on Computer Architecture, pages 46--56, New York, NY, USA, 2007. ACM.
[21]
R. Milner. A Calculus of Communicating Systems. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 1982.
[22]
S. Park and D. L. Dill. Verification of FLASH cache coherence protocol by aggregation of distributed transactions. In Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 288--296, 1996. ACM.
[23]
F. Pong and M. Dubois. The verification of cache coherence protocols. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 11--20, 1993. ACM.
[24]
F. Pong and M. Dubois. Verification techniques for cache coherence protocols. ACM Computing Surveys, 29(1):82--126, Mar. 1997.
[25]
F. Pong and M. Dubois. Formal verification of complex coherence protocols using symbolic state models. Journal of the ACM, 45(4):557--587, July 1998.
[26]
F. Pong, A. Nowatzyk, G. Aybay, and M. Dubois. Verifying distributed directory-based cache coherence protocols: S3.mp, a case study. In Euro-Par '95: Parallel Processing, volume 966 of Lecture Notes in Computer Science, pages 287--300, Springer Berlin/Heidelberg, 1995.
[27]
U. Stern and D. Dill. Automatic verification of the SCI cache coherence protocol. In Correct Hardware Design and Verification Methods: IFIP WG10.5 Advanced Research Working Conference Proceedings, 21--34, 1995.
[28]
S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi. Cacti 5.1. Technical report, HP Technical Report, 2008.
[29]
C. tsun Chou, P. K. Mannava, and S. Park. A simple method for parameterized verification of cache coherence protocols. In Formal Methods in Computer Aided Design, pages 382--398. Springer, 2004.
[30]
Apache HTTP Server v2.2.11. The apache software foundation. http://httpd.apache.org/.
[31]
K. Williams and R. Esser. Verification of the Futurebus+ cache coherence protocol: a case study in model checking. In Proceedings of the 27th Australasian Conference on Computer Science -- Volume 26, pages 65--71, Darlinghurst, Australia, Australia, 2004. Australian Computer Society, Inc.
[32]
A. W. Wilson, Jr. Hierarchical cache/bus architecture for shared memory multiprocessors. In Proceedings of the 14th Annual International Symposium on Computer Architecture, pages 244--252, 1987. ACM.
[33]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: characterization and methodological considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 24--36, 1995. ACM.
[34]
P. Yeung and K. Larsen. Practical assertion-based formal verification for SoC designs. In Proceedings of the 2005 International Symposium on System-on-Chip, pages 58--61, 2005.
[35]
M. Zhang, A. R. Lebeck, and D. J. Sorin. Fractal coherence: Scalably verifiable cache coherence. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pages 471--482, Washington, DC, USA, 2010. IEEE Computer Society.

Index Terms

  1. High-performance fractal coherence

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 42, Issue 1
    ASPLOS '14
    March 2014
    729 pages
    ISSN:0163-5964
    DOI:10.1145/2654822
    Issue’s Table of Contents
    • cover image ACM Conferences
      ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
      February 2014
      780 pages
      ISBN:9781450323055
      DOI:10.1145/2541940
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 February 2014
    Published in SIGARCH Volume 42, Issue 1

    Check for updates

    Author Tags

    1. cache coherence
    2. multicore
    3. verification

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 22 Nov 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media