Higher global bandwidth requirement for many applications and lower network cost have motivated the use of the Dragonfly network topology for high performance computing systems. In this paper we present the architecture of the Cray Cascade system, a distributed memory system based on the Dragonfly [1] network topology. We describe the structure of the system, its Dragonfly network and the routing algorithms. We describe a set of advanced features supporting both mainstream high performance computing applications and emerging global address space programing models.

We present a combination of performance results from prototype systems and simulation data for large systems. We demonstrate the value of the Dragonfly topology and the benefits obtained through extensive use of adaptive routing.

References

[1]

J. Kim, W. J. Dally, S. Scott and D. Abts, "Technology-Driven, Highly-Scalable Dragonfly Topology", Proc. of the International Symposium on Computer Architecture (ISCA), pages 77--88.

Digital Library

Google Scholar

[2]

R. Brightwell, K. Predretti, K. Underwood, and T. Hudson. "Seastar Interconnect: Balanced Bandwidth for ScalablePerformance", Micro, IEEE, 26(3):41--57, 2006

Digital Library

Google Scholar

[3]

Bob Alverson, Duncan Roweth and Larry Kaplan, Cray Inc. "The Gemini System Interconnect", High-Performance Interconnects Symposium, 83--87 (2010).

Digital Library

Google Scholar

[4]

C. Leisserson, "Fat-trees: Universal networks for hardware efficient supercomputing", IEEE Transactions on computer, C-34(10):892--901, October 1985.

Digital Library

Google Scholar

[5]

S. Scott, D. Abts, J. Kim, and W. J. Dally. "The BlackWidow High-radix Clos Network", Proc. of the International Symposium on Computer Architecture (ISCA), pages 16--28, 2006.

Digital Library

Google Scholar

[6]

L. G. Valiant, "A scheme for fast parallel communication", SIAM Journal on Computing, 11(2):350--361, 1982.

Digital Library

Google Scholar

[7]

H. Pritchard, I. Gorodetsky, and D. Buntinas, "A uGNI based MPICH2 Nemesis Network Module for the Cray XE", Proceedings of the 18th European MPI Users' Group Conference on Recent Advances in the Message Passing Interface, EuroMPI'11, pages 110--119, Berlin, Heidelberg, 2011. Springer-Verlag.

Digital Library

Google Scholar

[8]

Dennis Abts, Abdulla Bataineh, Steve Scott, Greg Faanes, Jim Schwarzmeier, Eric Lundberg, Tim Johnson, Mike Bye, Gerald Schwoerer, "The Cray BlackWidow: A Highly Scalable Vector Multiprocessor", SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing, pages 1--12.

Digital Library

Google Scholar

[9]

Cray Research, Inc.: SHMEM Technical Note for C, SG-2516 2.3. (1994).

Google Scholar

[10]

Dong Chen, Eisley, N. A.; Heidelberger, P.; Senger, R. M.; Sugawara, Y.; Kumar, S.; Salapura, V.; Satterfield, D.; Steinmacher-Burow, B.; Parker, J., "The IBM Blue Gene/Q Interconnection Fabric", IEEE Micro, Jan.-Feb. 2012, Volume: 32, Issue: 1, pg 32.

Digital Library

Google Scholar

[11]

Takumi Maruyama "SPARC64(TM) VIIIfx: Fujitsu's New Generation Octo Core Processor for PETA Scale computing",. Proceedings of Hot Chips 21. IEEE Computer Society, 2009.

Google Scholar

[12]

B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni and R. Rajamony, "The PERCS High-Performance Interconnect", Proceedings of 18th Symposium on High-Performance Interconnects (Hot Interconnects 2010), IEEE, Aug. 2010).

Digital Library

Google Scholar

Cited By

View all

Besta MGerstenberger RFischer MPodstawski MBlach NEgeli BMitenkov GChlapek WMichalewicz MNiewiadomski HMueller JHoefler TMohror KArnold DBadia R(2023)The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of CoresProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607068(1-18)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607068
Besta MRenc PGerstenberger RSylos Labini PZiogas AChen TGianinazzi LScheidl FSzenes KCarigiet AIff PKwasniewski GKanakagiri RGe CJaeger SWąs JVella FHoefler TMohror KArnold DBadia R(2023)High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor FormulationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607067(1-16)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607067
Wang X(2023)A Study of Simulating Heterogeneous Workloads on Large-scale Interconnect NetworkProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3593636(58-59)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3573900.3593636
Show More Cited By

Cray cascade: a scalable HPC system based on a Dragonfly network

Recommendations

Cray Cascade: A scalable HPC system based on a Dragonfly network
SC '12: Proceedings of the 2012 International Conference for High Performance Computing, Networking, Storage and Analysis

Higher global bandwidth requirement for many applications and lower network cost have motivated the use of the Dragonfly network topology for high performance computing systems. In this paper we present the architecture of the Cray Cascade system, a ...
The IBM Blue Gene/Q interconnection network and message unit
SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis

This is the first paper describing the IBM Blue Gene/Q interconnection network and message unit. The Blue Gene/Q system is the third generation in the IBM Blue Gene line of massively parallel supercomputers. The Blue Gene/Q architecture can be scaled to ...
The IBM Blue Gene/Q Interconnection Fabric

This article describes the IBM Blue Gene/Q interconnection network and message unit. Blue Gene/Q is the third generation in the IBM Blue Gene line of massively parallel supercomputers and can be scaled to 20 petaflops and beyond. For better application ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

November 2012

1161 pages

ISBN:9781467308045

General Chair:
Jeffrey K. Hollingsworth
University of Maryland

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 10 November 2012

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SC '12

Sponsor:

SIGHPC
SIGARCH
IEEE-CS

SC '12: International Conference for High Performance Computing, Networking, Storage and Analysis

November 10 - 16, 2012

Utah, Salt Lake City

Acceptance Rates

SC '12 Paper Acceptance Rate 100 of 461 submissions, 22%;

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

67
Total Citations
View Citations
854
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)6

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Besta MGerstenberger RFischer MPodstawski MBlach NEgeli BMitenkov GChlapek WMichalewicz MNiewiadomski HMueller JHoefler TMohror KArnold DBadia R(2023)The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of CoresProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607068(1-18)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607068
Besta MRenc PGerstenberger RSylos Labini PZiogas AChen TGianinazzi LScheidl FSzenes KCarigiet AIff PKwasniewski GKanakagiri RGe CJaeger SWąs JVella FHoefler TMohror KArnold DBadia R(2023)High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor FormulationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607067(1-16)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607067
Wang X(2023)A Study of Simulating Heterogeneous Workloads on Large-scale Interconnect NetworkProceedings of the 2023 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3573900.3593636(58-59)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3573900.3593636
Jha SPatke ABrandt JGentile ALim BShowerman MBauer GKaplan LKalbarczyk ZKramer WIyer RBhagwan RPorter G(2020)Measuring congestion in high-performance datacenter interconnectsProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388246(37-58)Online publication date: 25-Feb-2020
https://dl.acm.org/doi/10.5555/3388242.3388246
Alzaid ZBhowmik SYuan XLang MAyguadé EHwu WBadia RHofstee H(2020)Global link arrangement for practical DragonflyProceedings of the 34th ACM International Conference on Supercomputing10.1145/3392717.3392756(1-11)Online publication date: 29-Jun-2020
https://dl.acm.org/doi/10.1145/3392717.3392756
Mollah MWang WFaizian PRahman MYuan XPakin SLang M(2019)Modeling Universal Globally Adaptive Load-Balanced RoutingACM Transactions on Parallel Computing10.1145/33496206:2(1-23)Online publication date: 30-Aug-2019
https://dl.acm.org/doi/10.1145/3349620
Navaridas JLant JPascual JLuján MGoodacre J(2019)Design Exploration of Multi-tier Interconnection Networks for Exascale SystemsProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337903(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337903
Li CDong DLiao XKim JKim CEigenmann RDing CMcKee S(2019)DeepHiRProceedings of the ACM International Conference on Supercomputing10.1145/3330345.3330381(403-413)Online publication date: 26-Jun-2019
https://dl.acm.org/doi/10.1145/3330345.3330381
Kang YWang XMcGlohon NMubarak MChunduri SLan ZJin DLiu JKale L(2019)Modeling and Analysis of Application Interference on Dragonfly+Proceedings of the 2019 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3316480.3325517(161-172)Online publication date: 29-May-2019
https://dl.acm.org/doi/10.1145/3316480.3325517
Bhatele AJain NMubarak MGamblin TJin DLiu JKale L(2019)Analyzing Cost-Performance Tradeoffs of HPC Network Designs under Different Constraints using SimulationsProceedings of the 2019 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation10.1145/3316480.3325516(1-12)Online publication date: 29-May-2019
https://dl.acm.org/doi/10.1145/3316480.3325516
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Cray Cascade: A scalable HPC system based on a Dragonfly network

The IBM Blue Gene/Q interconnection network and message unit

The IBM Blue Gene/Q Interconnection Fabric