Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/SC.2018.00007acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Exploiting idle resources in a high-radix switch for supplemental storage

Published: 26 July 2019 Publication History

Abstract

A general-purpose switch for a high-performance network is usually designed with symmetric ports providing credit-based flow control and error recovery via link-level retransmission. Because port buffers must be sized for the longest links and modern asymmetric network topologies have a wide range of link lengths, we observe that there can be a significant amount of unused buffer memory, particularly in edge switches. We also observe that the tiled architecture used in many high-radix switches contains an abundance of internal bandwidth. We combine these observations to create a new switch architecture that allows ports to stash packets in unused buffers on other ports, accessible via excess internal bandwidth in the tiled switch. We explore this architecture through two use cases: end-to-end resilience and congestion mitigation. We find that stashing is highly effective and does not degrade network performance.

References

[1]
J. Kim, W. J. Dally, S. Scott, and D. Abts, "Technology-driven, highly-scalable dragonfly topology," in Proceedings of the 35th Annual International Symposium on Computer Architecture, ser. ISCA '08. Washington, DC, USA: IEEE Computer Society, 2008, pp. 77--88. {Online}. Available
[2]
M. Xie, Y. Lu, K. Wang, L. Liu, H. Cao, and X. Yang, "Tianhe-1a interconnect and message-passing services," IEEE Micro, vol. 32, no. 1, pp. 8--20, Jan 2012.
[3]
A. Shpiner, Z. Haramaty, S. Eliad, V. Zdornov, B. Gafni, and E. Zahavi, "Dragonfly+: Low cost topology for scaling datacenters," in 2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), Feb 2017, pp. 1--8.
[4]
M. S. Birrittella, M. Debbage, R. Huggahalli, J. Kunz, T. Lovett, T. Rimmer, K. D. Underwood, and R. C. Zak, "Intel omni-path architecture: Enabling scalable, high performance fabrics," in 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, Aug 2015, pp. 1--9.
[5]
G. Faanes, A. Bataineh, D. Roweth, T. Court, E. Froese, B. Alverson, T. Johnson, J. Kopnick, M. Higgins, and J. Reinhard, "Cray cascade: a scalable hpc system based on a dragonfly network," in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, ser. SC '12. Los Alamitos, CA, USA: IEEE Computer Society Press, 2012, pp. 103:1--103:9. {Online}. Available: http://dl.acm.org/citation.cfm?id=2388996.2389136
[6]
Y. Dai, K. Wang, G. Qu, L. Xiao, D. Dong, and X. Qi, "A scalable and resilient microarchitecture based on multiport binding for high-radix router design," in 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2017, pp. 429--438.
[7]
R. Alverson, D. Roweth, and L. Kaplan, "The gemini system interconnect," in Proceedings of the 2010 18th IEEE Symposium on High Performance Interconnects, ser. HOTI '10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 83--87. {Online}. Available
[8]
N. R. Adiga et al., "An overview of the bluegene/l supercomputer," in Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, ser. SC '02. Los Alamitos, CA, USA: IEEE Computer Society Press, 2002, pp. 1--22. {Online}. Available: http://dl.acm.org/citation.cfm?id=762761.762787
[9]
D. Chen, N. A. Eisley, P. Heidelberger, R. M. Senger, Y. Sugawara, S. Kumar, V. Salapura, D. Satterfield, B. Steinmacher-Burow, and J. Parker, "The ibm blue gene/q interconnection fabric," IEEE Micro, vol. 32, no. 1, pp. 32--43, 2012.
[10]
J. Dongarra, "Report on the Sunway TaihuLight System," University of Tennessee, Department of Electrical and Computer Science, Tech. Rep. UT-EECS-16--742, Jun 2016.
[11]
Cable News Network. (2017, Jul.) Japan is building the fastest supercomputer ever made. https://www.cnn.com/2017/06/13/tech/supercomputer-japan/index.html. Accessed: 2018-03-19.
[12]
B. Alverson, E. Froese, L. Kaplan, and D. Roweth, "Cray XC Series Network," Cray, Inc., Tech. Rep. WP-Aries01--1112, 2012.
[13]
Intel Corporation. Intel omni-path cable products. https://www.intel.com/content/www/us/en/products/network-io/high-performance-fabrics/omni-path-cables.html. Accessed: 2018-03-15.
[14]
"Transmission Control Protocol," RFC 793, Sep. 1981. {Online}. Available: https://rfc-editor.org/rfc/rfc793.txt
[15]
InfiniBand Architecture Specification, Volume 1, Release 1.3, InfiniBand Trade Association, Mar 2015.
[16]
NVIDIA Corporation. (2018) Nvlink fabric. https://www.nvidia.com/en-us/data-center/nvlink. Accessed: 2018-05-24.
[17]
J. Kim, W. J. Dally, B. Towles, and A. K. Gupta, "Microarchitecture of a high-radix router," in Proceedings of the 32nd Annual International Symposium on Computer Architecture, ser. ISCA '05. Washington, DC, USA: IEEE Computer Society, 2005, pp. 420--431. {Online}. Available
[18]
S. Scott, D. Abts, J. Kim, and W. J. Dally, "The blackwidow high-radix clos network," in 33rd International Symposium on Computer Architecture (ISCA'06), 2006, pp. 16--28.
[19]
M. Katevenis, P. Vatsolaki, and A. Efthymiou, "Pipelined memory shared buffer for vlsi switches," in Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, ser. SIGCOMM '95. New York, NY, USA: Association for Computing Machinery, 1995, pp. 39--48.
[20]
Y. Tamir and G. L. Frazier, "High-performance multiqueue buffers for vlsi communication switches," in {1988} The 15th Annual International Symposium on Computer Architecture. Conference Proceedings, May 1988, pp. 343--354.
[21]
Wikimedia Foundation, Inc. (2018, Mar.) End-to-end principle. https://en.wikipedia.org/wiki/End-to-end_principle. Accessed: 2018-03-28.
[22]
E. Gran, M. Eimot, S.-A. Reinemo, T. Skeie, O. Lysne, L. Huse, and G. Shainer, "First experiences with congestion control in infiniband hardware," in Parallel Distributed Processing, 2010 IEEE International Symposium on, pp. 1 --12.
[23]
J. Duato, I. Johnson, J. Flich, F. Naven, P. Garcia, and T. Nachiondo, "A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks," in High-Performance Computer Architecture. 11th International Symposium on, 2005, pp. 108 -- 119.
[24]
J.-L. Ferrer, E. Baydal, A. Robles, P. Lopez, and J. Duato, "A scalable and early congestion management mechanism for mins," in Parallel, Distributed and Network-Based Processing, 18th Euromicro International Conference on, 2010.
[25]
K. Ramakrishnan, S. Floyd, and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP," RFC 3168 (Proposed Standard), RFC Editor, Fremont, CA, USA, pp. 1--63, Sep. 2001, updated by RFCs 4301, 6040. {Online}. Available: https://www.rfc-editor.org/rfc/rfc3168.txt
[26]
N. Jiang, D. U. Becker, G. Michelogiannakis, and W. J. Dally, "Network congestion avoidance through speculative reservation," in Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture, ser. HPCA '12. Washington, DC, USA: IEEE Computer Society, 2012. {Online}. Available
[27]
N. Jiang, L. Dennison, and W. J. Dally, "Network endpoint congestion control for fine-grained communication," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '15. New York, NY, USA: ACM, 2015, pp. 35:1--35:12. {Online}. Available
[28]
W. Dally and B. Towles, Principles and Practices of Interconnection Networks. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2003.
[29]
N. Jiang, D. Becker, G. Michelogiannakis, J. Balfour, B. Towles, D. Shaw, J. Kim, and W. Dally, "A detailed and flexible cycle-accurate network-on-chip simulator," in Performance Analysis of Systems and Software (ISPASS), 2013 IEEE International Symposium on, April 2013, pp. 86--96.
[30]
D. U. Becker and W. J. Dally, "Allocator implementations for network-on-chip routers," in Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, Nov 2009, pp. 1--12.
[31]
M. Garcia, E. Vallejo, R. Beivide, M. Odriozola, and M. Valero, "Efficient routing mechanisms for dragonfly networks," in 2013 42nd International Conference on Parallel Processing, Oct 2013, pp. 582--592.
[32]
H. Adalsteinsson, S. Cranford, D. A. Evensky, J. P. Kenny, J. Mayo, A. Pinar, and C. L. Janssen, "A simulator for large-scale parallel computer architectures," Int. J. Distrib. Syst. Technol., vol. 1, no. 2, pp. 57--73, Apr. 2010. {Online}. Available
[33]
B. Austin and N. J. Wright, "Measurement and interpretation of microbenchmark and application energy use on the cray xc30," in Proceedings of the 2Nd International Workshop on Energy Efficient Supercomputing, ser. E2SC '14. Piscataway, NJ, USA: IEEE Press, 2014, pp. 51--59. {Online}. Available
[34]
(2013) Characterization of the DOE mini-apps. {Online}. Available: http://portal.nersc.gov/project/CAL/designforward.htm
[35]
C. A. Nicopoulos, D. Park, J. Kim, N. Vijaykrishnan, M. S. Yousif, and C. R. Das, "Vichar: A dynamic virtual channel regulator for network-on-chip routers," in 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06), Dec 2006, pp. 333--346.
[36]
P. Fuentes, E. Vallejo, R. Beivide, C. Minkenberg, and M. Valero, "Flexvc: Flexible virtual channel management in low-diameter networks," in 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2017, pp. 842--854.
[37]
M. Alizadeh, B. Atikoglu, A. Kabbani, A. Lakshmikantha, R. Pan, B. Prabhakar, and M. Seaman, "Data center transport mechanisms: Congestion control theory and ieee standardization," in Communication, Control, and Computing, 2008 46th Annual Allerton Conference on, sept. 2008, pp. 1270 --1277.
[38]
J.-L. Ferrer, E. Baydal, A. Robles, P. Lopez, and J. Duato, "Congestion management in mins through marked and validated packets," in Proceedings of the 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing. Washington, DC, USA: IEEE Computer Society, 2007, pp. 254--261. {Online}. Available
[39]
M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan, "Data center tcp (dctcp)," in Proceedings of the ACM SIGCOMM 2010 Conference, ser. SIGCOMM '10. New York, NY, USA: ACM, 2010, pp. 63--74. {Online}. Available
[40]
J. Escudero-Sahuquillo, P. García, F. Quiles, J. Flich, and J. Duato, "Fbicm: Efficient congestion management for high-performance networks using distributed deterministic routing," in Proceedings of the 15th International Conference on High Performance Computing, ser. HiPC'08. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 503--517. {Online}. Available: http://dl.acm.org/citation.cfm?id=1791889.1791941
[41]
J. Escudero-Sahuquillo, E. G. Gran, P. J. Garcia, J. Flich, T. Skeie, O. Lysne, F. J. Quiles, and J. Duato, "Combining congested-flow isolation and injection throttling in hpc interconnection networks," in Proceedings of the 2011 International Conference on Parallel Processing, ser. ICPP '11. Washington, DC, USA: IEEE Computer Society, 2011, pp. 662--672. {Online}. Available

Cited By

View all
  • (2021)CIB-HIERACM Transactions on Architecture and Code Optimization10.1145/346806218:4(1-21)Online publication date: 17-Jul-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis
November 2018
932 pages

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 July 2019

Check for updates

Badges

Author Tags

  1. buffer storage
  2. high performance computing
  3. high-radix switches
  4. multiprocessor interconnection networks
  5. packet switching

Qualifiers

  • Research-article

Conference

SC18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)CIB-HIERACM Transactions on Architecture and Code Optimization10.1145/346806218:4(1-21)Online publication date: 17-Jul-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media