Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/602770.602794acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article
Free access

Building a high-performance collective communication library

Published: 14 November 1994 Publication History

Abstract

In this paper, we report on a project to develop a unified approach for building a library of collective communication operations that performs well on a cross-section of problems encountered in real applications. The target architecture is a two-dimensional mesh with worm-hole routing, but the techniques are more general. The approach differs from traditional library implementations in that we address the need for implementations that perform well for various sized vectors and grid dimensions, including non-power-of-two grids. We show how a general approach to hybrid algorithms yields performance across the entire range of vector lengths. Moreover, many scalable implementations of application libraries require collective communication within groups of nodes. Our approach yields the same kind of performance for group collective communication. Results from the Intel Paragon system are included. To obtain this library for Intel systems contact intercom©cs.utexas.edu.

References

[1]
M. Barnett, S. Gupta, D. Payne, L. Shuler, R. van de Geijn and J. Watts. Interprocessor Collective Communication Library (InterCom). Proceedings of Scalable High Performance Computing Conference, pg. 357-364, IEEE Computer Society Press, Knoxville, TN, May 23-24, 1994.
[2]
M. Barnett, R. Littlefield, D.G. Payne and R. van de Geijn. Efficient Communication Primitives on Mesh Architectures with Hardware Routing. Sixth SIAM Conference on Parallel Processing for Scientific Computing, Norfolk, VA, Mar. 22-24, 1993.
[3]
M. Barnett, R. Littlefield, D.G. Payne and R. van de Geijn. Global Combine on Mesh Architectures with Wormhole Routing. 7th International Parallel Processing Symposium, pages 156-162, IEEE Computer Society Press, Newport Beach, CA, Apr. 13-16, 1993.
[4]
M. Barnett, D. Payne and R. van de Geijn. Optimal broadcasting in mesh-connected architectures. University of Texas Department of Computer Science TR-91-38, Dec. 1991.
[5]
M. Barnett, D.G. Payne, R. van de Geijn and J. Watts. Broadcasting on Meshes with Worm-Hole Routing. Journal of Parallel and Distributed Computing, submitted. (Currently University of Texas Department of Computer Sciences TR-93-24.)
[6]
J.-C. Bermond, P. Michallon and D. Trystram. Broadcasting in Wraparound Meshes with Parallel Monodirectional Links. Parallel Computing, 18(6):639-648, June 1992.
[7]
C.-T. Ho and S. L. Johnsson. Distributed Routing Algorithms for Broadcasting and Personalized Communication in Hypercubes. Proceedings of the 1986 International Conference on Parallel Processing, pg. 640-648, IEEE Computer Society Press, 1986.
[8]
S. L. Lillevik. The Touchstone 30 Gigaflop Delta Prototype Sixth Distributed Memory Computing Conference Proceedings, pg. 671-677, IEEE Computer Society Press, 1991.
[9]
R. Littlefield. Characterizing and Tuning Communications Performance on the Touchstone Delta and iPSC/860. Proceedings of the 1992 Intel User's Group Meeting, Dallas, Texas, Oct. 4-7, 1992.
[10]
L. M. Ni and P. K McKinley. A Survey of Wormhole Routing Techniques in Direct Networks. IEEE Computer, 26(2):62-76, Feb. 1993.
[11]
Y. Saad and M. H. Schultz. Data Communication in Parallel Architectures. Parallel Computing, 11(2):131-150, Aug. 1989.
[12]
S. R. Seidel. Broadcasting on Linear Arrays and Meshes. Oak Ridge National Laboratory Technical Report ORNL/TM-12356, Mar. 1993.
[13]
M. Simmen. Comments on Broadcast Algorithms for Two-Dimensional Grids Parallel Computing, 17(1):109-112, Apr. 1991.
[14]
R. A. van de Geijn. Efficient Global Combine Operations. Sixth Distributed Memory Computing Conference Proceedings, pg. 291-294, IEEE Computer Society Press, 1991.
[15]
R. van de Geijn and J. Watts. A Pipelined Broadcast for Multidimensional Meshes. Parallel Processing Letters, to appear.
[16]
D. W. Walker. The Design of a Standard Message Passing Interface for Distributed Memory Concurrent Computers. Parallel Computing, Apr. 1994. (Up to date information about the MPI standard is available from netlib, directory mpi.)

Cited By

View all
  • (2011)Cache injection for parallel applicationsProceedings of the 20th international symposium on High performance distributed computing10.1145/1996130.1996135(15-26)Online publication date: 8-Jun-2011
  • (2010)Dynamic Load-Balanced Multicast for Data-Intensive Applications on CloudsProceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing10.1109/CCGRID.2010.63(5-14)Online publication date: 17-May-2010
  • (2009)A configurable algorithm for parallel image-compositing applicationsProceedings of the Conference on High Performance Computing Networking, Storage and Analysis10.1145/1654059.1654064(1-10)Online publication date: 14-Nov-2009
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
Supercomputing '94: Proceedings of the 1994 ACM/IEEE conference on Supercomputing
November 1994
840 pages
ISBN:0818666056

Sponsors

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 14 November 1994

Check for updates

Qualifiers

  • Article

Conference

SC '94
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)7
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2011)Cache injection for parallel applicationsProceedings of the 20th international symposium on High performance distributed computing10.1145/1996130.1996135(15-26)Online publication date: 8-Jun-2011
  • (2010)Dynamic Load-Balanced Multicast for Data-Intensive Applications on CloudsProceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing10.1109/CCGRID.2010.63(5-14)Online publication date: 17-May-2010
  • (2009)A configurable algorithm for parallel image-compositing applicationsProceedings of the Conference on High Performance Computing Networking, Storage and Analysis10.1145/1654059.1654064(1-10)Online publication date: 14-Nov-2009
  • (2009)MPI Applications on GridsProceedings of the 15th International Euro-Par Conference on Parallel Processing10.1007/978-3-642-03869-3_45(466-477)Online publication date: 23-Aug-2009
  • (2008)The design and implementation of MPI collective operations for clusters in long-and-fast networksCluster Computing10.1007/s10586-007-0050-711:1(45-55)Online publication date: 1-Mar-2008
  • (2006)Self-adapting numerical software (SANS) effortIBM Journal of Research and Development10.1147/rd.502.022350:2/3(223-238)Online publication date: 1-Mar-2006
  • (2005)Performance Modeling and Tuning Strategies of Mixed Mode Collective CommunicationsProceedings of the 2005 ACM/IEEE conference on Supercomputing10.1109/SC.2005.56Online publication date: 12-Nov-2005
  • (2005)A Reconfigurable MPI Broadcast FunctionProceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region10.1109/HPCASIA.2005.9Online publication date: 30-Nov-2005
  • (2005)A proposal of reconfigurable MPI collective communication functionsProceedings of the Third international conference on Parallel and Distributed Processing and Applications10.1007/11576235_14(102-107)Online publication date: 2-Nov-2005
  • (2003)Scalable NIC-based Reduction on Large-scale ClustersProceedings of the 2003 ACM/IEEE conference on Supercomputing10.1145/1048935.1050209Online publication date: 15-Nov-2003
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media