Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1815695.1815713acmotherconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
research-article

STAPL: standard template adaptive parallel library

Published: 24 May 2010 Publication History

Abstract

The Standard Template Adaptive Parallel Library (stapl) is a high-productivity parallel programming framework that extends C++ and stl with unified support for shared and distributed memory parallelism. stapl provides distributed data structures (pContainers) and parallel algorithms (pAlgorithms) and a generic methodology for extending them to provide customized functionality. The stapl runtime system provides the abstraction for communication and program execution. In this paper, we describe the major components of stapl and present performance results for both algorithms and data structures showing scalability up to tens of thousands of processors.

References

[1]
P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. Amato, and L. Rauchwerger. STAPL: A standard template adaptive parallel C++ library. In Proc. of the Int. W-shop on Advanced Compiler Technology for High Performance and Embedded Processors, Bucharest, Romania, 2001.
[2]
P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. M. Amato, and L. Rauchwerger. STAPL: An adaptive, generic parallel programming library for C++. In Int. W-shop on Languages and Compilers for Parallel Computing, LNCS, Cumberland Falls, KY, 2001.
[3]
G. Blelloch. NESL: A Nested Data-Parallel Language. Tech. Rep. CMU-CS-93-129, Carnegie Mellon Univ., 1993.
[4]
G. E. Blelloch, et. al. Implementation of a portable nested data-parallel language. In ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp. 102--111, 1993.
[5]
D. Bonachea. Gasnet specification, v1.1. Tech. Rep. UCB/CSD-02-1207, U.C. Berkeley, 2002.
[6]
A. A. Buss, T. Smith, G. Tanase, N. Thomas, M. Bianco, N. M. Amato, and L. Rauchwerger. Design for interoperability in STAPL: pMatrices and linear algebra algorithms. In Int. W-shop on Languages and Compilers for Parallel Computing. In LNCS, vol. 5335, pp. 304--315, Edmonton, Alberta, Canada, 2008.
[7]
D. Callahan, Chamberlain, B. L., and H. Zima. The Cascade high productivity language. In The 9-th Int. W-shop on High-Level Parallel Programming Models and Supportive Environments, vol. 26, pp. 52--60, Los Alamitos, CA, 2004.
[8]
P. Charles, et. al. X10: an object-oriented approach to non-uniform cluster computing. In Proc. of the 20th ACM SIGPLAN OOPSLA, New York, NY, 2005.
[9]
D. Culler, A. Dusseau, S. C. Goldstein, A. Krishnamurthy, S. Lumetta, T. von Eicken, and K. Yelick. Parallel programming in Split-C. In Int. Conf. on Supercomputing, 1993.
[10]
M. Fomitchev and E. Ruppert. Lock-free linked lists and skip lists. In Proc. ACM Symp. on Princ. of Dist. Proc., pp. 50--59, New York, NY, 2004.
[11]
I. Foster, C. Kesselman, and S. Tuecke. The NEXUS task-parallel runtime system. In Proc. 1st Intl W-shop on Parallel Processing, pp. 457--462, 1994.
[12]
M. Frigo, C. Leiserson, and K. Randall. The implementation of the Cilk-5 multithreaded language. In ACM SIGPLAN Conf. on Programming Language Design and Implementation, 1998.
[13]
T. L. Harris. A pragmatic implementation of non-blocking linked-lists. In Proc. Int. Conf. Dist. Comput., pp. 300--314, London, UK, 2001.
[14]
M. Herlihy. A methodology for implementing highly concurrent data objects. ACM Trans. Prog. Lang. Sys., 15(5):745--770, 1993.
[15]
J. Reinders. Intel Thread Building Blocks. O'Reilly Media, 2007.
[16]
J. JàJà. An Introduction Parallel Algorithms. Addison--Wesley, Reading, MA, 1992.
[17]
E. Johnson. Support for Parallel Generic Programming. PhD thesis, Indiana University, Indianapolis, IN, 1998.
[18]
L. Kal'e, et. al. Converse: An interoperable framework for parallel programming. In Proc. of the 10th Int. Parallel Processing Symp., 1996.
[19]
L. V. Kale and S. Krishnan. CHARM++: A portable concurrent object oriented system based on C++. SIGPLAN Not., 28(10):91--108, 1993.
[20]
M. M. Michael. High performance dynamic lock-free hash tables and list-based sets. In Proc. of the 14-th ACM Symp. on Parallel Algorithms and Architectures, pp. 73--82, Winnipeg, Manitoba, Canada, 2002.
[21]
D. Musser, G. Derge, and A. Saini. STL Tutorial and Reference Guide, 2nd Edition. Addison-Wesley, 2001.
[22]
J. Nieplocha and B. Carpenter. ARMCI: A portable remote memory copy libray for ditributed array libraries and compiler run-time systems. In Proc. of the 11 IPPS/SPDP'99 W-shops in conjunction with the 13th Int. Parallel Processing Symp., 1999.
[23]
W. Pugh. Concurrent maintenance of skip lists. Tech. Report UMIACS-TR-90-80, University of Maryland at College Park, College Park, MD, USA, 1990.
[24]
L. Rauchwerger, F. Arzu, and K. Ouchi. Standard Templates Adaptive Parallel Library. In Proc. of the 4th Int. W-shop on Languages, Compilers and Run-Time Systems for Scalable Computers, LNCS, Pittsburgh, PA, May 1998.
[25]
J. V. W. Reynders, et. al. POOMA: A Framework for Scientific Simulations of Paralllel Architectures. In G. V. Wilson and P. Lu, editors, Parallel Programming in C++, chapter 14, MIT Press, 1996.
[26]
S. Saunders and L. Rauchwerger. ARMI: an adaptive, platform independent communication library. In Proc. of the 9-th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp. 230--241, San Diego, California, USA, 2003. ACM.
[27]
G. Tanase, M. Bianco, N. M. Amato, and L. Rauchwerger. The STAPL pArray. In Proc. of the 2007 W-shop on Memory Performance (MEDEA), pp. 73--80, Brasov, Romania, 2007.
[28]
G. Tanase, C. Raman, M. Bianco, N. M. Amato, and L. Rauchwerger. Associative parallel containers in STAPL. In Int. W-shop on Languages and Compilers for Parallel Computing, in LNCS, vol. 5234, pp. 156--171, Urbana-Champaign, IL, 2008.
[29]
G. Tanase, X. Xu, A. Buss, Harshvardhan, I. Papadopoulos, O. Pearce, T. Smith, N. Thomas, M. Bianco, N. M. Amato, and L. Rauchwerger. The STAPL pList. In Int. W-shop on Languages and Compilers for Parallel Computing, in LNCS, Newark, DE, 2009.
[30]
N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato, and L. Rauchwerger. A framework for adaptive algorithm selection in STAPL. In Proc. of the 10-th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, Chicago, IL, 2005.
[31]
J. D. Valois. Lock-free linked lists using compare-and-swap. In Proc. ACM Symp. on Princ. of Dist. Proc., pp. 214--222, New York, NY, 1995.
[32]
K. Yelick, et. al. Titanium: A high-performance Java dialect. In ACM W-shop on Java for High Performance Network Computing, New York, 1998.

Cited By

View all
  • (2024)Pure: Evolving Message Passing To Better Leverage Shared Memory Within NodesProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638503(133-146)Online publication date: 2-Mar-2024
  • (2024)Custom Accessors: Enabling Scalable Data Ingestion, (Re-)Organization, and Analysis on Distributed Systems2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825020(189-198)Online publication date: 15-Dec-2024
  • (2024)Collection skeletonsJournal of Systems and Software10.1016/j.jss.2024.112042213:COnline publication date: 1-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SYSTOR '10: Proceedings of the 3rd Annual Haifa Experimental Systems Conference
May 2010
211 pages
ISBN:9781605589084
DOI:10.1145/1815695
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. high productivity parallel programming
  2. library
  3. parallel data structures

Qualifiers

  • Research-article

Funding Sources

Conference

SYSTOR '10

Acceptance Rates

Overall Acceptance Rate 108 of 323 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Pure: Evolving Message Passing To Better Leverage Shared Memory Within NodesProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638503(133-146)Online publication date: 2-Mar-2024
  • (2024)Custom Accessors: Enabling Scalable Data Ingestion, (Re-)Organization, and Analysis on Distributed Systems2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825020(189-198)Online publication date: 15-Dec-2024
  • (2024)Collection skeletonsJournal of Systems and Software10.1016/j.jss.2024.112042213:COnline publication date: 1-Jul-2024
  • (2023)A Halo abstraction for distributed n-dimensional structured grids within the C++ PGAS library DASHPeerJ Computer Science10.7717/peerj-cs.12039(e1203)Online publication date: 3-Feb-2023
  • (2023)Didal: Distributed Data Library for Development of Parallel Fragmented ProgramsParallel Computing Technologies10.1007/978-3-031-41673-6_3(30-41)Online publication date: 15-Aug-2023
  • (2020)Provably optimal parallel transport sweeps on semi-structured gridsJournal of Computational Physics10.1016/j.jcp.2020.109234(109234)Online publication date: Jan-2020
  • (2020)Automatic parallelization of representative-based clustering algorithms for multicore cluster systemsInternational Journal of Data Science and Analytics10.1007/s41060-020-00206-4Online publication date: 7-Mar-2020
  • (2019)Parallel edge-based sampling for static and dynamic graphsProceedings of the 16th ACM International Conference on Computing Frontiers10.1145/3310273.3323052(125-134)Online publication date: 30-Apr-2019
  • (2019)An Evaluation of An Asynchronous Task Based Dataflow Approach For Uintah2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC)10.1109/COMPSAC.2019.10282(652-657)Online publication date: Jul-2019
  • (2019)Nested Parallelism with Algorithmic SkeletonsLanguages and Compilers for Parallel Computing10.1007/978-3-030-34627-0_12(159-175)Online publication date: 13-Nov-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media