research-article

STAPL: standard template adaptive parallel library

Authors:

Ioannis Papadopoulos,

Gabriel Tanase,

Nancy M. Amato,

Lawrence RauchwergerAuthors Info & Claims

SYSTOR '10: Proceedings of the 3rd Annual Haifa Experimental Systems Conference

Article No.: 14, Pages 1 - 10

https://doi.org/10.1145/1815695.1815713

Published: 24 May 2010 Publication History

Abstract

The Standard Template Adaptive Parallel Library (stapl) is a high-productivity parallel programming framework that extends C++ and stl with unified support for shared and distributed memory parallelism. stapl provides distributed data structures (pContainers) and parallel algorithms (pAlgorithms) and a generic methodology for extending them to provide customized functionality. The stapl runtime system provides the abstraction for communication and program execution. In this paper, we describe the major components of stapl and present performance results for both algorithms and data structures showing scalability up to tens of thousands of processors.

References

[1]

P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. Amato, and L. Rauchwerger. STAPL: A standard template adaptive parallel C++ library. In Proc. of the Int. W-shop on Advanced Compiler Technology for High Performance and Embedded Processors, Bucharest, Romania, 2001.

[2]

P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. M. Amato, and L. Rauchwerger. STAPL: An adaptive, generic parallel programming library for C++. In Int. W-shop on Languages and Compilers for Parallel Computing, LNCS, Cumberland Falls, KY, 2001.

Digital Library

[3]

G. Blelloch. NESL: A Nested Data-Parallel Language. Tech. Rep. CMU-CS-93-129, Carnegie Mellon Univ., 1993.

Digital Library

[4]

G. E. Blelloch, et. al. Implementation of a portable nested data-parallel language. In ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp. 102--111, 1993.

Digital Library

[5]

D. Bonachea. Gasnet specification, v1.1. Tech. Rep. UCB/CSD-02-1207, U.C. Berkeley, 2002.

Digital Library

[6]

A. A. Buss, T. Smith, G. Tanase, N. Thomas, M. Bianco, N. M. Amato, and L. Rauchwerger. Design for interoperability in STAPL: pMatrices and linear algebra algorithms. In Int. W-shop on Languages and Compilers for Parallel Computing. In LNCS, vol. 5335, pp. 304--315, Edmonton, Alberta, Canada, 2008.

Digital Library

[7]

D. Callahan, Chamberlain, B. L., and H. Zima. The Cascade high productivity language. In The 9-th Int. W-shop on High-Level Parallel Programming Models and Supportive Environments, vol. 26, pp. 52--60, Los Alamitos, CA, 2004.

[8]

P. Charles, et. al. X10: an object-oriented approach to non-uniform cluster computing. In Proc. of the 20th ACM SIGPLAN OOPSLA, New York, NY, 2005.

Digital Library

[9]

D. Culler, A. Dusseau, S. C. Goldstein, A. Krishnamurthy, S. Lumetta, T. von Eicken, and K. Yelick. Parallel programming in Split-C. In Int. Conf. on Supercomputing, 1993.

Digital Library

[10]

M. Fomitchev and E. Ruppert. Lock-free linked lists and skip lists. In Proc. ACM Symp. on Princ. of Dist. Proc., pp. 50--59, New York, NY, 2004.

Digital Library

[11]

I. Foster, C. Kesselman, and S. Tuecke. The NEXUS task-parallel runtime system. In Proc. 1st Intl W-shop on Parallel Processing, pp. 457--462, 1994.

[12]

M. Frigo, C. Leiserson, and K. Randall. The implementation of the Cilk-5 multithreaded language. In ACM SIGPLAN Conf. on Programming Language Design and Implementation, 1998.

Digital Library

[13]

T. L. Harris. A pragmatic implementation of non-blocking linked-lists. In Proc. Int. Conf. Dist. Comput., pp. 300--314, London, UK, 2001.

Digital Library

[14]

M. Herlihy. A methodology for implementing highly concurrent data objects. ACM Trans. Prog. Lang. Sys., 15(5):745--770, 1993.

Digital Library

[15]

J. Reinders. Intel Thread Building Blocks. O'Reilly Media, 2007.

Digital Library

[16]

J. JàJà. An Introduction Parallel Algorithms. Addison--Wesley, Reading, MA, 1992.

[17]

E. Johnson. Support for Parallel Generic Programming. PhD thesis, Indiana University, Indianapolis, IN, 1998.

Digital Library

[18]

L. Kal'e, et. al. Converse: An interoperable framework for parallel programming. In Proc. of the 10th Int. Parallel Processing Symp., 1996.

Digital Library

[19]

L. V. Kale and S. Krishnan. CHARM++: A portable concurrent object oriented system based on C++. SIGPLAN Not., 28(10):91--108, 1993.

Digital Library

[20]

M. M. Michael. High performance dynamic lock-free hash tables and list-based sets. In Proc. of the 14-th ACM Symp. on Parallel Algorithms and Architectures, pp. 73--82, Winnipeg, Manitoba, Canada, 2002.

Digital Library

[21]

D. Musser, G. Derge, and A. Saini. STL Tutorial and Reference Guide, 2nd Edition. Addison-Wesley, 2001.

Digital Library

[22]

J. Nieplocha and B. Carpenter. ARMCI: A portable remote memory copy libray for ditributed array libraries and compiler run-time systems. In Proc. of the 11 IPPS/SPDP'99 W-shops in conjunction with the 13th Int. Parallel Processing Symp., 1999.

Digital Library

[23]

W. Pugh. Concurrent maintenance of skip lists. Tech. Report UMIACS-TR-90-80, University of Maryland at College Park, College Park, MD, USA, 1990.

Digital Library

[24]

L. Rauchwerger, F. Arzu, and K. Ouchi. Standard Templates Adaptive Parallel Library. In Proc. of the 4th Int. W-shop on Languages, Compilers and Run-Time Systems for Scalable Computers, LNCS, Pittsburgh, PA, May 1998.

Digital Library

[25]

J. V. W. Reynders, et. al. POOMA: A Framework for Scientific Simulations of Paralllel Architectures. In G. V. Wilson and P. Lu, editors, Parallel Programming in C++, chapter 14, MIT Press, 1996.

[26]

S. Saunders and L. Rauchwerger. ARMI: an adaptive, platform independent communication library. In Proc. of the 9-th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp. 230--241, San Diego, California, USA, 2003. ACM.

Digital Library

[27]

G. Tanase, M. Bianco, N. M. Amato, and L. Rauchwerger. The STAPL pArray. In Proc. of the 2007 W-shop on Memory Performance (MEDEA), pp. 73--80, Brasov, Romania, 2007.

Digital Library

[28]

G. Tanase, C. Raman, M. Bianco, N. M. Amato, and L. Rauchwerger. Associative parallel containers in STAPL. In Int. W-shop on Languages and Compilers for Parallel Computing, in LNCS, vol. 5234, pp. 156--171, Urbana-Champaign, IL, 2008.

Digital Library

[29]

G. Tanase, X. Xu, A. Buss, Harshvardhan, I. Papadopoulos, O. Pearce, T. Smith, N. Thomas, M. Bianco, N. M. Amato, and L. Rauchwerger. The STAPL pList. In Int. W-shop on Languages and Compilers for Parallel Computing, in LNCS, Newark, DE, 2009.

Digital Library

[30]

N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato, and L. Rauchwerger. A framework for adaptive algorithm selection in STAPL. In Proc. of the 10-th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, Chicago, IL, 2005.

Digital Library

[31]

J. D. Valois. Lock-free linked lists using compare-and-swap. In Proc. ACM Symp. on Princ. of Dist. Proc., pp. 214--222, New York, NY, 1995.

Digital Library

[32]

K. Yelick, et. al. Titanium: A high-performance Java dialect. In ACM W-shop on Java for High Performance Network Computing, New York, 1998.

Cited By

Psota JSolar-Lezama ALee IChabbi MSteuwer M(2024)Pure: Evolving Message Passing To Better Leverage Shared Memory Within NodesProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638503(133-146)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3627535.3638503
Castellana VMutlu BDio Lavore IFiroz JWolf KMinutoli MFeo J(2024)Custom Accessors: Enabling Scalable Data Ingestion, (Re-)Organization, and Analysis on Distributed Systems2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825020(189-198)Online publication date: 15-Dec-2024
https://doi.org/10.1109/BigData62323.2024.10825020
Franke BLi ZMorton MSteuwer M(2024)Collection skeletonsJournal of Systems and Software10.1016/j.jss.2024.112042213:COnline publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1016/j.jss.2024.112042
Show More Cited By

Index Terms

Recommendations

The STAPL parallel container framework
PPoPP '11

The Standard Template Adaptive Parallel Library (STAPL) is a parallel programming infrastructure that extends C++ with support for parallelism. It includes a collection of distributed data structures called pContainers that are thread-safe, concurrent ...
The STAPL plist
LCPC'09: Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing

We present the design and implementation of the staplpList, a parallel container that has the properties of a sequential list, but allows for scalable concurrent access when used in a parallel program. The Standard Template Adaptive Parallel Library (...
The stapl parallel container framework

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

SYSTOR '10: Proceedings of the 3rd Annual Haifa Experimental Systems Conference

May 2010

211 pages

ISBN:9781605589084

DOI:10.1145/1815695

Conference Chair:
Gadi Haber
IBM
,
Program Chairs:
Dilma M. Da Silva
IBM
,
Ethan L. Miller
UCSC

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SYSTOR '10

SYSTOR '10: The 3rd Annual Haifa Experimental Systems Conference

May 24 - 26, 2010

Haifa, Israel

Acceptance Rates

Overall Acceptance Rate 108 of 323 submissions, 33%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

65
Total Citations
View Citations
295
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Psota JSolar-Lezama ALee IChabbi MSteuwer M(2024)Pure: Evolving Message Passing To Better Leverage Shared Memory Within NodesProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638503(133-146)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3627535.3638503
Castellana VMutlu BDio Lavore IFiroz JWolf KMinutoli MFeo J(2024)Custom Accessors: Enabling Scalable Data Ingestion, (Re-)Organization, and Analysis on Distributed Systems2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825020(189-198)Online publication date: 15-Dec-2024
https://doi.org/10.1109/BigData62323.2024.10825020
Franke BLi ZMorton MSteuwer M(2024)Collection skeletonsJournal of Systems and Software10.1016/j.jss.2024.112042213:COnline publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1016/j.jss.2024.112042
Hünich DKnüpfer A(2023)A Halo abstraction for distributed n-dimensional structured grids within the C++ PGAS library DASHPeerJ Computer Science10.7717/peerj-cs.12039(e1203)Online publication date: 3-Feb-2023
https://doi.org/10.7717/peerj-cs.1203
Malyshkin VSchukin G(2023)Didal: Distributed Data Library for Development of Parallel Fragmented ProgramsParallel Computing Technologies10.1007/978-3-031-41673-6_3(30-41)Online publication date: 15-Aug-2023
https://doi.org/10.1007/978-3-031-41673-6_3
Adams MAdams MDaryl Hawkins WSmith TRauchwerger LAmato NBailey TFalgout RKunen ABrown P(2020)Provably optimal parallel transport sweeps on semi-structured gridsJournal of Computational Physics10.1016/j.jcp.2020.109234(109234)Online publication date: Jan-2020
https://doi.org/10.1016/j.jcp.2020.109234
Islam SBalasubramaniam SGupta SBrajesh SBadlani RLabhishetty NBaid AGoyal PGoyal N(2020)Automatic parallelization of representative-based clustering algorithms for multicore cluster systemsInternational Journal of Data Science and Analytics10.1007/s41060-020-00206-4Online publication date: 7-Mar-2020
https://doi.org/10.1007/s41060-020-00206-4
Lakhotia KKannan RGaur ASrivastava APrasanna VPalumbo FBecchi MSchulz MSato K(2019)Parallel edge-based sampling for static and dynamic graphsProceedings of the 16th ACM International Conference on Computing Frontiers10.1145/3310273.3323052(125-134)Online publication date: 30-Apr-2019
https://dl.acm.org/doi/10.1145/3310273.3323052
Humphrey ABerzins M(2019)An Evaluation of An Asynchronous Task Based Dataflow Approach For Uintah2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC)10.1109/COMPSAC.2019.10282(652-657)Online publication date: Jul-2019
https://doi.org/10.1109/COMPSAC.2019.10282
Majidi AThomas NSmith TAmato NRauchwerger L(2019)Nested Parallelism with Algorithmic SkeletonsLanguages and Compilers for Parallel Computing10.1007/978-3-030-34627-0_12(159-175)Online publication date: 13-Nov-2019
https://doi.org/10.1007/978-3-030-34627-0_12
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten