High-Level Support for Pipeline Parallelism on Many-Core Architectures

Siegfried Benkner¹⁹,
Enes Bajrovic¹⁹,
Erich Marth¹⁹,
Martin Sandrieser¹⁹,
Raymond Namyst²⁰ &
…
Samuel Thibault²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7484))

Included in the following conference series:

European Conference on Parallel Processing

3178 Accesses
5 Citations

Abstract

With the increasing architectural diversity of many-core architectures the challenges of parallel programming and code portability will sharply rise. The EU project PEPPHER addresses these issues with a component-based approach to application development on top of a task-parallel execution model. Central to this approach are multi-architectural components which encapsulate different implementation variants of application functionality tailored for different core types. An intelligent runtime system selects and dynamically schedules component implementation variants for efficient parallel execution on heterogeneous many-core architectures. On top of this model we have developed language, compiler and runtime support for a specific class of applications that can be expressed using the pipeline pattern. We propose C/C++ language annotations for specifying pipeline patterns and describe the associated compilation and runtime infrastructure. Experimental results indicate that with our high-level approach performance comparable to manual parallelization can be achieved.

Download to read the full chapter text

Chapter PDF

Programming Support for Future Parallel Architectures

Optimizing Task Parallelism with Library-Semantics-Aware Compilation

Pipeline Patterns on Top of Task-Based Runtimes

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Intel, Threading Building Blocks (2009), http://threadingbuildingblocks.org
Nvidia, C.: Compute Unified Device Architecture Programming Guide. NVIDIA, Santa Clara (2007)
Google Scholar
Kahle, J.A., Day, M.N., Hofstee, H.P., Johns, C.R., Maeurer, T.R., Shippy, D.J.: Introduction to the Cell Multiprocessor. IBM Journal of Research and Development 49(4-5), 589–604 (2005)
Article Google Scholar
Munshi, A. (ed.): OpenCL 1.0 Specification. Khronos OpenCL Working Group (2011)
Google Scholar
Pan, H., Hindman, B., Asanović, K.: Composing Parallel Software Efficiently with Lithe. In: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2010, pp. 376–387. ACM, New York (2010)
Chapter Google Scholar
Ansel, J., Chan, C.P., Wong, Y.L., Olszewski, M., Zhao, Q., Edelman, A., Amarasinghe, S.P.: PetaBricks: A Language and Compiler for Algorithmic Choice. In: Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2009, pp. 38–49 (2009)
Google Scholar
Wernsing, J.R., Stitt, G.: Elastic Computing: A Framework for Transparent, Portable, and Adaptive Multi-core Heterogeneous Computing. In: Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), pp. 115–124. ACM (2010)
Google Scholar
Vandierendonck, H., Pratikakis, P., Nikolopoulos, D.S.: Parallel Programming of General-Purpose Programs using Task-based Programming Models. In: Proceedings of the 3rd USENIX Conference on Hot Topics in Parallelism, HotPar 2011, Berkeley, CA, USA, p. 13 (2011)
Google Scholar
Benkner, S., Pllana, S., Traff, J., Tsigas, P., Dolinsky, U., Augonnet, C., Bachmayer, B., Kessler, C., Moloney, D., Osipov, V.: PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems. IEEE Micro 31(5), 28–41 (2011)
Article Google Scholar
Sandrieser, M., Benkner, S., Pllana, S.: Using explicit platform descriptions to support programming of heterogeneous many-core systems. Parallel Computing 38(12), 52–65 (2012), http://www.sciencedirect.com/science/article/pii/S0167819111001396
Article Google Scholar
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.-A.: StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. Concurrency and Computation: Practice and Experience (23), 187–198 (2011)
Google Scholar
Quinlan, D.: ROSE: Compiler Support for Object-Oriented Frameworks. Parallel Processing Letters 49 (2005)
Google Scholar
Topcuoglu, H., Hariri, S., Wu, M.-Y.: Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing. IEEE Transactions on Parallel and Distributed Systems 13(3) (March 2002)
Google Scholar
Burrows, M.: A Block-Sorting Lossless Data Compression Algorithm. Research Report 124, Digital Systems Research Center (1994)
Google Scholar
Intel, Intel Threading Building Blocks - Pipeline Documentation, http://threadingbuildingblocks.org/files/documentation/a00150.html
Seward, J.: BZIP2 Library Utility Function Documentation (September 2011), http://bzip.org/1.0.5/bzip2-manual-1.0.5.html#util-fns
Gilchrist, J.: Parallel Data Compression with bzip2. In: Proceedings of the 16th IASTED International Conference on Parallel and Distributed Computing and Systems, vol. 16, pp. 559–564 (2004)
Google Scholar
Gary, B.: Learning openCV: Computer Vision with the openCV Library. O’Reilly, USA (2008)
Google Scholar
Benoit, A., Robert, Y.: Mapping Pipeline Skeletons onto Heterogeneous Platforms. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007, Part I. LNCS, vol. 4487, pp. 591–598. Springer, Heidelberg (2007)
Chapter Google Scholar
Cole, M.: Bringing Skeletons out of the Closet: A Pragmatic Manifesto for Skeletal Parallel Programming. Parallel Computing (2004)
Google Scholar
Mattson, T., Sanders, B., Massingill, B.: Patterns for Parallel Programming. Addison-Wesley (2005)
Google Scholar
Pop, A., Cohen, A.: A Stream-Computing Extension to OpenMP. In: Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. ACM (2011)
Google Scholar
Thies, W., Karczmarek, M., Amarasinghe, S.: StreamIt: A Language for Streaming Applications. In: Horspool, R.N. (ed.) CC 2002. LNCS, vol. 2304, pp. 179–196. Springer, Heidelberg (2002)
Chapter Google Scholar
Sermulins, J., Thies, W., Rabbah, R., Amarasinghe, S.: Cache Aware Optimization of Stream Programs. ACM SIGPLAN Notices 40(7) (2005)
Google Scholar
Schaefer, C., Pankratius, V., Tichy, W.: Engineering Parallel Applications with Tunable Architectures. In: ICSE 2010: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol. 1 (May 2010)
Google Scholar
Otto, F., Schaefer, C.A., Dempe, M., Tichy, W.F.: A Language-Based Tuning Mechanism for Task and Pipeline Parallelism. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part II. LNCS, vol. 6272, pp. 328–340. Springer, Heidelberg (2010)
Chapter Google Scholar
Suleman, M., Qureshi, M., Khubaib, Patt, Y.: Feedback-Directed Pipeline Parallelism. In: PACT 2010: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (2010)
Google Scholar
Ayguade, E., Badia, R.M., Cabrera, D., Duran, A., Gonzalez, M., Igual, F., Jimenez, D., Labarta, J., Martorell, X., Mayo, R., Perez, J.M., Quintana-Ortí, E.S.: A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures. In: Müller, M.S., de Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 154–167. Springer, Heidelberg (2009)
Chapter Google Scholar
Wolfe, M.: Implementing the PGI Accelerator Model. In: GPGPU 2010: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units. ACM (March 2010)
Google Scholar
Bodin, F., Bihan, S.: Heterogeneous Multicore Parallel Programming for Graphics Processing Units. Scientific Programming 17, 325–335 (2009)
Google Scholar
OpenACC. Directives for Accelerators, http://www.openacc-standard.org/

Download references

Author information

Authors and Affiliations

Research Group Scientific Computing, University of Vienna, Austria
Siegfried Benkner, Enes Bajrovic, Erich Marth & Martin Sandrieser
LaBRI-INRIA Bordeaux Sud-Ouest, University of Bordeaux, Talence, France
Raymond Namyst & Samuel Thibault

Authors

Siegfried Benkner
View author publications
You can also search for this author in PubMed Google Scholar
Enes Bajrovic
View author publications
You can also search for this author in PubMed Google Scholar
Erich Marth
View author publications
You can also search for this author in PubMed Google Scholar
Martin Sandrieser
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Namyst
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Thibault
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Patras, Computer Technology Institute and Press “Diophantus”,, N. Kazantzaki, 26504, Rio, Greece
Christos Kaklamanis
University of Patras, University Building B, 26504, Rio, Greece
Theodore Papatheodorou
Computer Technology Institute and Press “Diophantus”, University of Patras, N. Kazantzaki, 26504, Rio, Greece
Paul G. Spirakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Benkner, S., Bajrovic, E., Marth, E., Sandrieser, M., Namyst, R., Thibault, S. (2012). High-Level Support for Pipeline Parallelism on Many-Core Architectures. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds) Euro-Par 2012 Parallel Processing. Euro-Par 2012. Lecture Notes in Computer Science, vol 7484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32820-6_61

Download citation

DOI: https://doi.org/10.1007/978-3-642-32820-6_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32819-0
Online ISBN: 978-3-642-32820-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

High-Level Support for Pipeline Parallelism on Many-Core Architectures

Abstract

Chapter PDF

Similar content being viewed by others

Programming Support for Future Parallel Architectures

Optimizing Task Parallelism with Library-Semantics-Aware Compilation

Pipeline Patterns on Top of Task-Based Runtimes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

High-Level Support for Pipeline Parallelism on Many-Core Architectures

Abstract

Chapter PDF

Similar content being viewed by others

Programming Support for Future Parallel Architectures

Optimizing Task Parallelism with Library-Semantics-Aware Compilation

Pipeline Patterns on Top of Task-Based Runtimes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation