Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3437801.3441603acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Modernizing parallel code with pattern analysis

Published: 17 February 2021 Publication History

Abstract

Fifty years of parallel programming has generated a substantial legacy parallel codebase, creating a new portability challenge: re-parallelizing already parallel code. Our solution exploits inherently portable parallel patterns, and addresses the challenge of identifying patternization opportunities in legacy parallel code via constraint matching on traced dynamic dataflow graphs. Notably, this makes the analysis source-independent and equally applicable to sequential and parallel legacy code. We identify various map and reduction patterns, including compositions, in Pthreads code. Experiments with the Starbench suite show that our analysis is effective (finding 86% of the patterns known in the literature), accurate (reporting actual patterns in 98% of the cases), and efficient (scaling linearly with the size of the execution traces). We re-express the found patterns via a parallel pattern library, making code freely portable across CPU/GPU systems and performing competitively with hand-tuned implementations at zero additional effort.

References

[1]
J. R. Allen, Ken Kennedy, Carrie Porterfield, and Joe Warren. 1983. Conversion of Control Dependence to Data Dependence. In Principles of Programming Languages. ACM.
[2]
Michael Andersch, Chi Ching Chi, and Ben Juurlink. 2012. Using OpenMP superscalar for parallelization of embedded and consumer applications. In Embedded Computer Systems. IEEE.
[3]
Michael Andersch, Ben Juurlink, and Chi Ching Chi. 2011. A benchmark suite for evaluating parallel programming models. In Parallel Systems and Algorithms. Springer.
[4]
Roberto Baldoni, Emilio Coppa, Daniele Cono D'elia, Camil Demetrescu, and Irene Finocchi. 2018. A Survey of Symbolic Execution Techniques. Comput. Surveys 51, 3 (2018).
[5]
Marc Berndl and Laurie Hendren. 2003. Dynamic Profiling and Trace Cache Generation. In Code Generation and Optimization. IEEE.
[6]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In Parallel Architectures and Compilation Techniques. ACM.
[7]
Eric Bodden, Laurie Hendren, Patrick Lam, Ondřej Lhoták, and Nomair A. Naeem. 2007. Collaborative Runtime Verification with Trace-matches. In Runtime Verification. Springer.
[8]
K. J. Brown, A. K. Sujeeth, H. J. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun. 2011. A Heterogeneous Parallel Framework for Domain-Specific Languages. In Parallel Architectures and Compilation Techniques. IEEE.
[9]
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A Benchmark Suite for Heterogeneous Computing. In IEEE International Symposium on Workload Characterization. IEEE.
[10]
Geoffrey G. Chu. 2011. Improving combinatorial optimization. Ph.D. Dissertation. The University of Melbourne, Australia.
[11]
Daniele De Sensi, Tiziano De Matteis, Massimo Torquati, Gabriele Mencagli, and Marco Danelutto. 2017. Bringing Parallel Patterns Out of the Corner: The P3 ARSEC Benchmark Suite. ACM Transactions on Architecture and Code Optimization 14, 4 (2017).
[12]
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM 51, 1 (2008).
[13]
Etem Deniz and Alper Sen. 2016. Using Machine Learning Techniques to Detect Parallel Patterns of Multi-threaded Applications. International Journal of Parallel Programming 44, 4 (2016).
[14]
E. Deniz, A. Sen, B. Kahne, and J. Holt. 2015. MINIME: Pattern-Aware Multicore Benchmark Synthesizer. IEEE Trans. Comput. 64, 8 (2015).
[15]
Bruno Dufour, Karel Driesen, Laurie Hendren, and Clark Verbrugge. 2003. Dynamic Metrics for Java. In Object-Oriented Programing, Systems, Languages, and Applications. ACM.
[16]
August Ernstsson, Lu Li, and Christoph Kessler. 2018. SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems. International Journal of Parallel Programming 46, 1 (2018).
[17]
Fabian Gruber, Manuel Selva, Diogo Sampaio, Christophe Guillon, Antoine Moynault, Louis-Noël Pouchet, and Fabrice Rastello. 2019. Data-flow/Dependence Profiling for Structured Transformations. In Principles and Practice of Parallel Programming. ACM.
[18]
Jiawei Han, Hong Cheng, Dong Xin, and Xifeng Yan. 2007. Frequent Pattern Mining: Current Status and Future Directions. Data Mining and Knowledge Discovery 15, 1 (2007).
[19]
Gabriel Hjort Blindell. 2016. Instruction Selection: Principles, Methods, and Applications. Springer.
[20]
Zia Ul Huda, Rohit Atre, Ali Jannesari, and Felix Wolf. 2016. Automatic Parallel Pattern Detection in the Algorithm Structure Design Space. In International Parallel and Distributed Processing Symposium. IEEE.
[21]
Alain Ketterlin and Philippe Clauss. 2012. Profiling Data-Dependence to Assist Parallelization: Framework, Scope, and Optimization. In IEEE/ACM International Symposium on Microarchitecture. IEEE.
[22]
Minjang Kim, Hyesoon Kim, and Chi-Keung Luk. 2010. SD3: A Scalable Approach to Dynamic Data-Dependence Profiling. In IEEE/ACM International Symposium on Microarchitecture. IEEE.
[23]
Andreas Knüpfer, Dieter Kranzlmüller, and Wolfgang Nagel. 2004. Detection of Collective MPI Operation Patterns. In European PVM/MPI Users' Group Meeting. Springer.
[24]
Dieter Kranzlmüller, Andreas Knüpfer, and Wolfgang Nagel. 2004. Pattern Matching of Collective MPI Operations. In Parallel and Distributed Processing Techniques and Applications. CSREA Press.
[25]
Jim Kukunas. 2015. Power and Performance. Morgan Kaufmann.
[26]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Code Generation and Optimization. IEEE.
[27]
Jinsoo Lee, Wook-Shin Han, Romans Kasperovics, and Jeong-Hoon Lee. 2012. An In-Depth Comparison of Subgraph Isomorphism Algorithms in Graph Databases. Proceedings of the VLDB Endowment 6, 2 (2012).
[28]
Zhen Li, Ali Jannesari, and Felix Wolf. 2013. Discovery of Potential Parallelism in Sequential Programs. In Parallel Processing. IEEE.
[29]
Stanislav Manilov, Christos Vasiladiotis, and Björn Franke. 2018. Generalized Profile-Guided Iterator Recognition. In Compiler Construction. ACM.
[30]
Timothy Mattson, Beverly Sanders, and Berna Massingill. 2004. Patterns for Parallel Programming. Addison-Wesley.
[31]
Michael McCool, James Reinders, and Arch Robison. 2012. Structured Parallel Programming: Patterns for Efficient Computation. Morgan Kaufmann.
[32]
S. Midkiff. 2012. Automatic Parallelization: An Overview of Fundamental Compiler Techniques. Morgan & Claypool.
[33]
Nicholas Nethercote and Alan Mycroft. 2003. Redux: A Dynamic Dataflow Tracer. Electronic Notes in Theoretical Computer Science 89, 2 (2003).
[34]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation. In Programming Language Design and Implementation. ACM.
[35]
Nicholas Nethercote, Peter J. Stuckey, Ralph Becket, Sebastian Brand, Gregory J. Duck, and Guido Tack. 2007. MiniZinc: Towards a Standard CP Modelling Language. In Principles and Practice of Constraint Programming. Springer.
[36]
Tomas Öhberg, August Ernstsson, and Christoph Kessler. 2019. Hybrid CPU-GPU execution support in the skeleton programming framework SkePU. The Journal of Supercomputing (2019).
[37]
Jason A. Poovey, Brian P. Railing, and Thomas M. Conte. 2011. Parallel Pattern Detection for Architectural Improvements. In Hot Topics in Parallelism. USENIX.
[38]
R. Preissl, B. R. d. Supinski, M. Schulz, D. J. Quinlan, D. Kranzlmüller, and T. Panas. 2010. Exploitation of Dynamic Communication Patterns through Static Analysis. In Parallel Processing. IEEE.
[39]
S. Prema, Rupesh Nasre, R. Jehadeesan, and B. K. Panigrahi. 2019. A study on popular auto-parallelization frameworks. Concurrency and Computation: Practice and Experience 31, 17 (2019).
[40]
Francesca Rossi, Peter van Beek, and Toby Walsh (Eds.). 2006. Handbook of Constraint Programming. Elsevier.
[41]
Sean Rul, Hans Vandierendonck, and Koen De Bosschere. 2010. A Profile-based Tool for Finding Pipeline Parallelism in Sequential Programs. Parallel Comput. 36, 9 (2010).
[42]
Mehrzad Samadi, Davoud Anoushe Jamshidi, Janghaeng Lee, and Scott Mahlke. 2014. Paraprox: Pattern-based Approximation for Data Parallel Applications. In Architectural Support for Programming Languages and Operating Systems. ACM.
[43]
Howard M. Shapiro. 2003. Practical Flow Cytometry. Wiley.
[44]
The Clang Team. 2020. DataFlowSanitizer. https://clang.llvm.org/docs/DataFlowSanitizer.html
[45]
William Thies, Vikram Chandrasekhar, and Saman Amarasinghe. 2007. A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs. In IEEE/ACM International Symposium on Microarchitecture. IEEE.
[46]
Georgios Tournavitis and Björn Franke. 2010. Semi-automatic Extraction and Exploitation of Hierarchical Pipeline Parallelism Using Profiling Information. In Parallel Architectures and Compilation Techniques. ACM.
[47]
Georgios Tournavitis, Zheng Wang, Björn Franke, and Michael F.P. O'Boyle. 2009. Towards a Holistic Approach to Auto-Parallelization: Integrating Profile-Driven Parallelism Detection and Machine-Learning Based Mapping. In Programming Language Design and Implementation. ACM.
[48]
Michael Voss, Rafael Asenjo, and James Reinders. 2019. Pro TBB: C++ Parallel Programming with Threading Building Blocks. Apress.
[49]
Mani Zandifar, Mustafa Abdul Jabbar, Alireza Majidi, David Keyes, Nancy M. Amato, and Lawrence Rauchwerger. 2015. Composing Algorithmic Skeletons to Express High-Performance Scientific Applications. In International Conference on Supercomputing. ACM.

Cited By

View all
  • (2022)Pattern-based Autotuning of OpenMP Loops using Graph Neural Networks2022 IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S)10.1109/AI4S56813.2022.00010(26-31)Online publication date: Nov-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
February 2021
507 pages
ISBN:9781450382946
DOI:10.1145/3437801
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 February 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. code modernization
  2. dynamic analysis
  3. parallel patterns
  4. pattern matching

Qualifiers

  • Research-article

Funding Sources

  • EPSRC

Conference

PPoPP '21

Acceptance Rates

PPoPP '21 Paper Acceptance Rate 31 of 150 submissions, 21%;
Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)58
  • Downloads (Last 6 weeks)11
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Pattern-based Autotuning of OpenMP Loops using Graph Neural Networks2022 IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S)10.1109/AI4S56813.2022.00010(26-31)Online publication date: Nov-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media