research-article

Interactive Trace-Based Analysis Toolset for Manual Parallelization of C Programs

Authors:

Mihai T. Lazarescu,

Luciano LavagnoAuthors Info & Claims

ACM Transactions on Embedded Computing Systems (TECS), Volume 14, Issue 1

Article No.: 13, Pages 1 - 20

https://doi.org/10.1145/2638556

Published: 21 January 2015 Publication History

Abstract

Massive amounts of legacy sequential code need to be parallelized to make better use of modern multiprocessor architectures. Nevertheless, writing parallel programs is still a difficult task. Automated parallelization methods can be effective both at the statement and loop levels and, recently, at the task level, but they are still restricted to specific source code constructs or application domains. We present in this article an innovative toolset that supports developers when performing manual code analysis and parallelization decisions. It automatically collects and represents the program profile and data dependencies in an interactive graphical format that facilitates the analysis and discovery of manual parallelization opportunities. The toolset can be used for arbitrary sequential C programs and parallelization patterns. Also, its program-scope data dependency tracing at runtime can complement the tools based on static code analysis and can also benefit from it at the same time. We also tested the effectiveness of the toolset in terms of time to reach parallelization decisions and of their quality. We measured a significant improvement for several real-world representative applications.

References

[1]

V. H. Allan, R. B. Jones, R. M. Lee, and S. J. Allan. 1995. Software pipelining. ACM Computing Survey 27, 3, 367--432.

Digital Library

[2]

R. Allen and K. Kennedy. 2002. Optimizing Compilers for Modern Architectures. Morgan Kaufmann, San Francisco.

Digital Library

[3]

K. Asanovic, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer, J. Kubiatowicz, N. Morgan, D. Patterson, K. Sen, J. Wawrzynek, D. Wessel, and K. Yelick. 2009. A view of the parallel computing landscape. Communications of the ACM 52, 10, 56--67.

Digital Library

[4]

E. Athanasaki, N. Anastopoulos, K. Kourtis, and N. Koziris. 2008. Exploring the performance limits of simultaneous multithreading for memory intensive applications. Journal of Supercomputing 44, 1, 64--97.

Digital Library

[5]

D. F. Bacon, S. L. Graham, and O. J. Sharp. 1994. Compiler transformations for high-performance computing. ACM Computing Survey 26, 4, 345--420.

Digital Library

[6]

M.-W. Benabderrahmane, L.-N. Pouchet, A. Cohen, and C. Bastoul. 2010. The polyhedral model is more widely applicable than you think. In Compiler Construction, R. Gupta, Ed. Lecture Notes in Computer Science Series, vol. 6011. Springer, Berlin, 283--303.

Digital Library

[7]

D. Burger and J. Goodman. 2004. Billion-transistor architectures: There and back again. Computer 37, 3, 22--28.

Digital Library

[8]

Compaan Design BV. 2012. Retrieved from http://www.compaandesign.com/.

[9]

D. Culler, A. Dusseau, S. Goldstein, A. Krishnamurthy, S. Lumetta, T. von Eicken, and K. Yelick. 1993. Parallel programming in Split-C. In Proceedings of Supercomputing’93. 262--273.

Digital Library

[10]

J. González and A. González. 1998. The potential of data value speculation to boost ILP. In Proceedings of the 12th International Conference on Supercomputing. ICS’98. ACM, New York, NY, USA, 21--28.

Digital Library

[11]

B. Goossens and D. Parello. 2013. Limits of instruction-level parallelism capture. Procedia Computer Science 18, 0, 1664--1673. International Conference on Computational Science.

[12]

J. L. Hennessy and D. A. Patterson. 2012. Computer Architecture: A Quantitative Approach. Elsevier.

Digital Library

[13]

W.-M. Hwu, K. Keutzer, and T. Mattson. 2008. The concurrency challenge. IEEE Design Test of Computers 25, 4, 312--320.

Digital Library

[14]

G. Kahn. 1974. The semantics of a simple language for parallel programming. In Information Processing, J. L. Rosenfeld, Ed. North Holland, Amsterdam, Stockholm, Sweden, 471--475.

[15]

V. Kathail, S. Aditya, R. Schreiber, B. Ramakrishna Rau, D. Cronquist, and M. Sivaraman. 2002. Pico: Automatically designing custom computers. Computer 35, 9, 39--47.

Digital Library

[16]

B. Kienhuis, E. Rijpkema, and E. F. Deprettere. 2000. Compaan: Deriving process networks from Matlab for embedded signal processing architectures. In Proceedings of the 8th International Workshop on Hardware/Software Codesign. 13--17.

Digital Library

[17]

T. Mattson, B. Sanders, and B. Massingill. 2004. Patterns for Parallel Programming. Software Patterns Series. Pearson Education.

Digital Library

[18]

J.-Y. Mignolet, R. Baert, T. J. Ashby, P. Avasare, H.-O. Jang, and J. C. Son. 2009. MPA: Parallelizing an application onto a multicore platform made easy. IEEE Micro 29, 3, 31--39.

Digital Library

[19]

G. C. Necula, S. Mcpeak, S. P. Rahul, and W. Weimer. 2002. CIL: Intermediate language and tools for analysis and transformation of C programs. In Proceedings of the International Conference on Compiler Construction. 213--228.

Digital Library

[20]

G. Ottoni, R. Rangan, A. Stoler, and D. August. 2005. Automatic thread extraction with decoupled software pipelining. In Proceedings of 38th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO. IEEE. http://ieeexplore.ieee.org/xpls/abs_all.jsp&quest;arnumber=1540952&tag=1.

Digital Library

[21]

E. Pietriga. 2005. A toolkit for addressing HCI issues in visual language environments. In Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’05) 00, 145--152.

Digital Library

[22]

G. Ramalingam. 1994. The undecidability of aliasing. ACM Transactions on Programming Languages and Systems 16, 5, 1467--1471.

Digital Library

[23]

W. Thies, V. Chandrasekhar, and S. Amarasinghe. 2007. A practical approach to exploiting coarse-grained pipeline parallelism in C programs. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’07). 356--369.

Digital Library

[24]

G. Tournavitis, Z. Wang, B. Franke, and M. F. O’Boyle. 2009. Towards a holistic approach to auto-parallelization: Integrating profile-driven parallelism detection and machine-learning based mapping. SIGPLAN Notes 44, 6, 177--187.

Digital Library

[25]

H. Vandierendonck, S. Rul, and K. De Bosschere. 2010. The Paralax infrastructure: Automatic parallelization with a helping hand. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT’10). ACM, New York, NY, 389--400.

Digital Library

[26]

R. P. Wilson, R. S. French, C. S. Wilson, S. P. Amarasinghe, J. M. Anderson, S. W. K. Tjiang, S.-W. Liao, C.-W. Tseng, M. W. Hall, M. S. Lam, and J. L. Hennessy. 1994. Suif: An infrastructure for research on parallelizing and optimizing compilers. SIGPLAN Notes 29, 12, 31--37.

Digital Library

[27]

C. Yang, Y. Chen, X. Fu, C.-C. Lim, and R. Ju. 2006. A comparison of parallelization and performance optimizations for two ray-tracing applications. Proceedings of HPC&S 6, 321--330.

Cited By

Do XLouise SCohen A(2019)Design and Performance Analysis of Real-Time Dynamic Streaming ApplicationsLanguages and Compilers for Parallel Computing10.1007/978-3-030-34627-0_2(21-36)Online publication date: 13-Nov-2019
https://doi.org/10.1007/978-3-030-34627-0_2
Sultana NRao AJin ZPashakhanloo PZhu HZhong KLoo BShoshitaishvili YNaik M(2018)Making Break-ups Less PainfulProceedings of the 2018 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3273045.3273046(14-19)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3273045.3273046
Wang YKent K(2017)A Region-Based Approach to Pipeline Parallelism in Java Programs on Multicores2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)10.1109/PDP.2017.69(124-131)Online publication date: 2017
https://doi.org/10.1109/PDP.2017.69

Index Terms

Interactive Trace-Based Analysis Toolset for Manual Parallelization of C Programs
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Parallel programming languages

Recommendations

Energy-aware parallelization flow and toolset for C code
SCOPES '14: Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems

Multicore architectures are increasingly used in embedded systems to achieve higher throughput with lower energy consumption. This trend accentuates the need to convert existing sequential code to effectively exploit the resources of these ...
Dynamic Trace-Based Data Dependency Analysis for Parallelization of C Programs
SCAM '12: Proceedings of the 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation

Writing parallel code is traditionally considered a difficult task, even when it is tackled from the beginning of a project. In this paper, we demonstrate an innovative toolset that faces this challenge directly. It provides the software developers with ...
Automatic Trace-Based Parallelization of Java Programs
ICPP '07: Proceedings of the 2007 International Conference on Parallel Processing

We propose and evaluate a novel approach for automatic parallelization. The approach uses traces as units of parallel work. We discuss the benefits and challenges of the use of traces and propose an execution model for automatic parallelization based on ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems

ACM Transactions on Embedded Computing Systems Volume 14, Issue 1

January 2015

443 pages

ISSN:1539-9087

EISSN:1558-3465

DOI:10.1145/2724585

Editor:
Sandeep K. Shukla
Virginia Tech, USA

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 21 January 2015

Accepted: 01 June 2014

Revised: 01 December 2013

Received: 01 December 2012

Published in TECS Volume 14, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

European Commission in the context of the FP7 HEAP and PHARAON projects

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
146
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 30 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Do XLouise SCohen A(2019)Design and Performance Analysis of Real-Time Dynamic Streaming ApplicationsLanguages and Compilers for Parallel Computing10.1007/978-3-030-34627-0_2(21-36)Online publication date: 13-Nov-2019
https://doi.org/10.1007/978-3-030-34627-0_2
Sultana NRao AJin ZPashakhanloo PZhu HZhong KLoo BShoshitaishvili YNaik M(2018)Making Break-ups Less PainfulProceedings of the 2018 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3273045.3273046(14-19)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3273045.3273046
Wang YKent K(2017)A Region-Based Approach to Pipeline Parallelism in Java Programs on Multicores2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)10.1109/PDP.2017.69(124-131)Online publication date: 2017
https://doi.org/10.1109/PDP.2017.69

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents