Article

Software behavior oriented parallelization

Authors:

Chengliang ZhangAuthors Info & Claims

PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pages 223 - 234

https://doi.org/10.1145/1250734.1250760

Published: 10 June 2007 Publication History

Abstract

Many sequential applications are difficult to parallelize because of unpredictable control flow, indirect data access, and input-dependent parallelism. These difficulties led us to build a software system for behavior oriented parallelization (BOP), which allows a program to be parallelized based on partial information about program behavior, for example, a user reading just part of the source code, or a profiling tool examining merely one or few executions.

The basis of BOP is programmable software speculation, where a user or an analysis tool marks possibly parallel regions in the code, and the run-time system executes these regions speculatively. It is imperative to protect the entire address space during speculation. The main goal of the paper is to demonstrate that the general protection can be made cost effective by three novel techniques: programmable speculation, critical-path minimization, and value-based correctness checking. On a recently acquired multi-core, multi-processor PC, the BOP system reduced the end-to-end execution time by integer factors for a Lisp interpreter, a data compressor, a language parser, and a scientific library, with no change to the underlying hardware or operating system.

References

[1]

R. Allen and K. Kennedy. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers, October 2001.

Digital Library

[2]

M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, Utah, June 2001.

Digital Library

[3]

A. J. Bernstein. Analysis of programs for parallel processing. IEEE Transactions on Electronic Computers, 15(5):757--763, 1966.

[4]

W. Blume et al. Parallel programming with polaris. IEEE Computer, 29(12):77--81, December 1996.

Digital Library

[5]

H.-J. Boehm. Threads cannot be implemented as a library. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 261--268, 2005.

Digital Library

[6]

F. W. Chang and G. A. Gibson. Automatic i/o hint generation through speculative execution. In Proceedings of the Symposium on Operating Systems Design and Implementation, 1999.

Digital Library

[7]

W. Chen, C. Iancu, and K. Yelick. Communication optimizations for fine-grained UPC applications. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques, St. Louis, MO, 2005.

Digital Library

[8]

T. M. Chilimbi and M. Hirzel. Dynamic hot data stream prefetching for general-purpose programs. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Berlin, Germany, June 2002.

Digital Library

[9]

M. H. Cintra and D. R. Llanos. Design space exploration of a software speculative parallelization scheme. IEEE Transactions on Parallel and Distributed Systems, 16(6):562--576, 2005.

Digital Library

[10]

R. Cytron. Doacross: Beyond vectorization for multiprocessors. In Proceedings of the 1986 International Conference on Parallel Processing, St. Charles, IL, August 1986.

[11]

F. Dang, H. Yu, and L. Rauchwerger. The R-LRPD test: Speculative parallelization of partially parallel loops. Technical report, CS Dept., Texas A&M University, College Station, TX, 2002.

Digital Library

[12]

C. Ding and K. Kennedy. Improving cache performance in dynamic applications through data and computation reorganization at run time. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, GA, May 1999.

Digital Library

[13]

B. Grant, M. Philipose, M. Mock, C. Chambers, and S. J. Eggers. An evaluation of staged run-time optimizations in DyC. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia, May 1999.

Digital Library

[14]

CGrelck and S.-B. Scholz. SAC-from high-level programming with arrays to efficient parallel execution. Parallel Processing Letters, 13(3):401--412, 2003.

[15]

M. Gupta and R. Nim. Techniques for run-time parallelization of loops. In Proceedings of SC'98, 1998.

Digital Library

[16]

M. Hall, S. Amarasinghe, B. Murphy, S. Liao, and M. Lam. Interprocedural parallelization analysis in SUIF. ACM Trans. Program. Lang. Syst., 27(4):662--731, 2005.

Digital Library

[17]

R. H. Halstead. Multilisp: a language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems (TOPLAS), 7(4):501--538, 1985.

Digital Library

[18]

M. Herlihy, V. Luchangco, M. Moir, and W. N. Scherer III. Software transactional memory for dynamic--sized data structures. In Proceedings of the 22th PODC, pages 92--101, Boston, MA, July 2003.

Digital Library

[19]

M. Herlihy and J. E. Moss. Transactional memory: Architectural support for lock--free data structures. In Proceedings of the International Symposium on Computer Architecture, San Diego, CA, May 1993.

Digital Library

[20]

A. Kejariwal and A. Nicolau. Reading list of performance analysis, speculative execution. http://www.ics.uci.edu<akejariw/SpeculativeExecutionReadingList.pdf.

[21]

A. Kejariwal, X. Tian, W. Li, M. Girkar, S. Kozhukhov, H. Saito, U. Banerjee, A. Nicolau, A. V. Veidenbaum, and C. D. Polychronopoulos. On the performance potential of different types of speculative thread-level parallelism. In Proceedings of ACM International Conference on Supercomputing, June 2006.

Digital Library

[22]

P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel. TreadMarks: Distributed shared memory on standard workstations and operating systems. In Proceedings of the 1994 Winter USENIX Conference, 1994.

Digital Library

[23]

K. Li. Shared Virtual Memory on Loosely Coupled Multiprocessors. PhD thesis, Dept. of Computer Science, Yale University, New Haven, CT, September 1986.

Digital Library

[24]

M. K. Martin, D. J. Sorin, H. V. Cain, M. D. Hill, and M. H. Lipasti. Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing. In Proceedings of the International Symposium on Microarchitecture (MICRO--34), 2001.

Digital Library

[25]

J. Mellor-Crummey. Compile-time support for efficient data race detection in shared memory parallel programs. Technical Report CRPC-TR92232, Rice University, September 1992.

[26]

R. W. Numrich and J. K. Reid. Co-array Fortran for parallel programming. ACM Fortran Forum, 17(2):1--31, August 1998.

Digital Library

[27]

OpenMP application program interface, version 2.5, May 2005. http://www.openmp.org/drupal/mp-documents/spec25.pdf.

[28]

D. Perkovic and P. J. Keleher. A protocol-centric approach to on-the-fly race detection. IEEE Transactions on Parallel and Distributed Systems, 11(10):1058--1072, 2000.

Digital Library

[29]

L. Rauchwerger and D. Padua. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995.

Digital Library

[30]

M. C. Rinard and M. S. Lam. The design, implementation, and evaluation of Jade. ACM Transactions on Programming Languages and Systems (TOPLAS), 20(3):483--545, 1998.

Digital Library

[31]

X. Shen and C. Ding. Parallelization of utility programs based on behavior phase analysis. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing, Hawthorne, NY, 2005. short paper.

Digital Library

[32]

X. Shen, C. Ding, S. Dwarkadas, and M. L. Scott. Characterizing phases in service-oriented applications. Technical Report TR 848, Department of Computer Science, University of Rochester, November 2004.

[33]

X. Shen, Y. Zhong, and C. Ding. Locality phase prediction. In Proceedings of the Eleventh International Conference on Architect ural Support for Programming Languages and Operating Systems (ASPLOS XI), Boston, MA, 2004.

Digital Library

[34]

G. S. Sohi, S. E. Breach, and T. N. Vijaykumar. Multiscalar processors. In Proceedings of the International Symposium on Computer Architecture, 1995.

Digital Library

[35]

J. G. Steffan, C. Colohan, A. Zhai, and T. C. Mowry. The STAMPede approach to thread-level speculation. ACM Transactions on Computer Systems, 23(3):253--300, 2005.

Digital Library

[36]

C. von P. raun, L. Ceze, and C. Cascaval. Implicit parallelism with ordered transactions. In Proceedings of the ACM SIGPLAN Symposium on Principles Practice of Parallel Programming, March 2007.

Digital Library

[37]

R. Wahbe, S. Lucco, and S. L. Graham. Practical data breakpoints: design and implementation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Albuquerque, NM, June 1993.

Digital Library

[38]

A. Welc, S. Jagannathan, and A. L. Hosking. Safe futures for java. In Proceedings of OOPSLA, pages 439--453, 2005.

Digital Library

Cited By

Qiu JSun XSabet AZhao ZSherwood TBerger EKozyrakis C(2021)Scalable FSM parallelization via path fusion and higher-order speculationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446705(887-901)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3445814.3446705
Vasiladiotis CLozano RCole MFranke B(2021)Loop Parallelization using Dynamic Commutativity Analysis2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO51591.2021.9370319(150-161)Online publication date: 27-Feb-2021
https://doi.org/10.1109/CGO51591.2021.9370319
Apostolakis SXu ZTan ZChan GCampanoni SAugust DDonaldson ATorlak E(2020)SCAF: a speculation-aware collaborative dependence analysis frameworkProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386028(638-654)Online publication date: 11-Jun-2020
https://dl.acm.org/doi/10.1145/3385412.3386028
Show More Cited By

Index Terms

Software behavior oriented parallelization
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Runtime environments
    2. General programming languages
      1. Language types
        Parallel programming languages

Recommendations

Software behavior oriented parallelization
Proceedings of the 2007 PLDI conference

Many sequential applications are difficult to parallelize because of unpredictable control flow, indirect data access, and input-dependent parallelism. These difficulties led us to build a software system for behavior oriented parallelization (BOP), ...
A cost-driven compilation framework for speculative parallelization of sequential programs
PLDI '04

The emerging hardware support for thread-level speculation opens new opportunities to parallelize sequential programs beyond the traditional limits. By speculating that many data dependences are unlikely during runtime, consecutive iterations of a ...
A cost-driven compilation framework for speculative parallelization of sequential programs
PLDI '04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation

The emerging hardware support for thread-level speculation opens new opportunities to parallelize sequential programs beyond the traditional limits. By speculating that many data dependences are unlikely during runtime, consecutive iterations of a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation

June 2007

508 pages

ISBN:9781595936332

DOI:10.1145/1250734

General Chair:
Jeanne Ferrante
University of California, San Diego, USA
,
Program Chair:
Kathryn S. McKinley
University of Texas at Austin, USA

ACM SIGPLAN Notices Volume 42, Issue 6
Proceedings of the 2007 PLDI conference
June 2007
491 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1273442
Issue’s Table of Contents

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

PLDI '07

Sponsor:

PLDI '07: ACM SIGPLAN Conference on Programming Language Design and Implementation

June 10 - 13, 2007

California, San Diego, USA

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

184
Total Citations
View Citations
1,241
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)4

Reflects downloads up to 02 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Qiu JSun XSabet AZhao ZSherwood TBerger EKozyrakis C(2021)Scalable FSM parallelization via path fusion and higher-order speculationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446705(887-901)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3445814.3446705
Vasiladiotis CLozano RCole MFranke B(2021)Loop Parallelization using Dynamic Commutativity Analysis2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO51591.2021.9370319(150-161)Online publication date: 27-Feb-2021
https://doi.org/10.1109/CGO51591.2021.9370319
Apostolakis SXu ZTan ZChan GCampanoni SAugust DDonaldson ATorlak E(2020)SCAF: a speculation-aware collaborative dependence analysis frameworkProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386028(638-654)Online publication date: 11-Jun-2020
https://dl.acm.org/doi/10.1145/3385412.3386028
Qiu JJiang LZhao ZLarus JCeze LStrauss K(2020)Challenging Sequential Bitstream Processing via Principled Bitwise SpeculationProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378461(607-621)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3373376.3378461
Apostolakis SXu ZChan GCampanoni SAugust DLarus JCeze LStrauss K(2020)PerspectiveProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378458(351-367)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3373376.3378458
Xia YJiang PAgrawal GAmaral JKulkarni M(2019)Enabling prefix sum parallelism pattern for recurrences with principled function reconstructionProceedings of the 28th International Conference on Compiler Construction10.1145/3302516.3307354(17-28)Online publication date: 16-Feb-2019
https://dl.acm.org/doi/10.1145/3302516.3307354
Jiang PChen LAgrawal GEvripidou SStenström PO'Boyle M(2018)Revealing parallel scans and reductions in recurrences through function reconstructionProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243204(1-13)Online publication date: 1-Nov-2018
https://dl.acm.org/doi/10.1145/3243176.3243204
Grossman SLitz HKozyrakis C(2018)Making pull-based graph processing performantACM SIGPLAN Notices10.1145/3200691.317850653:1(246-260)Online publication date: 10-Feb-2018
https://dl.acm.org/doi/10.1145/3200691.3178506
Grossman SLitz HKozyrakis CKrall AGross T(2018)Making pull-based graph processing performantProceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3178487.3178506(246-260)Online publication date: 10-Feb-2018
https://dl.acm.org/doi/10.1145/3178487.3178506
Koduru SVora KGupta R(2018)Software Speculation on Caching DSMsInternational Journal of Parallel Programming10.1007/s10766-017-0499-946:2(313-332)Online publication date: 1-Apr-2018
https://dl.acm.org/doi/10.1007/s10766-017-0499-9
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents