Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1250734.1250760acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article

Software behavior oriented parallelization

Published: 10 June 2007 Publication History

Abstract

Many sequential applications are difficult to parallelize because of unpredictable control flow, indirect data access, and input-dependent parallelism. These difficulties led us to build a software system for behavior oriented parallelization (BOP), which allows a program to be parallelized based on partial information about program behavior, for example, a user reading just part of the source code, or a profiling tool examining merely one or few executions.
The basis of BOP is programmable software speculation, where a user or an analysis tool marks possibly parallel regions in the code, and the run-time system executes these regions speculatively. It is imperative to protect the entire address space during speculation. The main goal of the paper is to demonstrate that the general protection can be made cost effective by three novel techniques: programmable speculation, critical-path minimization, and value-based correctness checking. On a recently acquired multi-core, multi-processor PC, the BOP system reduced the end-to-end execution time by integer factors for a Lisp interpreter, a data compressor, a language parser, and a scientific library, with no change to the underlying hardware or operating system.

References

[1]
R. Allen and K. Kennedy. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers, October 2001.
[2]
M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, Utah, June 2001.
[3]
A. J. Bernstein. Analysis of programs for parallel processing. IEEE Transactions on Electronic Computers, 15(5):757--763, 1966.
[4]
W. Blume et al. Parallel programming with polaris. IEEE Computer, 29(12):77--81, December 1996.
[5]
H.-J. Boehm. Threads cannot be implemented as a library. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 261--268, 2005.
[6]
F. W. Chang and G. A. Gibson. Automatic i/o hint generation through speculative execution. In Proceedings of the Symposium on Operating Systems Design and Implementation, 1999.
[7]
W. Chen, C. Iancu, and K. Yelick. Communication optimizations for fine-grained UPC applications. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques, St. Louis, MO, 2005.
[8]
T. M. Chilimbi and M. Hirzel. Dynamic hot data stream prefetching for general-purpose programs. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Berlin, Germany, June 2002.
[9]
M. H. Cintra and D. R. Llanos. Design space exploration of a software speculative parallelization scheme. IEEE Transactions on Parallel and Distributed Systems, 16(6):562--576, 2005.
[10]
R. Cytron. Doacross: Beyond vectorization for multiprocessors. In Proceedings of the 1986 International Conference on Parallel Processing, St. Charles, IL, August 1986.
[11]
F. Dang, H. Yu, and L. Rauchwerger. The R-LRPD test: Speculative parallelization of partially parallel loops. Technical report, CS Dept., Texas A&M University, College Station, TX, 2002.
[12]
C. Ding and K. Kennedy. Improving cache performance in dynamic applications through data and computation reorganization at run time. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, GA, May 1999.
[13]
B. Grant, M. Philipose, M. Mock, C. Chambers, and S. J. Eggers. An evaluation of staged run-time optimizations in DyC. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia, May 1999.
[14]
CGrelck and S.-B. Scholz. SAC-from high-level programming with arrays to efficient parallel execution. Parallel Processing Letters, 13(3):401--412, 2003.
[15]
M. Gupta and R. Nim. Techniques for run-time parallelization of loops. In Proceedings of SC'98, 1998.
[16]
M. Hall, S. Amarasinghe, B. Murphy, S. Liao, and M. Lam. Interprocedural parallelization analysis in SUIF. ACM Trans. Program. Lang. Syst., 27(4):662--731, 2005.
[17]
R. H. Halstead. Multilisp: a language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems (TOPLAS), 7(4):501--538, 1985.
[18]
M. Herlihy, V. Luchangco, M. Moir, and W. N. Scherer III. Software transactional memory for dynamic--sized data structures. In Proceedings of the 22th PODC, pages 92--101, Boston, MA, July 2003.
[19]
M. Herlihy and J. E. Moss. Transactional memory: Architectural support for lock--free data structures. In Proceedings of the International Symposium on Computer Architecture, San Diego, CA, May 1993.
[20]
A. Kejariwal and A. Nicolau. Reading list of performance analysis, speculative execution. http://www.ics.uci.edu<akejariw/SpeculativeExecutionReadingList.pdf.
[21]
A. Kejariwal, X. Tian, W. Li, M. Girkar, S. Kozhukhov, H. Saito, U. Banerjee, A. Nicolau, A. V. Veidenbaum, and C. D. Polychronopoulos. On the performance potential of different types of speculative thread-level parallelism. In Proceedings of ACM International Conference on Supercomputing, June 2006.
[22]
P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel. TreadMarks: Distributed shared memory on standard workstations and operating systems. In Proceedings of the 1994 Winter USENIX Conference, 1994.
[23]
K. Li. Shared Virtual Memory on Loosely Coupled Multiprocessors. PhD thesis, Dept. of Computer Science, Yale University, New Haven, CT, September 1986.
[24]
M. K. Martin, D. J. Sorin, H. V. Cain, M. D. Hill, and M. H. Lipasti. Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing. In Proceedings of the International Symposium on Microarchitecture (MICRO--34), 2001.
[25]
J. Mellor-Crummey. Compile-time support for efficient data race detection in shared memory parallel programs. Technical Report CRPC-TR92232, Rice University, September 1992.
[26]
R. W. Numrich and J. K. Reid. Co-array Fortran for parallel programming. ACM Fortran Forum, 17(2):1--31, August 1998.
[27]
OpenMP application program interface, version 2.5, May 2005. http://www.openmp.org/drupal/mp-documents/spec25.pdf.
[28]
D. Perkovic and P. J. Keleher. A protocol-centric approach to on-the-fly race detection. IEEE Transactions on Parallel and Distributed Systems, 11(10):1058--1072, 2000.
[29]
L. Rauchwerger and D. Padua. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995.
[30]
M. C. Rinard and M. S. Lam. The design, implementation, and evaluation of Jade. ACM Transactions on Programming Languages and Systems (TOPLAS), 20(3):483--545, 1998.
[31]
X. Shen and C. Ding. Parallelization of utility programs based on behavior phase analysis. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing, Hawthorne, NY, 2005. short paper.
[32]
X. Shen, C. Ding, S. Dwarkadas, and M. L. Scott. Characterizing phases in service-oriented applications. Technical Report TR 848, Department of Computer Science, University of Rochester, November 2004.
[33]
X. Shen, Y. Zhong, and C. Ding. Locality phase prediction. In Proceedings of the Eleventh International Conference on Architect ural Support for Programming Languages and Operating Systems (ASPLOS XI), Boston, MA, 2004.
[34]
G. S. Sohi, S. E. Breach, and T. N. Vijaykumar. Multiscalar processors. In Proceedings of the International Symposium on Computer Architecture, 1995.
[35]
J. G. Steffan, C. Colohan, A. Zhai, and T. C. Mowry. The STAMPede approach to thread-level speculation. ACM Transactions on Computer Systems, 23(3):253--300, 2005.
[36]
C. von P. raun, L. Ceze, and C. Cascaval. Implicit parallelism with ordered transactions. In Proceedings of the ACM SIGPLAN Symposium on Principles Practice of Parallel Programming, March 2007.
[37]
R. Wahbe, S. Lucco, and S. L. Graham. Practical data breakpoints: design and implementation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Albuquerque, NM, June 1993.
[38]
A. Welc, S. Jagannathan, and A. L. Hosking. Safe futures for java. In Proceedings of OOPSLA, pages 439--453, 2005.

Cited By

View all
  • (2021)Scalable FSM parallelization via path fusion and higher-order speculationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446705(887-901)Online publication date: 19-Apr-2021
  • (2021)Loop Parallelization using Dynamic Commutativity Analysis2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO51591.2021.9370319(150-161)Online publication date: 27-Feb-2021
  • (2020)SCAF: a speculation-aware collaborative dependence analysis frameworkProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386028(638-654)Online publication date: 11-Jun-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2007
508 pages
ISBN:9781595936332
DOI:10.1145/1250734
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 42, Issue 6
    Proceedings of the 2007 PLDI conference
    June 2007
    491 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1273442
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. program behavior
  2. speculative parallelization

Qualifiers

  • Article

Conference

PLDI '07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)4
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Scalable FSM parallelization via path fusion and higher-order speculationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446705(887-901)Online publication date: 19-Apr-2021
  • (2021)Loop Parallelization using Dynamic Commutativity Analysis2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO51591.2021.9370319(150-161)Online publication date: 27-Feb-2021
  • (2020)SCAF: a speculation-aware collaborative dependence analysis frameworkProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386028(638-654)Online publication date: 11-Jun-2020
  • (2020)Challenging Sequential Bitstream Processing via Principled Bitwise SpeculationProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378461(607-621)Online publication date: 9-Mar-2020
  • (2020)PerspectiveProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378458(351-367)Online publication date: 9-Mar-2020
  • (2019)Enabling prefix sum parallelism pattern for recurrences with principled function reconstructionProceedings of the 28th International Conference on Compiler Construction10.1145/3302516.3307354(17-28)Online publication date: 16-Feb-2019
  • (2018)Revealing parallel scans and reductions in recurrences through function reconstructionProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243204(1-13)Online publication date: 1-Nov-2018
  • (2018)Making pull-based graph processing performantACM SIGPLAN Notices10.1145/3200691.317850653:1(246-260)Online publication date: 10-Feb-2018
  • (2018)Making pull-based graph processing performantProceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3178487.3178506(246-260)Online publication date: 10-Feb-2018
  • (2018)Software Speculation on Caching DSMsInternational Journal of Parallel Programming10.1007/s10766-017-0499-946:2(313-332)Online publication date: 1-Apr-2018
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media