Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1736020.1736030acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Speculative parallelization using software multi-threaded transactions

Published: 13 March 2010 Publication History

Abstract

With the right techniques, multicore architectures may be able to continue the exponential performance trend that elevated the performance of applications of all types for decades. While many scientific programs can be parallelized without speculative techniques, speculative parallelism appears to be the key to continuing this trend for general-purpose applications. Recently-proposed code parallelization techniques, such as those by Bridges et al. and by Thies et al., demonstrate scalable performance on multiple cores by using speculation to divide code into atomic units (transactions) that span multiple threads in order to expose data parallelism. Unfortunately, most software and hardware Thread-Level Speculation (TLS) memory systems and transactional memories are not sufficient because they only support single-threaded atomic units. Multi-threaded Transactions (MTXs) address this problem, but they require expensive hardware support as currently proposed in the literature. This paper proposes a Software MTX (SMTX) system that captures the applicability and performance of hardware MTX, but on existing multicore machines. The SMTX system yields a harmonic mean speedup of 13.36x on native hardware with four 6-core processors (24 cores in total) running speculatively parallelized applications.

References

[1]
M. Abadi, T. Harris, and M. Mehrara. Transactional memory with strong atomicity using off-the-shelf memory protection hardware. In PPoPP '09: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 185--196. ACM, 2009.
[2]
A.-R. Adl-Tabatabai, B. T. Lewis, V. Menon, B. R. Murphy, B. Saha, and T. Shpeisman. Compiler and runtime support for efficient software transactional memory. In PLDI '06: Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 26--37. ACM, June 2006.
[3]
R. Allen and K. Kennedy. Optimizing compilers for modern architectures: A dependence-based approach. Morgan Kaufmann Publishers Inc., 2002.
[4]
E. D. Berger, T. Yang, T. Liu, and G. Novark. Grace: safe multithreaded programming for C/C++. In OOPSLA '09: Proceeding of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, pages 81--96. ACM, 2009.
[5]
A. Bhowmik and M. Franklin. A general compiler framework for speculative multithreading. In Proceedings of the 14th ACM Symposium on Parallel Algorithms and Architectures, pages 99--108, August 2002.
[6]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: characterization and architectural implications. In PACT '08: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pages 72--81. ACM, 2008.
[7]
M. Bridges, N. Vachharajani, Y. Zhang, T. Jablin, and D. August. Revisiting the sequential programming model for multi-core. In MICRO'07: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pages 69--84. IEEE Computer Society, 2007.
[8]
M. J. Bridges. The VELOCITY Compiler: Extracting Efficient Multicore Execution from Legacy Sequential Codes. PhD thesis, Department of Computer Science, Princeton University, Princeton, New Jersey, United States, November 2008.
[9]
C. Cascaval, C. Blundell, M. Michael, H. W. Cain, P. Wu, S. Chiras, and S. Chatterjee. Software transactional memory: Why is it only a research toy? Queue, 6(5):46--58, 2008.
[10]
C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior oriented parallelization. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 223--234. ACM, 2007.
[11]
M. J. Garzarán, M. Prvulovic, J. M. Llabería, V. Vi Ünals, L. Rauchwerger, and J. Torrellas. Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors. ACM Transactions on Architecture Code Optimization, 2(3):247--279, 2005.
[12]
J. Giacomoni, T. Moseley, and M. Vachharajani. FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue. In PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 43--52. ACM, February 2008.
[13]
GNU Image Manipulation Program. http://www.gimp.org. K. Kelsey, T. Bai, C. Ding, and C. Zhang. Fast Track: A software system for speculative program optimization. In CGO '09: Proceedings of the 2009 International Symposium on Code Generation and Optimization, pages 157--168. IEEE Computer Society, May 2009.
[14]
M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic parallelism requires abstractions. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 211--222. ACM, 2007.
[15]
J. Larus and R. Rajwar. Transactional Memory (Synthesis Lectures on Computer Architecture). Morgan & Claypool Publishers, 2007.
[16]
C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO '04: Proceedings of the International Symposium on Code Generation and Optimization, page 75. IEEE Computer Society, 2004.
[17]
M. Mehrara, J. Hao, P.-C. Hsu, and S. Mahlke. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In PLDI '09: Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 166--176. ACM, 2009.
[18]
C. E. Oancea and A. Mycroft. Software thread-level speculation: an optimistic library implementation. In IWMSE '08: Proceedings of the 1st International Workshop on Multicore Software Engineering, pages 23--32. ACM, 2008.
[19]
M. Olszewski, J. Cutler, and J. G. Steffan. JudoSTM: A dynamic binary-rewriting approach to software transactional memory. In PACT'07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques, pages 365--375. IEEE Computer Society, 2007.
[20]
G. Ottoni, R. Rangan, A. Stoler, and D. I. August. Automatic thread extraction with decoupled software pipelining. In MICRO '05: Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture, pages 105--118. IEEE Computer Society, 2005.
[21]
E. Raman. Parallelization Techniques with Improved Dependence Handling. PhD thesis, Department of Computer Science, Princeton University, Princeton, New Jersey, United States, June 2009.
[22]
E. Raman, N. Vachharajani, R. Rangan, and D. I. August. Spice: speculative parallel iteration chunk execution. In CGO '08: Proceedings of the 2008 International Symposium on Code Generation and Optimization, pages 175--184. ACM, 2008.
[23]
L. Rauchwerger and D. A. Padua. The LRPD test: Speculative runtime parallelization of loops with privatization and reduction parallelization. IEEE Transactions on Parallel and Distributed Systems, 10(2):160---180, 1999.
[24]
P. Rundberg and P. Stenstrom. An all-software thread-level data dependence speculation system for multiprocessors. Journal of Instruction-Level Parallelism, 3, October 2001.
[25]
N. Shavit and D. Touitou. Software transactional memory. In PODC'95: Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing, pages 204--213. ACM, 1995.
[26]
Standard Performance Evaluation Corporation (SPEC). http://www.spec.org.
[27]
J. G. Steffan, C. Colohan, A. Zhai, and T. C. Mowry. The STAMPede approach to thread-level speculation. ACM Transactions on Computer Systems, 23(3):253--300, February 2005.
[28]
W. Thies, V. Chandrasekhar, and S. Amarasinghe. A practical approach to exploiting coarse-grained pipeline parallelism in C programs. In MICRO '07: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pages 356--369. IEEE Computer Society, 2007.
[29]
C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy or discard execution model for speculative parallelization on multicores. In MICRO '08: Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture, pages 330--341. IEEE Computer Society, 2008.
[30]
N. Vachharajani. Intelligent Speculation for Pipelined Multithreading. PhD thesis, Department of Computer Science, Princeton University, Princeton, New Jersey, United States, November 2008.
[31]
N. Vachharajani, R. Rangan, E. Raman, M. J. Bridges, G. Ottoni, and D. I. August. Speculative decoupled software pipelining. In PACT'07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques, pages 49--59. IEEE Computer Society, 2007.
[32]
H. Zhong, M. Mehrara, S. Lieberman, and S. Mahlke. Uncovering hidden loop level parallelism in sequential applications. In HPCA '08: Proceedings of the 14th International Symposium on High-Performance Computer Architecture, 2008.
[33]
C. Zilles and G. Sohi. Master/slave speculative parallelization. In MICRO'02: Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture, pages 85--96. IEEE Computer Society Press, 2002.

Cited By

View all
  • (2024)POSTER: OCToPus: Semantic-aware Concurrency Control for Blockchain TransactionsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638494(463-465)Online publication date: 2-Mar-2024
  • (2022)On the choice of the best chunk size for the speculative execution of loopsPLOS ONE10.1371/journal.pone.026760217:5(e0267602)Online publication date: 17-May-2022
  • (2022)Predictive Execution of Parallel Simulations in Hard Real-Time SystemsIEEE Transactions on Computers10.1109/TC.2022.3147416(1-1)Online publication date: 2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS XV: Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems
March 2010
422 pages
ISBN:9781605588391
DOI:10.1145/1736020
  • General Chair:
  • James C. Hoe,
  • Program Chair:
  • Vikram S. Adve
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 38, Issue 1
    ASPLOS '10
    March 2010
    399 pages
    ISSN:0163-5964
    DOI:10.1145/1735970
    Issue’s Table of Contents
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 45, Issue 3
    ASPLOS '10
    March 2010
    399 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1735971
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automatic parallelization
  2. loop-level parallelism
  3. multi-threaded transactions
  4. pipelined parallelism
  5. software transactional memory
  6. thread-level speculation

Qualifiers

  • Research-article

Conference

ASPLOS '10

Acceptance Rates

ASPLOS XV Paper Acceptance Rate 32 of 181 submissions, 18%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)9
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)POSTER: OCToPus: Semantic-aware Concurrency Control for Blockchain TransactionsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638494(463-465)Online publication date: 2-Mar-2024
  • (2022)On the choice of the best chunk size for the speculative execution of loopsPLOS ONE10.1371/journal.pone.026760217:5(e0267602)Online publication date: 17-May-2022
  • (2022)Predictive Execution of Parallel Simulations in Hard Real-Time SystemsIEEE Transactions on Computers10.1109/TC.2022.3147416(1-1)Online publication date: 2022
  • (2021)Scalable FSM parallelization via path fusion and higher-order speculationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446705(887-901)Online publication date: 19-Apr-2021
  • (2021)Quantifying the Semantic Gap Between Serial and Parallel Programming2021 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC53511.2021.00024(151-162)Online publication date: Nov-2021
  • (2021)Tasking framework for adaptive speculative parallel mesh generationThe Journal of Supercomputing10.1007/s11227-021-04158-978:5(1-32)Online publication date: 11-Nov-2021
  • (2020)SCAF: a speculation-aware collaborative dependence analysis frameworkProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386028(638-654)Online publication date: 11-Jun-2020
  • (2020)T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA45697.2020.00024(159-172)Online publication date: May-2020
  • (2019)LernaACM Transactions on Storage10.1145/331036815:1(1-24)Online publication date: 22-Mar-2019
  • (2019)Processing transactions in a predefined orderProceedings of the 24th Symposium on Principles and Practice of Parallel Programming10.1145/3293883.3295730(120-132)Online publication date: 16-Feb-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media