research-article

Making nested parallel transactions practical using lightweight hardware support

Authors:

Nathan Bronson,

Christos Kozyrakis,

Kunle OlukotunAuthors Info & Claims

ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing

Pages 61 - 71

https://doi.org/10.1145/1810085.1810097

Published: 02 June 2010 Publication History

Abstract

Transactional Memory (TM) simplifies parallel programming by supporting parallel tasks that execute in an atomic and isolated way. To achieve the best possible performance, TM must support the nested parallelism available in real-world applications and supported by popular programming models. A few recent papers have proposed support for nested parallelism in software TM (STM) and hardware TM (HTM). However, the proposed designs are still impractical, as they either introduce excessive runtime overheads or require complex hardware structures.

This paper presents filter-accelerated, nested TM (FaNTM). We extend a hybrid TM based on hardware signatures to provide practical support for nested parallel transactions. In the FaNTM design, hardware filters provide continuous and nesting-aware conflict detection, which effectively eliminates the excessive overheads of software nested transactions. In contrast to a full HTM approach, FaNTM simplifies hardware by decoupling nested parallel transactions from caches using hardware filters. We also describe subtle correctness and liveness issues that do not exist in the non-nested baseline TM.

We quantify the performance of FaNTM using STAMP applications and microbenchmarks that use concurrent data structures. First, we demonstrate that the runtime overhead of FaNTM is small (2.3% on average) when applications use only single-level parallelism. Second, we show that the incremental performance overhead of FaNTM is reasonable when the available parallelism is used in deeper nesting levels. We also demonstrate that nested parallel transactions on FaNTM run significantly faster (e.g., 12.4x) than those on a nested STM. Finally, we show how nested parallelism is used to improve the overall performance of a transactional microbenchmark.

References

[1]

The OpenMP Application Program Interface Specification, version 3.0. http://www.openmp.org, May 2008.

[2]

K. Agrawal, J. T. Fineman, and J. Sukha. Nested parallelism in transactional memory. In PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, pages 163--174, New York, NY, USA, 2008. ACM.

Digital Library

[3]

W. Baek, N. Bronson, C. Kozyrakis, and K. Olukotun. Implementing and evaluating nested parallel transactions in software transactional memory. In 22nd ACM Symposium on Parallelism in Algorithms and Architectures. June 2010.

Digital Library

[4]

J. Barreto, A. Dragojević, P. Ferreira, R. Guerraoui, and M. Kapalka. Leveraging parallel nesting in transactional memory. In PPoPP '10: Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 91--100, New York, NY, USA, 2010. ACM.

Digital Library

[5]

C. Cao Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In IISWC '08: Proceedings of The IEEE International Symposium on Workload Characterization, September 2008.

[6]

C. Cao Minh, M. Trautmann, J. Chung, A. McDonald, N. Bronson, J. Casper, C. Kozyrakis, and K. Olukotun. An effective hybrid transactional memory system with strong isolation guarantees. In Proceedings of the 34th Annual International Symposium on Computer Architecture. June 2007.

Digital Library

[7]

P. Damron, A. Fedorova, Y. Lev, V. Luchangco, M. Moir, and D. Nussbaum. Hybrid transactional memory. In ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 2006.

Digital Library

[8]

D. Dice, O. Shalev, and N. Shavit. Transactional locking II. In DISC'06: Proceedings of the 20th International Symposium on Distributed Computing, March 2006.

Digital Library

[9]

L. Hammond, V. Wong, M. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherence and consistency. In Proceedings of the 31st International Symposium on Computer Architecture, pages 102--113, June 2004.

Digital Library

[10]

T. Harris and K. Fraser. Language support for lightweight transactions. In OOPSLA '03: Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, pages 388--402. ACM Press, 2003.

Digital Library

[11]

M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the 20th International Symposium on Computer Architecture, pages 289--300, 1993.

Digital Library

[12]

J. Larus and R. Rajwar. Transactional Memory. Morgan Claypool Synthesis Series, 2006.

[13]

K. E. Moore, J. Bobba, M. J. Moravan, M. D. Hill, and D. A. Wood. LogTM: Log-Based Transactional Memory. In 12th International Conference on High-Performance Computer Architecture, February 2006.

[14]

J. E. B. Moss and T. Hosking. Nested Transactional Memory: Model and Preliminary Architecture Sketches. In OOPSLA 2005 Workshop on Synchronization and Concurrency in Object-Oriented Languages (SCOOL). University of Rochester, October 2005.

[15]

H. E. Ramadan and E. Witchel. The Xfork in the Road to Coordinated Sibling Transactions. In The Fourth ACM SIGPLAN Workshop on Transactional Computing (TRANSACT 09), February 2009.

[16]

B. Saha, A. Adl-Tabatabai, and Q. Jacobson. Architectural support for software transactional memory. In MICRO '06: Proceedings of the International Symposium on Microarchitecture, 2006.

Digital Library

[17]

B. Saha, A.-R. Adl-Tabatabai, R. L. Hudson, C. Cao Minh, and B. Hertzberg. McRT-STM: A high performance software transactional memory system for a multi-core runtime. In PPoPP '06: Proceedings of the 11th ACM SIGPLAN symposium on Principles and practice of parallel programming, New York, NY, USA, March 2006. ACM Press.

Digital Library

[18]

Supercomputing Technologies Group, Massachusetts Institute of Technology Laboratory for Computer Science. Cilk 5.4.6 Reference Manual, Nov. 2001.

[19]

Y. Tanaka, K. Taura, M. Sato, and A. Yonezawa. Performance Evaluation of OpenMP Applications with Nested Parallelism. In LCR '00: Languages, Compilers, and Run-Time Systems for Scalable Computers, pages 100--112, London, UK, 2000. Springer-Verlag.

Digital Library

[20]

N. A. Vachharajani. Intelligent Speculation for Pipelined Multithreading. PhD thesis, Princeton University, 2008.

Digital Library

[21]

H. Volos, A. Welc, A.-R. Adl-Tabatabai, T. Shpeisman, X. Tian, and R. Narayanaswamy. NePaLTM: Design and Implementation of Nested Parallelism for Transactional Memory Systems. In ECOOP, 2009.

Digital Library

Cited By

Kim SHan MBaek W(2022)DPrime+DAbort: A High-Precision and Timer-Free Directory-Based Side-Channel Attack in Non-Inclusive Cache Hierarchies using Intel TSX2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00014(67-81)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00014
Park JBaek W(2019)Analyzing and optimizing the performance and energy efficiency of transactional scientific applications on large-scale NUMA systems with HTM supportJournal of Parallel and Distributed Computing10.1016/j.jpdc.2018.12.007127:C(1-17)Online publication date: 1-May-2019
https://dl.acm.org/doi/10.1016/j.jpdc.2018.12.007
Park JBaek W(2018)Quantifying the Performance and Energy-Efficiency Impact of Hardware Transactional Memory on Scientific Applications on Large-Scale NUMA Systems2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2018.00090(804-813)Online publication date: May-2018
https://doi.org/10.1109/IPDPS.2018.00090
Show More Cited By

Index Terms

Making nested parallel transactions practical using lightweight hardware support

Recommendations

Implementing and evaluating nested parallel transactions in software transactional memory
SPAA '10: Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures

Transactional Memory (TM) is a promising technique that simplifies parallel programming for shared-memory applications. To date, most TM systems have been designed to efficiently support single-level parallelism. To achieve widespread use and maximize ...
Safe open-nested transactions through ownership
PPoPP '09

Researchers in transactional memory (TM) have proposed open nesting as a methodology for increasing the concurrency of transactional programs. The idea is to ignore ``low-level'' memory operations of an open-nested transaction when detecting conflicts ...
Safe open-nested transactions through ownership
PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming

Researchers in transactional memory (TM) have proposed open nesting as a methodology for increasing the concurrency of transactional programs. The idea is to ignore ``low-level'' memory operations of an open-nested transaction when detecting conflicts ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing

June 2010

365 pages

ISBN:9781450300186

DOI:10.1145/1810085

General Chair:
Taisuke Boku
University of Tsukuba
,
Program Chairs:
Hiroshi Nakashima
Kyoto University
,
Avi Mendelson
Microsoft

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 June 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ICS'10

Sponsor:

SIGARCH

ICS'10: International Conference on Supercomputing

June 2 - 4, 2010

Ibaraki, Tsukuba, Japan

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
173
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kim SHan MBaek W(2022)DPrime+DAbort: A High-Precision and Timer-Free Directory-Based Side-Channel Attack in Non-Inclusive Cache Hierarchies using Intel TSX2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00014(67-81)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00014
Park JBaek W(2019)Analyzing and optimizing the performance and energy efficiency of transactional scientific applications on large-scale NUMA systems with HTM supportJournal of Parallel and Distributed Computing10.1016/j.jpdc.2018.12.007127:C(1-17)Online publication date: 1-May-2019
https://dl.acm.org/doi/10.1016/j.jpdc.2018.12.007
Park JBaek W(2018)Quantifying the Performance and Energy-Efficiency Impact of Hardware Transactional Memory on Scientific Applications on Large-Scale NUMA Systems2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2018.00090(804-813)Online publication date: May-2018
https://doi.org/10.1109/IPDPS.2018.00090
Subramanian SJeffrey MAbeydeera MLee HYing VEmer JSanchez D(2017)FractalACM SIGARCH Computer Architecture News10.1145/3140659.308021845:2(587-599)Online publication date: 24-Jun-2017
https://dl.acm.org/doi/10.1145/3140659.3080218
Subramanian SJeffrey MAbeydeera MLee HYing VEmer JSanchez D(2017)FractalProceedings of the 44th Annual International Symposium on Computer Architecture10.1145/3079856.3080218(587-599)Online publication date: 24-Jun-2017
https://dl.acm.org/doi/10.1145/3079856.3080218
Kim TLee SPark JKim J(2016)Efficient lifetime management of SSD-based RAIDs using dedup-assisted partial stripe writes2016 5th Non-Volatile Memory Systems and Applications Symposium (NVMSA)10.1109/NVMSA.2016.7547184(1-6)Online publication date: Aug-2016
https://doi.org/10.1109/NVMSA.2016.7547184
Kim SBaek W(2016)HAPT: hardware-accelerated persistent transactions2016 5th Non-Volatile Memory Systems and Applications Symposium (NVMSA)10.1109/NVMSA.2016.7547181(1-6)Online publication date: Aug-2016
https://doi.org/10.1109/NVMSA.2016.7547181
Filipe RBarreto J(2015)Nested Parallelism in Transactional MemoryTransactional Memory. Foundations, Algorithms, Tools, and Applications10.1007/978-3-319-14720-8_9(192-209)Online publication date: 2015
https://doi.org/10.1007/978-3-319-14720-8_9
Titos-Gil JAcacio M(2015)Hardware Approaches to Transactional Memory in Chip MultiprocessorsHandbook on Data Centers10.1007/978-1-4939-2092-1_27(805-835)Online publication date: 17-Mar-2015
https://doi.org/10.1007/978-1-4939-2092-1_27
Chang WJou JHsieh CLin D(2013)A Distributed Run-Time Dynamic Data Manager for Multi-core System Parallel ExecutionAdvances in Intelligent Systems and Applications - Volume 210.1007/978-3-642-35473-1_73(741-750)Online publication date: 2013
https://doi.org/10.1007/978-3-642-35473-1_73
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents