Abstract
Loop Thread-Level Speculation on Hardware Transactional Memories is a promising strategy to improve application performance in the multicore era. However, the reuse of shared scalar or array variables introduces constraints (false dependences or false sharing) that obstruct efficient speculative parallelization. Speculative privatization relieves these constraints by creating speculatively private data copies for each transaction thus enabling scalable parallelization. To support it, this paper proposes two new OpenMP clauses to parallel for that enable speculative privatization of scalar or arrays in may DOACROSS loops: spec_private and spec_reduction. We also present an evaluation that reveals that, for certain loops, speed-ups of up to \(3.24\times \) can be obtained by applying speculative privatization in TLS.
Supported by FAPESP, grants 18/07446-8 and 18/15519-5.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Clang 4.0 was adapted to generate AST to support the new clauses as explained in Sect. 4.
References
Apostolakis, S., Xu, Z., Chan, G., Campanoni, S., August, D.I.: Perspective: a sensible approach to speculative automatic parallelization. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Lausanne, Switzerland, pp. 351–367 (2020)
Bhattacharyya, A., Amaral, J.N., Finkel, H.: Data-dependence profiling to enable safe thread level speculation. In: International Conference on Computer Science and Software Engineering, Markham, Canada, pp. 91–100 (2015)
Burke, M.G., Cytron, R., Ferrante, J., Hsieh, W.C.: Automatic generation of nested, fork-join parallelism. J. Supercomput. 3(2), 71–88 (1989)
Cintra, M., Llanos, D.R.: Toward efficient and robust software speculative parallelization on multiprocessors. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), San Diego, USA, pp. 13–24 (2003)
cTuning Foundation: cbench: Collective benchmarks (2016). http://ctuning.org/cbench
Gupta, M., Nim, R.: Techniques for speculative run-time parallelization of loops. In: International Conference on High Performance Computing, Networking, Storage and Analysis (SC), Orlando, USA, p. 12 (1998)
Johnson, N.P., Kim, H., Prabhu, P., Zaks, A., August, D.I.: Speculative separation for privatization and reductions. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Beijing, China, pp. 359–370 (2012)
Rauchwerger, L., Padua, D.A.: The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. IEEE Trans. Parallel Distrib. Syst. (TPDS) 10(2), 160–180 (1999)
Rauchwerger, L.: Speculative Parallelization of Loops, pp. 1901–1912. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-09766-4_35
Salamanca, J., Amaral, J.N., Araujo, G.: Evaluating and improving thread-level speculation in hardware transactional memories. In: IEEE International Parallel and Distributed Processing Symposium (IPDPS), Chicago, USA, pp. 586–595 (2016)
Salamanca, J., Amaral, J.N., Araujo, G.: Using hardware-transactional-memory support to implement thread-level speculation. IEEE Trans. Parallel Distrib. Syst. (TPDS) 29(2), 466–480 (2018)
Salamanca, J., Amaral, J.N., Araujo, G.: Performance evaluation of thread-level speculation in off-the-shelf hardware transactional memories. In: Rivera, F.F., Pena, T.F., Cabaleiro, J.C. (eds.) Euro-Par 2017. LNCS, vol. 10417, pp. 607–621. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64203-1_44
Sohi, G.S., Breach, S.E., Vijaykumar, T.N.: Multiscalar processors. In: International Symposium on Computer Architecture (ISCA), S. Margherita Ligure, Italy, pp. 414–425 (1995)
Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C.: A scalable approach to thread-level speculation. In: International Conference on Computer Architecture (ISCA), Vancouver, Canada, pp. 1–12 (2000)
Zima, H., Chapman, B.: Supercompilers for Parallel and Vector Computers. Association for Computing Machinery, New York (1990)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Salamanca, J., Baldassin, A. (2022). Using Hardware Transactional Memory to Implement Speculative Privatization in OpenMP. In: Chapman, B., Moreira, J. (eds) Languages and Compilers for Parallel Computing. LCPC 2020. Lecture Notes in Computer Science(), vol 13149. Springer, Cham. https://doi.org/10.1007/978-3-030-95953-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-95953-1_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95952-4
Online ISBN: 978-3-030-95953-1
eBook Packages: Computer ScienceComputer Science (R0)