Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-030-71058-3_1guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Characterizing the Sharing Behavior of Applications Using Software Transactional Memory

Published: 15 November 2020 Publication History

Abstract

Software Transactional Memory (STM) is an alternative abstraction for process synchronization in parallel programming. It is often easier to use than locks, avoiding issues such as deadlocks. In order to improve STM performance, many studies have been made on transactional schedulers. However, in current architectures with complex memories hierarchies, it is also important to map threads in such a way that threads that share data are executed close to each other in the memory hierarchy, such that they can access data protected by STM faster. For a successful thread mapping of an STM application, it is important to perform an in-depth analysis of its sharing behavior to determine its suitability for different mapping policies and the expected performance gains. This paper characterizes the sharing behavior of the STAMP benchmark suite by using information extracted from the STM runtime, providing information to guide thread mapping based on their sharing behavior. Our main findings are that most of the STAMP applications are suitable for a static thread mapping approach to improve the performance since (1) the applications do not present dynamic behavior and (2) the sharing pattern does not change between executions. Furthermore, we show that sharing information gathered from the STM runtime can be used to analyze and reduce false sharing in TM applications.

References

[1]
Amslinger, R., Piatka, C., Haas, F., Weis, S., Ungerer, T., Altmeyer, S.: Hardware multiversioning for fail-operational multithreaded applications. In: 2020 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 20–27. IEEE CS, September 2020.
[2]
Baldassin, A., Borin, E., Araujo, G.: Performance implications of dynamic memory allocators on transactional memory systems. In: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, pp. 87–96. Association for Computing Machinery, New York (2015).
[3]
Barrow-Williams, N., Fensch, C., Moore, S.: A communication characterisation of Splash-2 and Parsec. In: 2009 IEEE International Symposium on Workload Characterization (IISWC), pp. 86–97 (2009).
[4]
Bordage, C., Jeannot, E.: Process affinity, metrics and impact on performance: an empirical study. In: Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. CCGrid 2018, pp. 523–532. IEEE Press (2018).
[5]
Bylina, B., Bylina, J.: OpenMP thread affinity for matrix factorization on multicore systems. In: 2017 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 489–492 (2017).
[6]
de Carvalho, J.P.L., Honorio, B.C., Baldassin, A., Araujo, G.: Improving transactional code generation via variable annotation and barrier elision. In: 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). pp. 1008–1017 (2020).
[7]
Castro, M., Georgiev, K., Marangozova-Martin, V., Méhaut, J., Fernandes, L.G., Santana, M.: Analysis and tracing of applications based on software transactional memory on multicore architectures. In: 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing, pp. 199–206 (2011).
[8]
Castro M, Góes LFW, and Méhaut JF Adaptive thread mapping strategies for transactional memory applications J. Parallel Distrib. Comput. 2014 74 9 2845-2859
[9]
Chan, K., Lam, K.T., Wang, C.L.: Cache affinity optimization techniques for scaling software transactional memory systems on multi-CMP architectures. In: 14th Internationl Symposium on Parallel and Distributed Computing, pp. 56–65. IEEE CS, June 2015.
[10]
Chen, D.D., Gibbons, P.B., Mowry, T.C.: Tardis, TM.: Incremental repair for transactional memory. In: Proceedings of the Eleventh International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2020. Association for Computing Machinery, New York (2020).
[11]
Cruz EHM, Diener M, and Navaux POA Thread and Data Mapping for Multicore Systems 2018 Cham Springer
[12]
Cruz, E.H.M., Diener, M., Pilla, L.L., Navaux, P.O.A.: EagerMap: a task mapping algorithm to improve communication and load balancing in clusters of multicore systems. ACM Trans. Parallel Comput. 5(4) (Mar 2019).
[13]
Di Sanzo P Analysis, classification and comparison of scheduling techniques for software transactional memories IEEE Trans. Parallel Distrib. Syst. 2017 28 12 3356-3373
[14]
Di Sanzo P, Pellegrini A, Sannicandro M, Ciciani B, and Quaglia F Adaptive model-based scheduling in software transactional memory IEEE Trans. Comput. 2020 69 5 621-632
[15]
Diener, M., Cruz, E.H.M., Alves, M.A.Z., Navaux, P.O.A.: Communication in shared memory: Concepts, definitions, and efficient detection. In: 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 151–158, February 2016.
[16]
Diener M, Cruz EH, Pilla LL, Dupros F, and Navaux PO Characterizing communication and page usage of parallel applications for thread and data mapping Performance Evaluation 2015 88–89 18-36
[17]
Felber, P., Fetzer, C., Riegel, T.: Dynamic performance tuning of word-based software transactional memory. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2008, pp. 237–246. ACM, New York (2008).
[18]
Felber P, Fetzer C, Riegel T, and Marlier P Time-based software transactional memory IEEE Trans. Parallel Distrib. Syst. 2010 21 1793-1807
[19]
Gaud F et al. Challenges of memory management on modern NUMA systems Commun. ACM 2015 58 12 59-66
[20]
Góes LFW, Ribeiro CP, Castro M, Méhaut J-F, Cole M, and Cintra M Automatic skeleton-driven memory affinity for transactional worklist applications Int. J. Parallel Programm. 2013 42 2 365-382
[21]
Grahn H Transactional memory J. Parallel Distrib. Comput. 2010 70 10 993-1008
[22]
Guerraoui, R., Herlihy, M., Pochon, B.: Towards a theory of transactional contention managers. In: Proceedings of the Twenty-fifth Annual ACM Symposium on Principles of Distributed Computing, PODC 2006, pp. 316–317. ACM, New York (2006).
[23]
Gustedt, J., Jeannot, E., Mansouri, F.: Automatic, abstracted and portable topology-aware thread placement. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 389–399 (2017).
[24]
Harris T, Larus J, and Rajwar R Transactional Memory 2010 San Rafae Morgan and Claypool Publishers
[25]
Hughes, C., Poe, J., Qouneh, A., Li, T.: On the (dis)similarity of transactional memory workloads. In: 2009 IEEE International Symposium on Workload Characterization (IISWC), pp. 108–117 (2009).
[26]
Jeannot, E.: TopoMatch: Process mapping algorithms and tools for general topologies (2020). https://gitlab.inria.fr/ejeannot/topomatch. Accessed 20 July 2020
[27]
Jeannot E, Mercier G, and Tessier F Process placement in multicore clusters: algorithmic issues and practical techniques IEEE Trans. Parallel Distrib. Syst. 2014 25 4 993-1002
[28]
Khaleghzadeh H, Deldari H, Reddy R, and Lastovetsky A Hierarchical multicore thread mapping via estimation of remote communication J. Supercomput. 2017 74 3 1321-1340
[29]
Luk, C.K., et al.: Pin: Building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 190–200. ACM, New York (2005).
[30]
Majo, Z., Gross, T.R.: Memory system performance in a NUMA multicore multiprocessor. In: Proceedings of the 4th Annual International Conference on Systems and Storage, SYSTOR 2011, pp. 12:1–12:10. ACM, New York (2011).
[31]
Mazaheri, A., Wolf, F., Jannesari, A.: Unveiling thread communication bottlenecks using hardware-independent metrics. In: Proceedings of the 47th International Conference on Parallel Processing. ICPP 2018. ACM, New York (2018).
[32]
Minh, C.C., Chung, J., Kozyrakis, C., Olukotun, K.: STAMP: stanford transactional applications for multi-processing. In: IEEE International Symposium on Workload Characterization. pp. 35–46. IEEE CS, September 2008.
[33]
Mohammed, M.S., Abandah, G.A.: Communication characteristics of parallel shared-memory multicore applications. In: 2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), pp. 1–6 (2015).
[34]
Mururu, G., Gavrilovska, A., Pande, S.: Quantifying and reducing execution variance in STM via model driven commit optimization. In: 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 109–121 (2019).
[35]
Pasqualin, D.P., Diener, M., Du Bois, A.R., Pilla, M.L.: Online sharing-aware thread mapping in software transactional memory. In: 2020 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 35–42. IEEE CS, September 2020.
[36]
Pasqualin, D.P., Diener, M., Du Bois, A.R., Pilla, M.L.: Thread affinity in software transactional memory. In: 2020 19th International Symposium on Parallel and Distributed Computing (ISPDC), pp. 180–187. IEEE CS, July 2020.
[37]
Pellegrini, F.: Static mapping by dual recursive bipartitioning of process architecture graphs. In: Proceedings of IEEE Scalable High Performance Computing Conference, pp. 486–493 (1994).
[38]
Poudel, P., Sharma, G.: Adaptive versioning in transactional memories. In: Ghaffari, M., Nesterenko, M., Tixeuil, S., Tucci, S., Yamauchi, Y. (eds.) Stabilization, Safety, and Security of Distributed Systems. pp. 277–295. Springer International Publishing, Cham (2019).
[39]
Rane, A., Browne, J.: Performance optimization of data structures using memory access characterization. In: 2011 IEEE International Conference on Cluster Computing, pp. 570–574 (2011).
[40]
Sasongko, M.A., Chabbi, M., Akhtar, P., Unat, D.: ComDetective: a lightweight communication detection tool for threads. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. SC 2019 ACM, New York (2019).
[41]
Soomro PN, Sasongko MA, and Unat D BindMe: A thread binding library with advanced mapping algorithms Concurr. Comput. Pract. Exp. 2018 30 21 e4692
[42]
Stirb, I.: NUMA-BTDM: A thread mapping algorithm for balanced data locality on NUMA systems. In: 2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 317–320 (2016).
[43]
Waliullah MM and Stenstrom P Removal of conflicts in hardware transactional memory systems Int. J. Parallel Program. 2012 42 1 198-218
[44]
Wang Z and Bovik AC Mean squared error: Love it or leave it? a new look at signal fidelity measures IEEE Signal Process. Mag. 2009 26 1 98-117
[45]
Yu Z, Zuo Yu, and Zhao Y Convoider: a concurrency bug avoider based on transparent software transactional memory Int. J. Parallel Program. 2019 48 1 32-60
[46]
Zhou N, Delaval G, Robu B, Rutten E, and Méhaut JF An autonomic-computing approach on mapping threads to multi-cores for software transactional memory Concurr. Comput. Pract. Exp. 2018 30 18 e4506

Cited By

View all
  • (2021)Sharing-Aware Data Mapping in Software Transactional MemoryEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-031-04580-6_32(481-492)Online publication date: 4-Jul-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Benchmarking, Measuring, and Optimizing: Third BenchCouncil International Symposium, Bench 2020, Virtual Event, November 15–16, 2020, Revised Selected Papers
Nov 2020
245 pages
ISBN:978-3-030-71057-6
DOI:10.1007/978-3-030-71058-3
  • Editors:
  • Felix Wolf,
  • Wanling Gao

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 15 November 2020

Author Tags

  1. Software transactional memory
  2. STAMP
  3. Sharing behavior
  4. Communication matrix
  5. Thread mapping
  6. Characterization

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Sharing-Aware Data Mapping in Software Transactional MemoryEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-031-04580-6_32(481-492)Online publication date: 4-Jul-2021

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media