Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3331453.3361323acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaeConference Proceedingsconference-collections
research-article

Method for Reducing Overhead of Shared Memory Access Instrumentation

Published: 22 October 2019 Publication History

Abstract

Memory monitoring is crucial for understanding the memory access behavior of applications. Especially in multithreaded programs, dealing with concurrency bugs relies on tracking and analyzing accesses to shared memory. Instrumentation is widely used to obtain diagnostic information for runtime checks. However, instrumenting all memory accesses incurs a high performance overhead, slowing down a program's execution by an order of magnitude. In this paper, a simple but novel method is proposed to address performance degradation problem caused by instrumentation. It is based on the following key insight: there is no need to track those thread-local stack accesses. Recognizing such redundancy in memory access instrumentation and runtime checks, the paper presents the IIMA (Is Interesting Memory Access) algorithm to conduct instrumentation pruning. The algorithm is implemented based on LLVM infrastructure and evaluated across a range of well-designed test cases and open source benchmarks. The results show that the method is able to aggressively reduce the amount of instrumented memory accesses especially at low compilation optimization level and further reduce the runtime overhead.

References

[1]
Wang, Haojie, et al. (2018). Spindle: informed memory access monitoring. 2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18).
[2]
Liu, Lei, et al. (2012). A software memory partition approach for eliminating bank-level interference in multicore systems. Proceedings of the 21st international conference on Parallel architectures and compilation techniques. ACM.
[3]
Wen, Shasha, Milind Chabbi, and Xu Liu (2017). REDSPY: exploring value locality in software. ACM SIGARCH Computer Architecture News. Vol. 45. No. 1. ACM.
[4]
Voung, Jan Wen, Ranjit Jhala, and Sorin Lerner (2007). RELAY: static race detection on millions of lines of code. Proceedings of the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering. ACM.
[5]
Xie, Xinwei, Jingling Xue, and Jie Zhang (2013). Acculock: Accurate and efficient detection of data races. Software: Practice and Experience 43.5:543--576.
[6]
Bruening, Derek, and Qin Zhao (2011). Practical memory checking with Dr. Memory. Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization. IEEE Computer Society.
[7]
Serebryany, Konstantin, et al. (2012). AddressSanitizer: A fast address sanity checker. Presented as part of the 2012 {USENIX} Annual Technical Conference ({USENIX}{ATC} 12).
[8]
Arnold, Matthew, and Barbara G. Ryder (2001). A framework for reducing the cost of instrumented code. Acm Sigplan Notices36.5:168--179.
[9]
Marino, Daniel, Madanlal Musuvathi and Satish Narayanasamy (2009). LiteRace: effective sampling for lightweight data-race detection. ACM Sigplan notices. Vol. 44. No. 6. ACM.
[10]
Serebryany, Konstantin, et al. (2011). Dynamic race detection with LLVM compiler. International Conference on Runtime Verification. Springer, Berlin, Heidelberg.
[11]
Hauswirth, Matthias, and Trishul M. Chilimbi (2004). Low-overhead memory leak detection using adaptive statistical profiling. Acm SIGPLAN notices. Vol. 39. No. 11. ACM.
[12]
Erickson, John, et al. (2010). Effective Data-Race Detection for the Kernel. OSDI. Vol. 10. No. 10.
[13]
Lattner, Chris, and Vikram Adve (2004). LLVM: A compilation framework for lifelong program analysis & transformation. Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization. IEEE Computer Society.
[14]
Xavier-Baudry (2015). Parallelism-Benchmark. https://github.com/Xavier-Baudry/Parallelism-Benchmark.
[15]
NPB3.0-omp-C (2014). https://github.com/benchmark-subsetting/NPB3.0-omp-C.
[16]
The Fcd tool. (2017). https://github.com/zneak/fcd.
[17]
The Dagger tool. (2017). https://github.com/repzret/dagger.
[18]
Hardekopf, Ben, and Calvin Lin (2011). Flow-sensitive pointer analysis for millions of lines of code. Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization. IEEE Computer Society.

Cited By

View all
  • (2024)ParaShareDetect: Dynamic Instrumentation and Runtime Analysis for False Sharing Detection in Parallel Computing2024 4th International Conference on Computer, Control and Robotics (ICCCR)10.1109/ICCCR61138.2024.10585404(230-235)Online publication date: 19-Apr-2024
  • (2021)SnowboardProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483549(66-83)Online publication date: 26-Oct-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSAE '19: Proceedings of the 3rd International Conference on Computer Science and Application Engineering
October 2019
942 pages
ISBN:9781450362948
DOI:10.1145/3331453
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Instrumentation
  2. LLVM
  3. Memory access
  4. Runtime overhead
  5. Shared memory
  6. multithreaded programs

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CSAE 2019

Acceptance Rates

Overall Acceptance Rate 368 of 770 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ParaShareDetect: Dynamic Instrumentation and Runtime Analysis for False Sharing Detection in Parallel Computing2024 4th International Conference on Computer, Control and Robotics (ICCCR)10.1109/ICCCR61138.2024.10585404(230-235)Online publication date: 19-Apr-2024
  • (2021)SnowboardProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483549(66-83)Online publication date: 26-Oct-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media