Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Public Access

The CSI Framework for Compiler-Inserted Program Instrumentation

Published: 19 December 2017 Publication History

Abstract

The CSI framework provides comprehensive static instrumentation that a compiler can insert into a program-under-test so that dynamic-analysis tools - memory checkers, race detectors, cache simulators, performance profilers, code-coverage analyzers, etc. - can observe and investigate runtime behavior. Heretofore, tools based on compiler instrumentation would each separately modify the compiler to insert their own instrumentation. In contrast, CSI inserts a standard collection of instrumentation hooks into the program-under-test. Each CSI-tool is implemented as a library that defines relevant hooks, and the remaining hooks are "nulled" out and elided during either compile-time or link-time optimization, resulting in instrumented runtimes on par with custom instrumentation. CSI allows many compiler-based tools to be written as simple libraries without modifying the compiler, lowering the bar for the development of dynamic-analysis tools.
We have defined a standard API for CSI and modified LLVM to insert CSI hooks into the compiler's internal representation (IR) of the program. The API organizes IR objects - such as functions, basic blocks, and memory accesses - into flat and compact ID spaces, which not only simplifies the building of tools, but surprisingly enables faster maintenance of IR-object data than do traditional hash tables. CSI hooks contain a "property" parameter that allows tools to customize behavior based on static information without introducing overhead. CSI provides "forensic" tables that tools can use to associate IR objects with source-code locations and to relate IR objects to each other.
To evaluate the efficacy of CSI, we implemented six demonstration CSI-tools. One of our studies shows that compiling with CSI and linking with the "null" CSI-tool produces a tool-instrumented executable that is as fast as the original uninstrumented code. Another study, using a CSI port of Google's ThreadSanitizer, shows that the CSI-tool rivals the performance of Google's custom compiler-based implementation. All other demonstration CSI tools slow down the execution of the program-under-test by less than 70%.

References

[1]
A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (second ed.).
[2]
Apache Software Foundation. 2016. ab -- Apache HTTP server benchmarking tool. Available at https://httpd.apache.org/docs/2.4/programs/ab.html. (2016).
[3]
David R. Barach, David H. Taenzer, and Robert E. Wells. 1982. A Technique for Finding Storage Allocation Errors in C-language Programs. SIGPLAN Notices, Vol. 17, 5 (May 1982), 16--24.
[4]
Andrew R. Bernat and Barton P. Miller. 2011. Anywhere, Any-time Binary Instrumentation. In PASTE. 9--16.
[5]
Walter Binder, Alex Villazón, Danilo Ansaloni, and Philippe Moret. 2009. @J: Towards Rapid Development of Dynamic Analysis Tools for the Java Virtual Machine. VMIL. Article 4,9 pages.
[6]
Derek Bruening, Evelyn Duesterwald, and Saman Amarasinghe. 2001. Design and Implementation of a Dynamic Optimization Framework for Windows. FDDO-4.
[7]
Eric Bruneton, Romain Lenglet, and Thierry Coupaye. 2002. ASM: A code manipulation tool to implement adaptable systems. Adaptable and Extensible Component Systems.
[8]
Randal E. Bryant and David R. O’Hallaron. 2015. Computer Systems: A Programmer's Perspective (3rd ed.). Pearson, USA.
[9]
Bryan Buck and Jeffrey K. Hollingsworth. 2000. An API for Runtime Code Patching. Int. J. High Perform. Comput. Appl. Vol. 14, 4 (Nov.2000), 317--329.
[10]
Mark Callaghan and Domas Mituzas. 2009. Poor Man's Profiler. Available at https://dom.as/2009/02/15/poor-mans-contention-profiling. (Feb.2009).
[11]
Clang. 2017. Clang 6 Documentatoin: ThreadSanitizer. http://clang.llvm.org/docs/ThreadSanitizer.html. (2017).
[12]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms (third ed.). The MIT Press.
[13]
Thomas Coudray, Arnaud Fontaine, and Pierre Chifflier. 2015. Picon: Control Flow Integrity on LLVM IR. In SSTIC.
[14]
Markus Dahm. 1999. Byte Code Engineering. 267--277.
[15]
Bruno De Bus, Dominique Chanet, Bjorn De Sutter, Ludo Van Put, and Koen De Bosschere. 2004. The Design and Implementation of FIT: A Flexible Instrumentation Toolkit. PASTE. 29--34.
[16]
Chen Ding and Yutao Zhong. 2003. Predicting Whole-program Locality Through Reuse Distance Analysis. PLDI. 245--257.
[17]
Anne Dinning and Edith Schonberg. 1991. Detecting Access Anomalies in Programs with Critical Sections. PADD. 85--96.
[18]
Anne Carolyn Dinning. . 1990. Detecting Nondeterminism in Shared Memory Parallel Programs. Ph.D. Dissertation. Department of Computer Science, New York University.
[19]
DWARF Standards Committee. 2015. DWARF Debugging Information Format Version 4. (2015).
[20]
Mingdong Feng and Charles E. Leiserson. 1999. Efficient Detection of Determinacy Races in Cilk Programs. Theory of Computing Systems Vol. 32, 3 (1999), 301--326.
[21]
Free Software Foundation. 2009. GCC Wiki: LinkTimeOptimization. Available at https://gcc.gnu.org/wiki/LinkTimeOptimization. (Oct. . 2009).
[22]
Free Software Foundation. 2014. GNU Binutils. Available at https://www.gnu.org/software/binutils/. (September 2014).
[23]
Free Software Foundation. 2017. GNU Compiler Collection (GCC) Internals.
[24]
Free Software Foundation. 2017. GNU Compiler Collection (GCC) Internals: Plugins. Available at https://gcc.gnu.org/onlinedocs/gccint/Plugins.html. (April 2017).
[25]
Felix Garcia and Javier Fernandez. 2000. POSIX Thread Libraries. Linux Journal, Vol. 2000, 70es (Feb.2000).
[26]
Saturnino Garcia, Donghwan Jeon, Christopher M. Louie, and Michael Bedford Taylor. 2011. Kremlin: Rethinking and Rebooting gprof for the Multicore Age. SIGPLAN Not. Vol. 46 (June 2011), 458--469. Issue 6.
[27]
Google, Inc. 2015. Google C++Style Guide. Available at https://google.github.io/styleguide/cppguide.html. (2015).
[28]
Susan L. Graham, Peter B. Kessler, and Marshall K. McKusick. 1982. gprof: A Call Graph Execution Profiler. SIGPLAN'82 Symposium on Compiler Construction. 120--126.
[29]
Reed Hastings and Bob Joyce. 1992. Purify: Fast detection of memory leaks and access errors. Winter 1992 USENIX Conference. 125--138.
[30]
Yuxiong He, Charles E. Leiserson, and William M. Leiserson. 2010. The Cilkview Scalability Analyzer. In SPAA. 145--156.
[31]
Intel Corporation. 2015. Pin 2.14 User Guide. Available at https://software.intel.com/sites/landingpage/pintool/docs/71313/Pin/html/index.html. (January 2015).
[32]
Rohit Jalan and Arun Kejariwal. 2012. Trin-Trin: Who's calling? A Pin-Based Dynamic Call Graph Extraction Framework. International Journal of Parallel Programming, Vol. 40, 4 (2012), 410--442.
[33]
Donghwan Jeon, Saturnino Garcia, Chris Louie, and Michael Bedford Taylor. 2011. Kismet: Parallel Speedup Estimates for Serial Programs. OOPSLA. 519--536.
[34]
Teresa Johnson, Mehdi Amini, and Xinliang David Li. 2017. ThinLTO: Scalable and Incremental LTO. In CGO. 111--121.
[35]
Gregor Kiczales, Erik Hilsdale, Jim Hugunin, Mik Kersten, Jeffrey Palm, and William G. Griswold. 2001. An Overview of AspectJ. In ECOOP. 327--353.
[36]
Leslie Lamport. 1978. Time, Clocks and the Ordering of Events in a Distributed System. (July 1978), 558--565.
[37]
James R. Larus and Eric Schnarr. 1995. EEL: Machine-independent Executable Editing. In PLDI. 291--300.
[38]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. CGO. 75.
[39]
Han Bok Lee and Benjamin Zorn. 1997. BIT: A Tool for Instrumenting Java Bytecodes. USENIX Symposium on Internet Technologies and Systems. USENIX Association, 73--82.
[40]
I-Ting Angelina Lee and Tao B. Schardl. 2015. Efficiently Detecting Races in Cilk Programs That Use Reducer Hyperobjects. SPAA. 111--122.
[41]
LLVM Project. 2016. LLVM Language Reference Manual. Available at http://llvm.org/docs/LangRef.html. (2016).
[42]
LLVM Project. 2016. LLVM Link Time Optimization: Design and Implementation. Available at http://llvm.org/docs/LinkTimeOptimization.html. (2016).
[43]
LLVM Project. 2016. Writing an LLVM Pass. Available at http://llvm.org/docs/WritingAnLLVMPass.html. (2016).
[44]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. PLDI. 190--200.
[45]
Lukávs Marek, Alex Villazón, Yudi Zheng, Danilo Ansaloni, Walter Binder, and Zhengwei Qi. 2012. DiSL: A Domain-Specific Language for Bytecode Instrumentation. AOSD. 239--250.
[46]
John Mellor-Crummey. 1991. On-the-fly Detection of Data Races for Programs with Nested Fork-Join Parallelism. Supercomputing. 24--33.
[47]
John Mellor-Crummey. 1993. Compile-Time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs. PADD. 129--139.
[48]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavyweight dynamic binary instrumentation. PLDI. 89--100.
[49]
Oracle. 2004. JVM#8482; Tool Interface (JVM TI). Available at http://docs.oracle.com/javase/1.5.0/docs/guide/jvmti/index.html. (2004).
[50]
James Reinders. 2005. VTune Performance Analyzer Essentials. Intel Press.
[51]
Ted Romer, Geoff Voelker, Dennis Lee, Alec Wolman, Wayne Wong, Hank Levy, Brian Bershad, and Brad Chen. 1997. Instrumentation and Optimization of Win32/Intel Executables Using Etch. NT. 1--7.
[52]
Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A Dynamic Race Detector for Multi-Threaded Programs. ACM Transactions on Computer Systems Vol. 15, 4 (Nov. 1997), 391--411.
[53]
Tao B. Schardl, Bradley C. Kuszmaul, I-Ting Angelina Lee, William M. Leiserson, and Charles E. Leiserson. 2015. The Cilkprof Scalability Profiler. In SPAA. 89--100.
[54]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. USENIX Annual Technical Conference.
[55]
Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer -- Data Race Detection in Practice. WBIA. 62--71.
[56]
Konstantin Serebryany, Alexander Potapenko, Timur Iskhodzhanov, and Dmitry Vyukov. 2011. Dynamic Race Detection with LLVM Compiler. Technical Report 37278. Google.
[57]
Sameer Shende, Allen D. Malony, Janice Cuny, Peter Beckman, Steve Karmesin, and Kathleen Lindlan. 1998. Portable Profiling and Tracing for Parallel, Scientific Applications Using C++. SPDT. 134--145.
[58]
Michael D. Smith. 1991. Tracing with pixie. Technical Report CSL-TR-91--497. Stanford University.
[59]
Amitabh Srivastava and Alan Eustace. 1994. ATOM: A System for Building Customized Program Analysis Tools. PLDI. 196--205.
[60]
Amitabh Srivastava and David W. Wall. 1992. A Practical System for Intermodule Code Optimization at Link-Time. Technical Report 92/6. Digital Western Research Laboratory.
[61]
Richard M. Stallman and the GCC Developer Community. 2016. Using the GNU Compiler Collection (for GCC version 6.1.0). Free Software Foundation.
[62]
Basile Starynkevitch. 2011. MELT -- A Translated Domain Specific Language Embedded in the GCC Compiler. DSL.
[63]
Mark Stephenson, Siva Kumar Sastry Hari, Yunsup Lee, Eiman Ebrahimi, Daniel R. Johnson, David Nellans, Mike O'Connor, and Stephen W. Keckler. 2015. Flexible Software Profiling of GPU Architectures. ISCA. 185--197.
[64]
Rabin A. Sugumar and Santosh G. Abraham. 1993. Efficient Simulation of Caches Under Optimal Replacement with Applications to Miss Characterization. In SIGMETRICS. 24--35.
[65]
Ian Lance Taylor. 2008. A new ELF linker. Available at https://research.google.com/pubs/archive/34417.pdf. (2008).
[66]
The Clang Team. 2017. Clang Plugins -- Clang 5 Documentation. Available at https://clang.llvm.org/docs/ClangPlugins.html. (2017).
[67]
Mustafa M Tikir and Jeffrey K Hollingsworth. 2002. Efficient instrumentation for code coverage testing. ISSTA. 86--96.
[68]
Gang-Ryung Uh, Robert Cohn, Bharadwaj Yadavalli, Ramesh Peri, and Ravi Ayyagari. 2006. Analyzing Dynamic Binary Instrumentation Overhead.
[69]
Raja Vallée-Rai, Etienne Gagnon, Laurie Hendren, Patrick Lam, Patrice Pominville, and Vijay Sundaresan. 2000. Optimizing Java Bytecode Using the Soot Framework: Is It Feasible? CC. Lecture Notes in Computer Science, Vol. Vol. 1781. 18--34.
[70]
William von Hagen. 2006. The Definitive Guide to GCC (second ed.). Apress, Chapter 6.
[71]
David W. Wall. 1989. Link-Time Code Modification. Technical Report 89/17. Digital Western Research Laboratory.
[72]
Josef Weidendorfer. 2008. Sequential Performance Analysis with Callgrind and KCachegrind. 2nd International Workshop on Parallel Tools for High Performance Computing. 93--113.
[73]
David A. Wheeler. 2001. More Than a Gigabuck: Estimating GNU/Linux's Size. Available at http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html. (June 2001).
[74]
Abe White. 2007. Serp. Available at http://serp.sourceforge.net/. (2007).
[75]
Yuan Yu, Tom Rodeheffer, and Wei Chen. 2005. RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Tracking. SOSP. 221--234.

Cited By

View all
  • (2023)Optimization-Aware Compiler-Level Event ProfilingACM Transactions on Programming Languages and Systems10.1145/359147345:2(1-50)Online publication date: 26-Jun-2023
  • (2023)Towards Solving the Challenge of Minimal Overhead MonitoringCompanion of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578245.3584851(381-388)Online publication date: 15-Apr-2023
  • (2022)Creating concise and efficient dynamic analyses with ALDAProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507760(740-752)Online publication date: 28-Feb-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems
Proceedings of the ACM on Measurement and Analysis of Computing Systems  Volume 1, Issue 2
December 2017
480 pages
EISSN:2476-1249
DOI:10.1145/3175501
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 December 2017
Published in POMACS Volume 1, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. compiler-inserted instrumentation
  2. dynamic program analysis
  3. program instrumentation

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)141
  • Downloads (Last 6 weeks)33
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Optimization-Aware Compiler-Level Event ProfilingACM Transactions on Programming Languages and Systems10.1145/359147345:2(1-50)Online publication date: 26-Jun-2023
  • (2023)Towards Solving the Challenge of Minimal Overhead MonitoringCompanion of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578245.3584851(381-388)Online publication date: 15-Apr-2023
  • (2022)Creating concise and efficient dynamic analyses with ALDAProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507760(740-752)Online publication date: 28-Feb-2022
  • (2021)TIP: Time-Proportional Instruction ProfilingMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480058(15-27)Online publication date: 18-Oct-2021
  • (2021)A Systematic Framework to Identify Violations of Scenario-dependent Driving Rules in Autonomous Vehicle SoftwareProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/34600825:2(1-25)Online publication date: 4-Jun-2021
  • (2021)JPortal: precise and efficient control-flow tracing for JVM programs with Intel processor traceProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454096(1080-1094)Online publication date: 19-Jun-2021
  • (2019)TapirACM Transactions on Parallel Computing10.1145/33656556:4(1-33)Online publication date: 17-Dec-2019
  • (2019)The CSI Framework for Compiler-Inserted Program InstrumentationACM SIGMETRICS Performance Evaluation Review10.1145/3308809.330886046:1(100-102)Online publication date: 17-Jan-2019
  • (2018)The CSI Framework for Compiler-Inserted Program InstrumentationACM SIGMETRICS Performance Evaluation Review10.1145/3292040.321965746:1(100-102)Online publication date: 12-Jun-2018
  • (2018)The CSI Framework for Compiler-Inserted Program InstrumentationAbstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems10.1145/3219617.3219657(100-102)Online publication date: 12-Jun-2018
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media