research-article

Tailoring programs for static analysis via program transformation

Authors:

Rijnard van Tonder,

Claire Le GouesAuthors Info & Claims

ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

Pages 824 - 834

https://doi.org/10.1145/3377811.3380343

Published: 01 October 2020 Publication History

Abstract

Static analysis is a proven technique for catching bugs during software development. However, analysis tooling must approximate, both theoretically and in the interest of practicality. False positives are a pervading manifestation of such approximations---tool configuration and customization is therefore crucial for usability and directing analysis behavior. To suppress false positives, developers readily disable bug checks or insert comments that suppress spurious bug reports. Existing work shows that these mechanisms fall short of developer needs and present a significant pain point for using or adopting analyses. We draw on the insight that an analysis user always has one notable ability to influence analysis behavior regardless of analyzer options and implementation: modifying their program. We present a new technique for automated, generic, and temporary code changes that tailor to suppress spurious analysis errors. We adopt a rule-based approach where simple, declarative templates describe general syntactic changes for code patterns that are known to be problematic for the analyzer. Our technique promotes program transformation as a general primitive for improving the fidelity of analysis reports (we treat any given analyzer as a black box). We evaluate using five different static analyzers supporting three different languages (C, Java, and PHP) on large, real world programs (up to 800KLOC). We show that our approach is effective in sidestepping long-standing and complex issues in analysis implementations.

References

[1]

2019. Clang Static Analyzer. https://clang-analyzer.llvm.org/.

[2]

2019. CodeSonar. https://www.grammatech.com/products/codesonar.

[3]

2019. Coverity: suppressing asserts. https://community.synopsys.com/s/question/0D534000046YuzbCAC.

[4]

2019. Error Prone: Patching. https://errorprone.info/docs/patching.

[5]

2019. Infer. https://github.com/facebook/infer.

[6]

2019. NullAway: auto-suppressing. https://github.com/uber/NullAway/wiki/Suppressing-Warnings#auto-suppressing.

[7]

2019. PHPStan. https://github.com/phpstan/phpstan.

[8]

2019. Spotbugs. https://github.com/spotbugs/spotbugs.

[9]

Cristiano Calcagno, Dino Distefano, Jérémy Dubreil, Dominik Gabi, Pieter Hooimeijer, Martino Luca, Peter W. O'Hearn, Irene Papakonstantinou, Jim Purbrick, and Dulma Rodriguez. 2015. Moving Fast with Software Verification. In NASA Formal Methods (NFM '15). 3--11.

[10]

Maria Christakis and Christian Bird. 2016. What Developers Want and Need from Program Analysis: An Empirical Study. In International Conference on Automated Software Engineering (ASE '16). 332--343.

Digital Library

[11]

Maria Christakis, Peter Müller, and Valentin Wüstholz. 2015. An Experimental Evaluation of Deliberate Unsoundness in a Static Program Analyzer. In Verification, Model Checking, and Abstract Interpretation (VMCAI '15). 336--354.

[12]

J. Robert M. Cornish, Graeme Gange, Jorge A. Navas, Peter Schachte, Harald Søndergaard, and Peter J. Stuckey. 2014. Analyzing Array Manipulating Programs by Program Transformation. In Logic-Based Program Synthesis and Transformation (LOPSTR '14). 3--20.

[13]

Patrick Cousot and Radhia Cousot. 2002. Systematic Design of Program Transformation Frameworks by Abstract Interpretation. In Symposium on Principles of Programming Languages (POPL '02). 178--190.

Digital Library

[14]

Pascal Cuoq, Benjamin Monate, Anne Pacalet, Virgile Prevosto, John Regehr, Boris Yakobowski, and Xuejun Yang. 2012. Testing Static Analyzers with Randomly Generated Programs. In NASA Formal Methods (NFM '12). 120--125.

[15]

Alastair F. Donaldson, Hugues Evrard, Andrei Lascu, and Paul Thomson. 2017. Automated testing of graphics shader compilers. PACMPL 1, OOPSLA (2017), 93:1--93:29.

[16]

Pär Emanuelsson and Ulf Nilsson. 2008. A Comparative Study of Industrial Static Analysis Tools. Electr. Notes Theor. Comput. Sci. 217 (2008), 5--21.

Digital Library

[17]

Nikos Gorogiannis, Peter W. O'Hearn, and Ilya Sergey. 2019. A true positives theorem for a static race detector. PACMPL 3, POPL (2019), 57:1--57:29.

[18]

Mark Harman. 2018. We Need a Testability Transformation Semantics. In Software Engineering and Formal Methods (SEFM '18). 3--17.

[19]

Mark Harman, Lin Hu, Robert M. Hierons, Joachim Wegener, Harmen Sthamer, André Baresel, and Marc Roper. 2004. Testability Transformation. IEEE Trans. Software Eng. 30, 1 (2004), 3--16.

Digital Library

[20]

Ciera Jaspan, I-Chin Chen, and Anoop Sharma. 2007. Understanding the Value of Program Analysis Tools. In Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA '07). 963--970.

[21]

Brittany Johnson, Yoonki Song, Emerson R. Murphy-Hill, and Robert W. Bowdidge. 2013. Why Don't Software Developers use Static Analysis Tools to Find Bugs?. In International Conference on Software Engineering, 2013 ('13). 672--681.

[22]

William Landi. 1992. Undecidability of Static Analysis. ACM Letters on Programming Languages and Systems 1, 4 (dec 1992), 323--337.

Digital Library

[23]

Chris Lattner and Vikram S. Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Code Generation and Optimization (CGO '04). 75--88.

Digital Library

[24]

Julia Lawall and Gilles Muller. 2018. Coccinelle: 10 Years of Automated Evolution in the Linux Kernel. In USENIX Annual Technical Conference. 601--614.

[25]

Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In Defense of Soundiness: A Manifesto. Commun. ACM 58, 2 (2015), 44--46.

Digital Library

[26]

Francesco Logozzo and Manuel Fähndrich. 2008. On the Relative Completeness of Bytecode Analysis Versus Source Code Analysis. In Compiler Construction (CC '08). 197--212.

[27]

Eduardus A. T. Merks, J. Michael Dyck, and Robert D. Cameron. 1992. Language Design For Program Manipulation. IEEE Trans. Software Eng. 18, 1 (1992), 19--32.

Digital Library

[28]

Kedar S. Namjoshi and Zvonimir Pavlinovic. 2018. The Impact of Program Transformations on Static Program Analysis. In International Symposium on Static Analysis (SAS '18). 306--325.

[29]

Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018. T-Fuzz: Fuzzing by Program Transformation. In IEEE Symposium on Security and Privacy.

[30]

Martin C Rinard, Cristian Cadar, Daniel Dumitran, Daniel M Roy, Tudor Leu, and William S Beebee. 2004. Enhancing Server Availability and Security Through Failure-Oblivious Computing. In OSDI, Vol. 4. 21--21.

Digital Library

[31]

Caitlin Sadowski, Edward Aftandilian, Alex Eagle, Liam Miller-Cushon, and Ciera Jaspan. 2018. Lessons from Building Static Analysis Tools at Google. Commun. ACM 61, 4 (2018), 58--66.

Digital Library

[32]

Rijnard van Tonder, John Kotheimer, and Claire Le Goues. 2018. Semantic Crash Bucketing. In International Conference on Automated Software Engineering (ASE '18). 612--622.

[33]

Rijnard van Tonder and Claire Le Goues. 2018. Static Automated Program Repair for Heap Properties. In International Conference on Software Engineerinng (ICSE '18). 151--162.

[34]

Rijnard van Tonder and Claire Le Goues. 2019. Lightweight Multi-Language Syntax Transformation with Parser Parser Combinators. In Conference on Programming language Design and Implementation (PLDI '19).

[35]

Louis Wasserman. 2013. Scalable, Example-based Refactorings with Refaster. In Workshop on Refactoring Tools (WRT@SPLASH '13). 25--28.

[36]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Conference on Programming Language Design and Implementation (PLDI '11). 283--294.

Cited By

Zhang HPei YChen JTan SChandra SBlincoe KTonella P(2023)Statfier: Automated Testing of Static Analyzers via Semantic-Preserving Program TransformationsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616272(237-249)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616272
Ahad AJung CAskar AKim DKim TKwon Y(2023)Pyfet: Forensically Equivalent Transformation for Python Binary Decompilation2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179370(3296-3313)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179370
Ma XYan JWang WYan JZhang JQiu ZGrundy J(2021)Detecting memory-related bugs by tracking heap memory management of C++ smart pointersProceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE51524.2021.9678836(880-891)Online publication date: 15-Nov-2021
https://dl.acm.org/doi/10.1109/ASE51524.2021.9678836

Recommendations

Static program analysis of embedded executable assembly code
CASES '04: Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems

We consider the problem of automatically checking if coding standards have been followed in the development of embedded applications. The problem arises from practical considerations because DSP chip manufacturers (in our case Texas Instruments) want ...
Static analysis of multi-staged programs via unstaging translation
POPL '11

Static analysis of multi-staged programs is challenging because the basic assumption of conventional static analysis no longer holds: the program text itself is no longer a fixed static entity, but rather a dynamically constructed value. This article ...
Static Analysis of JNI Programs via Binary Decompilation
JNI programs are widely used thanks to the combined benefits of C and Java programs. However, because understanding the interaction behaviors between two different programming languages is challenging, JNI program development is difficult to get right and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

June 2020

1640 pages

ISBN:9781450371216

DOI:10.1145/3377811

General Chairs:
Gregg Rothermel
North Carolina State University
,
Doo-Hwan Bae
KAIST, South Korea

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

KIISE: Korean Institute of Information Scientists and Engineers
IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Artifacts Available / v1.1

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ICSE '20

Sponsor:

SIGSOFT

ICSE '20: 42nd International Conference on Software Engineering

June 27 - July 19, 2020

Seoul, South Korea

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
485
Total Downloads

Downloads (Last 12 months)136
Downloads (Last 6 weeks)2

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang HPei YChen JTan SChandra SBlincoe KTonella P(2023)Statfier: Automated Testing of Static Analyzers via Semantic-Preserving Program TransformationsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616272(237-249)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616272
Ahad AJung CAskar AKim DKim TKwon Y(2023)Pyfet: Forensically Equivalent Transformation for Python Binary Decompilation2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179370(3296-3313)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179370
Ma XYan JWang WYan JZhang JQiu ZGrundy J(2021)Detecting memory-related bugs by tracking heap memory management of C++ smart pointersProceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE51524.2021.9678836(880-891)Online publication date: 15-Nov-2021
https://dl.acm.org/doi/10.1109/ASE51524.2021.9678836

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents