research-article

BegBunch: benchmarking for C bug detection tools

Authors:

Cristina Cifuentes,

Christian Hoermann,

Michael Mounteney,

Bernhard ScholzAuthors Info & Claims

DEFECTS '09: Proceedings of the 2nd International Workshop on Defects in Large Software Systems: Held in conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2009)

Pages 16 - 20

https://doi.org/10.1145/1555860.1555866

Published: 19 June 2009 Publication History

Abstract

Benchmarks for bug detection tools are still in their infancy. Though in recent years various tools and techniques were introduced, little effort has been spent on creating a benchmark suite and a harness for a consistent quantitative and qualitative performance measurement. For assessing the performance of a bug detection tool and determining which tool is better than another for the type of code to be looked at, the following questions arise: 1) how many bugs are correctly found, 2) what is the tool's average false positive rate, 3) how many bugs are missed by the tool altogether, and 4) does the tool scale.

In this paper we present our contribution to the C bug detection community: two benchmark suites that allow developers and users to evaluate accuracy and scalability of a given tool. The two suites contain buggy, mature open source code; bugs are representative of "real world" bugs. A harness accompanies each benchmark suite to compute automatically qualitative and quantitative performance of a bug detection tool.

BegBunch has been tested to run on the Solaris™, Mac OS X and Linux operating systems. We show the generality of the harness by evaluating it with our own Parfait and three publicly available bug detection tools developed by others.

References

[1]

S. Christey and R. A. Martin. Vulnerability type distributions in CVE. Technical report, The MITRE Corporation, May 2007. Version 1.1.

[2]

C. Cifuentes and B. Scholz. Parfait -- designing a scalable bug checker. In Proceedings of the ACM SIGPLAN Static Analysis Workshop, pages 4--11, 12 June 2008.

Digital Library

[3]

D. Evans and D. Larochelle. Improving security using extensible lightweight static analysis. IEEE Software, pages 42--51, January/February 2002.

Digital Library

[4]

S. Heckman and L. Williams. On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques. In Proc. of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pages 41--50, October 2008.

Digital Library

[5]

G. J. Holzmann. Static source code checking for user-defined properties. In Proceedings of 6th World Conference on Integrated Design&Process Technology (IDPT), June 2002.

[6]

K. Kratkiewicz and R. Lippmann. Using a diagnostic corpus of C programs to evaluate buffer overflow detection by static analysis tools. In Proc. of Workshop on the Evaluation of Software Defect Detection Tools, June 2005.

[7]

C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis&transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04), March 2004.

Digital Library

[8]

LLVM/Clang Static Analyzer. http://clang.llvm.org/StaticAnalysis.html. Last accessed: 1 December 2008.

[9]

S. Lu, Z. Li, F. Qin, L. Tan, P. Zhou, and Y. Zhou. BugBench: A benchmark for evaluating bug detection tools. In Proc. of Workshop on the Evaluation of Software Defect Detection Tools, June 2005.

[10]

MITRE Corporation. Common Weakness Enumeration. http://cwe.mitre.org/, April 2008.

[11]

NIST. National Institute of Standards and Technology SAMATE Reference Dataset (SRD) project. http://samate.nist.gov/SRD, January 2006.

[12]

S. E. Sim, S. Easterbrook, and R. C. Holt. Using benchmarking to advance research: A challenge to software engineering. In Proceedings of the 25th International Conference on Software Engineering, pages 74--83, Portland, Oregon, 2003. IEEE Computer Society.

Digital Library

[13]

C. van Rijsbergen. Information Retrieval. Butterworth, 2 edition, 1979.

Digital Library

[14]

D. A. Wheeler. More Than A Gigabuck: Estimating GNU/Linux's Size. http://www.dwheeler.com/sloc/, 2001. Last accessed: 16 March 2009.

[15]

M. Zitser, R. Lippmann, and T. Leek. Testing static analysis tools using exploitable buffer overflows from open source code. In Proc. of International Symposium on Foundations of Software Engineering, pages 97--106. ACM Press, 2004.

Digital Library

Cited By

Beyer DGrunske LKettl MLingsch-Rosenfeld MRaselimo MSpinellis DConstantinou EBacchelli A(2024)P3: A Dataset of Partial Program FixesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644889(123-127)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644889
Elahi HWang G(2024)Forward-porting and its limitations in fuzzer evaluationInformation Sciences: an International Journal10.1016/j.ins.2024.120142662:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.ins.2024.120142
Bitchebe SKone YOlivier PBoukhobza JBromberg YHagimont DTchana A(2023)GuaNary: Efficient Buffer Overflow Detection In Virtualized Clouds Using Intel EPT-based Sub-Page Write Protection SupportProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36267877:3(1-26)Online publication date: 7-Dec-2023
https://dl.acm.org/doi/10.1145/3626787
Show More Cited By

Index Terms

BegBunch: benchmarking for C bug detection tools
1. General and reference
  1. Cross-computing tools and techniques
    1. Metrics
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions
SP '12: Proceedings of the 2012 IEEE Symposium on Security and Privacy

Programmers should never fix the same bug twice. Unfortunately this often happens when patches to buggy code are not propagated to all code clones. Unpatched code clones represent latent bugs, and for security-critical problems, latent vulnerabilities, ...
Scalability Bugs: When 100-Node Testing is Not Enough
HotOS '17: Proceedings of the 16th Workshop on Hot Topics in Operating Systems

We highlight the problem of scalability bugs, a new class of bugs that appear in "cloud-scale" distributed systems. Scalability bugs are latent bugs that are cluster-scale dependent, whose symptoms typically surface in large-scale deployments, but not ...
MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools
SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing

We present MRNet, a software-based multicast/reduction network for building scalable performance and system administration tools. MRNet supports multiple simultaneous, asynchronous collective communication operations. MRNet is flexible, allowing tool ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

DEFECTS '09: Proceedings of the 2nd International Workshop on Defects in Large Software Systems: Held in conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2009)

June 2009

34 pages

ISBN:9781605586540

DOI:10.1145/1555860

Editors:
Ben Liblit
University of Wisconsin-Madison
,
Nachiappan Nagappan
Microsoft Research
,
Thomas Zimmermann
Microsoft Research

Copyright © 2009 Copyright 2009 Sun Microsystems, Inc.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISSTA '09

Sponsor:

SIGSOFT

ISSTA '09: International Symposium on Software Testing and Analysis

July 19, 2009

Illinois, Chicago

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

35
Total Citations
View Citations
550
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)2

Reflects downloads up to 06 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Beyer DGrunske LKettl MLingsch-Rosenfeld MRaselimo MSpinellis DConstantinou EBacchelli A(2024)P3: A Dataset of Partial Program FixesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644889(123-127)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644889
Elahi HWang G(2024)Forward-porting and its limitations in fuzzer evaluationInformation Sciences: an International Journal10.1016/j.ins.2024.120142662:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.ins.2024.120142
Bitchebe SKone YOlivier PBoukhobza JBromberg YHagimont DTchana A(2023)GuaNary: Efficient Buffer Overflow Detection In Virtualized Clouds Using Intel EPT-based Sub-Page Write Protection SupportProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36267877:3(1-26)Online publication date: 7-Dec-2023
https://dl.acm.org/doi/10.1145/3626787
Zhu HRubio-González CGrundy JPollock LPenta M(2023)On the Reproducibility of Software Defect DatasetsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00195(2324-2335)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00195
Yu PWu YPeng XPeng JZhang JXie PZhao W(2023)ViolationTracker: Building Precise Histories for Static Analysis Violations2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)10.1109/ICSE48619.2023.00171(2022-2034)Online publication date: May-2023
https://doi.org/10.1109/ICSE48619.2023.00171
Widyasari RPrana GHaryono SWang SLo D(2022)Real world projects, real faults: evaluating spectrum based fault localization techniques on Python projectsEmpirical Software Engineering10.1007/s10664-022-10189-427:6Online publication date: 1-Nov-2022
https://dl.acm.org/doi/10.1007/s10664-022-10189-4
Patra JPradel MSpinellis DGousios GChechik MDi Penta M(2021)Semantic bug seeding: a learning-based approach for creating realistic bugsProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468623(906-918)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3468264.3468623
Liu KKim DBissyande TYoo SLe Traon Y(2021)Mining Fix Patterns for FindBugs ViolationsIEEE Transactions on Software Engineering10.1109/TSE.2018.288495547:1(165-188)Online publication date: 1-Jan-2021
https://doi.org/10.1109/TSE.2018.2884955
Laurent MSaillard EQuinson M(2021)The MPI Bugs Initiative: a Framework for MPI Verification Tools Evaluation2021 IEEE/ACM 5th International Workshop on Software Correctness for HPC Applications (Correctness)10.1109/Correctness54621.2021.00008(1-9)Online publication date: Nov-2021
https://doi.org/10.1109/Correctness54621.2021.00008
Tang HNadi S(2021)On using Stack Overflow comment-edit pairs to recommend code maintenance changesEmpirical Software Engineering10.1007/s10664-021-09954-826:4Online publication date: 11-May-2021
https://doi.org/10.1007/s10664-021-09954-8
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten