Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1858996.1859013acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Automatic construction of an effective training set for prioritizing static analysis warnings

Published: 20 September 2010 Publication History

Abstract

In order to improve ineffective warning prioritization of static analysis tools, various approaches have been proposed to compute a ranking score for each warning. In these approaches, an effective training set is vital in exploring which factors impact the ranking score and how. While manual approaches to build a training set can achieve high effectiveness but suffer from low efficiency (i.e., high cost), existing automatic approaches suffer from low effectiveness. In this paper, we propose an automatic approach for constructing an effective training set. In our approach, we select three categories of impact factors as input attributes of the training set, and propose a new heuristic for identifying actionable warnings to automatically label the training set. Our empirical evaluations show that the precision of the top 22 warnings for Lucene, 20 for ANT, and 6 for Spring can achieve 100% with the help of our constructed training set.

References

[1]
}}C. Artho. Jlint - Find Bugs in Java Programs. http://Jlint.sourceforge.net/.
[2]
}}N. Ayewah, D. Hovemeyer, J. D. Morgenthaler, J. Penix, and W. Pugh. Using static analysis to find bugs. IEEE Software, vol. 25, no. 5, pages 22--29, 2008.
[3]
}}C. Boogerd and L. Moonen. Prioritizing software inspection results using static profiling. In Proc. SCAM, pages 149--160, 2006.
[4]
}}D. Binkley. Source code analysis: a road map. In Proc. FOSE, pages 104--119, 2007.
[5]
}}J. Bevan, E. J. Whitehead, Jr., S. Kim, and M. Godfrey. Identifying changed source code lines from revision repositories. In Proc. ESEC/FSE, pages 177--186, 2005.
[6]
}}B. Chess and J. West. Secure programming with static analysis. Aaison Wesley, 2007.
[7]
}}D. Cubranic and G. C. Murphy. Hipikat: recommending pertinent software development artifacts. In Proc. ICSE, pages 408--418, 2003.
[8]
}}K. Chen, S. R. Schach, L. Yu, J. Offutt, and G. Z. Heller. Open-source change logs. Empirical Software Engineering, vol. 9, no. 3, pages 197--210, 2004.
[9]
}}D. Engler, B. Chelf, A. Chou, and S. Hallem. Bugs as deviate behavior: A general approach to inferring errors in system code. In Proc. SOSP, pages 57--72, 2001.
[10]
}}D. Engler and M. Musuvathi. Static analysis versus software model checking for bug finding. In Proc. VMCAI, pages 191--210, 2004.
[11]
}}M. Fischer, M. Pinzger, and H. Gall. Populating a release history database from revision control and bug tracking systems. In Proc. ICSM, pages 23--32, 2003.
[12]
}}FindBugs, available at http://findbugs.sourceforge.net/.
[13]
}}Fortify, available at http://www.fortify.net/intro.html.
[14]
}}K. Hornik, M. Stinchcombe and H. White. Multilayer feed-forward networks are universal approximators. Neural Networks, vol. 2, pages 359--366, 1989.
[15]
}}D. Hovemeyer and W. Pugh. Finding bugs is easy. In Proc. OOPSLA, pages 132--136, 2004.
[16]
}}S. Heckman and L. Williams. On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques. In Pro. ESEM, pages 41--50, 2008.
[17]
}}S. S. Heckman. Adaptively ranking alerts generated from automated static analysis. ACM Crossroads, 14(1), pages 1--11, 2007.
[18]
}}S. Kim and M. D. Ernst. Which warnings should I fix first? In Proc. ESEC/FSE, pages 45--54, 2007.
[19]
}}S. Kim and M. D. Ernst. Prioritizing warning categories by analyzing software history. In Proc. MSR, pages 27--30, 2007.
[20]
}}T. Kremenek, K. Ashcraft, J. Yang and D. Engler. Correlation exploitation in error ranking. In Proc. FSE, pages 83--93, 2004.
[21]
}}T. Kremenek and D. R. Engler. Z-ranking: using statistical analysis to counter the impact of static analysis approximations. In Proc. SAS, pages 295--315, 2003.
[22]
}}Lint4j, available at http://www.jutils.com/.
[23]
}}A. Mockus and L. G. Votta. Identifying reasons for software changes using historic databases. In Proc. ICSM, pages 120--130, 2000.
[24]
}}PMD, available at http://pmd.sourceforge.net/.
[25]
}}J. R. Ruthruff, J. Penix, J. D. Morgenthaler, S. Elbaum, and G. Rothermel. Predicting accurate and actionable static analysis warnings: an experimental approach. In Proc. ICSE, pages 341--350, 2008.
[26]
}}N. Rutar, C. B. Almazan, and J. S. Foster. A comparison of bug finding tools for Java. In Proc. ISSRE, pages 245--256, 2004.
[27]
}}G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Communications of the ACM, vol.18, no.11, pages 613--620, 1975.
[28]
}}S. E. Sim, S. Easterbrook, and R. C. Holt. Using benchmarking to advance research: a challenge to software engineering, In Proc. ICSE, pages 74--83, 2003.
[29]
}}J. Spacco, D. Hovemeyer, and W. Pugh. Tracking defect warnings across revisions. In Proc. MSR, pages 133--136, 2006.
[30]
}}J. Sliwerski, T. Zimmermann and A. Zeller. When do changes induce fixes? In Proc. MSR 2005, pages 1--5, 2005.
[31]
}}Weka, available at http://www.cs.waikato.ac.nz/~ml/weka/
[32]
}}C. C. Williams and J. K. Hollingsworth. Automatic mining of source code repositories to improve static analysis techniques. IEEE Trans. Software Engineering, vol. 31, no. 6, pages 466--480, 2005.

Cited By

View all
  • (2024)Machine Learning for Actionable Warning Identification: A Comprehensive SurveyACM Computing Surveys10.1145/369635257:2(1-35)Online publication date: 19-Sep-2024
  • (2024)Reducing False Positives of Static Bug Detectors Through Code Representation Learning2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00075(681-692)Online publication date: 12-Mar-2024
  • (2023)Resolving Security Issues via Quality-Oriented Refactoring: A User Study2023 ACM/IEEE International Conference on Technical Debt (TechDebt)10.1109/TechDebt59074.2023.00016(82-91)Online publication date: May-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '10: Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering
September 2010
534 pages
ISBN:9781450301169
DOI:10.1145/1858996
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 September 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. generic-bug-related lines
  2. static analysis tools
  3. training-set construction
  4. warning prioritization

Qualifiers

  • Research-article

Conference

ASE10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)5
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Machine Learning for Actionable Warning Identification: A Comprehensive SurveyACM Computing Surveys10.1145/369635257:2(1-35)Online publication date: 19-Sep-2024
  • (2024)Reducing False Positives of Static Bug Detectors Through Code Representation Learning2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00075(681-692)Online publication date: 12-Mar-2024
  • (2023)Resolving Security Issues via Quality-Oriented Refactoring: A User Study2023 ACM/IEEE International Conference on Technical Debt (TechDebt)10.1109/TechDebt59074.2023.00016(82-91)Online publication date: May-2023
  • (2023)Mitigating False Positive Static Analysis Warnings: Progress, Challenges, and OpportunitiesIEEE Transactions on Software Engineering10.1109/TSE.2023.332966749:12(5154-5188)Online publication date: 1-Dec-2023
  • (2023)How to Find Actionable Static Analysis Warnings: A Case Study With FindBugsIEEE Transactions on Software Engineering10.1109/TSE.2023.323420649:4(2856-2872)Online publication date: 1-Apr-2023
  • (2023)An Empirical Study of Class Rebalancing Methods for Actionable Warning IdentificationIEEE Transactions on Reliability10.1109/TR.2023.323498272:4(1648-1662)Online publication date: Dec-2023
  • (2023)Uncertainty-aware consistency checking in industrial settings2023 ACM/IEEE 26th International Conference on Model Driven Engineering Languages and Systems (MODELS)10.1109/MODELS58315.2023.00026(73-83)Online publication date: 1-Oct-2023
  • (2023)Understanding Why and Predicting When Developers Adhere to Code-Quality StandardsProceedings of the 45th International Conference on Software Engineering: Software Engineering in Practice10.1109/ICSE-SEIP58684.2023.00045(432-444)Online publication date: 17-May-2023
  • (2023)A critical comparison on six static analysis tools: Detection, agreement, and precisionJournal of Systems and Software10.1016/j.jss.2022.111575198(111575)Online publication date: Apr-2023
  • (2023)An unsupervised feature selection approach for actionable warning identificationExpert Systems with Applications10.1016/j.eswa.2023.120152227(120152)Online publication date: Oct-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media