research-article

Predicting buggy changes inside an integrated development environment

Authors:

Janaki T. Madhavan,

E. James Whitehead, Jr.Authors Info & Claims

eclipse '07: Proceedings of the 2007 OOPSLA workshop on eclipse technology eXchange

Pages 36 - 40

https://doi.org/10.1145/1328279.1328287

Published: 21 October 2007 Publication History

Abstract

We present a tool that predicts whether the software under development inside an IDE has a bug. An IDE plugin performs this prediction, using the Change Classification technique to classify source code changes as buggy or clean during the editing session. Change Classification uses Support Vector Machines (SVM), a machine learning classifier algorithm, to classify changes to projects mined from their configuration management repository. This technique, besides being language independent and relatively accurate, can (a) classify a change immediately upon its completion and (b) use features extracted solely from the change delta (added, deleted) and the source code to predict buggy changes. Thus, integrating change classification within an IDE can predict potential bugs in the software as the developer edits the source code, ideally reducing the amount of time spent on fixing bugs later. To this end, we have developed a Change Classification plugin for Eclipse based on client-server architecture, described in this paper.

References

[1]

C. Artho. Jlint - Find bugs in Java programs, 2006.

[2]

J. Bevan, E. J. Whitehead, Jr., S. Kim, and M. Godfrey. Facilitating software evolution with Kenyon. In Proceedings of the 2005 European Software Engineering Conference and 2005 Foundations of Software Engineering (ESEC/FSE 2005), pages 177--186, September 2005.

Digital Library

[3]

J. F. Bowring, J. M. Rehg, and M. J. Harrold. Active learning for automatic classification of software behaviour. In Proceedings of the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA '04), pages 195--205, 2004.

Digital Library

[4]

Y. Brun and M. D. Ernst. Finding latent code errors via machine learning over program executions. In Proceedings of the 26th International Conference on Software Engineering (ICSE 2004), pages 480--490, May 2004.

Digital Library

[5]

W. R. Bush, J. D. Pincus, and D. J. Sielaff. A static analyzer for finding dynamic programming errors. Software-Practice & Experience, 30(7):775--802, June 2000.

Digital Library

[6]

G. Canfora and L. Cerulo. Jimpa: An Eclipse plug-in for impact analysis. In Proceedings of the 10th Conference on Software Maintenance and Reengineering (CSMR 2006), pages 341--342, 2006.

Digital Library

[7]

T. Copeland. PMD Applied: Centennial Books. 2005.

[8]

V. Dallmeier, C. Lindig, and A. Zeller. Lightweight bug localization with AMPLE. In Proceedings of the 6th International Symposium on Automated analysis-driven debugging, pages 99--104, 2005.

Digital Library

[9]

W. Dickinson, D. Leon, and A. Podgurski. Finding failures by cluster analysis of execution profiles. In Proceedings of the 23rd International Conference on Software Engineering, pages 339--348, 2000.

Digital Library

[10]

M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE Trans. Softw. Engin., 27(2):1--25, February 2001.

Digital Library

[11]

C. Flanagan, K. R. M. Leino, M. Lillibridge, G. Nelson, J. B. Saxe, and R. Stata. Extended static checking for Java. In Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation, volume 37, pages 234--245, June 2002.

Digital Library

[12]

T. Gyimothy, R. Ferenc, and I. Siket. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Engin., 31(10):897--910, October 2005.

Digital Library

[13]

M. Haran, A. Karr, A. Orso, A. Porter, and A. Sanil. Applying classification techniques to remotely-collected program execution data. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE), pages 146--155, 2005.

Digital Library

[14]

A. E. Hassan and R. C. Holt. The top ten list: Dynamic fault prediction. In Proceedings of the 21st IEEE International Conference on Software Maintenance (ICSM 2005), pages 263--272, September 2005.

Digital Library

[15]

D. Hovemeyer and W. Pugh. Finding bugs is easy. In Proceedings of the Onward! Track of the ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), pages 132--136, October 2004.

Digital Library

[16]

T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of ECML-98, 10th European Conference on Machine Learning, pages 137--142, April 1998.

Digital Library

[17]

T. Khoshgoftaar and E. B. Allen. Ordering fault-prone software modules. Softw. Quality Control J., 11(1):19--37, May 2003.

Digital Library

[18]

S. Kim, K. Pan, and E. J. Whitehead, Jr. Memories of bug fixes. In Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2006), pages 35--45, November 2006.

Digital Library

[19]

S. Kim, E. J. Whitehead, Jr., and Y. Zhang. Classifying software changes: Clean or buggy. IEEE Trans. Softw. Engin., in review. Manuscript available at http://www.cs.ucsc.edu/~ejw/papers/cc.pdf.

Digital Library

[20]

V. Livshits. Turning Eclipse against itself: Finding bugs in eclipse code using lightweight static analysis. Eclipsecon '05 Research Exchange, March 2005.

[21]

A. Mockus and D. M. Weiss. Predicting risk of software changes. Bell Labs Tech. J., 5(2):169--180, April-June 2000.

[22]

T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Predicting the location and number of faults in large software systems. IEEE Trans. Softw. Engin., 31(4):340--355, April 2005.

Digital Library

[23]

K. Pan, S. Kim, and E. J. Whitehead, Jr. Bug classification using program slicing metrics. In Proceedings of the 6th IEEE International Workshop on Source Code Analysis and Manipulation (SCAM 2006), pages 31--42, September 2006.

Digital Library

[24]

X. Ren, F. Shah, F. Tip, B. G. Ryder, and O. Chelsey. Chianti: A tool for change impact analysis of java programs. In Proceedings of the 19th ACM SIGPLAN conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA '04), pages 432--448, 2004.

Digital Library

[25]

S. Scott and S. Matwin. Feature engineering for text classification. In Proceedings of the 16th International Conference on Machine Learning, pages 379--388, June 1999.

Digital Library

[26]

J. Sliwerski, T. Zimmermann, and A. Zeller. HATARI: Raising risk awareness. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/SIGSOFT FSE '05), pages 107--110, 2005.

Digital Library

[27]

I. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco, CA, 2005.

Digital Library

[28]

T. Zimmermann, P. Weisgerber, S. Diehl, and A. Zeller. Mining version histories to guide software changes. In Proceedings of the 26th International Conference on Software Engineering (ICSE '04), pages 563--572, 2004.

Digital Library

Cited By

Sharma TKechagia MGeorgiou STiwari RVats IMoazen HSarro F(2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111934
Kawalerowicz MMadeyski L(2023)Continuous build outcome prediction: an experimental evaluation and acceptance modellingApplied Intelligence10.1007/s10489-023-04523-653:8(8673-8692)Online publication date: 18-Apr-2023
https://doi.org/10.1007/s10489-023-04523-6
Kawalerowicz MMadeyski L(2021)Jaskier: A Supporting Software Tool for Continuous Build Outcome Prediction PracticeAdvances and Trends in Artificial Intelligence. From Theory to Practice10.1007/978-3-030-79463-7_36(426-438)Online publication date: 19-Jul-2021
https://doi.org/10.1007/978-3-030-79463-7_36
Show More Cited By

Index Terms

Predicting buggy changes inside an integrated development environment
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Software version control
  2. Software notations and tools
    1. Development frameworks and environments
      1. Integrated and visual development environments
    2. Software configuration management and version control systems

Recommendations

Predicting Buggy Code Clones through Machine Learning
CASCON '22: Proceedings of the 32nd Annual International Conference on Computer Science and Software Engineering
Code clones (similar code fragments in a code-base} often have negative impacts on the maintenance and evolution of software systems. According to the existing studies, code clones may contain bugs or inconsistencies that can cause an increased ...
Reducing Features to Improve Bug Prediction
ASE '09: Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering

Recently, machine learning classifiers have emerged as a way to predict the existence of a bug in a change made to a source code file. The classifier is first trained on software history data, and then used to predict bugs. Two drawbacks of existing ...
Inside Servlets: Server-Side Programming for the Java Platform

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

eclipse '07: Proceedings of the 2007 OOPSLA workshop on eclipse technology eXchange

October 2007

79 pages

ISBN:9781605580159

DOI:10.1145/1328279

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

OOPSLA07

Sponsor:

SIGPLAN

OOPSLA07: ACM SIGPLAN Object Oriented Programming Systems and Applications Conference

October 21, 2007

Quebec, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 38 of 79 submissions, 48%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
309
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sharma TKechagia MGeorgiou STiwari RVats IMoazen HSarro F(2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111934
Kawalerowicz MMadeyski L(2023)Continuous build outcome prediction: an experimental evaluation and acceptance modellingApplied Intelligence10.1007/s10489-023-04523-653:8(8673-8692)Online publication date: 18-Apr-2023
https://doi.org/10.1007/s10489-023-04523-6
Kawalerowicz MMadeyski L(2021)Jaskier: A Supporting Software Tool for Continuous Build Outcome Prediction PracticeAdvances and Trends in Artificial Intelligence. From Theory to Practice10.1007/978-3-030-79463-7_36(426-438)Online publication date: 19-Jul-2021
https://doi.org/10.1007/978-3-030-79463-7_36
Kawalerowicz MMadeyski L(2021)Continuous Build Outcome Prediction: A Small-N Experiment in Settings of a Real Software ProjectAdvances and Trends in Artificial Intelligence. From Theory to Practice10.1007/978-3-030-79463-7_35(412-425)Online publication date: 19-Jul-2021
https://doi.org/10.1007/978-3-030-79463-7_35
Xia XLo DWang XYang X(2016)Collective Personalized Change Classification With Multiobjective SearchIEEE Transactions on Reliability10.1109/TR.2016.258813965:4(1810-1829)Online publication date: Dec-2016
https://doi.org/10.1109/TR.2016.2588139
Misirli AShihab EKamei Y(2016)Studying high impact fix-inducing changesEmpirical Software Engineering10.1007/s10664-015-9370-z21:2(605-641)Online publication date: 1-Apr-2016
https://dl.acm.org/doi/10.1007/s10664-015-9370-z
Shivaji SWhitehead EAkella RKim S(2013)Reducing Features to Improve Code Change-Based Bug PredictionIEEE Transactions on Software Engineering10.1109/TSE.2012.4339:4(552-569)Online publication date: 1-Apr-2013
https://dl.acm.org/doi/10.1109/TSE.2012.43
Hall TBeecham SBowes DGray DCounsell S(2012)A Systematic Literature Review on Fault Prediction Performance in Software EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2011.10338:6(1276-1304)Online publication date: 1-Nov-2012
https://dl.acm.org/doi/10.1109/TSE.2011.103
Maskeri GKarnam DViswanathan SPadmanabhuni S(2012)Bug Prediction Metrics Based Decision Support for Preventive Software MaintenanceProceedings of the 2012 19th Asia-Pacific Software Engineering Conference - Volume 0110.1109/APSEC.2012.43(260-269)Online publication date: 4-Dec-2012
https://dl.acm.org/doi/10.1109/APSEC.2012.43
Hata HMizuno OKikuno T(2010)Fault-prone module detection using large-scale text features based on spam filteringEmpirical Software Engineering10.1007/s10664-009-9117-915:2(147-165)Online publication date: 1-Apr-2010
https://dl.acm.org/doi/10.1007/s10664-009-9117-9
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents