Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2970276.2970359acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Locus: locating bugs from software changes

Published: 25 August 2016 Publication History

Abstract

Various information retrieval (IR) based techniques have been proposed recently to locate bugs automatically at the file level. However, their usefulness is often compromised by the coarse granularity of files and the lack of contextual information. To address this, we propose to locate bugs using software changes, which offer finer granularity than files and provide important contextual clues for bug-fixing. We observe that bug inducing changes can facilitate the bug fixing process. For example, it helps triage the bug fixing task to the developers who committed the bug inducing changes or enables developers to fix bugs by reverting these changes. Our study further identifies that change logs and the naturally small granularity of changes can help boost the performance of IR-based bug localization. Motivated by these observations, we propose an IR-based approach Locus to locate bugs from software changes, and evaluate it on six large open source projects. The results show that Locus outperforms existing techniques at the source file level localization significantly. MAP and MRR in particular have been improved, on average, by 20.1% and 20.5%, respectively. Locus is also capable of locating the inducing changes within top 5 for 41.0% of the bugs. The results show that Locus can significantly reduce the number of lines needing to be scanned to locate the bug compared with existing techniques.

References

[1]
https://bugs.eclipse.org/bugs/buglist.cgi? classification=Eclipse&component=Core&list id= 11582065&product=JDT&query format=advanced& resolution=FIXED&version=4.5. Accessed: 2015-03-22.
[2]
https://bugs.eclipse.org/bugs/buglist.cgi? classification=Eclipse&component=UI&list id= 11582038&product=PDE&query format=advanced& resolution=FIXED&version=4.4. Accessed: 2015-03-22.
[3]
https://bz.apache.org/bugzilla/buglist.cgi?product= Tomcat%208&query format=advanced&resolution= FIXED. Accessed: 2015-03-22.
[4]
R. Abreu, P. Zoeteweij, and A. J. Van Gemund. On the accuracy of spectrum-based fault localization. In TAIC-PART’07, pages 89–98, 2007.
[5]
A. Alali, H. Kagdi, J. Maletic, et al. What’s a typical commit? a characterization of open source software repositories. In ICPC’08, pages 182–191. IEEE, 2008.
[6]
H. A. N. An Ngoc Lam, Anh Tuan Nguyen and T. N. Nguyen. Combining deep learning with information retrieval to localize buggy files for bug reports. In ASE’15, pages 151–160. IEEE, 2015.
[7]
S. A. Bohner. Software change impact analysis. 1996.
[8]
V. Dallmeier and T. Zimmermann. Extraction of bug localization benchmarks from history. In ASE’07, pages 433–436. ACM, 2007.
[9]
B. Fluri, M. Wursch, M. PInzger, and H. C. Gall. Change distilling: Tree differencing for fine-grained source code change extraction. IEEE Transactions on Software Engineering, 33(11):725–743, 2007.
[10]
T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. Predicting fault incidence using software change history. IEEE Transactions on Software Engineering, 26(7):653–661, 2000.
[11]
G. Jeong, S. Kim, and T. Zimmermann. Improving bug triage with bug tossing graphs. In FSE’09, pages 111–120. ACM, 2009.
[12]
Y. Kamei, E. Shihab, B. Adams, A. E. Hassan, A. Mockus, A. Sinha, and N. Ubayashi. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering, 39(6):757–773, 2013.
[13]
D. Kawrykow and M. P. Robillard. Non-essential changes in version histories. In ICSE’11, pages 351–360. ACM, 2011.
[14]
D. Kim, Y. Tao, S. Kim, and A. Zeller. Where should we fix this bug? a two-phase recommendation model. IEEE Transactions on Software Engineering, 39(11):1597–1610, 2013.
[15]
S. Kim, E. J. Whitehead Jr, and Y. Zhang. Classifying software changes: Clean or buggy? IEEE Transactions on Software Engineering, 34(2):181–196, 2008.
[16]
S. Kim, T. Zimmermann, K. Pan, and E. J. Whitehead Jr. Automatic identification of bug-introducing changes. In ASE’06, pages 81–90. IEEE, 2006.
[17]
S. Kim, T. Zimmermann, E. J. Whitehead Jr, and A. Zeller. Predicting faults from cached history. In ICSE’07, pages 489–498. IEEE Computer Society, 2007.
[18]
T.-D. B. Le, R. J. Oentaryo, and D. Lo. Information retrieval and spectrum based bug localization: better together. In FSE’15, pages 579–590. ACM, 2015.
[19]
S. K. Lukins, N. A. Kraft, and L. H. Etzkorn. Bug localization using latent dirichlet allocation. Information and Software Technology, 52(9):972–990, 2010.
[20]
X. Ma, P. Huang, X. Jin, P. Wang, S. Park, D. Shen, Y. Zhou, L. K. Saul, and G. M. Voelker. edoctor: Automatically diagnosing abnormal battery drain issues on smartphones. In NSDI’13, pages 57–70, 2013.
[21]
H. B. Mann and D. R. Whitney. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, pages 50–60, 1947.
[22]
C. D. Manning and H. Schütze. Foundations of statistical natural language processing, volume 999. MIT Press, 1999.
[23]
S. Meng, X. Wang, L. Zhang, and H. Mei. A history-based matching approach to identification of framework evolution. In ICSE’12, pages 353–363. IEEE, 2012.
[24]
L. Moreno, W. Bandara, S. Haiduc, and A. Marcus. On the relationship between the vocabulary of bug reports and source code. In ICSE’13, pages 452–455. IEEE, 2013.
[25]
L. Moreno, J. J. Treadway, A. Marcus, and W. Shen. On the use of stack traces to improve text retrieval-based bug localization. In ICSME’14, pages 151–160. IEEE, 2014.
[26]
R. Moser, W. Pedrycz, and G. Succi. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In ICSE’08, pages 181–190. IEEE, 2008.
[27]
A. T. Nguyen, T. T. Nguyen, J. Al-Kofahi, H. V. Nguyen, and T. N. Nguyen. A topic-based approach for narrowing the search space of buggy files from a bug report. In ASE’11, pages 263–272. IEEE, 2011.
[28]
C. Parnin and A. Orso. Are automated debugging techniques actually helping programmers? In ISSTA’11, pages 199–209. ACM, 2011.
[29]
F. Rahman, D. Posnett, A. Hindle, E. Barr, and P. Devanbu. Bugcache for inspections: hit or miss? In FSE’11, pages 322–331. ACM, 2011.
[30]
S. Rao and A. Kak. Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In MSR’11, pages 43–52. ACM, 2011.
[31]
X. Ren, F. Shah, F. Tip, B. G. Ryder, and O. Chesley. Chianti: a tool for change impact analysis of java programs. In ACM Sigplan Notices, volume 39, pages 432–448. ACM, 2004.
[32]
R. K. Saha, M. Lease, S. Khurshid, and D. E. Perry. Improving bug localization using structured information retrieval. In ASE’2013, pages 345–355. IEEE, 2013.
[33]
J. Śliwerski, T. Zimmermann, and A. Zeller. When do changes induce fixes? ACM sigsoft software engineering notes, 30(4):1–5, 2005.
[34]
E. M. Voorhees et al. The trec-8 question answering track report. In Trec, volume 99, pages 77–82, 1999.
[35]
Q. Wang, C. Parnin, and A. Orso. Evaluating the usefulness of ir-based fault localization techniques. In ISSTA’15, pages 1–11. ACM, 2015.
[36]
S. Wang and D. Lo. Version history, similar report, and structure: Putting them together for improved bug localization. In ICPC’14, pages 53–63. ACM, 2014.
[37]
S. Wang, D. Lo, and X. Jiang. Understanding widespread changes: A taxonomic study. In CSMR’13, pages 5–14. IEEE, 2013.
[38]
C.-P. Wong, Y. Xiong, H. Zhang, D. Hao, L. Zhang, and H. Mei. Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In ICSME’14, pages 181–190. IEEE, 2014.
[39]
R. Wu, H. Zhang, S.-C. Cheung, and S. Kim. Crashlocator: locating crashing faults based on crash stacks. In Proceedings of the 2014 International Symposium on Software Testing and Analysis, pages 204–214, 2014.
[40]
R. Wu, H. Zhang, S. Kim, and S.-C. Cheung. Relink: recovering links between bugs and changes. In FSE’11, pages 15–25. ACM, 2011.
[41]
X. Ye, R. Bunescu, and C. Liu. Learning to rank relevant files for bug reports using domain knowledge. In FSE’14, pages 689–699. ACM, 2014.
[42]
Z. Yin, D. Yuan, Y. Zhou, S. Pasupathy, and L. Bairavasundaram. How do fixes become bugs? In FSE’11, pages 26–36. ACM, 2011.
[43]
A. Zeller and R. Hildebrandt. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering, 28(2):183–200, 2002.
[44]
L. Zhang, M. Kim, and S. Khurshid. Localizing failure-inducing program edits based on spectrum information. In ICSM’11, pages 23–32. IEEE, 2011.
[45]
J. Zhou, H. Zhang, and D. Lo. Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports. In ICSE’12, pages 14–24. IEEE, 2012.

Cited By

View all
  • (2024)FBDetect: Catching Tiny Performance Regressions at Hyperscale through In-Production MonitoringProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695977(522-540)Online publication date: 4-Nov-2024
  • (2024)Effective Vulnerable Function Identification based on CVE Description Empowered by Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695013(393-405)Online publication date: 27-Oct-2024
  • (2024)How Well Industry-Level Cause Bisection Works in Real-World: A Study on Linux KernelCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663828(62-73)Online publication date: 10-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering
August 2016
899 pages
ISBN:9781450338455
DOI:10.1145/2970276
  • General Chair:
  • David Lo,
  • Program Chairs:
  • Sven Apel,
  • Sarfraz Khurshid
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bug localization
  2. information retrieval
  3. software analytics
  4. software changes

Qualifiers

  • Research-article

Conference

ASE'16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)133
  • Downloads (Last 6 weeks)20
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FBDetect: Catching Tiny Performance Regressions at Hyperscale through In-Production MonitoringProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695977(522-540)Online publication date: 4-Nov-2024
  • (2024)Effective Vulnerable Function Identification based on CVE Description Empowered by Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695013(393-405)Online publication date: 27-Oct-2024
  • (2024)How Well Industry-Level Cause Bisection Works in Real-World: A Study on Linux KernelCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663828(62-73)Online publication date: 10-Jul-2024
  • (2024)Aligning Programming Language and Natural Language: Exploring Design Choices in Multi-Modal Transformer-Based Embedding for Bug LocalizationProceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering10.1145/3643787.3648028(1-8)Online publication date: 20-Apr-2024
  • (2024)ChangeRCA: Finding Root Causes from Software Changes in Large Online SystemsProceedings of the ACM on Software Engineering10.1145/36437281:FSE(24-46)Online publication date: 12-Jul-2024
  • (2024)Evaluating SZZ Implementations: An Empirical Study on the Linux KernelIEEE Transactions on Software Engineering10.1109/TSE.2024.340671850:9(2219-2239)Online publication date: 29-May-2024
  • (2024)SpecNLP: A Pre-trained Model Enhanced with Spectrum Profile for Bug Localization2024 IEEE International Conference on Artificial Intelligence Testing (AITest)10.1109/AITest62860.2024.00018(81-86)Online publication date: 15-Jul-2024
  • (2024)Boosting fault localization of statements by combining topic modeling and OchiaiInformation and Software Technology10.1016/j.infsof.2024.107499173(107499)Online publication date: Sep-2024
  • (2024)A systematic mapping study of bug reproduction and localizationInformation and Software Technology10.1016/j.infsof.2023.107338165:COnline publication date: 1-Jan-2024
  • (2024)When debugging encounters artificial intelligence: state of the art and open challengesScience China Information Sciences10.1007/s11432-022-3803-967:4Online publication date: 21-Feb-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media