Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3387904.3389281acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Exploiting Code Knowledge Graph for Bug Localization via Bi-directional Attention

Published: 12 September 2020 Publication History

Abstract

Bug localization automatic localize relevant source files given a natural language description of bug within a software project. For a large project containing hundreds and thousands of source files, developers need cost lots of time to understand bug reports generated by quality assurance and localize these buggy source files. Traditional methods are heavily depending on the information retrieval technologies which rank the similarity between source files and bug reports in lexical level. Recently, deep learning based models are used to extract semantic information of code with significant improvements for bug localization. However, programming language is a highly structural and logical language, which contains various relations within and cross source files. Thus, we propose KGBugLocator to utilize knowledge graph embeddings to extract these interrelations of code, and a keywords supervised bi-directional attention mechanism regularize model with interactive information between source files and bug reports. With extensive experiments on four different projects, we prove our model can reach the new the-state-of-art(SOTA) for bug localization.

References

[1]
Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2007. On the Accuracy of Spectrum-Based Fault Localization. In Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION '07). IEEE Computer Society, USA, 89--98.
[2]
Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States. 2787--2795. http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data
[3]
Gregory Gay, Sonia Haiduc, Andrian Marcus, and Tim Menzies. 2009. On the use of relevance feedback in IR-based concept location. In 25th IEEE International Conference on Software Maintenance (ICSM 2009), September 20-26, 2009, Edmonton, Alberta, Canada. 351--360. https://doi.org/10.1109/ICSM.2009.5306315
[4]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (1997), 1735--1780. https://doi.org/10.1162/neco.1997.9.8.1735
[5]
Xuan Huo and Ming Li. 2017. Enhancing the Unified Features to Locate Buggy Files by Exploiting the Sequential Nature of Source Code. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017. 1909--1915. https://doi.org/10.24963/ijcai.2017/265
[6]
Xuan Huo, Ming Li, and Zhi-Hua Zhou. 2016. Learning Unified Features from Natural and Programming Languages for Locating Buggy Source Code. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016. 1606--1612. http://www.ijcai.org/Abstract/16/230
[7]
Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Knowledge Graph Embedding via Dynamic Mapping Matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers. 687--696. https://www.aclweb.org/anthology/P15-1067/
[8]
Rie Johnson and Tong Zhang. 2015. Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. In NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31 - June 5, 2015. 103--112. https://doi.org/10.3115/v1/n15-1011
[9]
James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In 20th IEEE/ACM International Conference on Automated Software Engineering (ASE 2005), November 7-11, 2005, Long Beach, CA, USA. 273--282. https://doi.org/10.1145/1101908.1101949
[10]
An Ngoc Lam, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2017. Bug localization with combination of deep learning and information retrieval. In Proceedings of the 25th International Conference on Program Comprehension, ICPC 2017, Buenos Aires, Argentina, May 22-23, 2017. 218--229. https://doi.org/10.1109/ICPC.2017.24
[11]
Hongliang Liang, Lu Sun, Meilin Wang, and Yuxing Yang. 2019. Deep Learning With Customized Abstract Syntax Tree for Bug Localization. IEEE Access 7 (2019), 116309--116320. https://doi.org/10.1109/ACCESS.2019.2936948
[12]
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA. 2181--2187. http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9571
[13]
Zeqi Lin, Bing Xie, Yanzhen Zou, Junfeng Zhao, Xuan-Dong Li, Jun Wei, Hailong Sun, and Gang Yin. 2017. Intelligent Development Environment and Software Knowledge Graph. J. Comput. Sci. Technol. 32, 2 (2017), 242--249. https://doi.org/10.1007/s11390-017-1718-y
[14]
Guangliang Liu, Yang Lu, Ke Shi, Jingfei Chang, and Xing Wei. 2019. Convolutional Neural Networks-Based Locating Relevant Buggy Code Files for Bug Reports Affected by Data Imbalance. IEEE Access 7 (2019), 131304--131316. https://doi.org/10.1109/ACCESS.2019.2940557
[15]
Stacy K. Lukins, Nicholas A. Kraft, and Letha H. Etzkorn. 2010. Bug localization using latent Dirichlet allocation. Information & Software Technology 52, 9 (2010), 972--990. https://doi.org/10.1016/j.infsof.2010.04.002
[16]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press. https://doi.org/10.1017/CBO9780511809071
[17]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States. 3111--3119. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality
[18]
Laura Moreno, John Joseph Treadway, Andrian Marcus, and Wuwei Shen. 2014. On the Use of Stack Traces to Improve Text Retrieval-Based Bug Localization. In 30th IEEE International Conference on Software Maintenance and Evolution, Victoria, BC, Canada, September 29-October 3, 2014. 151--160. https://doi.org/10.1109/ICSME.2014.37
[19]
Lili Mou, Ge Li, Zhi Jin, Lu Zhang, and Tao Wang. 2014. TBCNN: A Tree-Based Convolutional Neural Network for Programming Language Processing. CoRR abs/1409.5718 (2014). arXiv:1409.5718 http://arxiv.org/abs/1409.5718
[20]
Syed Shariyar Murtaza, Abdelwahab Hamou-Lhadj, Nazim H. Madhavji, and Mechelle Gittens. 2014. An empirical study on the use of mutant traces for diagnosis of faults in deployed systems. Journal of Systems and Software 90 (2014), 29--44. https://doi.org/10.1016/j.jss.2013.11.1094
[21]
Sravya Polisetty, Andriy V. Miranskyy, and Ayse Basar. 2019. On Usefulness of the Deep-Learning-Based Bug Localization Models to Practitioners. In Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering, PROMISE 2019, Recife, Brazil, September 18, 2019. 16--25. https://doi.org/10.1145/3345629.3345632
[22]
Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar T. Devanbu. 2016. On the "naturalness" of buggy code. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016. 428--439. https://doi.org/10.1145/2884781.2884848
[23]
Henrique Lemos Ribeiro, Roberto Paulo Andrioli de Araujo, Marcos Lordello Chaim, Higor Amario de Souza, and Fabio Kon. 2019. Evaluating data-flow coverage in spectrum-based fault localization. In ESEM. IEEE, 1--11.
[24]
Xiaobing Sun, Wei Zhou, Bin Li, Zhen Ni, and Jinting Lu. 2019. Bug Localization for Version Issues With Defect Patterns. IEEE Access 7 (2019), 18811--18820. https://doi.org/10.1109/ACCESS.2019.2894976
[25]
Shaowei Wang and David Lo. 2016. AmaLgam+: Composing Rich Information Sources for Accurate Bug Localization. Journal of Software: Evolution and Process 28, 10 (2016), 921--942. https://doi.org/10.1002/smr.1801
[26]
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27-31, 2014, Québec City, Québec, Canada. 1112--1119. http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8531
[27]
Yan Xiao and Jacky Keung. 2018. Improving Bug Localization with Character-Level Convolutional Neural Network and Recurrent Neural Network. In 25th Asia-Pacific Software Engineering Conference, APSEC 2018, Nara, Japan, December 4-7, 2018. 703--704. https://doi.org/10.1109/APSEC.2018.00097
[28]
Yan Xiao, Jacky Keung, Kwabena Ebo Bennin, and Qing Mi. 2019. Improving bug localization with word embedding and enhanced convolutional neural networks. Information & Software Technology 105 (2019), 17--29. https://doi.org/10.1016/j.infsof.2018.08.002
[29]
Yan Xiao, Jacky Keung, Qing Mi, and Kwabena Ebo Bennin. 2017. Improving Bug Localization with an Enhanced Convolutional Neural Network. In 24th Asia-Pacific Software Engineering Conference, APSEC 2017, Nanjing, China, December 4-8, 2017. 338--347. https://doi.org/10.1109/APSEC.2017.40
[30]
Rui Xie, Long Chen, Wei Ye, Zhiyu Li, Tianxiang Hu, Dongdong Du, and Shikun Zhang. 2019. DeepLink: A Code Knowledge Graph Based Deep Learning Approach for Issue-Commit Link Recovery. In 26th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2019, Hangzhou, China, February 24-27, 2019. 434--444. https://doi.org/10.1109/SANER.2019.8667969
[31]
Xin Ye, Razvan C. Bunescu, and Chang Liu. 2014. Learning to rank relevant files for bug reports using domain knowledge. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16-22, 2014. 689--699. https://doi.org/10.1145/2635868.2635874
[32]
Klaus Changsun Youm, June Ahn, Jeongho Kim, and Eunseok Lee. 2015. Bug Localization Based on Code Change Histories and Bug Reports. In 2015 Asia-Pacific Software Engineering Conference, APSEC 2015, New Delhi, India, December 1-4, 2015. 190--197. https://doi.org/10.1109/APSEC.2015.23
[33]
Klaus Changsun Youm, June Ahn, and Eunseok Lee. 2017. Improved bug localization based on code change histories and bug reports. Information & Software Technology 82 (2017), 177--192. https://doi.org/10.1016/j.infsof.2016.11.002
[34]
Mengshi Zhang, Xia Li, Lingming Zhang, and Sarfraz Khurshid. 2017. Boosting spectrum-based fault localization using PageRank. In ISSTA. ACM, 261--272.
[35]
Jian Zhou, Hongyu Zhang, and David Lo. 2012. Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland. 14--24. https://doi.org/10.1109/ICSE.2012.6227210

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPC '20: Proceedings of the 28th International Conference on Program Comprehension
July 2020
481 pages
ISBN:9781450379588
DOI:10.1145/3387904
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bug localization
  2. code representation
  3. deep learning
  4. knowledge graph

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPC '20
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)125
  • Downloads (Last 6 weeks)24
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)UIGuider: Detecting Implicit Design Guidelines Using a Domain Knowledge Graph ApproachElectronics10.3390/electronics1307121013:7(1210)Online publication date: 26-Mar-2024
  • (2024)RLocator: Reinforcement Learning for Bug LocalizationIEEE Transactions on Software Engineering10.1109/TSE.2024.345259550:10(2695-2708)Online publication date: 1-Oct-2024
  • (2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
  • (2024)A systematic mapping study of bug reproduction and localizationInformation and Software Technology10.1016/j.infsof.2023.107338165:COnline publication date: 1-Jan-2024
  • (2024)bjEnet: a fast and accurate software bug localization method in natural language semantic spaceSoftware Quality Journal10.1007/s11219-024-09693-132:4(1515-1538)Online publication date: 22-Jul-2024
  • (2023)Capturing the long-distance dependency in the control flow graph via structural-guided attention for bug localizationProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/249(2242-2250)Online publication date: 19-Aug-2023
  • (2023)Pre-training Code Representation with Semantic Flow Graph for Effective Bug LocalizationProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616338(579-591)Online publication date: 30-Nov-2023
  • (2023)Studying the Influence and Distribution of the Human Effort in a Hybrid Fitness Function for Search-Based Model-Driven EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2023.332973049:12(5189-5202)Online publication date: 1-Dec-2023
  • (2023)BL-GAN: Semi-Supervised Bug Localization via Generative Adversarial NetworkIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322532935:11(11112-11125)Online publication date: 1-Nov-2023
  • (2023)Documentation-Guided API Sequence Search without Worrying about the Text-API Semantic Gap2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00040(343-354)Online publication date: Mar-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media