Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3613372.3613396acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbesConference Proceedingsconference-collections
research-article

Similar Bug Reports Recommendation System using BERT

Published: 25 September 2023 Publication History

Abstract

In order to document software issues so that they can be later analyzed and corrected, Bug Reports (BR) are used. According to Mozilla’s Bugzilla, as an example, over 8,000 new bugs were reported for Firefox in 2020. Thus, a recommendation system can be a valuable tool to improve productivity in software development, especially when dealing with a high volume of BRs to be reviewed and possibly fixed by the maintainers. This study proposes and evaluates a BR recommendation system based on textual similarity, with the differential use of the state-of-the-art text comprehension model BERT as one of the factors in the similarity calculation. We use a dataset with 106k Mozilla BRs extracted from Bugzilla, an open-source platform. The main objective is to improve suggestions for BRs with a context close to that provided by the maintainer. In the study, we experimented with the BERT model adopting the similarity calculation as individually as together with the well-known TF-IDF vectorization technique. The results attest that there were gains of approximately 14% in the frequency of relevant BRs for the first 20 recommendations compared to a baseline technique that adopts only the TF-IDF vectorization approach. The BERT model added improvements to the evaluated metrics (precision, feedback, and likelihood) when complementary to TF-IDF, but did not perform positively in an isolated manner. Overall, the findings could have implications for software development teams handling a high volume of BRs and potentially increase their productivity in resolving BRs.

References

[1]
M. Alenezi, K. Magel, and S. Banitaan. 2013. Efficient Bug Triaging Using Text Mining. J. Softw. 8, 9 (2013), 2185–2190.
[2]
John Anvik, Lyndon Hiew, and Gail C. Murphy. 2005. Coping with an Open Bug Repository. In 2005 Proceedings of the OOPSLA Workshop on Eclipse Technology EXchange (ETX). 35–39.
[3]
Nicolas Bettenburg 2007. Quality of Bug Reports in Eclipse. In 2007 Proceedings of the OOPSLA Workshop on Eclipse Technology eXchange (ETX). 21–25.
[4]
Thorsten Brants. 2003. Natural Language Processing in Information Retrieval. CLIN 111 (2003), 1–13.
[5]
Richard G. Brereton and Gavin R. Lloyd. 2010. Support Vector Machines for classification and regression. Analyst 135 (2010), 230–267.
[6]
Bugzilla. 2023. About - Bugzilla. Available at: https://www.bugzilla.org/about/. Online; accessed 11 July 2023.
[7]
Guilherme Carneiro. 2023. Similar Bug Reports Recommender. Available at: https://github.com/guimcarneiro/similar-bug-reports-recommender.
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 1 (2018), 4171–4186.
[9]
Cambridge Dictionary. 2023. Definition of the Word "context". Available at: https://dictionary.cambridge.org/us/dictionary/english/context. Online; accessed 15 April 2023.
[10]
Pieter Hooimeijer and Westley Weimer. 2007. Modeling Bug Report Quality. In 22nd IEEE/ACM Proceedings of the International Conference on Automated software engineering (ASE). 34–43.
[11]
Mia Mohammad Imran, Agnieszka Ciborowska, and Kostadin Damevski. 2021. Automatically Selecting Follow-up Questions for Deficient Bug Reports. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). 167–178.
[12]
Folasade Isinkaye, Yetunde Folajimi, and Bolanle Ojokoh. 2015. Recommendation Systems: Principles, Methods and Evaluation. Egyptian Informatics Journal 16 (2015), 261–273.
[13]
Taemin Kim and Geunseok Yang. 2022. Predicting Duplicate in Bug Report Using Topic-Based Duplicate Learning With Fine Tuning-Based BERT Algorithm. IEEE Access 10 (2022), 129666–129675.
[14]
Rrezarta Krasniqi and Hyunsook Do. 2022. Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging. In 2022 Proceedings of the International Conference on Evaluation and Assessment in Software Engineering (EASE). 10–19.
[15]
C. H. E. N. Lele 2020. Bug Report Quality Detection Based on the BM25 Algorithm. Journal of Tsinghua University (Science and Technology) 60, 10 (2020), 829–836.
[16]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. Advances in neural information processing systems 26 (2013).
[17]
Ruslan Mirziianov, Amir Kiamov, and Eduard V. Ehlakov. 2021. System for Automatic Evaluation of Bug Report Quality. In 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). 2158–2160.
[18]
Mozilla. 2023. History of the Mozilla Project. Available at: https://www.mozilla.org/en-US/about/history/. Online; accessed 11 July 2023.
[19]
Kai Pan, Sunghun Kim, and E James Whitehead. 2009. Toward an understanding of bug fix patterns. Empirical Software Engineering 14 (2009), 286–315.
[20]
Mohammad M Rahman, Foutse Khomh, and Marco Castelluccio. 2022. Works for Me! Cannot Reproduce–A Large Scale Empirical Study of Non-reproducible Bugs. Empirical Software Engineering 27 (2022), 111.
[21]
Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence Embeddings Using Siamese Bert-networks. arXiv preprint arXiv:1908.10084 (2019).
[22]
Google Research. 2023. "What is BERT?". Available at: https://github.com/google-research/bert. Online; accessed 11 July 2023.
[23]
Henrique Rocha, Guilherme de Oliveira, Humberto Marques-Neto, and Marco Tulio Valente. 2015. NextBug: a Bugzilla Extension for Recommending Similar Bugs. Journal of Software Engineering Research and Development 3 (2015), 1–14.
[24]
Robert J Sandusky, Les Gasser, and Gabriel Ripoche. 2004. Bug Report Networks: Varieties, Strategies, and Impacts in a F/OSS Development Community. In International Workshop on Mining Software Repositories (MSR). 80–84.
[25]
Lalita Sharma and Anju Gera. 2013. A Survey of Recommendation System: Research Challenges. International Journal of Engineering Trends and Technology (IJETT) 4, 5 (2013), 1989–1992.
[26]
LiHong Xu, ShuTao Sun, and Qi Wang. 2016. Text similarity algorithm based on semantic vector space model. In 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS). 1–4.
[27]
Xinli Yang, David Lo, Xin Xia, Lingfeng Bao, and Jianling Sun. 2016. Combining Word Embedding with Information Retrieval to Recommend Similar Bug Reports. In 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE). 127–137.
[28]
Xin Ye, Hui Shen, Xiao Ma, Razvan Bunescu, and Chang Liu. 2016. From Word Embeddings to Document Similarities for Improved Information Retrieval in Software Engineering. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). 404–415.
[29]
Jie Zhang, Xiaoyin Wang, Dan Hao, Bing Xie, Lu Zhang, and Hong Mei. 2015. A Survey on Bug-Report Analysis.Sci. China Inf. Sci. 58 (2015), 1–24.
[30]
Thomas Zimmermann, Rahul Premraj, Nicolas Bettenburg, Sascha Just, Adrian Schroter, and Cathrin Weiss. 2010. What Makes a Good Bug Report?IEEE Transactions on Software Engineering 36, 5 (2010), 618–643.

Cited By

View all
  • (2024)Exploring the Role of Automation in Duplicate Bug Report Detection: An Industrial Case StudyProceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)10.1145/3644032.3644450(193-203)Online publication date: 15-Apr-2024

Index Terms

  1. Similar Bug Reports Recommendation System using BERT

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SBES '23: Proceedings of the XXXVII Brazilian Symposium on Software Engineering
    September 2023
    570 pages
    ISBN:9798400707872
    DOI:10.1145/3613372
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 September 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. BERT
    2. Bug Reports
    3. Bugzilla
    4. Recommendation System

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SBES 2023
    SBES 2023: XXXVII Brazilian Symposium on Software Engineering
    September 25 - 29, 2023
    Campo Grande, Brazil

    Acceptance Rates

    Overall Acceptance Rate 147 of 427 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)46
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 31 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Exploring the Role of Automation in Duplicate Bug Report Detection: An Industrial Case StudyProceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)10.1145/3644032.3644450(193-203)Online publication date: 15-Apr-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media