Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3639478.3641228acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
short-paper

An Ensemble Method for Bug Triaging using Large Language Models

Published: 23 May 2024 Publication History

Abstract

This study delves into the automation of bug triaging --- the process of assigning bug reports to appropriate developers and components in software development. At the core of our investigation are six transformer-based Large Language Models (LLMs), which we fine-tuned using a sequence classification method tailored for bug triaging tasks. Our results demonstrate a noteworthy performance of the DeBERTa model, which significantly outperforms its counterparts CodeBERT, DistilBERT, RoBERTa, ALBERT, and BERT in terms of effectiveness. However, it is crucial to note that despite the varying performance of each model, each model exhibits a unique degree of orthogonality, indicating distinct strengths in their bug triaging capabilities. Leveraging these orthogonal characteristics, we propose an ensemble method combining these LLMs through voting and stacking techniques. Remarkably, our findings reveal that the voting-based ensemble method surpasses all individual baselines in performance.

References

[1]
John Anvik. 2006. Automating bug report assignment. In Proceedings of the 28th international conference on Software engineering. 937--940.
[2]
John Anvik, Lyndon Hiew, and Gail C Murphy. 2006. Who should fix this bug?. In Proceedings of the 28th international conference on Software engineering. 361--370.
[3]
Kenneth Ward Church. 2017. Word2Vec. Natural Language Engineering 23, 1 (2017), 155--162.
[4]
Kevin Crowston, James Howison, and Hala Annabi. 2006. Information systems success in free and open source software development: Theory and measures. Software Process: Improvement and Practice 11, 2 (2006), 123--148.
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[6]
Atish Kumar Dipongkor and Kevin Moran. 2023. A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging. arXiv preprint arXiv:2310.06913 (2023).
[7]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. arXiv:2002.08155 [cs.CL]
[8]
Claudio Gentile and Manfred KK Warmuth. 1998. Linear hinge loss and average margin. Advances in neural information processing systems 11 (1998).
[9]
Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. 2020. Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint arXiv:2006.03654 (2020).
[10]
Gaeul Jeong, Sunghun Kim, and Thomas Zimmermann. 2009. Improving bug triage with bug tossing graphs. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering. 111--120.
[11]
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019).
[12]
Sun-Ro Lee, Min-Jae Heo, Chan-Gun Lee, Milhan Kim, and Gaeul Jeong. 2017. Applying deep learning based automatic bug triager to industrial projects. In Proceedings of the 2017 11th Joint Meeting on foundations of software engineering. 926--931.
[13]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[14]
Senthil Mani, Anush Sankaran, and Rahul Aralikatte. 2019. Deeptriage: Exploring the effectiveness of deep learning for bug triaging. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data. 171--179.
[15]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1--6, 2018, Volume 1 (Long Papers), Marilyn A. Walker, Heng Ji, and Amanda Stent (Eds.). Association for Computational Linguistics, 2227--2237.
[16]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).
[17]
Hongrun Wu, Haiyang Liu, and Yutao Ma. 2018. Empirical study on developer factors affecting tossing path length of bug reports. IET Software 12, 3 (2018), 258--270.
[18]
Jifeng Xuan, He Jiang, Zhilei Ren, Jun Yan, and Zhongxuan Luo. 2017. Automatic bug triage using semi-supervised text classification. arXiv preprint arXiv:1704.04769 (2017).
[19]
Geunseok Yang, Tao Zhang, and Byungjeong Lee. 2014. Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In 2014 IEEE 38th Annual Computer Software and Applications Conference. IEEE, 97--106.
[20]
Syed Farhan Alam Zaidi, Faraz Malik Awan, Minsoo Lee, Honguk Woo, and Chan-Gun Lee. 2020. Applying convolutional neural networks with different word representation techniques to recommend bug fixers. IEEE Access 8 (2020), 213729--213747.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE-Companion '24: Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings
April 2024
531 pages
ISBN:9798400705021
DOI:10.1145/3639478
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 May 2024

Check for updates

Qualifiers

  • Short-paper

Conference

ICSE-Companion '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 115
    Total Downloads
  • Downloads (Last 12 months)115
  • Downloads (Last 6 weeks)37
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media