research-article

BERT- and TF-IDF-based feature extraction for long-lived bug prediction in FLOSS: : A comparative study

Authors:

Ricardo da Silva Torres,

Mario Lúcio CôrtesAuthors Info & Claims

Volume 160, Issue C

https://doi.org/10.1016/j.infsof.2023.107217

Published: 01 August 2023 Publication History

Abstract

Context:

The correct prediction of long-lived bugs could help maintenance teams to build their plan and to fix more bugs that often adversely affect software quality and disturb the user experience across versions in Free/Libre Open-Source Software (FLOSS). Machine Learning and Text Mining methods have been applied to solve many real-world prediction problems, including bug report handling.

Objective:

Our research aims to compare the accuracy of ML classifiers on long-lived bug prediction in FLOSS using Bidirectional Encoder Representations from Transformers (BERT)- and Term Frequency - Inverse Document Frequency (TF-IDF)-based feature extraction. Besides that, we aim to investigate BERT variants on the same task.

Method:

We collected bug reports from six popular FLOSS and used the Machine Learning classifiers to predict long-lived bugs. Furthermore, we compare different feature extractors, based on BERT and TF-IDF methods, in long-lived bug prediction.

Results:

We found that long-lived bug prediction using BERT-based feature extraction systematically outperformed the TF-IDF. The SVM and Random Forest outperformed other classifiers in almost all datasets using BERT. Furthermore, smaller BERT architectures show themselves as competitive.

Conclusion:

Our results demonstrated a promising avenue to predict long-lived bugs based on BERT contextual embedding features and fine-tuning procedures.

References

[1]

A. Lamkanfi, S. Demeyer, E. Giger, B. Goethals, Predicting the severity of a reported bug, in: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), (ISSN: 2160-1852) 2010, pp. 1–10.

[2]

A. Lamkanfi, S. Demeyer, Q.D. Soetens, T. Verdonck, Comparing Mining Algorithms for Predicting the Severity of a Reported Bug, in: 2011 15th European Conference on Software Maintenance and Reengineering, (ISSN: 1534-5351) 2011, pp. 249–258.

[3]

Yang G., Baek S., Lee J.-W., Lee B., Analyzing emotion words to predict severity of software bugs: A case study of open source projects, in: Proceedings of the Symposium on Applied Computing, SAC ’17, ACM, New York, NY, USA, 2017, pp. 1280–1287.

[4]

H. Zhang, L. Gong, S. Versteeg, Predicting bug-fixing time: An empirical study of commercial software projects, in: 2013 35th International Conference on Software Engineering (ICSE), (ISSN: 0270-5257) 2013, pp. 1042–1051.

[5]

W. Abdelmoez, M. Kholief, F.M. Elsalmy, Bug fix-time prediction model using naïve Bayes classifier, in: 2012 22nd International Conference on Computer Theory and Applications (ICCTA), 2012, pp. 167–172.

[6]

Al-Zubaidi W.H.A., Dam H.K., Ghose A., Li X., Multi-objective search-based approach to estimate issue resolution time, in: Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering, in: PROMISE, Association for Computing Machinery, New York, NY, USA, 2017, pp. 53–62. Available from: https://doi.org/10.1145/3127005.3127011.

[7]

P. Ardimento, M. Bilancia, S. Monopoli, Predicting Bug-Fix Time: Using Standard Versus Topic-Based Text Categorization Techniques, 2016, pp. 167–182.

[8]

Ardimento P., Dinapoli A., Knowledge extraction from on-line open source bug tracking systems to predict bug-fixing time, in: Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, WIMS ’17, Association for Computing Machinery, New York, NY, USA, 2017, Available from: https://doi.org/10.1145/3102254.3102275.

[9]

Sepahvand R., Akbari R., Hashemi S., Predicting the bug fixing time using word embedding and deep long short term memories, IET Softw. 14 (3) (2020) 203–212.

Digital Library

[10]

C. Liu, J. Yang, L. Tan, M. Hafiz, R2Fix: Automatically Generating Bug Fixes from Bug Reports, in: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, 2013, pp. 282–291.

[11]

P. Francis, L. Williams, Determining ”Grim Reaper” Policies to Prevent Languishing Bugs, in: 2013 IEEE International Conference on Software Maintenance, 2013, pp. 436–439.

[12]

Akbarinasaji S., Caglayan B., Bener A., Predicting bug-fixing time: A replication study using an open source software project, J. Syst. Softw. 136 (2018) 173–186.

Digital Library

[13]

B.S. Rawal, A.K. Tsetse, Analysis of bugs in Google security research project database, in: 2015 IEEE Recent Advances in Intelligent Computational Systems (RAICS), 2015, pp. 116–121.

[14]

Saha R.K., Khurshid S., Perry D.E., Understanding the triaging and fixing processes of long lived bugs, Inf. Softw. Technol. 65 (2015) 114–128.

Digital Library

[15]

Mezouar M.E., Zhang F., Zou Y., Are tweets useful in the bug fixing process? An empirical study on firefox and chrome, Empir. Softw. Eng. 23 (3) (2018) 1704–1742. Available from: https://doi.org/10.1007/s10664-017-9559-4.

[16]

R.K. Saha, S. Khurshid, D.E. Perry, An empirical study of long lived bugs, in: 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), 2014, pp. 144–153.

[17]

R.K. Saha, J. Lawall, S. Khurshid, D.E. Perry, Are These Bugs Really “Normal”?, in: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, (ISSN: 2160-1852) 2015, pp. 258–268.

[18]

Gomes L.A.F., da Silva Torres R., Côrtes M.L., On the prediction of long-lived bugs: An analysis and comparative study using FLOSS projects, Inf. Softw. Technol. 132 (2021) Available from: https://www.sciencedirect.com/science/article/pii/S0950584920302482.

[19]

G. Canfora, M. Ceccarelli, L. Cerulo, M. Di Penta, How Long Does a Bug Survive? An Empirical Study, in: 2011 18th Working Conference on Reverse Engineering, 2011, pp. 191–200.

[20]

Marks L., Zou Y., Hassan A.E., Studying the fix-time for bugs in large open source projects, in: Proceedings of the 7th International Conference on Predictive Models in Software Engineering, Promise ’11, Association for Computing Machinery, New York, NY, USA, 2011, Available from: https://doi.org/10.1145/2020390.2020401.

[21]

Giger E., Pinzger M., Gall H., Predicting the fix time of bugs, in: Proceedings of the 2Nd International Workshop on Recommendation Systems for Software Engineering, RSSE ’10, ACM, New York, NY, USA, 2010, pp. 52–56.

[22]

Singh V.B., Misra S., Sharma M., Bug severity assessment in cross project context and identifying training candidates, J. Inf. Knowl. Manage. 16 (01) (2017).

[23]

N.K.S. Roy, B. Rossi, Cost-Sensitive Strategies for Data Imbalance in Bug Severity Classification: Experimental Results, in: 2017 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2017, pp. 426–429.

[24]

Gomes L.A.F., da Silva Torres R., Côrtes M.L., Bug report severity level prediction in open source software: A survey and research opportunities, Inf. Softw. Technol. 115 (2019) 58–78.

Digital Library

[25]

H. Rocha, G. de Oliveira, M.T. Valente, H. Marques-Neto, Characterizing Bug Workflows in Mozilla Firefox, in: Proceedings of the 30th Brazilian Symposium on Software Engineering, SBES 2016, Maringá, Brazil, September 19 - 23, 2016, 2016, pp. 43–52.

[26]

Devlin J., Chang M.-W., Lee K., Toutanova K., BERT: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186. Available from: https://www.aclweb.org/anthology/N19-1423.

[27]

González-Carvajal S., Garrido-Merchán E.C., Comparing BERT against traditional machine learning text classification, 2020, arXiv preprint arXiv:2005.13012.

[28]

Sun C., Qiu X., Xu Y., Huang X., How to fine-tune BERT for text classification?, 2020.

[29]

Peters M.E., Ruder S., Smith N.A., To tune or not to tune? Adapting pretrained representations to diverse tasks, in: Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), Association for Computational Linguistics, Florence, Italy, 2019, pp. 7–14. Available from: https://www.aclweb.org/anthology/W19-4302.

[30]

Csuvik V., Horváth D., Horváth F., Vidács L., Utilizing source code embeddings to identify correct patches, in: 2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF), IEEE, 2020, pp. 18–25.

[31]

Feng Z., Guo D., Tang D., Duan N., Feng X., Gong M., Shou L., Qin B., Liu T., Jiang D., Zhou M., CodeBERT: A pre-trained model for programming and natural languages, in: Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics, Online, 2020, pp. 1536–1547. Available from: https://aclanthology.org/2020.findings-emnlp.139.

[32]

Guo D., Ren S., Lu S., Feng Z., Tang D., Liu S., Zhou L., Duan N., Svyatkovskiy A., Fu S., et al., Graphcodebert: Pre-training code representations with data flow, 2020, arXiv preprint arXiv:2009.08366.

[33]

Kanade A., Maniatis P., Balakrishnan G., Shi K., Learning and evaluating contextual embedding of source code, in: Proceedings of the 37th International Conference on Machine Learning, ICML ’20, JMLR.org, 2020.

[34]

Wang R., Zhang H., Lu G., Lyu L., Lyu C., Fret: Functional reinforced transformer with BERT for code summarization, IEEE Access 8 (2020) 135591–135604.

[35]

Akimova E.N., Bersenev A.Y., Deikov A.A., Kobylkin K.S., Konygin A.V., Mezentsev I.P., Misilov V.E., A survey on software defect prediction using deep learning, Mathematics 9 (11) (2021) 1180.

[36]

Allamanis M., Jackson-Flux H., Brockschmidt M., Self-supervised bug detection and repair, Adv. Neural Inf. Process. Syst. 34 (2021).

[37]

de Araújo A.F., Marcacini R.M., RE-BERT: Automatic extraction of software requirements from app reviews using BERT language model, in: Proceedings of the 36th Annual ACM Symposium on Applied Computing, SAC ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp. 1321–1327. Available from: https://doi.org/10.1145/3412841.3442006.

[38]

J. Lin, Y. Liu, Q. Zeng, M. Jiang, J. Cleland-Huang, Traceability Transformed: Generating More Accurate Links with Pre-Trained BERT Models, in: Proceedings of the 43rd International Conference on Sofware Engineering, Vol. 43, Available from:.

[39]

Wang X., Wang Y., Mi F., Zhou P., Wan Y., Liu X., Li L., Wu H., Liu J., Jiang X., SynCoBERT: Syntax-guided multi-modal contrastive pre-training for code representation, 2021,. arXiv. Available from: https://arxiv.org/abs/2108.04556.

[40]

Zou W., Li E., Fang C., BLESER: Bug localization based on enhanced semantic retrieval, 2021, arXiv preprint arXiv:2109.03555.

[41]

P. Ardimento, C. Mele, Using BERT to Predict Bug-Fixing Time, in: 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), 2020, pp. 1–7.

[42]

Zhang T., Yang G., Lee B., Chan A.T.S., Predicting severity of bug report by mining bug repository with concept profile, in: Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC ’15, ACM, New York, NY, USA, 2015, pp. 1553–1558.

[43]

Géron A., Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques To Build Intelligent Systems, O’Reilly Media, 2019.

[44]

Flach P., Machine Learning: The Art and Science of Algorithms that Make Sense of Data, Cambridge University Press, New York, NY, USA, 2012.

Digital Library

[45]

Marsland S., Machine Learning: An Algorithmic Perspective, Second Edition, second ed., Chapman & Hall/CRC, 2014.

[46]

Haykin S., Neural Networks: A Comprehensive Foundation, second ed., Prentice Hall PTR, Upper Saddle River, NJ, USA, 1998.

[47]

Zhou J., Zhang H., Lo D., Where should the bugs be fixed? - More accurate information retrieval-based bug localization based on bug reports, in: Proceedings of the 34th International Conference on Software Engineering, ICSE ’12, IEEE Press, Piscataway, NJ, USA, 2012, pp. 14–24.

[48]

Breiman L., Random Forests, Mach. Learn. 45 (1) (2001) 5–32.

Digital Library

[49]

Tian Y., Ali N., Lo D., Hassan A.E., On the unreliability of bug severity data, Empir. Softw. Engg. 21 (6) (2016) 2298–2323.

[50]

Zhao Y., Cen Y., Data Mining Applications with R, first ed., Academic Press, 2013.

[51]

Kuhn M., Johnson K., Applied Predictive Modeling, in: SpringerLink : Bücher, Springer New York, 2013.

[52]

Luo G., A review of automatic selection methods for machine learning algorithms and hyperparameter values, Netw. Model. Anal. Health Inform. Bioinform. 5 (1) (2016) 18.

[53]

Probst P., Bischl B., Boulesteix A.-L., Tunability: Importance of Hyperparameters of Machine Learning Algorithms, 2018, arXiv e-prints, arXiv:1802.09596. arXiv:1802.09596 [stat.ML].

[54]

Feldman R., Sanger J., Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data, Cambridge University Press, New York, NY, USA, 2006.

[55]

Williams G., Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery, Springer, 2011, p. 374.

[56]

Srivastava A., Sahami M., Text Mining: Classification, Clustering, and Applications, first ed., Chapman and Hall/CRC, 2009.

[57]

Torfi A., Shirvani R.A., Keneshloo Y., Tavaf N., Fox E.A., Natural language processing advancements by deep learning: A survey, 2021.

[58]

Landolt S., Wambsganss T., Söllner M., A Taxonomy for Deep Learning in Natural Language Processing, Hawaii International Conference on System Sciences, 2021.

[59]

Ravichandiran S., Getting Started with Google BERT: Build and Train State-of-the-Art Natural Language Processing Models using BERT, Packt Publishing, 2021.

[60]

Lan Z., Chen M., Goodman S., Gimpel K., Sharma P., Soricut R., ALBERT: A lite BERT for self-supervised learning of language representations, 2019,. Available from: https://arxiv.org/abs/1909.11942.

[61]

Turc I., Chang M.-W., Lee K., Toutanova K., Well-read students learn better: On the importance of pre-training compact models, 2019, arXiv Preprint arXiv:1908.08962v2.

[62]

Sanh V., Debut L., Chaumond J., Wolf T., DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, 2019,. arXiv. Available from: https://arxiv.org/abs/1910.01108.

[63]

Clark K., Luong M.-T., Le Q.V., Manning C.D., ELECTRA: Pre-training text encoders as discriminators rather than generators, in: International Conference on Learning Representations, 2020, Available from: https://openreview.net/forum?id=r1xMH1BtvB.

[64]

A. Lamkanfi, S. Demeyer, Filtering Bug Reports for Fix-Time Analysis, in: 2012 16th European Conference on Software Maintenance and Reengineering, 2012, pp. 379–384.

[65]

Habayeb M., Murtaza S.S., Miranskyy A., Bener A.B., On the use of hidden Markov model to predict the time to fix bugs, in: Proceedings of the 40th International Conference on Software Engineering, ICSE ’18, ACM, New York, NY, USA, 2018, p. 700.

[66]

Y. Tian, D. Lo, C. Sun, Information Retrieval Based Nearest Neighbor Classification for Fine-Grained Bug Severity Prediction, in: 2012 19th Working Conference on Reverse Engineering, 2012, pp. 215–224.

[67]

Valdivia Garcia H., Shihab E., Characterizing and predicting blocking bugs in open source projects, in: Proceedings of the 11th Working Conference on Mining Software Repositories, in: MSR 2014, ACM, New York, NY, USA, 2014, pp. 72–81.

[68]

de Jonge E., van der Loo M., An introduction to data cleaning with R, Statist. Netherl. (2013) 53.

[69]

Japkowicz N., Shah M., Evaluating Learning Algorithms: A Classification Perspective, Cambridge University Press, New York, NY, USA, 2011.

[70]

Wilcoxon F., Individual Comparisons by Ranking Methods, Springer New York, New York, NY, 1992, pp. 196–202.

[71]

Kipf T.N., Welling M., Semi-supervised classification with graph convolutional networks, 2016, arXiv preprint arXiv:1609.02907.

[72]

Zhou J., Cui G., Zhang Z., Yang C., Liu Z., Sun M., Graph neural networks: A review of methods and applications, 2018, CoRR, abs/1812.08434. Available from: http://arxiv.org/abs/1812.08434.

[73]

Wu Z., Pan S., Chen F., Long G., Zhang C., Yu P.S., A comprehensive survey on graph neural networks, 2019, CoRR, abs/1901.00596. Available from: http://arxiv.org/abs/1901.00596.

Cited By

Hou XZhao YLiu YYang ZWang KLi LLuo XLo DGrundy JWang H(2024)Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology10.1145/3695988Online publication date: 20-Sep-2024
https://doi.org/10.1145/3695988
Iannone ESellitto GIaccarino EFerrucci FDe Lucia APalomba F(2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3654443
Du XLi CMa XZheng ZRoychoudhury APaiva AAbreu RStorey M(2024)How Does Pre-trained Language Model Perform on Deep Learning Framework Bug Prediction?Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643113(346-347)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3643113
Show More Cited By

Recommendations

A Bug Rule Based Technique with Feedback for Classifying Bug Reports
CIT '11: Proceedings of the 2011 IEEE 11th International Conference on Computer and Information Technology

As software programs become increasingly large and complex, it is more important to improve the quality of software maintenance. Many software programs rely on bug reports to correct errors in maintenance activities. Bug tracking systems were developed ...
Towards Semi-automatic Bug Triage and Severity Prediction Based on Topic Model and Multi-feature of Bug Reports
COMPSAC '14: Proceedings of the 2014 IEEE 38th Annual Computer Software and Applications Conference

Bug fixing is an essential activity in the software maintenance, because most of the software systems have unavoidable defects. When new bugs are submitted, triagers have to find and assign appropriate developers to fix the bugs. However, if the bugs are ...
An empirical analysis of reopened bugs based on open source projects
EASE '16: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering

Background: Bug fixing is a long-term and time-consuming activity. A software bug experiences a typical life cycle from newly reported to finally closed by developers, but it could be reopened afterwards for further actions due to reasons such as ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information and Software Technology

Information and Software Technology Volume 160, Issue C

Aug 2023

251 pages

ISSN:0950-5849

Issue’s Table of Contents

Elsevier B.V.

Publisher

Butterworth-Heinemann

United States

Publication History

Published: 01 August 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hou XZhao YLiu YYang ZWang KLi LLuo XLo DGrundy JWang H(2024)Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology10.1145/3695988Online publication date: 20-Sep-2024
https://doi.org/10.1145/3695988
Iannone ESellitto GIaccarino EFerrucci FDe Lucia APalomba F(2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3654443
Du XLi CMa XZheng ZRoychoudhury APaiva AAbreu RStorey M(2024)How Does Pre-trained Language Model Perform on Deep Learning Framework Bug Prediction?Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643113(346-347)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3643113
Wang ZZhang HChen HFeng YDing J(2024)Content-based quality evaluation of scientific papers using coarse feature and knowledge entity networkJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10211936:6Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1016/j.jksuci.2024.102119
Liu YYou TZou JCao B(2024)Modelling customer requirement for mobile games based on online reviews using BW-CNN and S-Kano modelsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125142258:COnline publication date: 15-Dec-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.125142
Pal RKar SSekh A(2024)Enhancing Accessibility in Online Shopping: A Dataset and Summarization Method for Visually Impaired IndividualsSN Computer Science10.1007/s42979-024-03351-w5:8Online publication date: 2-Nov-2024
https://dl.acm.org/doi/10.1007/s42979-024-03351-w
Chen QTang ZHe DZhao DWang J(2024)A three-stage quality evaluation method for experience products: taking animation as an exampleMultimedia Systems10.1007/s00530-024-01401-030:4Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1007/s00530-024-01401-0
Chuor PIttoo AHeng S(2024)User Story Classification with Machine Learning and LLMsKnowledge Science, Engineering and Management10.1007/978-981-97-5492-2_13(161-175)Online publication date: 16-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-5492-2_13
Ren HMa MZhang XCao YNie C(2023)Effective Recommendation of Cross-Project Correlated Issues based on Issue MetricsProceedings of the 14th Asia-Pacific Symposium on Internetware10.1145/3609437.3609462(1-1)Online publication date: 4-Aug-2023
https://dl.acm.org/doi/10.1145/3609437.3609462

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents