A Survey on Bug Deduplication and Triage Methods from Multiple Points of View
<p>Bug management process.</p> "> Figure 2
<p>Information contained in a typical bug report.</p> "> Figure 3
<p>The life cycle of a bug report.</p> "> Figure 4
<p>Process of feature extraction and similarity calculation for bug reports.</p> "> Figure 5
<p>The number and trend of bug-deduplication- and triage-related work in recent years.</p> "> Figure 6
<p>How the state-of-the-art deduplication and triage techniques evolved.</p> "> Figure 7
<p>Overview of related work in recent years.</p> "> Figure 8
<p>Example image of a CNN for text classification.</p> "> Figure 9
<p>Calculation process in an LSTM unit for text classification.</p> "> Figure 10
<p>Transfer learning used to build ML models for text classification.</p> "> Figure 11
<p>Frames in a stack trace from Eclipse.</p> "> Figure 12
<p>Alignment process of stack traces.</p> "> Figure 13
<p>Seed processing in fuzz testing.</p> "> Figure 14
<p>New coverage cases.</p> "> Figure 15
<p>Source code converted to CFG.</p> "> Figure 16
<p>Weisfeiler–Lehman subtree kernel algorithm.</p> "> Figure 17
<p>General process of information-retrieval-based work.</p> "> Figure 18
<p>General process of machine-learning-based work.</p> ">
Abstract
:1. Introduction
- What is the roadmap of deduplication- and triage-related work? What mathematical methods are commonly used to address these problems?
- What are the main approaches currently used for deduplication and triage? What are the recent works on each approach and how are they implemented?
- What datasets are used in the related works? How are these works evaluated, and what are their actual results?
- What conclusions can be drawn from the current works? What are the potential research directions for the future?
- This paper summarizes the mathematical concepts and methods that are commonly used by bug deduplication and triage methods.
- This paper summarizes relevant works based on runtime information and analyzes the commonly used technical approaches from three perspectives. This paper has provides a comparison of the implementation methods and results of each work.
- This paper summarizes relevant works based on bug reports and explains the technical principles from two perspectives: information retrieval and machine learning. This paper provides detailed descriptions of the implementation approaches of various methods and a comparative analysis of their performance differences.
- This paper draws some empirical findings and proposes some possible future research points in terms of bug deduplication and triage.
2. Related Survey
3. The Roadmap of Existing Literature
3.1. Overview of Relevant Literature in Recent Years
3.2. Background Knowledge
3.2.1. Feature Extraction and Selection
3.2.2. Similarity Evaluation Model
3.3. Commonly Used Datasets
3.4. Evaluation Parameters
4. Works Based on Runtime Information
4.1. Methods Based on Comparing Stack Trace
4.2. Methods Based on Analysis Coverage
4.3. Methods Based on Context Comparison
5. Works Based on Bug Reports
5.1. Information Retrieval Approaches for Deduplication and Triage
5.2. Machine Learning Approaches for Deduplication and Triage
6. Evaluation Methods and Results
7. Findings and Future Direction
7.1. Findings from Existing Works
- The currently used BTS use the approaches based on bug reports to implement deduplication and triage, which is mainly determined by the ease of obtaining and transmitting bug reports. The biggest obstacle to using runtime information-based methods lies in the complete acquisition and format conversion of runtime information, which is also a possible research direction in the future. The similarity measurement used by BTS also generally requires more accurate text matching, which also reduces the effectiveness of automatic deduplication and requires more human resources to complete accurate deduplication and triage.
- Stack trace hash has been widely used in many works(∼50%) due to its ready availability and general benefits. It is helpful in identifying root causes and facilitating quick scenario reconstruction and has a certain level of usability. However, its accuracy in determining bug uniqueness is not high. For example, different paths leading to the same crash point may result in splitting of what should be considered the same bug into different ones. Similarly, identical call sequences with different specific values may result in grouping of bugs that should be considered different.
- Relying solely on coverage information is also inaccurate. This is mainly because there may be new execution paths unrelated to triggering the bug, which can lead to different coverage information for the same bug.
- When using runtime information of a program for bug deduplication and triage, false positives may occur because the same bug may exhibit different crash points, error messages, etc.
- In works based on information retrieval, the main sources of information include the bug’s basic attributes, crash dumps, stack traces, etc. Among them, stack trace is the most important analysis component, and almost all works(over 90%) refer to it to some extent.
- In works using machine learning methods based on bug reports, the basic approach aligns well with NLP processing approaches. Therefore, most of these works(∼67%) utilize neural network models such as LSTM and CNN. The key input for such works is the textual description information in the bug report. In some works(∼33%), the abilities of developers are also modeled and extracted as features to enhance the model’s recognition capability.
- More than 90% of works use open-source databases as test objects. For authors belonging to certain companies, in addition to public, open-source datasets, they also use the company’s datasets, such as JetBrains, Ericsson, etc.
- Generally, works based on runtime information tend to have better performance compared than those based on bug reports, but they also come with greater overhead. This is mainly because the accuracy of bug reports cannot be fully guaranteed.
7.2. Future Directions
- Stack trace is a primary source of information for works based on information retrieval. However, existing research has shown that this information may not be sufficient to accurately locate bug characteristics. Therefore, future work can consider studying methods to enrich and strengthen stack traces.
- Works based on bug reports heavily rely on bug descriptions, and the accuracy of these descriptions has a significant impact on the results. Currently, most works lack an evaluation analysis of the availability of bug reports. In future work, it would be beneficial to construct models to evaluate the usability of bug reports.
- Current works using DNN models mainly focus on CNN and LSTM. In future work, consideration of the use of updated models such as transformers can be explored to evaluate their effectiveness.
- Exploring ways to improve the efficiency of collecting runtime information and reducing the complexity of processing is a hopeful research point.
8. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Anvik, J.; Hiew, L.; Murphy, G.C. Coping with an open bug repository. In Proceedings of the 2005 OOPSLA Workshop on Eclipse Technology Exchange, San Diego, CA, USA, 16–17 October 2005; pp. 35–39. [Google Scholar]
- Banerjee, S.; Helmick, J.; Syed, Z.; Cukic, B. Eclipse vs. mozilla: A comparison of two large-scale open source problem report repositories. In Proceedings of the 2015 IEEE 16th International Symposium on High Assurance Systems Engineering, Daytona Beach Shores, FL, USA, 8–10 January 2015; pp. 263–270. [Google Scholar]
- Banerjee, S.; Cukic, B. On the cost of mining very large open source repositories. In Proceedings of the 2015 IEEE/ACM 1st International Workshop on Big Data Software Engineering, Florence, Italy, 23 May 2015; pp. 37–43. [Google Scholar]
- Angell, R.; Oztalay, B.; DeOrio, A. A topological approach to hardware bug triage. In Proceedings of the 2015 16th International Workshop on Microprocessor and SOC Test and Verification (MTV), Austin, TX, USA, 3–4 December 2015; pp. 20–25. [Google Scholar]
- Golagha, M.; Lehnhoff, C.; Pretschner, A.; Ilmberger, H. Failure clustering without coverage. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, Beijing, China, 15–19 July 2019; pp. 134–145. [Google Scholar]
- Mu, D.; Wu, Y.; Chen, Y.; Lin, Z.; Yu, C.; Xing, X.; Wang, G. An In-depth Analysis of Duplicated Linux Kernel Bug Reports. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium 2022, San Diego, CA, USA, 24–28 April 2022. [Google Scholar]
- Lee, D.G.; Seo, Y.S. Systematic Review of Bug Report Processing Techniques to Improve Software Management Performance. J. Inf. Process. Syst. 2019, 15, 967–985. [Google Scholar]
- Jahanshahi, H.; Cevik, M.; Mousavi, K.; Başar, A. ADPTriage: Approximate Dynamic Programming for Bug Triage. arXiv 2022, arXiv:2211.00872. [Google Scholar]
- Wu, H.; Ma, Y.; Xiang, Z.; Yang, C.; He, K. A spatial—Temporal graph neural network framework for automated software bug triaging. Knowl.-Based Syst. 2022, 241, 108308. [Google Scholar] [CrossRef]
- Zhao, Y.; He, T.; Chen, Z. A unified framework for bug report assignment. Int. J. Softw. Eng. Knowl. Eng. 2019, 29, 607–628. [Google Scholar] [CrossRef]
- Lee, D.G.; Seo, Y.S. Improving bug report triage performance using artificial intelligence based document generation model. Hum.-Centric Comput. Inf. Sci. 2020, 10, 26. [Google Scholar] [CrossRef]
- Neysiani, B.S.; Babamir, S.M. Automatic duplicate bug report detection using information retrieval-based versus machine learning-based approaches. In Proceedings of the 2020 6th International Conference on Web Research (ICWR), Tehran, Iran, 22–23 April 2020; pp. 288–293. [Google Scholar]
- Uddin, J.; Ghazali, R.; Deris, M.M.; Naseem, R.; Shah, H. A survey on bug prioritization. Artif. Intell. Rev. 2017, 47, 145–180. [Google Scholar] [CrossRef]
- Sawant, V.B.; Alone, N.V. A survey on various techniques for bug triage. Int. Res. J. Eng. Technol. 2015, 2, 917–920. [Google Scholar]
- Neysiani, B.S.; Babamir, S.M. Methods of feature extraction for detecting the duplicate bug reports in software triage systems. In Proceedings of the International Conference on Information Technology, Communications and Telecommunications (IRICT), Tehran, Iran, 1 March 2016; Volume 2016. [Google Scholar]
- Yadav, A.; Singh, S.K. Survey based classification of bug triage approaches. APTIKOM J. Comput. Sci. Inf. Technol. 2016, 1, 1–11. [Google Scholar] [CrossRef] [Green Version]
- Chhabra, D.; Malik, M.; Sharma, S. Literature survey on automatic bug triaging using machine learning techniques. In Proceedings of the AIP Conference Proceedings; AIP Publishing LLC: Melville, NY, USA, 2022; Volume 2555, p. 020017. [Google Scholar]
- Neysiani, B.S.; Babamir, S.M. Duplicate Detection Models for Bug Reports of Software Triage Systems: A Survey. Curr. Trends Comput. Sci. Appl. 2019, 1, 128–134. [Google Scholar]
- Pandey, N.; Sanyal, D.K.; Hudait, A.; Sen, A. Automated classification of software issue reports using machine learning techniques: An empirical study. Innov. Syst. Softw. Eng. 2017, 13, 279–297. [Google Scholar] [CrossRef]
- Goyal, A.; Sardana, N. Machine learning or information retrieval techniques for bug triaging: Which is better? e-Inform. Softw. Eng. J. 2017, 11, 117–141. [Google Scholar]
- Banerjee, S.; Cukic, B.; Adjeroh, D. Automated duplicate bug report classification using subsequence matching. In Proceedings of the 2012 IEEE 14th International Symposium on High-Assurance Systems Engineering, Omaha, NE, USA, 25–27 October 2012; pp. 74–81. [Google Scholar]
- Banerjee, S.; Syed, Z.; Helmick, J.; Cukic, B. A fusion approach for classifying duplicate problem reports. In Proceedings of the 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE), Pasadena, CA, USA, 4–7 November 2013; pp. 208–217. [Google Scholar]
- Prifti, T.; Banerjee, S.; Cukic, B. Detecting bug duplicate reports through local references. In Proceedings of the 7th International Conference on Predictive Models in Software Engineering, Banff, AB, Canada, 20–21 September 2011; pp. 1–9. [Google Scholar]
- Jiang, H.; Chen, X.; He, T.; Chen, Z.; Li, X. Fuzzy clustering of crowdsourced test reports for apps. ACM Trans. Internet Technol. (TOIT) 2018, 18, 1–28. [Google Scholar] [CrossRef]
- Xia, X.; Lo, D.; Ding, Y.; Al-Kofahi, J.M.; Nguyen, T.N.; Wang, X. Improving automated bug triaging with specialized topic model. IEEE Trans. Softw. Eng. 2016, 43, 272–297. [Google Scholar] [CrossRef]
- Panda, R.R.; Nagwani, N.K. Topic modeling and intuitionistic fuzzy set-based approach for efficient software bug triaging. Knowl. Inf. Syst. 2022, 64, 3081–3111. [Google Scholar] [CrossRef]
- Panda, R.R.; Nagwani, N.K. Classification and intuitionistic fuzzy set based software bug triaging techniques. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 6303–6323. [Google Scholar] [CrossRef]
- Jang, J.; Yang, G. A Bug Triage Technique Using Developer-Based Feature Selection and CNN-LSTM Algorithm. Appl. Sci. 2022, 12, 9358. [Google Scholar] [CrossRef]
- Choquette-Choo, C.A.; Sheldon, D.; Proppe, J.; Alphonso-Gibbs, J.; Gupta, H. A multi-label, dual-output deep neural network for automated bug triaging. In Proceedings of the 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), Boca Raton, FL, USA, 16–19 December 2019; pp. 937–944. [Google Scholar]
- Chauhan, R.; Sharma, S.; Goyal, A. DENATURE: Duplicate detection and type identification in open source bug repositories. Int. J. Syst. Assur. Eng. Manag. 2023, 14, 275–292. [Google Scholar] [CrossRef]
- Jiang, Y.; Su, X.; Treude, C.; Shang, C.; Wang, T. Does Deep Learning improve the performance of duplicate bug report detection? An empirical study. J. Syst. Softw. 2023, 198, 111607. [Google Scholar] [CrossRef]
- Dhaliwal, T.; Khomh, F.; Zou, Y. Classifying field crash reports for fixing bugs: A case study of Mozilla Firefox. In Proceedings of the 2011 27th IEEE International Conference on Software Maintenance (ICSM), Williamsburg, VA, USA, 25–30 September 2011; pp. 333–342. [Google Scholar]
- Dang, Y.; Wu, R.; Zhang, H.; Zhang, D.; Nobel, P. Rebucket: A method for clustering duplicate crash reports based on call stack similarity. In Proceedings of the 2012 34th International Conference on Software Engineering (ICSE), Zurich, Switzerland, 2–9 June 2012; pp. 1084–1093. [Google Scholar]
- Rodrigues, I.M.; Khvorov, A.; Aloise, D.; Vasiliev, R.; Koznov, D.; Fernandes, E.R.; Chernishev, G.; Luciv, D.; Povarov, N. TraceSim: An Alignment Method for Computing Stack Trace Similarity. Empir. Softw. Eng. 2022, 27, 53. [Google Scholar] [CrossRef]
- Shi, H.; Wang, G.; Fu, Y.; Hu, C.; Song, H.; Dong, J.; Tang, K.; Liang, K. Abaci-finder: Linux kernel crash classification through stack trace similarity learning. J. Parallel Distrib. Comput. 2022, 168, 70–79. [Google Scholar] [CrossRef]
- Dunn, T.; Banerjee, N.K.; Banerjee, S. GPU acceleration of document similarity measures for automated bug triaging. In Proceedings of the 2016 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Ottawa, ON, Canada, 23–27 October 2016; pp. 140–145. [Google Scholar]
- Wu, R.; Zhang, H.; Cheung, S.C.; Kim, S. Crashlocator: Locating crashing faults based on crash stacks. In Proceedings of the 2014 International Symposium on Software Testing and Analysis, San Jose, CA, USA, 21–25 July 2014; pp. 204–214. [Google Scholar]
- Koopaei, N.E.; Hamou-Lhadj, A. CrashAutomata: An approach for the detection of duplicate crash reports based on generalizable automata. In Proceedings of the CASCON, Markham, ON, Canada, 2–4 November 2015; pp. 201–210. [Google Scholar]
- Sabor, K.K.; Hamou-Lhadj, A.; Larsson, A. Durfex: A feature extraction technique for efficient detection of duplicate bug reports. In Proceedings of the 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), Prague, Czech Republic, 25–29 July 2017; pp. 240–250. [Google Scholar]
- Tian, Y.; Yu, S.; Fang, C.; Li, P. FuRong: Fusing report of automated Android testing on multi-devices. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings, Seoul, Republic of Korea, 27 June–19 July 2020; pp. 49–52. [Google Scholar]
- Khvorov, A.; Vasiliev, R.; Chernishev, G.; Rodrigues, I.M.; Koznov, D.; Povarov, N. S3M: Siamese stack (trace) similarity measure. In Proceedings of the 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), Madrid, Spain, 17–19 May 2021; pp. 266–270. [Google Scholar]
- Yeh, C.C.; Lu, H.L.; Lee, Y.H.; Chou, W.S.; Huang, S.K. CRAXTriage: A coverage based triage system. In Proceedings of the 2017 IEEE Conference on Dependable and Secure Computing, Taipei, Taiwan, 7–10 August 2017; pp. 408–415. [Google Scholar]
- Liu, Y. RESTCluster: Automated Crash Clustering for RESTful API. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, Rochester, MI, USA, 10–14 October 2022; pp. 1–3. [Google Scholar]
- Peng, J.; Zhang, M.; Wang, Q. Deduplication and Exploitability Determination of UAF Vulnerability Samples by Fast Clustering. KSII Trans. Internet Inf. Syst. 2016, 10, 4933–4956. [Google Scholar]
- Pham, V.T.; Khurana, S.; Roy, S.; Roychoudhury, A. Bucketing failing tests via symbolic analysis. In Proceedings of the Fundamental Approaches to Software Engineering: 20th International Conference, FASE 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, 22–29 April 2017; Proceedings 20. Springer: Berlin/Heidelberg, Germany, 2017; pp. 43–59. [Google Scholar]
- Moroo, A.; Aizawa, A.; Hamamoto, T. Reranking-based Crash Report Deduplication. In Proceedings of the SEKE, Pittsburgh, PA, USA, 5–7 July 2017; Volume 17, pp. 507–510. [Google Scholar]
- Cui, W.; Peinado, M.; Cha, S.K.; Fratantonio, Y.; Kemerlis, V.P. Retracer: Triaging crashes by reverse execution from partial memory dumps. In Proceedings of the 38th International Conference on Software Engineering, Austin, TX, USA, 14–22 May 2016; pp. 820–831. [Google Scholar]
- Eom, K.J.; Paik, J.Y.; Mok, S.K.; Jeon, H.G.; Cho, E.S.; Kim, D.W.; Ryu, J. Automated crash filtering for arm binary programs. In Proceedings of the 2015 IEEE 39th Annual Computer Software and Applications Conference, Taichung, Taiwan, 1–5 July 2015; Volume 2, pp. 478–483. [Google Scholar]
- Cui, W.; Ge, X.; Kasikci, B.; Niu, B.; Sharma, U.; Wang, R.; Yun, I. {REPT}: Reverse debugging of failures in deployed software. In Proceedings of the 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18), Carlsbad, CA, USA, 8–10 October 2018; pp. 17–32. [Google Scholar]
- Xu, J.; Mu, D.; Xing, X.; Liu, P.; Chen, P.; Mao, B. Postmortem Program Analysis with Hardware-Enhanced Post-Crash Artifacts. In Proceedings of the USENIX Security Symposium, Vancouver, BC, Canada, 16–18 August 2017; pp. 17–32. [Google Scholar]
- Mu, D.; Du, Y.; Xu, J.; Xu, J.; Xing, X.; Mao, B.; Liu, P. Pomp++: Facilitating postmortem program diagnosis with value-set analysis. IEEE Trans. Softw. Eng. 2019, 47, 1929–1942. [Google Scholar] [CrossRef]
- Jiang, Z.; Jiang, X.; Hazimeh, A.; Tang, C.; Zhang, C.; Payer, M. Igor: Crash Deduplication Through Root-Cause Clustering. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, 15–19 November 2021; pp. 3318–3336. [Google Scholar]
- van Tonder, R.; Kotheimer, J.; Le Goues, C. Semantic crash bucketing. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, Montpellier, France, 3–7 September 2018; pp. 612–622. [Google Scholar]
- Zhang, X.; Chen, J.; Feng, C.; Li, R.; Diao, W.; Zhang, K.; Lei, J.; Tang, C. DeFault: Mutual information-based crash triage for massive crashes. In Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 21–29 May 2022; pp. 635–646. [Google Scholar]
- Kallingal Joshy, A.; Le, W. FuzzerAid: Grouping Fuzzed Crashes Based On Fault Signatures. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, Rochester, MI, USA, 10–14 October 2022; pp. 1–12. [Google Scholar]
- Alawneh, A.; Alazzam, I.M.; Shatnawi, K. Locating Source Code Bugs in Software Information Systems Using Information Retrieval Techniques. Big Data Cogn. Comput. 2022, 6, 156. [Google Scholar] [CrossRef]
- Krasniqi, R.; Do, H. Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging. In Proceedings of the International Conference on Evaluation and Assessment in Software Engineering 2022, Gothenburg, Sweden, 13–15 June 2022; pp. 10–19. [Google Scholar]
- Lee, C.Y.; Hu, D.D.; Feng, Z.Y.; Yang, C.Z. Mining temporal information to improve duplication detection on bug reports. In Proceedings of the 2015 IIAI 4th International Congress on Advanced Applied Informatics, Okayama, Japan, 12–16 July 2015; pp. 551–555. [Google Scholar]
- Wang, S.; Khomh, F.; Zou, Y. Improving bug management using correlations in crash reports. Empir. Softw. Eng. 2016, 21, 337–367. [Google Scholar] [CrossRef]
- Rakha, M.S.; Bezemer, C.P.; Hassan, A.E. Revisiting the performance evaluation of automated approaches for the retrieval of duplicate issue reports. IEEE Trans. Softw. Eng. 2017, 44, 1245–1268. [Google Scholar] [CrossRef]
- Banerjee, S.; Syed, Z.; Helmick, J.; Culp, M.; Ryan, K.; Cukic, B. Automated triaging of very large bug repositories. Inf. Softw. Technol. 2017, 89, 1–13. [Google Scholar] [CrossRef]
- Savidov, G.; Fedotov, A. Casr-Cluster: Crash clustering for Linux applications. In Proceedings of the 2021 Ivannikov Ispras Open Conference (ISPRAS), 2–3 December 2021; pp. 47–51. [Google Scholar]
- Budhiraja, A.; Reddy, R.; Shrivastava, M. Lwe: Lda refined word embeddings for duplicate bug report detection. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, Gothenburg, Sweden, 27 May–3 June 2018; pp. 165–166. [Google Scholar]
- Chaparro, O.; Florez, J.M.; Singh, U.; Marcus, A. Reformulating queries for duplicate bug report detection. In Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), Hangzhou, China, 24–27 February 2019; pp. 218–229. [Google Scholar]
- Karasov, N.; Khvorov, A.; Vasiliev, R.; Golubev, Y.; Bryksin, T. Aggregation of Stack Trace Similarities for Crash Report Deduplication. arXiv 2022, arXiv:2205.00212. [Google Scholar]
- James, K.; Du, Y.; Das, S.; Monrose, F. Separating the Wheat from the Chaff: Using Indexing and Sub-Sequence Mining Techniques to Identify Related Crashes During Bug Triage. In Proceedings of the 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), Guangzhou, China, 5–9 December 2022; pp. 31–42. [Google Scholar]
- Yang, H.; Xu, Y.; Li, Y.; Choi, H.D. K-Detector: Identifying Duplicate Crash Failures in Large-Scale Software Delivery. In Proceedings of the 2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Coimbra, Portugal, 12–15 October 2020; pp. 1–6. [Google Scholar]
- Park, J.w.; Lee, M.W.; Kim, J.; Hwang, S.w.; Kim, S. Costriage: A cost-aware triage algorithm for bug reporting systems. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 7–11 August 2011; Volume 25, pp. 139–144. [Google Scholar]
- Hindle, A.; Alipour, A.; Stroulia, E. A contextual approach towards more accurate duplicate bug report detection and ranking. Empir. Softw. Eng. 2016, 21, 368–410. [Google Scholar] [CrossRef]
- Badashian, A.S. Realistic bug triaging. In Proceedings of the 38th International Conference on Software Engineering Companion, Austin, TX, USA, 14–22 May 2016; pp. 847–850. [Google Scholar]
- Zhang, T.; Yang, G.; Lee, B.; Chan, A.T. Guiding bug triage through developer analysis in bug reports. Int. J. Softw. Eng. Knowl. Eng. 2016, 26, 405–431. [Google Scholar] [CrossRef]
- Goyal, A. Effective Bug Triage for Non-Reproducible Bugs. In Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), Buenos Aries, Argentina, 20–28 May 2017; pp. 487–488. [Google Scholar]
- Zhang, W.; Cui, Y.; Yoshida, T. En-lda: An novel approach to automatic bug report assignment with entropy optimized latent dirichlet allocation. Entropy 2017, 19, 173. [Google Scholar] [CrossRef] [Green Version]
- Hindle, A.; Onuczko, C. Preventing duplicate bug reports by continuously querying bug reports. Empir. Softw. Eng. 2019, 24, 902–936. [Google Scholar] [CrossRef]
- Yadav, A.; Singh, S.K.; Suri, J.S. Ranking of software developers based on expertise score for bug triaging. Inf. Softw. Technol. 2019, 112, 1–17. [Google Scholar] [CrossRef]
- Alazzam, I.; Aleroud, A.; Al Latifah, Z.; Karabatis, G. Automatic bug triage in software systems using graph neighborhood relations for feature augmentation. IEEE Trans. Comput. Soc. Syst. 2020, 7, 1288–1303. [Google Scholar] [CrossRef]
- Nath, V.; Sheldon, D.; Alphonso-Gibbs, J. Principal Component Analysis and Entropy-based Selection for the Improvement of Bug Triage. In Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Virtually Online, 13–15 December 2021; pp. 541–546. [Google Scholar]
- Panda, R.R.; Nagwani, N.K. An Improved Software Bug Triaging Approach Based on Topic Modeling and Fuzzy Logic. In Proceedings of the Third Doctoral Symposium on Computational Intelligence: DoSCI 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 337–346. [Google Scholar]
- Wu, X.; Shan, W.; Zheng, W.; Chen, Z.; Ren, T.; Sun, X. An Intelligent Duplicate Bug Report Detection Method Based on Technical Term Extraction. In Proceedings of the 2023 IEEE/ACM International Conference on Automation of Software Test (AST), Melbourne, Australia, 15–16 May 2023; pp. 1–12. [Google Scholar]
- Ebrahimi, N.; Trabelsi, A.; Islam, M.S.; Hamou-Lhadj, A.; Khanmohammadi, K. An HMM-based approach for automatic detection and classification of duplicate bug reports. Inf. Softw. Technol. 2019, 113, 98–109. [Google Scholar] [CrossRef]
- Rodrigues, I.M.; Aloise, D.; Fernandes, E.R.; Dagenais, M. A soft alignment model for bug deduplication. In Proceedings of the 17th International Conference on Mining Software Repositories, Virtual Online, 29–30 June 2020; pp. 43–53. [Google Scholar]
- He, J.; Xu, L.; Yan, M.; Xia, X.; Lei, Y. Duplicate bug report detection using dual-channel convolutional neural networks. In Proceedings of the 28th International Conference on Program Comprehension, Virtual Online, 13–15 July 2020; pp. 117–127. [Google Scholar]
- Aggarwal, K.; Timbers, F.; Rutgers, T.; Hindle, A.; Stroulia, E.; Greiner, R. Detecting duplicate bug reports with software engineering domain knowledge. J. Softw. Evol. Process 2017, 29, e1821. [Google Scholar] [CrossRef]
- Dedík, V.; Rossi, B. Automated bug triaging in an industrial context. In Proceedings of the 2016 42th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Limassol, Cyprus, 31 August–2 September 2016; pp. 363–367. [Google Scholar]
- Lin, M.J.; Yang, C.Z.; Lee, C.Y.; Chen, C.C. Enhancements for duplication detection in bug reports with manifold correlation features. J. Syst. Softw. 2016, 121, 223–233. [Google Scholar] [CrossRef]
- Lee, S.R.; Heo, M.J.; Lee, C.G.; Kim, M.; Jeong, G. Applying deep learning based automatic bug triager to industrial projects. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, Paderborn, Germany, 4–8 September 2017; pp. 926–931. [Google Scholar]
- Xuan, J.; Jiang, H.; Ren, Z.; Yan, J.; Luo, Z. Automatic bug triage using semi-supervised text classification. arXiv 2017, arXiv:1704.04769. [Google Scholar]
- Song, H.-Z.; Ma, Y.-T. DeepTriage:An Automatic Triage Method for Software Bugs Using Deep Learning. J. Chin. Comput. Syst. 2019, 40, 126–132. [Google Scholar]
- Chaparro, O. Improving bug reporting, duplicate detection, and localization. In Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), Buenos Aires, Argentina, 20–28 May 2017; pp. 421–424. [Google Scholar]
- Xi, S.; Yao, Y.; Xiao, X.; Xu, F.; Lu, J. An effective approach for routing the bug reports to the right fixers. In Proceedings of the 10th Asia-Pacific Symposium on Internetware, Beijing, China, 16 September 2018; pp. 1–10. [Google Scholar]
- Xie, Q.; Wen, Z.; Zhu, J.; Gao, C.; Zheng, Z. Detecting duplicate bug reports with convolutional neural networks. In Proceedings of the 2018 25th Asia-Pacific Software Engineering Conference (APSEC), Nara, Japan, 4–7 December 2018; pp. 416–425. [Google Scholar]
- Alenezi, M.; Banitaan, S.; Zarour, M. Using categorical features in mining bug tracking systems to assign bug reports. arXiv 2018, arXiv:1804.07803. [Google Scholar] [CrossRef]
- Budhiraja, A.; Dutta, K.; Reddy, R.; Shrivastava, M. DWEN: Deep word embedding network for duplicate bug report detection in software repositories. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, Gothenburg, Sweden, 27 May–3 June 2018; pp. 193–194. [Google Scholar]
- Kukkar, A.; Mohana, R.; Nayyar, A.; Kim, J.; Kang, B.G.; Chilamkurti, N. A novel deep-learning-based bug severity classification technique using convolutional neural networks and random forest with boosting. Sensors 2019, 19, 2964. [Google Scholar] [CrossRef]
- Xi, S.Q.; Yao, Y.; Xiao, X.S.; Xu, F.; Lv, J. Bug triaging based on tossing sequence modeling. J. Comput. Sci. Technol. 2019, 34, 942–956. [Google Scholar] [CrossRef]
- Mani, S.; Sankaran, A.; Aralikatte, R. Deeptriage: Exploring the effectiveness of deep learning for bug triaging. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, Kolkata, India, 3–5 January 2019; pp. 171–179. [Google Scholar]
- Catolino, G.; Palomba, F.; Zaidman, A.; Ferrucci, F. Not all bugs are the same: Understanding, characterizing, and classifying bug types. J. Syst. Softw. 2019, 152, 165–181. [Google Scholar] [CrossRef]
- Poddar, L.; Neves, L.; Brendel, W.; Marujo, L.; Tulyakov, S.; Karuturi, P. Train one get one free: Partially supervised neural network for bug report duplicate detection and clustering. arXiv 2019, arXiv:1903.12431. [Google Scholar]
- Sarkar, A.; Rigby, P.C.; Bartalos, B. Improving bug triaging with high confidence predictions at ericsson. In Proceedings of the 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), Cleveland, OH, USA, 29 September–4 October 2019; pp. 81–91. [Google Scholar]
- Pahins, C.A.D.L.; D’Morison, F.; Rocha, T.M.; Almeida, L.M.; Batista, A.F.; Souza, D.F. T-REC: Towards accurate bug triage for technical groups. In Proceedings of the 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), Boca Raton, FL, USA, 16–19 December 2019; pp. 889–895. [Google Scholar]
- Guo, S.; Zhang, X.; Yang, X.; Chen, R.; Guo, C.; Li, H.; Li, T. Developer activity motivated bug triaging: Via convolutional neural network. Neural Process. Lett. 2020, 51, 2589–2606. [Google Scholar] [CrossRef]
- Xiao, G.; Du, X.; Sui, Y.; Yue, T. Hindbr: Heterogeneous information network based duplicate bug report prediction. In Proceedings of the 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), Coimbra, Portugal, 12–15 October 2020; pp. 195–206. [Google Scholar]
- Zhang, W. Efficient bug triage for industrial environments. In Proceedings of the 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), Adelaide, Australia, 28 September–2 October 2020; pp. 727–735. [Google Scholar]
- Russo, F.; Raju, R.; Clarke, C.; Yang, N.; Escalona, A.; Tappert, C.C.; Leider, A. Software Bug Triage Using Machine Learning and Natural Language Processing; Pace University: New York, NY, USA, 2020. [Google Scholar]
- Neysiani, B.S.; Babamir, S.M.; Aritsugi, M. Efficient feature extraction model for validation performance improvement of duplicate bug report detection in software bug triage systems. Inf. Softw. Technol. 2020, 126, 106344. [Google Scholar] [CrossRef]
- He, H.; Yang, S. Automatic Bug Triage Using Hierarchical Attention Networks. In Proceedings of the 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C), Hainan Island, China, 6–10 December 2021; pp. 1043–1049. [Google Scholar]
- Wang, H.; Li, Q. Effective Bug Triage Based on a Hybrid Neural Network. In Proceedings of the 2021 28th Asia-Pacific Software Engineering Conference (APSEC), Taipei, Taiwan, 6–9 December 2021; pp. 82–91. [Google Scholar]
- Yu, X.; Wan, F.; Du, J.; Jiang, F.; Guo, L.; Lin, J. Bug Triage Model Considering Cooperative and Sequential Relationship. In Proceedings of the Wireless Algorithms, Systems, and Applications: 16th International Conference, WASA 2021, Nanjing, China, 25–27 June 2021; Proceedings, Part II 16. Springer: Berlin/Heidelberg, Germany, 2021; pp. 160–172. [Google Scholar]
- Zaidi, S.F.A.; Lee, C.G. Learning graph representation of bug reports to triage bugs using graph convolution network. In Proceedings of the 2021 International Conference on Information Networking (ICOIN), Jeju Island, Republic of Korea, 13–16 January 2021; pp. 504–507. [Google Scholar]
- Jahanshahi, H.; Chhabra, K.; Cevik, M.; Baþar, A. DABT: A dependency-aware bug triaging method. In Evaluation and Assessment in Software Engineering, Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering, Virtual Online, 21–24 June 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 504–507. [Google Scholar]
- Zaidi, S.F.A.; Lee, C.G. One-class classification based bug triage system to assign a newly added developer. In Proceedings of the 2021 International Conference on Information Networking (ICOIN), Jeju Island, Republic of Korea, 13–16 January 2021; pp. 738–741. [Google Scholar]
- Zhang, W.; Zhao, J.; Wang, S. SusTriage: Sustainable Bug Triage with Multi-modal Ensemble Learning. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Melbourne, VIC, Australia, 14–17 December 2021; pp. 441–448. [Google Scholar]
- Aktaş, E.U. Automated Software Issue Triage in Large Scale Industrial Context. Ph.D. Thesis, Sabanci University, Tuzla, Türkiye, 2021. [Google Scholar]
- Aung, T.W.W.; Wan, Y.; Huo, H.; Sui, Y. Multi-triage: A multi-task learning framework for bug triage. J. Syst. Softw. 2022, 184, 111133. [Google Scholar] [CrossRef]
- Yu, X.; Wan, F.; Tang, B.; Zhan, D.; Peng, Q.; Yu, M.; Wang, Z.; Cui, S. Deep Bug Triage Model Based on Multi-head Self-attention Mechanism. In Proceedings of the Computer Supported Cooperative Work and Social Computing: 16th CCF Conference, ChineseCSCW 2021, Xiangtan, China, 26–28 November 2021; Revised Selected Papers, Part II. Springer: Berlin/Heidelberg, Germany, 2022; pp. 107–119. [Google Scholar]
- Chao, L.; Qiaoluan, X.; Yong, L.; Yang, X.; Hyun-Deok, C. DeepCrash: Deep metric learning for crash bucketing based on stack trace. In Proceedings of the 6th International Workshop on Machine Learning Techniques for Software Quality Evaluation, Singapore, 18 November 2022; pp. 29–34. [Google Scholar]
- Zaidi, S.F.A.; Woo, H.; Lee, C.G. Toward an effective bug triage system using transformers to add new developers. J. Sens. 2022, 2022, 4347004. [Google Scholar] [CrossRef]
- Samir, M.; Sherief, N.; Abdelmoez, W. Improving Bug Assignment and Developer Allocation in Software Engineering through Interpretable Machine Learning Models. Computers 2023, 12, 128. [Google Scholar] [CrossRef]
Cited Paper | Study on Commonly Used Methods | Study on Runtime Information-Based Approaches | Study on Information Retrieval Approaches | Study on Machine Learning Approaches |
---|---|---|---|---|
Neysiani et al. [12] | Not included | Not included | Bug report deduplication using IR methods | Bug report deduplication using ML methods |
Udden et al. [13] | Not included | Not included | Bug prioritization using data mining | Bug prioritization using machine learning |
Sawant et al. [14] | Not included | Not included | Bug report classification using text-based analysis, recommendation, etc. | Not included |
Neysiani et al. [15] | Not included | Not included | Not included | Features and general steps for bug report deduplication |
Yadav et al. [16] | Not included | Not included | Not included | Comparison of ML-based classification |
Chhabra et al. [17] | Not included | Not included | Factors to consider in bug triage | Not included |
Neysiani et al. [18] | Not included | Not included | General description of the IR-based methods | General description of the ML-based methods |
Lee et al. [7] | Not included | Not included | Deduplication using IR methods | Deduplication using NLP, naive Bayes, etc. |
Pandey et al. [19] | Not included | Not included | Not included | Bug triage using six ML models |
Goyal et al. [20] | Not included | Not included | Bug triage using IR methods | Bug triage using ML methods |
Our work | Feature extraction methods and similarity calculation methods | Deduplication and triage based on runtime stacks, coverage, and context | Deduplication and triage based on texture analysis, topic modeling, etc. | Deduplication and triage based on CNN, LSTM, transformer, etc. |
Dataset | Proportion in Works |
---|---|
Mozilla | ∼64.4% |
Eclipse | ∼37% |
Netbeans | ∼19.2% |
Openoffce | ∼11% |
Others | ∼60% |
Category | Work | Methods | Runtime Information | Effect | Dataset |
---|---|---|---|---|---|
Methods based on comparing stack traces | CrashAutomata | N-gram | Stack traces | F measure: 97% | 5.7 k traces from Mozilla |
DURFEX | Variable-length N-gram | Stack traces | 93% and 70% less execution time compared with 1, 2-g | 380 k traces from Firefox and Eclipse | |
FuRong | Levenshtein distance | Stack trace in Android bug log | 93.4% precision and 87.9% accuracy, on average | 91 bugs from 8 Android applications | |
S3M | biLSTM encoder | Stack traces | 0.96 and 0.76 RR@10 for JetBrain and Netbeans, respectively | 340 k traces from JetBrains and Netbeans | |
abaci-finder | kstack2vec, BiLSTM | Stack traces | 0.83 F1 score | 17 k traces from syzbot | |
Methods based on analyzing coverage | CRAXTriage | Coverage comparison | Bug execution path | Not mentioned | 11 programs |
Methods based on comparing contexts | Fast clustering for UA | Clustering in 2D plane | UAF bug context | 12.2 s clustering time | 1.2 K samples from IE8 |
Clustering based on symbolic analysis | symbolic analysis and clustering | Bug execution path | 50% cases allow for finer-grained analysis | 21 programs | |
Reranking-based deduplication | TF-IDF, Rebucket | races | ∼7 Stack T0% accuracy | 51 k reports from Launchpad and Firefox 48 | |
REPT | Hardware tracing, reverse debugging, and taint analysis | Program with bugs | 92% accuracy, on average | 14 programs | |
POMP | Reverse debugging, taint analysis | Program with bugs | More than 93% bug causes identified | 28 prgrams | |
POMP++ | reverse debugging, taint analysis | Program with bugs | 12% more data flow recovered | 30 programs | |
IgorFuzz | Graph similarity calculation, spectral clustering | Crash poc | Achieved the highest F score in 90% of cases | Magma and Moonlight benchmark | |
Triage based on bug signature | PIN, srcML, bear, C-Reduce | Program with bugs | 99.1% precision | Reports from 7 programs |
Work | Methods | Effects | Dataset |
---|---|---|---|
Deduplication through local references | Reducing search space based on temporal locality | Up to 53% recall rate | 74 k from FireFox |
Time-based deduplication | BM25Fext | 45 k from eclipse | 77% RR@20 |
FactorLCS | Enhancing LCS using size matching within group weight | ≥70% recall rate | 97 k+ from Firefox and 41 k+ from Eclipse |
Fusion approach For deduplication | MULAN-based multilabel classification model | 72% recall rate and up to 40% performance improvement | 111 k from Firefox |
Triaging for very large bug repositories | Text cosine similarity, time window, and document factors | ≥95% original recall and low duplicate recall as a filtration aide; ∼70% recall rate as triaging guide | 246 k from Eclipse, Firefox, and Open Office |
Deduplication using correlations | Stack trace signature, temporal locality, and crash comment textual similarity | 50% and 47% Recall Rate and 55% and 35% precision for the FireFox and Eclipse datasets, respectively | 1 k+ types from FireFox and MSR and 20 k+ from Eclipse |
LWE | LDA and word embedding | 0.558 RR@20% | 768 k from Mozilla |
Refined feature-based deduplication | Resolution field extraction | Not mentioned | 10∼22% recall rate improvement and 7∼18% precision improvement |
Duplication based on multiple factors | Reasonable parameter selection | 80% TP and 0.01% FP for deduplication | 3 M from syzbot |
Stack trace similarly aggregation | Aggregate computing | 15% RR@1 improvement | 40 k from Netbeans and 210 k from JetBrains |
CrashSearch | Locality-sensitive hashing | 11% F-score improvement compared with minor hashing | 1 k from eight real-world programs |
K-detector | AST comparison | 0.986 AUC on SAP HANA | 10 k dump from SAP HANA |
Reformulating queries | Three different queries | 42 k from 20 open-source projects | 56.6∼78% duplication detection |
CosTriage | Reduce cost of assigning bugs | 30% cost reduction | 13 k from Apache, 152 k from Eclipse, 5 k from Linux kernel, and 162 k from Mozilla |
Duplicate based on contextual approach | Multiple context feature comparison | 11.5% accuracy improvement, 41% Kappa improvement, and 16.8% AUC improvement | 37 k from Android, 43 k from Eclipse, 71 k from Mozilla, and 29 k from OpenOffice |
Triage based on developer analysis | Unigram model and Kullback–Leibler (KL) divergence | ∼75% precision and ∼40% and ∼52% F1 score | 8 k from Eclipse and 10 k from Mozilla |
TopicMiner | Multiple-topic model | 68.7% and 90.8% for top-one and top-five precision, respectively | 27 k from GCC, 42 k from OpenOffice, 46 k from Netbeans, 82 k from Eclipse, and 86 k from Mozilla |
Triage for non-reproducible bug | Time analysis, priority assignment, and NRFixer | ∼70% precision | Mozilla and Eclipse |
En-LDA | LDA and entropy calculation | 84% RR@5 for JDT and 58% RR@7 for Firefox | 3 k from Mozilla and 2 k from Eclipse |
Deduplication by continuous querying | Continuous querying | Over 42% duplication prevention | 222.4 k from Android, App Inventor, Bazaar, Cyanogenmod, Eclipse, K9Mail, Mozilla, MyTrack, OpenOffice, Openstack, Osmand, and Tempest |
Unified triage framework | Information gain, chi-square statistics, TF-IDF, LDA, and SVM | 49.22%, 85.99%, and 74.89% precision for Eclipse, Baiduinput, and Mooctest, respectively | 2 k reports from Eclipse and 0.2 k reports from Baiduinput |
Triage based on expertise score | Jaccard and cosine similarity | 89.49% accuracy, 89.53 % precision, 89.42% recall rate, and 89.49% F-score | 41 k reports from Mozilla, Eclipse, Netbeans, Firefox, and Freedesktop |
RSFH | LDA and graph classification | 0.732 accuracy, 0.871 precision, 0.732 recall rate, and 0.796 F score | 135 k from Bugzilla |
Feature extraction model for triage | TF-IDF and heuristic feature detection | 2% precision, 4.5% recall rate, and 5.9% F-score improvement | Not mentioned |
Triage based on principal component analysis | Principal component analysis and entropy-based keyword extraction | 90% top-10 team precision and 67% individual precision | 43 k from a private dataset |
Intuitionistic fuzzy-set-based triaging | Intuitionistic fuzzy sets (IFS) | 15% 0.93, 0.90, and 0.88 precision for Eclipse, Mozilla, and NetBeans, respectively | 32 k from Eclipse, Mozilla, and NetBeans |
Quality-based classifier | Multiple-feature extraction | 76% precision, 70% recall rate, and 70% F1 score | 5 k from Jira and Bugzilla |
Intuitionistic fuzzy-set-based triage | LDA and IFSim | 0.894 accuracy, 0.897 precision, 0.893 recall rate, and 0.896 F1 score for Eclipse | Eclipse |
TM-FBT | Topic modeling and fuzzy logic | 0.903, 0.887, and 0.851 precision for Eclipse, Mozilla, and NetBeans, respectively | Eclipse, Mozilla, and NetBeans |
CTEDB | Word2Vec, TextRank, SBERT, and DeBERTaV3 | 66 k from eclipse and 230 k from mozilla | Over 98% accuracy, ∼96% precision, 96% recall rate, and 96% F1 score |
Work | Methods | Effects | Dataset |
---|---|---|---|
HMM-based deduplication | Hidden Markov models (HMMs) | 76.5% and 73% average accuracy for Firefox and GNOME, respectively | 1 M from Firefox and 753 k from GNOME |
Soft alignment model for deduplication | Soft-attention alignment and DNN | 5% RR@K improvement | 25 k from Eclipse, 54 k from Mozilla, 11 k from NetBeans, and 15 k from OpenOffice |
Dual-channel CNN-based deduplication | Word2vec and dual-channel CNN | Over 0.95 accuracy, recall rate, precision, and F1 score | 90 k from OpenOffice, 246 k from Eclipse, and 184 k from Netbeans |
Domain knowledge-based deduplication | BM25F and multiple ML models | Up to 92% accuracy | 37 k from Android, 42 k from OpenOffice, 72 k from Mozilla, and 29 k from Eclipse |
Triage in industrial context | SVM and TF-IDF | 53% accuracy, 59% precision, and 47% recall rate | 2 k from Jira and 9 k from Mozilla |
Deduplication with manifold correlation features | TF-IDF, BM25, and word2Vec | 2.79∼28.97% RR@5 improvement | 6 k from ArgoUML, 9 k from Apache, and 4 k from SVN |
Deep-learning-based automatic bug triage | Word2Vec and CNN | 82.83% and 35.83% higher performance in top-one and top-three accuracy, respectively | 24 k from four datasets |
Semisupervised bug triage | Enhanced naive Bayes classifier | 6% accuracy improvement | 20 k from Eclipse |
DeepTriage-song | BiLSTM and LSTM | 42.96% top-one accuracy | 200 k from Eclipse and 220 k from Mozilla |
SeqTriagle | Bidirectional RNN and attention model | 5∼20% accuracy improvement | 210 k from Eclipse, 300 k from Mozilla, and 165 k from Gentoo |
DBR-CNN | Word embedding and CNN | 0.903 F score and 0.919 accuracy | 1.8 k from Hadoop, 12 k from hdfs, 7 k from Mapreduce, and 22 k from Spark |
TERFUR | NLP model, vector space model, and merging algorithm | 78.15% accuracy, 78.41% recall rate, and 75.82% F1 score | 0.3 k from Justforfun, 0.3 k from SE-1800, 0.4 k from iShopping, 0.2 k from CloudMusic, and 0.4 k from UBook |
Triage using categorical features | Naive Bayes classifier | 0.633, 0.584, and 0.38 F score for Netbeans, Freedesktop, and Firefox, respectively | Netbeans, Freedesktop, and Firefox |
DWEN | Word embedding and DNN | Over 0.7 RR@20 | 700 k from Mozilla and 100 k from OpenOffice |
Multilabel, dual-output DNN for triaging | Mutilabel classifier | 76% accuracy for team assignment and 55% accuracy for individual assignment | 236 k from a private dataset |
Triage Using CNN and RF with Boosting | CNN, and boosting-enhanced random forest (BCR) | 96.34% accuracy, and 96.43% F score | Mozilla, Eclipse, JBoss, OpenFOAM, and Firefox |
itriage | Tossing sequence model and RNN with GRU | 9.39% top-one accuracy improvement | 210 k from Eclipse, 300 k from Mozilla, and 165 k from Gentoo |
DeepTriage-mani | Bidirectional RNN and softmax classifier | 34∼47% accuracy | 383 k from Chromium, 314 k from Mozilla Core, and 162 k from Mozilla Firefox |
Triage based on bug cause | Enhanced LDA | 64% F score | 1 k+ from Apache, Eclipse, and Mozilla |
Partially supervised neural network for deduplication and clustering | Word embedding, bidirectional GRU units, topic clustering, and conditional attention-based deduplication | 0.95 and 0.88 F score for Firefox and JDT, respectively | 17 k from SnapS2R, 46 k from Eclipse, and 34 k from FireFox |
Triage with high confidence | TF-IDF, one-hot encoding, and logistic regression | 89.75% Precision, and 90.17% recall rate | 11 k from Ericsson |
T-REC | Vector space model, BM25F, and noisy-or classifier | 76.1% CC@5, 83.6% ACC@10, and 89.7% ACC@20 | 9.5 M from Sidia |
Developer activity-motivated triage | Work2vec and CNN | 0.7489 top-10 accuracy | 39 k from Eclipse, 15 k from Mozilla, and 19 k from Netbeans |
AI-based document generation model | LDA and backpropagation | Over 84% accuracy | 3 k from Bugzilla and 41 k from MSR |
Triage for industrial environments | LDA and DNN | 85.1%, 70.1%, and 92.1% RR@5 for JDT, Platform, and Firefox, respectively | 1 k from JDT, 4 k from Platform, and 13 k from Firefox |
NLP-based triage | Word2vec and LSTM | ∼78% accuracy | Not mentioned |
Hierarchical attention network for triage | Word2Ve, GloVe, and hierarchical attention network | 50∼65% accuracy | 633 k from Chromium, 1 M from Core, 1 M from Firefox, 187 k from Netbeans, and 318 k from Eclipse |
Mixed DNN for triage | LSTM and CNN | 80% and 60% top-five accuracy for Eclipse and Mozilla, respectively | 200 k from Eclipse and 220 k from Mozilla |
BTCSR | TF-IDF, LDA, random walk, and cooperative SkipGram | 51.26% RR@3, 63.25% RR@5, and 74.14% RR@10 | 14 k from Eclipse, 10 k from Mozilla, 11 k from Netbeans, and 2 k from GCC |
Triage using GCN | TF-IDF and graph convolutional network | 84%, 72.11%, and 66.5% top-10 accuracy for DT, Platform, and Firefox, respectively | 1 k from JDT, 4 k from Platform, and 37 k from Firefox |
HINDBR | Low-dimensional space vector conversion | 2 M from nine datasets | 98.83% accuracy and 97.08% F1 score |
DABT | Bug dependency graph, LDA, TF-IDF, and SVM | 50% bug fix time reduction | 16 k from JDT, 70 k from LibreOffice, and 112 k from Mozilla |
One-class classification-based triage | One-class SVM | ∼93% average accuracy and ∼53% average recall rate | 4 k from Platform and 20 k form Firefox |
SusTriage | Multimodal learning | 69% mean average precision improvement and 61% mean reciprocal rank improvement in Eclipse project | 16 k from Eclipse and 15 k from Mozilla |
Efficient feature extraction model | TF-IDF-based feature extraction | Android, Eclipse, Mozilla, and OpenOffice | 97% accuracy, precision, recall rate, and F1 score |
Triage in large-scale industrial contexts | TF-IDF and two-level classifier | Human resource reduction | 78 k reports |
Triage using CNN-LSTM | CNN and LSTM | 52.4% accuracy | 383 k from Chromium, 314 k from Firefox, and 162k from Mozilla core |
Multi-triage | AST extractor, context augmenter, text encoder, and AST encoder | 57% accuracy for developer triage and 47% accuracy for bug triage | 81.6 k from aspnetcore, azure-powershell, Eclipse, efcore, elasticserach, mixedrealitytoolkit-unity, monogame, nunit, realm-java, Roslyn, and rxjava |
Triage based on transfer learning | Transfer learning | 75.2%, 82.7%, 78.2%, and 79.3% accuracy for Chrome, Mozilla Core, Firefox, and a private dataset, respectively | 163 k from Chromium, 186 k from Mozilla Core, 138 k from Mozilla Firefox, and 75 k from a private dataset |
MSDBT | LSTM | 0.5424 RR@3, 0.6375 RR@5, and 0.745 RR@10 | 14 k from Mozilla, 10 k from Eclipse, 11 k from Netbeans, and 2 k from Gcc |
ST-DGNN | Joint random walk and graph recurrent convolutional neural network | ∼0.7 F1@k | 150 k from Eclipse and 170 k from Mozilla |
Deepcrash | frame2vec, Bi-LSTM, and Rebucket | 80.72% F score | 10 k from SAP hana and 47 k from Netbeans |
Triage using transformer | BERT and transfer learning | Over 60% top-10 accuracy | Eclipse, Firefox, and NetBeans |
DENATURE | TF-IDF, SVM, and logistic regression | 45 k from Eclipse | 88.8% accuracy |
XAI-based triage | XAI model | 208 K from Eclipse | Not mentioned |
CombineIRDL | IR + ML | 1 M from Eclipse, Mozilla, and OpenOffice | 7.1∼11.3% precision improvement |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qian, C.; Zhang, M.; Nie, Y.; Lu, S.; Cao, H. A Survey on Bug Deduplication and Triage Methods from Multiple Points of View. Appl. Sci. 2023, 13, 8788. https://doi.org/10.3390/app13158788
Qian C, Zhang M, Nie Y, Lu S, Cao H. A Survey on Bug Deduplication and Triage Methods from Multiple Points of View. Applied Sciences. 2023; 13(15):8788. https://doi.org/10.3390/app13158788
Chicago/Turabian StyleQian, Cheng, Ming Zhang, Yuanping Nie, Shuaibing Lu, and Huayang Cao. 2023. "A Survey on Bug Deduplication and Triage Methods from Multiple Points of View" Applied Sciences 13, no. 15: 8788. https://doi.org/10.3390/app13158788
APA StyleQian, C., Zhang, M., Nie, Y., Lu, S., & Cao, H. (2023). A Survey on Bug Deduplication and Triage Methods from Multiple Points of View. Applied Sciences, 13(15), 8788. https://doi.org/10.3390/app13158788