Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

When debugging encounters artificial intelligence: state of the art and open challenges

  • Review
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Both software debugging and artificial intelligence techniques are hot topics in the current field of software engineering. Debugging techniques, which comprise fault localization and program repair, are an important part of the software development lifecycle for ensuring the quality of software systems. As the scale and complexity of software systems grow, developers intend to improve the effectiveness and efficiency of software debugging via artificial intelligence (artificial intelligence for software debugging, AI4SD). On the other hand, many artificial intelligence models are being integrated into safety-critical areas such as autonomous driving, image recognition, and audio processing, where software debugging is highly necessary and urgent (software debugging for artificial intelligence, SD4AI). An AI-enhanced debugging technique could assist in debugging AI systems more effectively, and a more robust and reliable AI approach could further guarantee and support debugging techniques. Therefore, it is important to take AI4SD and SD4AI into consideration comprehensively. In this paper, we want to show readers the path, the trend, and the potential that these two directions interact with each other. We select and review a total of 165 papers in AI4SD and SD4AI for answering three research questions, and further analyze opportunities and challenges as well as suggest future directions of this cross-cutting area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Garousi V, Rainer A, Lauvås Jr P, et al. Software-testing education: a systematic literature mapping. J Syst Software, 2020, 165: 110570

    Article  Google Scholar 

  2. Lou Y, Ghanbari A, Li X, et al. Can automated program repair refine fault localization? A unified debugging approach. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. 75–87

  3. Monperrus M. Automatic software repair: a bibliography. ACM Computing Surveys, 2018, 51: 1–24

    Article  Google Scholar 

  4. Zakari A, Lee S P, Abreu R, et al. Multiple fault localization of software programs: a systematic literature review. Inf Software Tech, 2020, 124: 106312

    Article  Google Scholar 

  5. Lu G Z, Xu L, Yang Y B, et al. Predictive analysis for race detection in software-defined networks. Sci China Inf Sci, 2019, 62: 062101

    Article  Google Scholar 

  6. Fang C R, Chen Z Y, Xu B W. Comparing logic coverage criteria on test case prioritization. Sci China Inf Sci, 2012, 55: 2826–2840

    Article  Google Scholar 

  7. Zhou Y M, Leung H, Song Q B, et al. An in-depth investigation into the relationships between structural metrics and unit testability in object-oriented systems. Sci China Inf Sci, 2012, 55: 2800–2815

    Article  Google Scholar 

  8. Wang G, Shen R, Chen J, et al. Probabilistic delta debugging. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021. 881–892

  9. Jiang J J, Xiong Y F, Xia X. A manual inspection of Defects4J bugs and its implications for automatic program repair. Sci China Inf Sci, 2019, 62: 200102

    Article  Google Scholar 

  10. Wang S, Lo D. AmaLgam+: composing rich information sources for accurate bug localization. J Software Evolu Process, 2016, 28: 921–942

    Article  CAS  ADS  Google Scholar 

  11. Pang N. Deep learning for code repair. Vancouver: University of British Columbia, 2018. https://people.ece.ubc.ca/qhanam/papers/npang_thesis_2018.pdf

  12. Safdari N, Alrubaye H, Aljedaani W, et al. Learning to rank faulty source files for dependent bug reports. In: Proceedings of Big Data: Learning, Analytics, and Applications, 2019. 109890B

  13. Zhang Z, Xie X. On the investigation of essential diversities for deep learning testing criteria. In: Proceedings of IEEE 19th International Conference on Software Quality, Reliability and Security, 2019. 394–405

  14. Devanbu P, Dwyer M, Elbaum S, et al. Deep learning & software engineering: state of research and future directions. 2020. ArXiv:2009.08525

  15. Pandey S K, Mishra R B, Tripathi A K. Machine learning based methods for software fault prediction: a survey. Expert Syst Appl, 2021, 172: 114595

    Article  Google Scholar 

  16. Ranjan P, Kumar S, Kumar U. Software fault prediction using computational intelligence techniques: a survey. Ind J Sci Tech, 2017, 10: 1–9

    Article  Google Scholar 

  17. Batool I, Khan T A. Software fault prediction using data mining, machine learning and deep learning techniques: a systematic literature review. Comput Electrical Eng, 2022, 100: 107886

    Article  Google Scholar 

  18. Durelli V H S, Durelli R S, Borges S S, et al. Machine learning applied to software testing: a systematic mapping study. IEEE Trans Rel, 2019, 68: 1189–1212

    Article  Google Scholar 

  19. Mahapatra S, Mishra S. Usage of machine learning in software testing. In: Proceedings of Automated Software Engineering: A Deep Learning-Based Approach, 2020. 39–54

  20. Braiek H B, Khomh F. On testing machine learning programs. J Syst Software, 2020, 164: 110542

    Article  Google Scholar 

  21. Zhang J M, Harman M, Ma L, et al. Machine learning testing: survey, landscapes and horizons. IEEE Trans Software Eng, 2022, 48: 1–36

    Article  Google Scholar 

  22. Riccio V, Jahangirova G, Stocco A, et al. Testing machine learning based systems: a systematic mapping. Empir Software Eng, 2020, 25: 5193–5254

    Article  Google Scholar 

  23. Wang Y, Jia P, Liu L, et al. A systematic review of fuzzing based on machine learning techniques. Plos One, 2020, 15: e0237749

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Chen J, Patra J, Pradel M, et al. A survey of compiler testing. ACM Comput Surv, 2021, 53: 1–36

    Google Scholar 

  25. Li X, Jiang H, Ren Z, et al. Deep learning in software engineering. 2018. ArXiv:1805.04825

  26. Ferreira F, Silva L L, Valente M T. Software engineering meets deep learning: a mapping study. In: Proceedings of the 36th Annual ACM Symposium on Applied Computing, 2021. 1542–1549

  27. Yang Y, Xia X, Lo D, et al. A survey on deep learning for software engineering. 2020. ArXiv:2011.14597

  28. Serban A, van der Blom K, Hoos H, et al. Adoption and effects of software engineering best practices in machine learning. In: Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2020. 1–12

  29. Arpteg A, Brinne B, Crnkovic-Friis L, et al. Software engineering challenges of deep learning. In: Proceedings of the 44th Euromicro Conference on Software Engineering and Advanced Applications, 2018. 50–59

  30. Zhang X, Yang Y, Feng Y, et al. Software engineering practice in the development of deep learning applications. 2019. ArXiv:1910.03156

  31. Yang Y, Xia X, Lo D, et al. Predictive models in software engineering: challenges and opportunities. ACM Trans Softw Eng Methodol, 2022, 31: 1–72

    Google Scholar 

  32. Lertvittayakumjorn P, Toni F. Explanation-based human debugging of NLP models: a survey. Trans Assoc Comput Linguistics, 2021, 9: 1508–1528

    Article  Google Scholar 

  33. Zhang Q, Zhao Y, Sun W, et al. Program repair: automated vs. manual. 2022. ArXiv:2203.05166

  34. Islam M J, Pan R, Nguyen G, et al. Repairing deep neural networks: fix patterns and challenges. In: Proceedings of IEEE/ACM 42nd International Conference on Software Engineering, 2020. 1135–1146

  35. Zhong W, Li C, Ge J, et al. Neural program repair: Systems, challenges and solutions. 2022. ArXiv:2202.10868

  36. Feng Y, Liu Q, Dou M Y, et al. Mubug: a mobile service for rapid bug tracking. Sci China Inf Sci, 2016, 59: 013101

    Article  Google Scholar 

  37. Zhang Z Y, Chen Z Y, Gao R Z, et al. An empirical study on constraint optimization techniques for test generation. Sci China Inf Sci, 2017, 60: 012105

    Article  Google Scholar 

  38. Zhao Y, Feng Y, Wang Y, et al. Quality assessment of crowdsourced test cases. Sci China Inf Sci, 2020, 63: 190102

    Article  Google Scholar 

  39. Staats M, Whalen M W, Heimdahl M P. Programs, tests, and oracles: the foundations of testing revisited. In: Proceedings of the 33rd International Conference on Software Engineering, 2011. 391–400

  40. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444

    Article  CAS  PubMed  ADS  Google Scholar 

  41. Barr A, Feigenbaum E A. The Handbook of Artificial Intelligence. Oxford: Butterworth-Heinemann, 1981

    Google Scholar 

  42. Feldt R, de Oliveira Neto F G, Torkar R. Ways of applying artificial intelligence in software engineering. In: Proceedings of IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering, 2018. 35–41

  43. Mou L, Li G, Zhang L, et al. Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, 2016

  44. Gu X, Zhang H, Zhang D, et al. Deep API learning. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 631–642

  45. Wang S, Liu T, Tan L. Automatically learning semantic features for defect prediction. In: Proceedings of IEEE/ACM 38th International Conference on Software Engineering, 2016. 297–308

  46. Li X, Zhang L. Transforming programs and tests in tandem for fault localization. In: Proceedings of the ACM on Programming Languages, 2017. 1–30

  47. Xie X, Chen T Y, Kuo F C, et al. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans Softw Eng Methodol, 2013, 22: 1–40

    Article  CAS  Google Scholar 

  48. Gao R, Wong W E. MSeer—an advanced technique for locating multiple bugs in parallel. IEEE Trans Software Eng, 2019, 45: 301–318

    Article  Google Scholar 

  49. Wang X Y, Jiang S J, Gao P F, et al. Cost-effective testing based fault localization with distance based test-suite reduction. Sci China Inf Sci, 2017, 60: 092112

    Article  Google Scholar 

  50. Wang Y, Huang Z Q, Li Y, et al. Lightweight fault localization combined with fault context to improve fault absolute rank. Sci China Inf Sci, 2017, 60: 092113

    Article  Google Scholar 

  51. Tu J, Xie X, Chen T Y, et al. On the analysis of spectrum based fault localization using hitting sets. J Syst Software, 2019, 147: 106–123

    Article  Google Scholar 

  52. Xu Z, Ma S, Zhang X, et al. Debugging with intelligence via probabilistic inference. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 1171–1181

  53. Tu J, Xie X, Zhou Y, et al. A search based context-aware approach for understanding and localizing the fault via weighted call graph. In: Proceedings of the 3rd International Conference on Trustworthy Systems and their Applications, 2016. 64–72

  54. Cao J, Yang S, Jiang W, et al. BugPecker: locating faulty methods with deep learning on revision graphs. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 1214–1218

  55. Wen M, Wu R, Cheung S C. Locus: locating bugs from software changes. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 262–273

  56. Wong W E, Gao R, Li Y, et al. A survey on software fault localization. IEEE Trans Software Eng, 2016, 42: 707–740

    Article  Google Scholar 

  57. Weiser M D. Program slices: formal, psychological, and practical investigations of an automatic program abstraction method. Dissertation for Ph.D. Degree. Ann Arbor: University of Michigan, 1979

  58. Zhang X, He H, Gupta N, et al. Experimental evaluation of using dynamic slices for fault location. In: Proceedings of the 6th International Symposium on Automated Analysis-Driven Debugging, 2005. 33–42

  59. Wotawa F. Fault localization based on dynamic slicing and hitting-set computation. In: Proceedings of the 10th International Conference on Quality Software, 2010. 161–170

  60. Xie X, Xu B. Essential Spectrum-Based Fault Localization. Berlin: Springer, 2021

    Book  Google Scholar 

  61. Laghari G, Murgia A, Demeyer S. Fine-tuning spectrum based fault localisation with frequent method item sets. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 274–285

  62. Zhang L, Li Z, Feng Y, et al. Improving fault-localization accuracy by referencing debugging history to alleviate structure bias in code suspiciousness. IEEE Trans Rel, 2020, 69: 1021–1049

    Article  Google Scholar 

  63. Zhang L, Yan L, Zhang Z, et al. A theoretical analysis on cloning the failed test cases to improve spectrum-based fault localization. J Syst Software, 2017, 129: 35–57

    Article  Google Scholar 

  64. Wen M, Chen J, Tian Y, et al. Historical spectrum based fault localization. IEEE Trans Software Eng, 2021, 47: 2348–2368

    Article  Google Scholar 

  65. Liblit B, Naik M, Zheng A X, et al. Scalable statistical bug isolation. SIGPLAN Not, 2005, 40: 15–26

    Article  Google Scholar 

  66. Nessa S, Abedin M, Wong W E, et al. Software fault localization using n-gram analysis. In: Proceedings of International Conference on Wireless Algorithms, Systems, and Applications, 2008. 548–559

  67. Guo Z Q, Zhou H C, Liu S R, et al. Information retrieval based bug localization: research problem, progress, and challenges (in Chinese). J Software, 2020, 31: 2826–2854

    Google Scholar 

  68. Zou W, Li E, Fang C. BLESER: bug localization based on enhanced semantic retrieval. 2021. ArXiv:2109.03555

  69. Ren Z, Jiang H, Xuan J, et al. Automated localization for unreproducible builds. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 71–81

  70. de Souza H A, Chaim M L, Kon F. Spectrum-based software fault localization: a survey of techniques, advances, and challenges. 2016. ArXiv:1607.04347

  71. Zhang Z, Lei Y, Mao X, et al. CNN-FL: an effective approach for localizing faults using convolutional neural networks. In: Proceedings of IEEE 26th International Conference on Software Analysis, Evolution and Reengineering, 2019. 445–455

  72. Wong W E, Qi Y U. BP neural network-based effective fault localization. Int J Soft Eng Knowl Eng, 2009, 19: 573–597

    Article  Google Scholar 

  73. Zheng W, Hu D, Wang J. Fault localization analysis based on deep neural network. Math Problems Eng, 2016, 2016: 1–11

    CAS  Google Scholar 

  74. Zhang Z, Lei Y, Mao X, et al. A study of effectiveness of deep learning in locating real faults. Inf Software Tech, 2021, 131: 106486

    Article  Google Scholar 

  75. Lam A N, Nguyen A T, Nguyen H A, et al. Bug localization with combination of deep learning and information retrieval. In: Proceedings of IEEE/ACM 25th International Conference on Program Comprehension, 2017. 218–229

  76. Huo X, Li M. Enhancing the unified features to locate buggy files by exploiting the sequential nature of source code. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017. 1909–1915

  77. Shi Z, Keung J, Bennin K E, et al. Comparing learning to rank techniques in hybrid bug localization. Appl Soft Computing, 2018, 62: 636–648

    Article  Google Scholar 

  78. Chen Z F, Ma W W Y, Lin W, et al. A study on the changes of dynamic feature code when fixing bugs: towards the benefits and costs of Python dynamic features. Sci China Inf Sci, 2018, 61: 012107

    Article  Google Scholar 

  79. Le X B D, Le Q L, Lo D, et al. Enhancing automated program repair with deductive verification. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2016. 428–432

  80. Gopinath D, Wang K, Hua J, et al. Repairing intricate faults in code using machine learning and path exploration. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2016. 453–457

  81. Roychoudhury A, Xiong Y F. Automated program repair: a step towards software automation. Sci China Inf Sci, 2019, 62: 200103

    Article  Google Scholar 

  82. Kong X, Zhang L, Wong W E, et al. Experience report: how do techniques, programs, and tests impact automated program repair? In: Proceedings of IEEE 26th International Symposium on Software Reliability Engineering, 2015. 194–204

  83. Wen M, Liu Y, Cheung S C. Boosting automated program repair with bug-inducing commits. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results, 2020. 77–80

  84. Marginean A, Bader J, Chandra S, et al. SapFix: automated end-to-end repair at scale. In: Proceedings of IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice, 2019. 269–278

  85. Bader J, Scott A, Pradel M, et al. Getafix: learning to fix bugs automatically. In: Proceedings of the ACM on Programming Languages, 2019. 1–27

  86. Motwani M, Soto M, Brun Y, et al. Quality of automated program repair on real-world defects. IEEE Trans Software Eng, 2022, 48: 637–661

    Article  Google Scholar 

  87. Smith E K, Barr E T, Goues C L, et al. Is the cure worse than the disease? Overfitting in automated program repair. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, 2015. 532–543

  88. Le Goues C, Dewey-Vogt M, Forrest S, et al. A systematic study of automated program repair: fixing 55 out of 105 bugs for $8 each. In: Proceedings of the 34th International Conference on Software Engineering, 2012. 3–13

  89. Qi Y, Mao X, Lei Y. Efficient automated program repair through fault-recorded testing prioritization. In: Proceedings of IEEE International Conference on Software Maintenance, 2013. 180–189

  90. Weimer W, Fry Z P, Forrest S. Leveraging program equivalence for adaptive program repair: Models and first results. In: Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering, 2013. 356–366

  91. Kim J, Kim J, Lee E. VFL: variable-based fault localization. Inf Software Tech, 2019, 107: 179–191

    Article  MathSciNet  Google Scholar 

  92. Wang S, Liu K, Lin B, et al. Beep: fine-grained fix localization by learning to predict buggy code elements. 2021. ArXiv:2111.07739

  93. Liu K, Koyuncu A, Bissyande T F, et al. You cannot fix what you cannot find! An investigation of fault localization bias in benchmarking automated program repair systems. In: Proceedings of the 12th IEEE Conference on Software Testing, Validation and Verification, 2019. 102–113

  94. Monperrus M. A critical review of “automatic patch generation learned from human-written patches”: essay on the problem statement and the evaluation of automatic software repair. In: Proceedings of the 36th International Conference on Software Engineering, 2014. 234–242

  95. Wang S, Mao X, Niu N, et al. Multi-location program repair strategies learned from past successful experience. 2018. ArXiv:1810.12556

  96. Motwani M, Sankaranarayanan S, Just R, et al. Do automated program repair techniques repair hard and important bugs? Empirical Software Eng, 2018, 23: 2901–2947

    Article  Google Scholar 

  97. Liu K, Kim D, Bissyande T F, et al. Mining fix patterns for FindBugs violations. IEEE Trans Software Eng, 2021, 47: 165–188

    Article  Google Scholar 

  98. Tufano M, Watson C, Bavota G, et al. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Trans Softw Eng Methodol, 2019, 28: 1–29

    Article  Google Scholar 

  99. Chen Z, Kommrusch S J, Tufano M, et al. SEQUENCER: sequence-to-sequence learning for end-to-end program repair. IEEE Trans Software Eng, 2021. doi: https://doi.org/10.1109/TSE.2019.2940179

  100. Lutellier T, Pham H V, Pang L, et al. CoCoNuT: combining context-aware neural translation models using ensemble for program repair. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. 101–114

  101. Cao J, Li M, Chen X, et al. DeepFD: automated fault diagnosis and localization for deep learning programs. 2022. ArXiv:2205.01938

  102. Li Z, Ma X, Xu C, et al. Operational calibration: debugging confidence errors for dnns in the field. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020. 901–913

  103. Yan S, Tao G, Liu X, et al. Correlations between deep neural network model coverage criteria and model quality. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020. 775–787

  104. Brown L. Tesla driver killed in crash posted videos of himself driving hands-free. 2021. https://www.marketwatch.com/story/tesla-driver-killed-in-crash-posted-videos-of-himself-driving-hands-free-11621220917

  105. Marijan D, Gotlieb A. Software testing for machine learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020. 34: 13576–13582

    Article  Google Scholar 

  106. Shen W, Li Y, Han Y, et al. Boundary sampling to boost mutation testing for deep learning models. Inf Software Tech, 2021, 130: 106413

    Article  Google Scholar 

  107. Shen G, Liu Y, Tao G, et al. Backdoor scanning for deep neural networks through k-arm optimization. In: Proceedings of International Conference on Machine Learning, 2021. 9525–9536

  108. Meng L, Li Y, Chen L, et al. Measuring discrimination to boost comparative testing for multiple deep learning models. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 385–396

  109. Lourenço R, Freire J, Shasha D. Debugging machine learning pipelines. In: Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, 2019. 1–10

  110. Feng Y, Shi Q, Gao X, et al. DeepGini: prioritizing massive tests to enhance the robustness of deep neural networks. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. 177–188

  111. Krishnan S, Wu E. PALM: machine learning explanations for iterative debugging. In: Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, 2017. 1–6

  112. Koh P W, Liang P. Understanding black-box predictions via influence functions. In: Proceedings of International Conference on Machine Learning, 2017. 1885–1894

  113. Cao Y, Yu A F, Aday A, et al. Efficient repair of polluted machine learning systems via causal unlearning. In: Proceedings of the Asia Conference on Computer and Communications Security, 2018. 735–747

  114. Zhang H, Chan W. Apricot: a weight-adaptation approach to fixing deep learning models. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019. 376–387

  115. Shen W, Li Y, Chen L, et al. Multiple-boundary clustering and prioritization to promote neural network retraining. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 410–422

  116. Zhang X, Yin Z, Feng Y, et al. NeuralVis: visualizing and interpreting deep learning models. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019. 1106–1109

  117. Eniser H F, Gerasimou S, Sen A. DeepFault: fault localization for deep neural networks. 2019. ArXiv:1902.05974

  118. Guidotti D, Leofante F, Pulina L, et al. Verification and repair of neural networks: a progress report on convolutional models. In: Proceedings of International Conference of the Italian Association for Artificial Intelligence, 2019. 405–417

  119. Zhang Y, Chen Y, Cheung S C, et al. An empirical study on TensorFlow program bugs. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2018. 129–140

  120. Islam M J, Nguyen G, Pan R, et al. A comprehensive study on deep learning bug characteristics. In: Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019. 510–520

  121. Humbatova N, Jahangirova G, Bavota G, et al. Taxonomy of real faults in deep learning systems. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020. 1110–1121

  122. Kitchenham B A, Budgen D, Brereton P. Evidence-Based Software Engineering and Systematic Reviews: Volume 4. Boca Raton: CRC Press, 2015

    Book  Google Scholar 

  123. Basili V R, Caldiera G, Rombach H D. The goal question metric approach. In: Encyclopedia of Software Engineering. 1994. 528–532

  124. Colanzi T E, Assunção W K G, Farah P R, et al. A review of ten years of the symposium on search-based software engineering. In: Proceedings of International Symposium on Search Based Software Engineering, 2019. 42–57

  125. Ye X, Shen H, Ma X, et al. From word embeddings to document similarities for improved information retrieval in software engineering. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 404–415

  126. Long F, Rinard M. Automatic patch generation by learning correct code. In: Proceedings of the 43rd Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, 2016. 298–312

  127. Xuan J, Martinez M, DeMarco F, et al. Nopol: automatic repair of conditional statement bugs in Java programs. IEEE Trans Software Eng, 2017, 43: 34–55

    Article  Google Scholar 

  128. Le X B D, Lo D, Le Goues C. History driven program repair. In: Proceedings of IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, 2016. 213–224

  129. Xiong Y, Wang J, Yan R, et al. Precise condition synthesis for program repair. In: Proceedings of IEEE/ACM 39th International Conference on Software Engineering, 2017. 416–426

  130. Ma S, Liu Y, Lee W C, et al. MODE: automated neural network model debugging via state differential analysis and input selection. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018. 175–186

  131. Agrawal A, Fu W, Menzies T. What is wrong with topic modeling? And how to fix it using search-based software engineering. Inf Software Tech, 2018, 98: 74–88

    Article  Google Scholar 

  132. Peng Z, Xiao X, Hu G, et al. ABFL: an autoencoder based practical approach for software fault localization. Inf Sci, 2020, 510: 108–121

    Article  Google Scholar 

  133. Huo X, Li M, Zhou Z H. Control flow graph embedding based on multi-instance decomposition for bug localization. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020. 4223–4230

  134. Li X, Li W, Zhang Y, et al. DeepFL: integrating multiple fault diagnosis dimensions for deep fault localization. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2019. 169–180

  135. Qi G, Yao L, Uzunov A V. Fault detection and localization in distributed systems using recurrent convolutional neural networks. In: Proceedings of International Conference on Advanced Data Mining and Applications, 2017. 33–48

  136. Huo X, Li M, Zhou Z H. Learning unified features from natural and programming languages for locating buggy source code. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016. 1606–1612

  137. Liang H, Sun L, Wang M, et al. Deep learning with customized abstract syntax tree for bug localization. IEEE Access, 2019, 7: 116309

    Article  Google Scholar 

  138. Golagha M, Pretschner A, Briand L C. Can we predict the quality of spectrum-based fault localization? In: Proceedings of IEEE 13th International Conference on Software Testing, Validation and Verification, 2020. 4–15

  139. Gu Y, Xuan J, Zhang H, et al. Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence. J Syst Software, 2019, 148: 88–104

    Article  Google Scholar 

  140. Kim Y, Mun S, Yoo S, et al. Precise learn-to-rank fault localization using dynamic and static features of target programs. ACM Trans Softw Eng Methodol, 2019, 28: 1–34

    Article  Google Scholar 

  141. Xia X, Lo D. An effective change recommendation approach for supplementary bug fixes. Autom Softw Eng, 2017, 24: 455–498

    Article  Google Scholar 

  142. Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine Learning. Cambridge: MIT Press, 2018

    Google Scholar 

  143. Pan Y, Xiao X, Hu G, et al. ALBFL: a novel neural ranking model for software fault localization via combining static and dynamic features. In: Proceedings of IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications, 2020. 785–792

  144. Ye X, Bunescu R, Liu C. Mapping bug reports to relevant files: a ranking model, a fine-grained benchmark, and feature evaluation. IEEE Trans Software Eng, 2016, 42: 379–402

    Article  Google Scholar 

  145. Yang X L, Lo D, Xia X, et al. High-impact bug report identification with imbalanced learning strategies. J Comput Sci Technol, 2017, 32: 181–198

    Article  Google Scholar 

  146. Guo Z, Li Y, Ma W, et al. Boosting crash-inducing change localization with rank-performance-based feature subset selection. Empir Software Eng, 2020, 25: 1905–1950

    Article  Google Scholar 

  147. Wu R, Wen M, Cheung S C, et al. ChangeLocator: locate crash-inducing changes based on crash reports. Empir Software Eng, 2018, 23: 2866–2900

    Article  Google Scholar 

  148. Li A, Lei Y, Mao X. Towards more accurate fault localization: an approach based on feature selection using branching execution probability. In: Proceedings of IEEE International Conference on Software Quality, Reliability and Security, 2016. 431–438

  149. Feyzi F. CGT-FL: using cooperative game theory to effective fault localization in presence of coincidental correctness. Empir Software Eng, 2020, 25: 3873–3927

    Article  Google Scholar 

  150. Amar A, Rigby P C. Mining historical test logs to predict bugs and localize faults in the test logs. In: Proceedings of IEEE/ACM 41st International Conference on Software Engineering, 2019. 140–151

  151. Koyuncu A, Bissyande T F, Kim D, et al. D&C: a divide-and-conquer approach to IR-based bug localization. 2019. ArXiv:1902.02703

  152. Yang B, He Y, Liu H, et al. A lightweight fault localization approach based on XGBoost. In: Proceedings of IEEE 20th International Conference on Software Quality, Reliability and Security, 2020. 168–179

  153. Nath A, Domingos P. Learning tractable probabilistic models for fault localization. In: Proceedings of the AAAI Conference on Artificial Intelligence: Volume 30. 2016

  154. Popescu M C, Balas V E, Perescu-Popescu L, et al. Multilayer perceptron and neural networks. WSEAS Trans Circuits Syst, 2009, 8: 579–588

    Google Scholar 

  155. Maru A, Dutta A, Kumar K V, et al. Effective software fault localization using a back propagation neural network. In: Proceedings of Computational Intelligence in Data Mining, 2020. 513–526

  156. Dutta A, Pant N, Mitra P, et al. Effective fault localization using an ensemble classifier. In: Proceedings of International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering, 2019. 847–855

  157. Li Y, Wang S, Nguyen T N. Fault localization with code coverage representation learning. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 661–673

  158. Polisetty S, Miranskyy A, Başar A. On usefulness of the deep-learning-based bug localization models to practitioners. In: Proceedings of the 15th International Conference on Predictive Models and Data Analytics in Software Engineering, 2019. 16–25

  159. Mahapatra R, Negi A. Effective software fault localization using GA-RBF neural network. J Theor Applied Inform Technol, 2016, 90: 168–174

    Google Scholar 

  160. Sohn J, Yoo S. FLUCCS: using code and change metrics to improve fault localization. In: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2017. 273–283

  161. Choi K, Sohn J, Yoo S. Learning fault localisation for both humans and machines using multi-objective GP. In: Proceedings of International Symposium on Search Based Software Engineering, 2018. 349–355

  162. Xuan J, Monperrus M. Learning to combine multiple ranking metrics for fault localization. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2014. 191–200

  163. Zou D, Liang J, Xiong Y, et al. An empirical study of fault localization families and their combinations. IEEE Trans Software Eng, 2021, 47: 332–347

    Article  Google Scholar 

  164. Liu P, Chen Y, Nie X, et al. FluxRank: a widely-deployable framework to automatically localizing root cause machines for software service failure mitigation. In: Proceedings of IEEE 30th International Symposium on Software Reliability Engineering, 2019. 35–46

  165. Le T D B, Lo D, Goues C L, et al. A learning-to-rank based fault localization approach using likely invariants. In: Proceedings of the 25th International Symposium on Software Testing and Analysis, 2016. 177–188

  166. Küçük Y, Henderson T A, Podgurski A. Improving fault localization by integrating value and predicate based causal inference techniques. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 649–660

  167. Podgurski A, Küçük Y. CounterFault: value-based fault localization by modeling and predicting counterfactual outcomes. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2020. 382–393

  168. Lou Y, Zhu Q, Dong J, et al. Boosting coverage-based fault localization via graph-based representation learning. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021. 664–676

  169. Maamar M, Lazaar N, Loudni S, et al. Fault localization using itemset mining under constraints. Autom Softw Eng, 2017, 24: 341–368

    Article  Google Scholar 

  170. Yan M, Xia X, Fan Y, et al. Just-In-Time defect identification and localization: a two-phase framework. IEEE Trans Software Eng, 2022, 48: 82–101

    Article  Google Scholar 

  171. Zaman T S, Han X, Yu T. SCMiner: localizing system-level concurrency faults from large system call traces. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019. 515–526

  172. Hoang T, Oentaryo R J, Le T D B, et al. Network-clustered multi-modal bug localization. IEEE Trans Software Eng, 2019, 45: 1002–1023

    Article  Google Scholar 

  173. Cheng S, Yan X, Khan A A. A similarity integration method based information retrieval and word embedding in bug localization. In: Proceedings of IEEE 20th International Conference on Software Quality, Reliability and Security, 2020. 180–187

  174. Pradel M, Sen K. DeepBugs: a learning approach to name-based bug detection. Proc ACM Program Lang, 2018, 2: 1–25

    Article  Google Scholar 

  175. Liu G, Lu Y, Shi K, et al. Convolutional neural networks-based locating relevant buggy code files for bug reports affected by data imbalance. IEEE Access, 2019, 7: 131304–131316

    Article  Google Scholar 

  176. Xiao Y, Keung J, Bennin K E, et al. Improving bug localization with word embedding and enhanced convolutional neural networks. Inf Software Tech, 2019, 105: 17–29

    Article  Google Scholar 

  177. Li G, Liu H, Jin J, et al. Deep learning based identification of suspicious return statements. In: Proceedings of IEEE 27th International Conference on Software Analysis, Evolution and Reengineering, 2020. 480–491

  178. Zhang Y, Lo D, Xia X, et al. Fusing multi-abstraction vector space models for concern localization. Empir Software Eng, 2018, 23: 2279–2322

    Article  Google Scholar 

  179. Mills C, Parra E, Pantiuchina J, et al. On the relationship between bug reports and queries for text retrieval-based bug localization. Empir Software Eng, 2020, 25: 3086–3127

    Article  Google Scholar 

  180. Almhana R, Mkaouer W, Kessentini M, et al. Recommending relevant classes for bug reports using multi-objective search. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 286–295

  181. Almhana R, Kessentini M, Mkaouer W. Method-level bug localization using hybrid multi-objective search. Inf Software Tech, 2021, 131: 106474

    Article  Google Scholar 

  182. Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems, 2013. 3111–3119

  183. Briem J A, Smit J, Sellik H, et al. OffSide: learning to identify mistakes in boundary conditions. In: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops, 2020. 203–208

  184. Liu G, Lu Y, Shi K, et al. Mapping bug reports to relevant source code files based on the vector space model and word embedding. IEEE Access, 2019, 7: 78870–78881

    Article  Google Scholar 

  185. Zhang W, Li Z, Wang Q, et al. FineLocator: a novel approach to method-level fine-grained bug localization by query expansion. Inf Software Tech, 2019, 110: 121–135

    Article  Google Scholar 

  186. Zhu Z, Li Y, Tong H, et al. CooBa: cross-project bug localization via adversarial transfer learning. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020. 3565–3571

  187. Zhong H, Mei H. Learning a graph-based classifier for fault localization. Sci China Inf Sci, 2020, 63: 162101

    Article  MathSciNet  Google Scholar 

  188. Jonsson L, Broman D, Magnusson M, et al. Automatic localization of bugs to faulty components in large scale software systems using Bayesian classification. In: Proceedings of IEEE International Conference on Software Quality, Reliability and Security, 2016. 423–430

  189. Huang Q, Lo D, Xia X, et al. Which packages would be affected by this bug report? In: Proceedings of IEEE 28th International Symposium on Software Reliability Engineering, 2017. 124–135

  190. Le T D B, Thung F, Lo D. Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools. Empir Software Eng, 2017, 22: 2237–2279

    Article  Google Scholar 

  191. Li Z, Jiang Z, Chen X, et al. Laprob: a label propagation-based software bug localization method. Inf Software Tech, 2021, 130: 106410

    Article  Google Scholar 

  192. Rahman M M, Roy C K. Improving IR-based bug localization with context-aware query reformulation. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018. 621–632

  193. Li X, Wong W E, Gao R, et al. Genetic algorithm-based test generation for software product line with the integration of fault localization techniques. Empir Software Eng, 2018, 23: 1–51

    Article  Google Scholar 

  194. Chatterjee P, Chatterjee A, Campos J, et al. Diagnosing software faults using multiverse analysis. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020. 1629–1635

  195. Elmishali A, Stern R, Kalech M. An artificial intelligence paradigm for troubleshooting software bugs. Eng Appl Artif Intelligence, 2018, 69: 147–156

    Article  Google Scholar 

  196. Liu B, Nejati S, Lucia S, et al. Effective fault localization of automotive Simulink models: achieving the trade-off between test oracle effort and fault localization accuracy. Empir Software Eng, 2019, 24: 444–490

    Article  Google Scholar 

  197. Zhang Z, Lei Y, Mao X, et al. Improving deep-learning-based fault localization with resampling. J Software Evolu Process, 2021, 33: e2312

    Article  Google Scholar 

  198. Japkowicz N, Stephen S. The class imbalance problem: a systematic study1. Intell Data Anal, 2002, 6: 429–449

    Article  Google Scholar 

  199. Graves A, Mohamed A R, Hinton G. Speech recognition with deep recurrent neural networks. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 2013. 6645–6649

  200. Jarrett K, Kavukcuoglu K, Ranzato M, et al. What is the best multi-stage architecture for object recognition? In: Proceedings of IEEE 12th International Conference on Computer Vision, 2009. 2146–2153

  201. Xia X, Gong L, Le T D B, et al. Diversity maximization speedup for localizing faults in single-fault and multi-fault programs. Autom Softw Eng, 2016, 23: 43–75

    Article  Google Scholar 

  202. Liu Y, Li M, Wu Y, et al. A weighted fuzzy classification approach to identify and manipulate coincidental correct test cases for fault localization. J Syst Software, 2019, 151: 20–37

    Article  Google Scholar 

  203. Zhang M, Li Y, Li X, et al. An empirical study of boosting spectrum-based fault localization via PageRank. IEEE Trans Software Eng, 2021, 47: 1089–1113

    Article  Google Scholar 

  204. Chen J, Ma H, Zhang L. Enhanced compiler bug isolation via memoized search. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 78–89

  205. Zhang X Y, Zheng Z, Cai K Y. Exploring the usefulness of unlabelled test cases in software fault localization. J Syst Software, 2018, 136: 278–290

    Article  Google Scholar 

  206. Gupta R, Kanade A, Shevade S. Neural attribution for semantic bug-localization in student programs. In: Proceedings of Advances in Neural Information Processing Systems, 2019. 32

  207. Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks. In: Proceedings of International Conference on Machine Learning, 2017. 3319–3328

  208. He J, Xu L, Yan M, et al. Duplicate bug report detection using dual-channel convolutional neural networks. In: Proceedings of the 28th International Conference on Program Comprehension, 2020. 117–127

  209. Ni Z, Li B, Sun X, et al. Analyzing bug fix for automatic bug cause classification. J Syst Software, 2020, 163: 110538

    Article  Google Scholar 

  210. Yan X B, Liu B, Wang S H. A test restoration method based on genetic algorithm for effective fault localization in multiple-fault programs. J Syst Software, 2021, 172: 110861

    Article  Google Scholar 

  211. Zheng Y, Wang Z, Fan X, et al. Localizing multiple software faults based on evolution algorithm. J Syst Software, 2018, 139: 107–123

    Article  Google Scholar 

  212. Gao M, Li P, Chen C, et al. Research on software multiple fault localization method based on machine learning. In: Proceedings of MATEC Web of Conferences: volume 232, 2018. 01060

  213. Behera R K, Shukla S, Rath S K, et al. Software reliability assessment using machine learning technique. In: Proceedings of International Conference on Computational Science and Its Applications. Springer, 2018. 403–411

  214. Li Z, Chen T H, Shang W. Where shall we log? Studying and suggesting logging locations in code blocks. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 361–372

  215. Vasic M, Kanade A, Maniatis P, et al. Neural program repair by jointly learning to localize and repair. 2019. ArXiv:1904.01720

  216. Chappelly T, Cifuentes C, Krishnan P, et al. Machine learning for finding bugs: an initial report. In: Proceedings of IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation, 2017. 21–26

  217. Yang X, Yu Z, Wang J, et al. Understanding static code warnings: an incremental AI approach. Expert Syst Appl, 2021, 167: 114134

    Article  Google Scholar 

  218. Lin Y, Sun J, Tran L, et al. Break the dead end of dynamic slicing: localizing data and control omission bug. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018. 509–519

  219. Yu X, Liu J, Yang Z, et al. The Bayesian network based program dependence graph and its application to fault localization. J Syst Software, 2017, 134: 44–53

    Article  Google Scholar 

  220. Hofer B, Nica I, Wotawa F. AI for localizing faults in spreadsheets. In: Proceedings of IFIP International Conference on Testing Software and Systems, 2017. 71–87

  221. Terra-Neves M, Machado N, Lynce I, et al. Concurrency debugging with MaxSMT. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019. 1608–1616

  222. Mesbah A, Rice A, Johnston E, et al. DeepDelta: learning to repair compilation errors. In: Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019. 925–936

  223. Zou W, Lo D, Kochhar P S, et al. Smart contract development: challenges and opportunities. IEEE Trans Software Eng, 2021, 47: 2084–2106

    Article  Google Scholar 

  224. Yu X L, Al-Bataineh O, Lo D, et al. Smart contract repair. ACM Trans Softw Eng Methodol, 2020, 29: 1–32

    Article  Google Scholar 

  225. Yuan Y, Banzhaf W. ARJA: automated repair of Java programs via multi-objective genetic programming. IEEE Trans Software Eng, 2018, 46: 1040–1067

    Article  Google Scholar 

  226. Yuan Y, Banzhaf W. Toward better evolutionary program repair. ACM Trans Softw Eng Methodol, 2020, 29: 1–53

    Article  Google Scholar 

  227. Oliveira V P L, Souza E F, Goues C L, et al. Improved representation and genetic operators for linear genetic programming for automated program repair. Empir Software Eng, 2018, 23: 2980–3006

    Article  Google Scholar 

  228. Lee J, Song D, So S, et al. Automatic diagnosis and correction of logical errors for functional programming assignments. Proc ACM Program Lang, 2018, 2: 1–30

    Article  Google Scholar 

  229. Machado N, Quinta D, Lucia B, et al. Concurrency debugging with differential schedule projections. ACM Trans Softw Eng Methodol, 2016, 25: 1–37

    Article  Google Scholar 

  230. Pan R, Hu Q, Xu G, et al. Automatic repair of regular expressions. Proc ACM Program Lang, 2019, 3: 1–29

    Article  Google Scholar 

  231. Koyuncu A, Liu K, Bissyande T F, et al. FixMiner: mining relevant fix patterns for automated program repair. Empir Software Eng, 2020, 25: 1980–2024

    Article  Google Scholar 

  232. Gulwani S, Radiček I, Zuleger F. Automated clustering and program repair for introductory programming assignments. SIGPLAN Not, 2018, 53: 465–480

    Article  Google Scholar 

  233. Falleri J R, Morandat F, Blanc X, et al. Fine-grained and accurate source code differencing. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, 2014. 313–324

  234. Sakkas G, Endres M, Cosman B, et al. Type error feedback via analytic program repair. In: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, 2020. 16–30

  235. White M, Tufano M, Martinez M, et al. Sorting and transforming program repair ingredients via deep learning code similarities. In: Proceedings of IEEE 26th International Conference on Software Analysis, Evolution and Reengineering, 2019. 479–490

  236. Yi X, Chen L, Mao X, et al. Efficient automated repair of high floating-point errors in numerical libraries. Proc ACM Program Lang, 2019, 3: 1–29

    Article  Google Scholar 

  237. Jiang J, Xiong Y, Zhang H, et al. Shaping program repair space with existing patches and similar code. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2018. 298–309

  238. Jiang N, Lutellier T, Tan L. CURE: code-aware neural machine translation for automatic program repair. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 1161–1173

  239. Koyuncu A, Liu K, Bissyande T F, et al. iFixR: bug report driven program repair. In: Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019. 314–325

  240. Zhu Q, Sun Z, Xiao Y a, et al. A syntax-guided edit decoder for neural program repair. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021. 341–353

  241. Sun Z, Zhu Q, Xiong Y, et al. TreeGen: a tree-based transformer architecture for code generation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020. 8984–8991

  242. Shariffdeen R, Noller Y, Grunske L, et al. Concolic program repair. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021. 390–405

  243. Lee J, Hong S, Oh H. MemFix: static analysis-based repair of memory deallocation errors for C. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018. 95–106

  244. Li Y, Wang S, Nguyen T N. DLFix: context-based code transformation learning for automated program repair. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020. 602–614

  245. Wen M, Chen J, Wu R, et al. Context-aware patch generation for better automated program repair. In: Proceedings of IEEE/ACM 40th International Conference on Software Engineering, 2018. 1–11

  246. Wang S, Wen M, Lin B, et al. Automated patch correctness assessment: how far are we? In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 968–980

  247. Ziarko W, Shan N. Machine learning through data classification and reduction. Fundamenta Informaticae, 1997, 30: 373–382

    Article  MathSciNet  Google Scholar 

  248. Patil T R, Sherekar S S. Performance analysis of Naive Bayes and J48 classification algorithm for data classification. Int J Comput Sci Appl, 2013, 6: 256–261

    Google Scholar 

  249. Kleinbaum D G, Dietz K, Gail M, et al. Logistic Regression. Berlin: Springer, 2002

    Google Scholar 

  250. Arcuri A, Briand L. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd International Conference on Software Engineering, 2011. 1–10

  251. Platt J. Sequential minimal optimization: a fast algorithm for training support vector machines. 1998. https://www.micro-soft.com/en-us/research/publication/sequential-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines/

  252. Xiong Y, Liu X, Zeng M, et al. Identifying patch correctness in test-based program repair. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 789–799

  253. Liang J, Ji R, Jiang J, et al. Interactive patch filtering as debugging aid. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2021. 239–250

  254. Saha R K, Lyu Y, Yoshida H, et al. Elixir: effective object-oriented program repair. In: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, 2017. 648–659

  255. Tan S H, Yoshida H, Prasad M R, et al. Anti-patterns in search-based program repair. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 727–738

  256. Long F, Rinard M. Staged program repair with condition synthesis. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, 2015. 166–178

  257. Le X B D, Thung F, Lo D, et al. Overfitting in semantics-based automated program repair. Empir Software Eng, 2018, 23: 3007–3033

    Article  Google Scholar 

  258. Yasunaga M, Liang P. Graph-based, self-supervised program repair from diagnostic feedback. In: Proceedings of International Conference on Machine Learning, 2020. 10799–10808

  259. Wang K, Singh R, Su Z. Dynamic neural program embedding for program repair. 2017. ArXiv:1711.07163

  260. Gupta R, Kanade A, Shevade S. Deep reinforcement learning for syntactic error repair in student programs. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019. 930–937

  261. Traver V J. On compiler error messages: what they say and what they mean. Adv Hum-Comput Interaction, 2010, 2010: 1–26

  262. Allamanis M, Brockschmidt M, Khademi M. Learning to represent programs with graphs. In: Proceedings of International Conference on Learning Representations, 2018

  263. Gember-Jacobson A, Akella A, Mahajan R, et al. Automatically repairing network control planes using an abstract representation. In: Proceedings of the 26th Symposium on Operating Systems Principles, 2017. 359–373

  264. Dinella E, Dai H, Li Z, et al. Hoppity: learning graph transformations to detect and fix bugs in programs. In: Proceedings of International Conference on Learning Representations, 2020

  265. Gupta K, Christensen P E, Chen X, et al. Synthesize, execute and debug: learning to repair for neural program synthesis. 2020. ArXiv:2007.08095

  266. Tian H, Liu K, Kaboré A K, et al. Evaluating representation learning of code changes for predicting patch correctness in program repair. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 981–992

  267. Yang G, Min K, Lee B. Applying deep learning algorithm to automatic bug localization and repair. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020. 1634–1641

  268. Lourenço R, Freire J, Shasha D. BugDoc: algorithms to debug computational processes. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, 2020. 463–478

  269. Pham H V, Lutellier T, Qi W, et al. CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In: Proceedings of IEEE/ACM 41st International Conference on Software Engineering, 2019. 1027–1038

  270. Wardat M, Le W, Rajan H. DeepLocalize: fault localization for deep neural networks. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 251–262

  271. Dolby J, Shinnar A, Allain A, et al. Ariadne: analysis for machine learning programs. In: Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, 2018. 1–10

  272. Cheng D, Cao C, Xu C, et al. Manifesting bugs in machine learning code: an explorative study with mutation testing. In: Proceedings of IEEE International Conference on Software Quality, Reliability and Security, 2018. 313–324

  273. Wu X, Zheng W, Xia X, et al. Data quality matters: a case study on data label correctness for security bug report prediction. IEEE Trans Software Eng, 2022, 48: 2541–2556

    Article  Google Scholar 

  274. Tao G, Ma S, Liu Y, et al. TRADER: trace divergence analysis and embedding regulation for debugging recurrent neural networks. In: Proceedings of IEEE/ACM 42nd International Conference on Software Engineering, 2020. 986–998

  275. Kim E, Gopinath D, Pasareanu C, et al. A programmatic and semantic approach to explaining and debugging neural network based object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. 11128–11137

  276. Sotoudeh M, Thakur A V. Provable repair of deep neural networks. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021. 588–603

  277. Song K, Tan X, Lu J. Neural machine translation with error correction. 2020. ArXiv:2007.10681

  278. Zhang Y, Ren L, Chen L, et al. Detecting numerical bugs in neural network architectures. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020. 826–837

  279. Schoop E, Huang F, Hartmann B. UMLAUT: debugging deep learning programs using program structure and model behavior. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2021. 1–16

  280. Zhang X, Zhai J, Ma S, et al. AUTOTRAINER: an automatic DNN training problem detection and repair system. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 359–371

  281. Sun Z, Zhang J M, Harman M, et al. Automatic testing and improvement of machine translation. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020. 974–985

  282. Jebnoun H, Braiek H B, Rahman M M, et al. The scent of deep learning code: an empirical study. In: Proceedings of the 17th International Conference on Mining Software Repositories, 2020. 420–430

  283. Fan Y, Xia X, Lo D, et al. What makes a popular academic AI repository? Empir Software Eng, 2021, 26: 2

    Article  Google Scholar 

  284. Liu J, Huang Q, Xia X, et al. Is using deep learning frameworks free? Characterizing technical debt in deep learning frameworks. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society, 2020. 1–10

  285. Liu J, Huang Q, Xia X, et al. An exploratory study on the introduction and removal of different types of technical debt in deep learning frameworks. Empir Software Eng, 2021, 26: 16

    Article  Google Scholar 

  286. Han J, Deng S, Lo D, et al. An empirical study of the dependency networks of deep learning libraries. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2020. 868–878

  287. Sun X, Zhou T, Li G, et al. An empirical study on real bugs for machine learning programs. In: Proceedings of the 24th Asia-Pacific Software Engineering Conference, 2017. 348–357

  288. Zhang R, Xiao W, Zhang H, et al. An empirical study on program failures of deep learning jobs. In: Proceedings of IEEE/ACM 42nd International Conference on Software Engineering, 2020. 1159–1170

  289. Jia L, Zhong H, Wang X, et al. An empirical study on bugs inside TensorFlow. In: Proceedings of International Conference on Database Systems for Advanced Applications. Berlin: Springer, 2020. 604–620

  290. Garcia J, Feng Y, Shen J, et al. A comprehensive study of autonomous vehicle bugs. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020. 385–396

  291. Chen Z, Yao H, Lou Y, et al. An empirical study on deployment faults of deep learning based mobile applications. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 674–685

  292. Just R, Jalali D, Ernst M D. Defects4j: a database of existing faults to enable controlled testing studies for Java programs. In: Proceedings of the International Symposium on Software Testing and Analysis, 2014. 437–440

  293. Abreu R, Zoeteweij P, van Gemund A J. An evaluation of similarity coefficients for software fault localization. In: Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing, 2006. 39–46

  294. Wong W E, Qi Y, Zhao L, et al. Effective fault localization using code coverage. In: Proceedings of the 31st Annual International Computer Software and Applications Conference, 2007. 449–456

  295. Rao P, Zheng Z, Chen T Y, et al. Impacts of test suite’s class imbalance on spectrum-based fault localization techniques. In: Proceedings of the 13th International Conference on Quality Software, 2013. 260–267

  296. Shu T, Ye T, Ding Z, et al. Fault localization based on statement frequency. Inf Sci, 2016, 360: 43–56

    Article  Google Scholar 

  297. Feyzi F, Parsa S. Inforence: effective fault localization based on information-theoretic analysis and statistical causal inference. Front Comput Sci, 2019, 13: 735–759

    Article  Google Scholar 

  298. Madeiral F, Urli S, Maia M, et al. BEARS: an extensible Java bug benchmark for automatic program repair studies. In: Proceedings of IEEE 26th International Conference on Software Analysis, Evolution and Reengineering, 2019. 468–478

  299. Saha R K, Lyu Y, Lam W, et al. Bugs.jar: a large-scale, diverse dataset of real-world Java bugs. In: Proceedings of the 15th International Conference on Mining Software Repositories, 2018. 10–13

  300. Song Y, Xie X, Liu Q, et al. A comprehensive empirical investigation on failure clustering in parallel debugging. J Syst Software, 2022, 193: 111452

    Article  Google Scholar 

  301. Song Y, Xie X, Zhang X, et al. Evolving ranking-based failure proximities for better clustering in fault isolation. In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022

  302. Chen T, Cheung S, Yiu S. Metamorphic Testing: A New Approach for Generating Next Test Cases. Technical Report hkust-cs98-01. Hong Kong University of Science and Technology, 1998

  303. Xie X, Ho J W K, Murphy C, et al. Testing and validating machine learning classifiers by metamorphic testing. J Syst Software, 2011, 84: 544–558

    Article  Google Scholar 

  304. Xie X, Zhang Z, Chen T Y, et al. METTLE: a METamorphic testing approach to assessing and validating unsupervised machine learning systems. IEEE Trans Rel, 2020, 69: 1293–1322

    Article  Google Scholar 

  305. Xie X, Ho J, Murphy C, et al. Application of metamorphic testing to supervised classifiers. In: Proceedings of the 9th International Conference on Quality Software, 2009. 135–144

  306. Chen S, Jin S, Xie X. Testing your question answering software via asking recursively. In: Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering, 2021. 104–116

  307. Grottke M, Trivedi K S. A classification of software faults. J Reliab Engin Assoc Japan, 2005, 27: 425–438

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by National Natural Science Foundation of China (Grant Nos. 62250610224, 61972289, 61832009). We sincerely appreciate the valuable suggestions from the anonymous reviewers for our paper.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xiaoyuan Xie or Baowen Xu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, Y., Xie, X. & Xu, B. When debugging encounters artificial intelligence: state of the art and open challenges. Sci. China Inf. Sci. 67, 141101 (2024). https://doi.org/10.1007/s11432-022-3803-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-022-3803-9

Keywords

Navigation