Abstract
Both software debugging and artificial intelligence techniques are hot topics in the current field of software engineering. Debugging techniques, which comprise fault localization and program repair, are an important part of the software development lifecycle for ensuring the quality of software systems. As the scale and complexity of software systems grow, developers intend to improve the effectiveness and efficiency of software debugging via artificial intelligence (artificial intelligence for software debugging, AI4SD). On the other hand, many artificial intelligence models are being integrated into safety-critical areas such as autonomous driving, image recognition, and audio processing, where software debugging is highly necessary and urgent (software debugging for artificial intelligence, SD4AI). An AI-enhanced debugging technique could assist in debugging AI systems more effectively, and a more robust and reliable AI approach could further guarantee and support debugging techniques. Therefore, it is important to take AI4SD and SD4AI into consideration comprehensively. In this paper, we want to show readers the path, the trend, and the potential that these two directions interact with each other. We select and review a total of 165 papers in AI4SD and SD4AI for answering three research questions, and further analyze opportunities and challenges as well as suggest future directions of this cross-cutting area.
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Garousi V, Rainer A, Lauvås Jr P, et al. Software-testing education: a systematic literature mapping. J Syst Software, 2020, 165: 110570
Lou Y, Ghanbari A, Li X, et al. Can automated program repair refine fault localization? A unified debugging approach. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. 75–87
Monperrus M. Automatic software repair: a bibliography. ACM Computing Surveys, 2018, 51: 1–24
Zakari A, Lee S P, Abreu R, et al. Multiple fault localization of software programs: a systematic literature review. Inf Software Tech, 2020, 124: 106312
Lu G Z, Xu L, Yang Y B, et al. Predictive analysis for race detection in software-defined networks. Sci China Inf Sci, 2019, 62: 062101
Fang C R, Chen Z Y, Xu B W. Comparing logic coverage criteria on test case prioritization. Sci China Inf Sci, 2012, 55: 2826–2840
Zhou Y M, Leung H, Song Q B, et al. An in-depth investigation into the relationships between structural metrics and unit testability in object-oriented systems. Sci China Inf Sci, 2012, 55: 2800–2815
Wang G, Shen R, Chen J, et al. Probabilistic delta debugging. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021. 881–892
Jiang J J, Xiong Y F, Xia X. A manual inspection of Defects4J bugs and its implications for automatic program repair. Sci China Inf Sci, 2019, 62: 200102
Wang S, Lo D. AmaLgam+: composing rich information sources for accurate bug localization. J Software Evolu Process, 2016, 28: 921–942
Pang N. Deep learning for code repair. Vancouver: University of British Columbia, 2018. https://people.ece.ubc.ca/qhanam/papers/npang_thesis_2018.pdf
Safdari N, Alrubaye H, Aljedaani W, et al. Learning to rank faulty source files for dependent bug reports. In: Proceedings of Big Data: Learning, Analytics, and Applications, 2019. 109890B
Zhang Z, Xie X. On the investigation of essential diversities for deep learning testing criteria. In: Proceedings of IEEE 19th International Conference on Software Quality, Reliability and Security, 2019. 394–405
Devanbu P, Dwyer M, Elbaum S, et al. Deep learning & software engineering: state of research and future directions. 2020. ArXiv:2009.08525
Pandey S K, Mishra R B, Tripathi A K. Machine learning based methods for software fault prediction: a survey. Expert Syst Appl, 2021, 172: 114595
Ranjan P, Kumar S, Kumar U. Software fault prediction using computational intelligence techniques: a survey. Ind J Sci Tech, 2017, 10: 1–9
Batool I, Khan T A. Software fault prediction using data mining, machine learning and deep learning techniques: a systematic literature review. Comput Electrical Eng, 2022, 100: 107886
Durelli V H S, Durelli R S, Borges S S, et al. Machine learning applied to software testing: a systematic mapping study. IEEE Trans Rel, 2019, 68: 1189–1212
Mahapatra S, Mishra S. Usage of machine learning in software testing. In: Proceedings of Automated Software Engineering: A Deep Learning-Based Approach, 2020. 39–54
Braiek H B, Khomh F. On testing machine learning programs. J Syst Software, 2020, 164: 110542
Zhang J M, Harman M, Ma L, et al. Machine learning testing: survey, landscapes and horizons. IEEE Trans Software Eng, 2022, 48: 1–36
Riccio V, Jahangirova G, Stocco A, et al. Testing machine learning based systems: a systematic mapping. Empir Software Eng, 2020, 25: 5193–5254
Wang Y, Jia P, Liu L, et al. A systematic review of fuzzing based on machine learning techniques. Plos One, 2020, 15: e0237749
Chen J, Patra J, Pradel M, et al. A survey of compiler testing. ACM Comput Surv, 2021, 53: 1–36
Li X, Jiang H, Ren Z, et al. Deep learning in software engineering. 2018. ArXiv:1805.04825
Ferreira F, Silva L L, Valente M T. Software engineering meets deep learning: a mapping study. In: Proceedings of the 36th Annual ACM Symposium on Applied Computing, 2021. 1542–1549
Yang Y, Xia X, Lo D, et al. A survey on deep learning for software engineering. 2020. ArXiv:2011.14597
Serban A, van der Blom K, Hoos H, et al. Adoption and effects of software engineering best practices in machine learning. In: Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2020. 1–12
Arpteg A, Brinne B, Crnkovic-Friis L, et al. Software engineering challenges of deep learning. In: Proceedings of the 44th Euromicro Conference on Software Engineering and Advanced Applications, 2018. 50–59
Zhang X, Yang Y, Feng Y, et al. Software engineering practice in the development of deep learning applications. 2019. ArXiv:1910.03156
Yang Y, Xia X, Lo D, et al. Predictive models in software engineering: challenges and opportunities. ACM Trans Softw Eng Methodol, 2022, 31: 1–72
Lertvittayakumjorn P, Toni F. Explanation-based human debugging of NLP models: a survey. Trans Assoc Comput Linguistics, 2021, 9: 1508–1528
Zhang Q, Zhao Y, Sun W, et al. Program repair: automated vs. manual. 2022. ArXiv:2203.05166
Islam M J, Pan R, Nguyen G, et al. Repairing deep neural networks: fix patterns and challenges. In: Proceedings of IEEE/ACM 42nd International Conference on Software Engineering, 2020. 1135–1146
Zhong W, Li C, Ge J, et al. Neural program repair: Systems, challenges and solutions. 2022. ArXiv:2202.10868
Feng Y, Liu Q, Dou M Y, et al. Mubug: a mobile service for rapid bug tracking. Sci China Inf Sci, 2016, 59: 013101
Zhang Z Y, Chen Z Y, Gao R Z, et al. An empirical study on constraint optimization techniques for test generation. Sci China Inf Sci, 2017, 60: 012105
Zhao Y, Feng Y, Wang Y, et al. Quality assessment of crowdsourced test cases. Sci China Inf Sci, 2020, 63: 190102
Staats M, Whalen M W, Heimdahl M P. Programs, tests, and oracles: the foundations of testing revisited. In: Proceedings of the 33rd International Conference on Software Engineering, 2011. 391–400
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444
Barr A, Feigenbaum E A. The Handbook of Artificial Intelligence. Oxford: Butterworth-Heinemann, 1981
Feldt R, de Oliveira Neto F G, Torkar R. Ways of applying artificial intelligence in software engineering. In: Proceedings of IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering, 2018. 35–41
Mou L, Li G, Zhang L, et al. Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, 2016
Gu X, Zhang H, Zhang D, et al. Deep API learning. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 631–642
Wang S, Liu T, Tan L. Automatically learning semantic features for defect prediction. In: Proceedings of IEEE/ACM 38th International Conference on Software Engineering, 2016. 297–308
Li X, Zhang L. Transforming programs and tests in tandem for fault localization. In: Proceedings of the ACM on Programming Languages, 2017. 1–30
Xie X, Chen T Y, Kuo F C, et al. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans Softw Eng Methodol, 2013, 22: 1–40
Gao R, Wong W E. MSeer—an advanced technique for locating multiple bugs in parallel. IEEE Trans Software Eng, 2019, 45: 301–318
Wang X Y, Jiang S J, Gao P F, et al. Cost-effective testing based fault localization with distance based test-suite reduction. Sci China Inf Sci, 2017, 60: 092112
Wang Y, Huang Z Q, Li Y, et al. Lightweight fault localization combined with fault context to improve fault absolute rank. Sci China Inf Sci, 2017, 60: 092113
Tu J, Xie X, Chen T Y, et al. On the analysis of spectrum based fault localization using hitting sets. J Syst Software, 2019, 147: 106–123
Xu Z, Ma S, Zhang X, et al. Debugging with intelligence via probabilistic inference. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 1171–1181
Tu J, Xie X, Zhou Y, et al. A search based context-aware approach for understanding and localizing the fault via weighted call graph. In: Proceedings of the 3rd International Conference on Trustworthy Systems and their Applications, 2016. 64–72
Cao J, Yang S, Jiang W, et al. BugPecker: locating faulty methods with deep learning on revision graphs. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 1214–1218
Wen M, Wu R, Cheung S C. Locus: locating bugs from software changes. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 262–273
Wong W E, Gao R, Li Y, et al. A survey on software fault localization. IEEE Trans Software Eng, 2016, 42: 707–740
Weiser M D. Program slices: formal, psychological, and practical investigations of an automatic program abstraction method. Dissertation for Ph.D. Degree. Ann Arbor: University of Michigan, 1979
Zhang X, He H, Gupta N, et al. Experimental evaluation of using dynamic slices for fault location. In: Proceedings of the 6th International Symposium on Automated Analysis-Driven Debugging, 2005. 33–42
Wotawa F. Fault localization based on dynamic slicing and hitting-set computation. In: Proceedings of the 10th International Conference on Quality Software, 2010. 161–170
Xie X, Xu B. Essential Spectrum-Based Fault Localization. Berlin: Springer, 2021
Laghari G, Murgia A, Demeyer S. Fine-tuning spectrum based fault localisation with frequent method item sets. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 274–285
Zhang L, Li Z, Feng Y, et al. Improving fault-localization accuracy by referencing debugging history to alleviate structure bias in code suspiciousness. IEEE Trans Rel, 2020, 69: 1021–1049
Zhang L, Yan L, Zhang Z, et al. A theoretical analysis on cloning the failed test cases to improve spectrum-based fault localization. J Syst Software, 2017, 129: 35–57
Wen M, Chen J, Tian Y, et al. Historical spectrum based fault localization. IEEE Trans Software Eng, 2021, 47: 2348–2368
Liblit B, Naik M, Zheng A X, et al. Scalable statistical bug isolation. SIGPLAN Not, 2005, 40: 15–26
Nessa S, Abedin M, Wong W E, et al. Software fault localization using n-gram analysis. In: Proceedings of International Conference on Wireless Algorithms, Systems, and Applications, 2008. 548–559
Guo Z Q, Zhou H C, Liu S R, et al. Information retrieval based bug localization: research problem, progress, and challenges (in Chinese). J Software, 2020, 31: 2826–2854
Zou W, Li E, Fang C. BLESER: bug localization based on enhanced semantic retrieval. 2021. ArXiv:2109.03555
Ren Z, Jiang H, Xuan J, et al. Automated localization for unreproducible builds. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 71–81
de Souza H A, Chaim M L, Kon F. Spectrum-based software fault localization: a survey of techniques, advances, and challenges. 2016. ArXiv:1607.04347
Zhang Z, Lei Y, Mao X, et al. CNN-FL: an effective approach for localizing faults using convolutional neural networks. In: Proceedings of IEEE 26th International Conference on Software Analysis, Evolution and Reengineering, 2019. 445–455
Wong W E, Qi Y U. BP neural network-based effective fault localization. Int J Soft Eng Knowl Eng, 2009, 19: 573–597
Zheng W, Hu D, Wang J. Fault localization analysis based on deep neural network. Math Problems Eng, 2016, 2016: 1–11
Zhang Z, Lei Y, Mao X, et al. A study of effectiveness of deep learning in locating real faults. Inf Software Tech, 2021, 131: 106486
Lam A N, Nguyen A T, Nguyen H A, et al. Bug localization with combination of deep learning and information retrieval. In: Proceedings of IEEE/ACM 25th International Conference on Program Comprehension, 2017. 218–229
Huo X, Li M. Enhancing the unified features to locate buggy files by exploiting the sequential nature of source code. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017. 1909–1915
Shi Z, Keung J, Bennin K E, et al. Comparing learning to rank techniques in hybrid bug localization. Appl Soft Computing, 2018, 62: 636–648
Chen Z F, Ma W W Y, Lin W, et al. A study on the changes of dynamic feature code when fixing bugs: towards the benefits and costs of Python dynamic features. Sci China Inf Sci, 2018, 61: 012107
Le X B D, Le Q L, Lo D, et al. Enhancing automated program repair with deductive verification. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2016. 428–432
Gopinath D, Wang K, Hua J, et al. Repairing intricate faults in code using machine learning and path exploration. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2016. 453–457
Roychoudhury A, Xiong Y F. Automated program repair: a step towards software automation. Sci China Inf Sci, 2019, 62: 200103
Kong X, Zhang L, Wong W E, et al. Experience report: how do techniques, programs, and tests impact automated program repair? In: Proceedings of IEEE 26th International Symposium on Software Reliability Engineering, 2015. 194–204
Wen M, Liu Y, Cheung S C. Boosting automated program repair with bug-inducing commits. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results, 2020. 77–80
Marginean A, Bader J, Chandra S, et al. SapFix: automated end-to-end repair at scale. In: Proceedings of IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice, 2019. 269–278
Bader J, Scott A, Pradel M, et al. Getafix: learning to fix bugs automatically. In: Proceedings of the ACM on Programming Languages, 2019. 1–27
Motwani M, Soto M, Brun Y, et al. Quality of automated program repair on real-world defects. IEEE Trans Software Eng, 2022, 48: 637–661
Smith E K, Barr E T, Goues C L, et al. Is the cure worse than the disease? Overfitting in automated program repair. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, 2015. 532–543
Le Goues C, Dewey-Vogt M, Forrest S, et al. A systematic study of automated program repair: fixing 55 out of 105 bugs for $8 each. In: Proceedings of the 34th International Conference on Software Engineering, 2012. 3–13
Qi Y, Mao X, Lei Y. Efficient automated program repair through fault-recorded testing prioritization. In: Proceedings of IEEE International Conference on Software Maintenance, 2013. 180–189
Weimer W, Fry Z P, Forrest S. Leveraging program equivalence for adaptive program repair: Models and first results. In: Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering, 2013. 356–366
Kim J, Kim J, Lee E. VFL: variable-based fault localization. Inf Software Tech, 2019, 107: 179–191
Wang S, Liu K, Lin B, et al. Beep: fine-grained fix localization by learning to predict buggy code elements. 2021. ArXiv:2111.07739
Liu K, Koyuncu A, Bissyande T F, et al. You cannot fix what you cannot find! An investigation of fault localization bias in benchmarking automated program repair systems. In: Proceedings of the 12th IEEE Conference on Software Testing, Validation and Verification, 2019. 102–113
Monperrus M. A critical review of “automatic patch generation learned from human-written patches”: essay on the problem statement and the evaluation of automatic software repair. In: Proceedings of the 36th International Conference on Software Engineering, 2014. 234–242
Wang S, Mao X, Niu N, et al. Multi-location program repair strategies learned from past successful experience. 2018. ArXiv:1810.12556
Motwani M, Sankaranarayanan S, Just R, et al. Do automated program repair techniques repair hard and important bugs? Empirical Software Eng, 2018, 23: 2901–2947
Liu K, Kim D, Bissyande T F, et al. Mining fix patterns for FindBugs violations. IEEE Trans Software Eng, 2021, 47: 165–188
Tufano M, Watson C, Bavota G, et al. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Trans Softw Eng Methodol, 2019, 28: 1–29
Chen Z, Kommrusch S J, Tufano M, et al. SEQUENCER: sequence-to-sequence learning for end-to-end program repair. IEEE Trans Software Eng, 2021. doi: https://doi.org/10.1109/TSE.2019.2940179
Lutellier T, Pham H V, Pang L, et al. CoCoNuT: combining context-aware neural translation models using ensemble for program repair. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. 101–114
Cao J, Li M, Chen X, et al. DeepFD: automated fault diagnosis and localization for deep learning programs. 2022. ArXiv:2205.01938
Li Z, Ma X, Xu C, et al. Operational calibration: debugging confidence errors for dnns in the field. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020. 901–913
Yan S, Tao G, Liu X, et al. Correlations between deep neural network model coverage criteria and model quality. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020. 775–787
Brown L. Tesla driver killed in crash posted videos of himself driving hands-free. 2021. https://www.marketwatch.com/story/tesla-driver-killed-in-crash-posted-videos-of-himself-driving-hands-free-11621220917
Marijan D, Gotlieb A. Software testing for machine learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020. 34: 13576–13582
Shen W, Li Y, Han Y, et al. Boundary sampling to boost mutation testing for deep learning models. Inf Software Tech, 2021, 130: 106413
Shen G, Liu Y, Tao G, et al. Backdoor scanning for deep neural networks through k-arm optimization. In: Proceedings of International Conference on Machine Learning, 2021. 9525–9536
Meng L, Li Y, Chen L, et al. Measuring discrimination to boost comparative testing for multiple deep learning models. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 385–396
Lourenço R, Freire J, Shasha D. Debugging machine learning pipelines. In: Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, 2019. 1–10
Feng Y, Shi Q, Gao X, et al. DeepGini: prioritizing massive tests to enhance the robustness of deep neural networks. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. 177–188
Krishnan S, Wu E. PALM: machine learning explanations for iterative debugging. In: Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, 2017. 1–6
Koh P W, Liang P. Understanding black-box predictions via influence functions. In: Proceedings of International Conference on Machine Learning, 2017. 1885–1894
Cao Y, Yu A F, Aday A, et al. Efficient repair of polluted machine learning systems via causal unlearning. In: Proceedings of the Asia Conference on Computer and Communications Security, 2018. 735–747
Zhang H, Chan W. Apricot: a weight-adaptation approach to fixing deep learning models. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019. 376–387
Shen W, Li Y, Chen L, et al. Multiple-boundary clustering and prioritization to promote neural network retraining. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 410–422
Zhang X, Yin Z, Feng Y, et al. NeuralVis: visualizing and interpreting deep learning models. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019. 1106–1109
Eniser H F, Gerasimou S, Sen A. DeepFault: fault localization for deep neural networks. 2019. ArXiv:1902.05974
Guidotti D, Leofante F, Pulina L, et al. Verification and repair of neural networks: a progress report on convolutional models. In: Proceedings of International Conference of the Italian Association for Artificial Intelligence, 2019. 405–417
Zhang Y, Chen Y, Cheung S C, et al. An empirical study on TensorFlow program bugs. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2018. 129–140
Islam M J, Nguyen G, Pan R, et al. A comprehensive study on deep learning bug characteristics. In: Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019. 510–520
Humbatova N, Jahangirova G, Bavota G, et al. Taxonomy of real faults in deep learning systems. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020. 1110–1121
Kitchenham B A, Budgen D, Brereton P. Evidence-Based Software Engineering and Systematic Reviews: Volume 4. Boca Raton: CRC Press, 2015
Basili V R, Caldiera G, Rombach H D. The goal question metric approach. In: Encyclopedia of Software Engineering. 1994. 528–532
Colanzi T E, Assunção W K G, Farah P R, et al. A review of ten years of the symposium on search-based software engineering. In: Proceedings of International Symposium on Search Based Software Engineering, 2019. 42–57
Ye X, Shen H, Ma X, et al. From word embeddings to document similarities for improved information retrieval in software engineering. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 404–415
Long F, Rinard M. Automatic patch generation by learning correct code. In: Proceedings of the 43rd Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, 2016. 298–312
Xuan J, Martinez M, DeMarco F, et al. Nopol: automatic repair of conditional statement bugs in Java programs. IEEE Trans Software Eng, 2017, 43: 34–55
Le X B D, Lo D, Le Goues C. History driven program repair. In: Proceedings of IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering, 2016. 213–224
Xiong Y, Wang J, Yan R, et al. Precise condition synthesis for program repair. In: Proceedings of IEEE/ACM 39th International Conference on Software Engineering, 2017. 416–426
Ma S, Liu Y, Lee W C, et al. MODE: automated neural network model debugging via state differential analysis and input selection. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018. 175–186
Agrawal A, Fu W, Menzies T. What is wrong with topic modeling? And how to fix it using search-based software engineering. Inf Software Tech, 2018, 98: 74–88
Peng Z, Xiao X, Hu G, et al. ABFL: an autoencoder based practical approach for software fault localization. Inf Sci, 2020, 510: 108–121
Huo X, Li M, Zhou Z H. Control flow graph embedding based on multi-instance decomposition for bug localization. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020. 4223–4230
Li X, Li W, Zhang Y, et al. DeepFL: integrating multiple fault diagnosis dimensions for deep fault localization. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2019. 169–180
Qi G, Yao L, Uzunov A V. Fault detection and localization in distributed systems using recurrent convolutional neural networks. In: Proceedings of International Conference on Advanced Data Mining and Applications, 2017. 33–48
Huo X, Li M, Zhou Z H. Learning unified features from natural and programming languages for locating buggy source code. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016. 1606–1612
Liang H, Sun L, Wang M, et al. Deep learning with customized abstract syntax tree for bug localization. IEEE Access, 2019, 7: 116309
Golagha M, Pretschner A, Briand L C. Can we predict the quality of spectrum-based fault localization? In: Proceedings of IEEE 13th International Conference on Software Testing, Validation and Verification, 2020. 4–15
Gu Y, Xuan J, Zhang H, et al. Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence. J Syst Software, 2019, 148: 88–104
Kim Y, Mun S, Yoo S, et al. Precise learn-to-rank fault localization using dynamic and static features of target programs. ACM Trans Softw Eng Methodol, 2019, 28: 1–34
Xia X, Lo D. An effective change recommendation approach for supplementary bug fixes. Autom Softw Eng, 2017, 24: 455–498
Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine Learning. Cambridge: MIT Press, 2018
Pan Y, Xiao X, Hu G, et al. ALBFL: a novel neural ranking model for software fault localization via combining static and dynamic features. In: Proceedings of IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications, 2020. 785–792
Ye X, Bunescu R, Liu C. Mapping bug reports to relevant files: a ranking model, a fine-grained benchmark, and feature evaluation. IEEE Trans Software Eng, 2016, 42: 379–402
Yang X L, Lo D, Xia X, et al. High-impact bug report identification with imbalanced learning strategies. J Comput Sci Technol, 2017, 32: 181–198
Guo Z, Li Y, Ma W, et al. Boosting crash-inducing change localization with rank-performance-based feature subset selection. Empir Software Eng, 2020, 25: 1905–1950
Wu R, Wen M, Cheung S C, et al. ChangeLocator: locate crash-inducing changes based on crash reports. Empir Software Eng, 2018, 23: 2866–2900
Li A, Lei Y, Mao X. Towards more accurate fault localization: an approach based on feature selection using branching execution probability. In: Proceedings of IEEE International Conference on Software Quality, Reliability and Security, 2016. 431–438
Feyzi F. CGT-FL: using cooperative game theory to effective fault localization in presence of coincidental correctness. Empir Software Eng, 2020, 25: 3873–3927
Amar A, Rigby P C. Mining historical test logs to predict bugs and localize faults in the test logs. In: Proceedings of IEEE/ACM 41st International Conference on Software Engineering, 2019. 140–151
Koyuncu A, Bissyande T F, Kim D, et al. D&C: a divide-and-conquer approach to IR-based bug localization. 2019. ArXiv:1902.02703
Yang B, He Y, Liu H, et al. A lightweight fault localization approach based on XGBoost. In: Proceedings of IEEE 20th International Conference on Software Quality, Reliability and Security, 2020. 168–179
Nath A, Domingos P. Learning tractable probabilistic models for fault localization. In: Proceedings of the AAAI Conference on Artificial Intelligence: Volume 30. 2016
Popescu M C, Balas V E, Perescu-Popescu L, et al. Multilayer perceptron and neural networks. WSEAS Trans Circuits Syst, 2009, 8: 579–588
Maru A, Dutta A, Kumar K V, et al. Effective software fault localization using a back propagation neural network. In: Proceedings of Computational Intelligence in Data Mining, 2020. 513–526
Dutta A, Pant N, Mitra P, et al. Effective fault localization using an ensemble classifier. In: Proceedings of International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering, 2019. 847–855
Li Y, Wang S, Nguyen T N. Fault localization with code coverage representation learning. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 661–673
Polisetty S, Miranskyy A, Başar A. On usefulness of the deep-learning-based bug localization models to practitioners. In: Proceedings of the 15th International Conference on Predictive Models and Data Analytics in Software Engineering, 2019. 16–25
Mahapatra R, Negi A. Effective software fault localization using GA-RBF neural network. J Theor Applied Inform Technol, 2016, 90: 168–174
Sohn J, Yoo S. FLUCCS: using code and change metrics to improve fault localization. In: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2017. 273–283
Choi K, Sohn J, Yoo S. Learning fault localisation for both humans and machines using multi-objective GP. In: Proceedings of International Symposium on Search Based Software Engineering, 2018. 349–355
Xuan J, Monperrus M. Learning to combine multiple ranking metrics for fault localization. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2014. 191–200
Zou D, Liang J, Xiong Y, et al. An empirical study of fault localization families and their combinations. IEEE Trans Software Eng, 2021, 47: 332–347
Liu P, Chen Y, Nie X, et al. FluxRank: a widely-deployable framework to automatically localizing root cause machines for software service failure mitigation. In: Proceedings of IEEE 30th International Symposium on Software Reliability Engineering, 2019. 35–46
Le T D B, Lo D, Goues C L, et al. A learning-to-rank based fault localization approach using likely invariants. In: Proceedings of the 25th International Symposium on Software Testing and Analysis, 2016. 177–188
Küçük Y, Henderson T A, Podgurski A. Improving fault localization by integrating value and predicate based causal inference techniques. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 649–660
Podgurski A, Küçük Y. CounterFault: value-based fault localization by modeling and predicting counterfactual outcomes. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2020. 382–393
Lou Y, Zhu Q, Dong J, et al. Boosting coverage-based fault localization via graph-based representation learning. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021. 664–676
Maamar M, Lazaar N, Loudni S, et al. Fault localization using itemset mining under constraints. Autom Softw Eng, 2017, 24: 341–368
Yan M, Xia X, Fan Y, et al. Just-In-Time defect identification and localization: a two-phase framework. IEEE Trans Software Eng, 2022, 48: 82–101
Zaman T S, Han X, Yu T. SCMiner: localizing system-level concurrency faults from large system call traces. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019. 515–526
Hoang T, Oentaryo R J, Le T D B, et al. Network-clustered multi-modal bug localization. IEEE Trans Software Eng, 2019, 45: 1002–1023
Cheng S, Yan X, Khan A A. A similarity integration method based information retrieval and word embedding in bug localization. In: Proceedings of IEEE 20th International Conference on Software Quality, Reliability and Security, 2020. 180–187
Pradel M, Sen K. DeepBugs: a learning approach to name-based bug detection. Proc ACM Program Lang, 2018, 2: 1–25
Liu G, Lu Y, Shi K, et al. Convolutional neural networks-based locating relevant buggy code files for bug reports affected by data imbalance. IEEE Access, 2019, 7: 131304–131316
Xiao Y, Keung J, Bennin K E, et al. Improving bug localization with word embedding and enhanced convolutional neural networks. Inf Software Tech, 2019, 105: 17–29
Li G, Liu H, Jin J, et al. Deep learning based identification of suspicious return statements. In: Proceedings of IEEE 27th International Conference on Software Analysis, Evolution and Reengineering, 2020. 480–491
Zhang Y, Lo D, Xia X, et al. Fusing multi-abstraction vector space models for concern localization. Empir Software Eng, 2018, 23: 2279–2322
Mills C, Parra E, Pantiuchina J, et al. On the relationship between bug reports and queries for text retrieval-based bug localization. Empir Software Eng, 2020, 25: 3086–3127
Almhana R, Mkaouer W, Kessentini M, et al. Recommending relevant classes for bug reports using multi-objective search. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 286–295
Almhana R, Kessentini M, Mkaouer W. Method-level bug localization using hybrid multi-objective search. Inf Software Tech, 2021, 131: 106474
Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems, 2013. 3111–3119
Briem J A, Smit J, Sellik H, et al. OffSide: learning to identify mistakes in boundary conditions. In: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops, 2020. 203–208
Liu G, Lu Y, Shi K, et al. Mapping bug reports to relevant source code files based on the vector space model and word embedding. IEEE Access, 2019, 7: 78870–78881
Zhang W, Li Z, Wang Q, et al. FineLocator: a novel approach to method-level fine-grained bug localization by query expansion. Inf Software Tech, 2019, 110: 121–135
Zhu Z, Li Y, Tong H, et al. CooBa: cross-project bug localization via adversarial transfer learning. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020. 3565–3571
Zhong H, Mei H. Learning a graph-based classifier for fault localization. Sci China Inf Sci, 2020, 63: 162101
Jonsson L, Broman D, Magnusson M, et al. Automatic localization of bugs to faulty components in large scale software systems using Bayesian classification. In: Proceedings of IEEE International Conference on Software Quality, Reliability and Security, 2016. 423–430
Huang Q, Lo D, Xia X, et al. Which packages would be affected by this bug report? In: Proceedings of IEEE 28th International Symposium on Software Reliability Engineering, 2017. 124–135
Le T D B, Thung F, Lo D. Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools. Empir Software Eng, 2017, 22: 2237–2279
Li Z, Jiang Z, Chen X, et al. Laprob: a label propagation-based software bug localization method. Inf Software Tech, 2021, 130: 106410
Rahman M M, Roy C K. Improving IR-based bug localization with context-aware query reformulation. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018. 621–632
Li X, Wong W E, Gao R, et al. Genetic algorithm-based test generation for software product line with the integration of fault localization techniques. Empir Software Eng, 2018, 23: 1–51
Chatterjee P, Chatterjee A, Campos J, et al. Diagnosing software faults using multiverse analysis. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020. 1629–1635
Elmishali A, Stern R, Kalech M. An artificial intelligence paradigm for troubleshooting software bugs. Eng Appl Artif Intelligence, 2018, 69: 147–156
Liu B, Nejati S, Lucia S, et al. Effective fault localization of automotive Simulink models: achieving the trade-off between test oracle effort and fault localization accuracy. Empir Software Eng, 2019, 24: 444–490
Zhang Z, Lei Y, Mao X, et al. Improving deep-learning-based fault localization with resampling. J Software Evolu Process, 2021, 33: e2312
Japkowicz N, Stephen S. The class imbalance problem: a systematic study1. Intell Data Anal, 2002, 6: 429–449
Graves A, Mohamed A R, Hinton G. Speech recognition with deep recurrent neural networks. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 2013. 6645–6649
Jarrett K, Kavukcuoglu K, Ranzato M, et al. What is the best multi-stage architecture for object recognition? In: Proceedings of IEEE 12th International Conference on Computer Vision, 2009. 2146–2153
Xia X, Gong L, Le T D B, et al. Diversity maximization speedup for localizing faults in single-fault and multi-fault programs. Autom Softw Eng, 2016, 23: 43–75
Liu Y, Li M, Wu Y, et al. A weighted fuzzy classification approach to identify and manipulate coincidental correct test cases for fault localization. J Syst Software, 2019, 151: 20–37
Zhang M, Li Y, Li X, et al. An empirical study of boosting spectrum-based fault localization via PageRank. IEEE Trans Software Eng, 2021, 47: 1089–1113
Chen J, Ma H, Zhang L. Enhanced compiler bug isolation via memoized search. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 78–89
Zhang X Y, Zheng Z, Cai K Y. Exploring the usefulness of unlabelled test cases in software fault localization. J Syst Software, 2018, 136: 278–290
Gupta R, Kanade A, Shevade S. Neural attribution for semantic bug-localization in student programs. In: Proceedings of Advances in Neural Information Processing Systems, 2019. 32
Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks. In: Proceedings of International Conference on Machine Learning, 2017. 3319–3328
He J, Xu L, Yan M, et al. Duplicate bug report detection using dual-channel convolutional neural networks. In: Proceedings of the 28th International Conference on Program Comprehension, 2020. 117–127
Ni Z, Li B, Sun X, et al. Analyzing bug fix for automatic bug cause classification. J Syst Software, 2020, 163: 110538
Yan X B, Liu B, Wang S H. A test restoration method based on genetic algorithm for effective fault localization in multiple-fault programs. J Syst Software, 2021, 172: 110861
Zheng Y, Wang Z, Fan X, et al. Localizing multiple software faults based on evolution algorithm. J Syst Software, 2018, 139: 107–123
Gao M, Li P, Chen C, et al. Research on software multiple fault localization method based on machine learning. In: Proceedings of MATEC Web of Conferences: volume 232, 2018. 01060
Behera R K, Shukla S, Rath S K, et al. Software reliability assessment using machine learning technique. In: Proceedings of International Conference on Computational Science and Its Applications. Springer, 2018. 403–411
Li Z, Chen T H, Shang W. Where shall we log? Studying and suggesting logging locations in code blocks. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 361–372
Vasic M, Kanade A, Maniatis P, et al. Neural program repair by jointly learning to localize and repair. 2019. ArXiv:1904.01720
Chappelly T, Cifuentes C, Krishnan P, et al. Machine learning for finding bugs: an initial report. In: Proceedings of IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation, 2017. 21–26
Yang X, Yu Z, Wang J, et al. Understanding static code warnings: an incremental AI approach. Expert Syst Appl, 2021, 167: 114134
Lin Y, Sun J, Tran L, et al. Break the dead end of dynamic slicing: localizing data and control omission bug. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018. 509–519
Yu X, Liu J, Yang Z, et al. The Bayesian network based program dependence graph and its application to fault localization. J Syst Software, 2017, 134: 44–53
Hofer B, Nica I, Wotawa F. AI for localizing faults in spreadsheets. In: Proceedings of IFIP International Conference on Testing Software and Systems, 2017. 71–87
Terra-Neves M, Machado N, Lynce I, et al. Concurrency debugging with MaxSMT. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019. 1608–1616
Mesbah A, Rice A, Johnston E, et al. DeepDelta: learning to repair compilation errors. In: Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019. 925–936
Zou W, Lo D, Kochhar P S, et al. Smart contract development: challenges and opportunities. IEEE Trans Software Eng, 2021, 47: 2084–2106
Yu X L, Al-Bataineh O, Lo D, et al. Smart contract repair. ACM Trans Softw Eng Methodol, 2020, 29: 1–32
Yuan Y, Banzhaf W. ARJA: automated repair of Java programs via multi-objective genetic programming. IEEE Trans Software Eng, 2018, 46: 1040–1067
Yuan Y, Banzhaf W. Toward better evolutionary program repair. ACM Trans Softw Eng Methodol, 2020, 29: 1–53
Oliveira V P L, Souza E F, Goues C L, et al. Improved representation and genetic operators for linear genetic programming for automated program repair. Empir Software Eng, 2018, 23: 2980–3006
Lee J, Song D, So S, et al. Automatic diagnosis and correction of logical errors for functional programming assignments. Proc ACM Program Lang, 2018, 2: 1–30
Machado N, Quinta D, Lucia B, et al. Concurrency debugging with differential schedule projections. ACM Trans Softw Eng Methodol, 2016, 25: 1–37
Pan R, Hu Q, Xu G, et al. Automatic repair of regular expressions. Proc ACM Program Lang, 2019, 3: 1–29
Koyuncu A, Liu K, Bissyande T F, et al. FixMiner: mining relevant fix patterns for automated program repair. Empir Software Eng, 2020, 25: 1980–2024
Gulwani S, Radiček I, Zuleger F. Automated clustering and program repair for introductory programming assignments. SIGPLAN Not, 2018, 53: 465–480
Falleri J R, Morandat F, Blanc X, et al. Fine-grained and accurate source code differencing. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, 2014. 313–324
Sakkas G, Endres M, Cosman B, et al. Type error feedback via analytic program repair. In: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, 2020. 16–30
White M, Tufano M, Martinez M, et al. Sorting and transforming program repair ingredients via deep learning code similarities. In: Proceedings of IEEE 26th International Conference on Software Analysis, Evolution and Reengineering, 2019. 479–490
Yi X, Chen L, Mao X, et al. Efficient automated repair of high floating-point errors in numerical libraries. Proc ACM Program Lang, 2019, 3: 1–29
Jiang J, Xiong Y, Zhang H, et al. Shaping program repair space with existing patches and similar code. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2018. 298–309
Jiang N, Lutellier T, Tan L. CURE: code-aware neural machine translation for automatic program repair. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 1161–1173
Koyuncu A, Liu K, Bissyande T F, et al. iFixR: bug report driven program repair. In: Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019. 314–325
Zhu Q, Sun Z, Xiao Y a, et al. A syntax-guided edit decoder for neural program repair. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021. 341–353
Sun Z, Zhu Q, Xiong Y, et al. TreeGen: a tree-based transformer architecture for code generation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020. 8984–8991
Shariffdeen R, Noller Y, Grunske L, et al. Concolic program repair. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021. 390–405
Lee J, Hong S, Oh H. MemFix: static analysis-based repair of memory deallocation errors for C. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018. 95–106
Li Y, Wang S, Nguyen T N. DLFix: context-based code transformation learning for automated program repair. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020. 602–614
Wen M, Chen J, Wu R, et al. Context-aware patch generation for better automated program repair. In: Proceedings of IEEE/ACM 40th International Conference on Software Engineering, 2018. 1–11
Wang S, Wen M, Lin B, et al. Automated patch correctness assessment: how far are we? In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 968–980
Ziarko W, Shan N. Machine learning through data classification and reduction. Fundamenta Informaticae, 1997, 30: 373–382
Patil T R, Sherekar S S. Performance analysis of Naive Bayes and J48 classification algorithm for data classification. Int J Comput Sci Appl, 2013, 6: 256–261
Kleinbaum D G, Dietz K, Gail M, et al. Logistic Regression. Berlin: Springer, 2002
Arcuri A, Briand L. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd International Conference on Software Engineering, 2011. 1–10
Platt J. Sequential minimal optimization: a fast algorithm for training support vector machines. 1998. https://www.micro-soft.com/en-us/research/publication/sequential-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines/
Xiong Y, Liu X, Zeng M, et al. Identifying patch correctness in test-based program repair. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 789–799
Liang J, Ji R, Jiang J, et al. Interactive patch filtering as debugging aid. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2021. 239–250
Saha R K, Lyu Y, Yoshida H, et al. Elixir: effective object-oriented program repair. In: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, 2017. 648–659
Tan S H, Yoshida H, Prasad M R, et al. Anti-patterns in search-based program repair. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 727–738
Long F, Rinard M. Staged program repair with condition synthesis. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, 2015. 166–178
Le X B D, Thung F, Lo D, et al. Overfitting in semantics-based automated program repair. Empir Software Eng, 2018, 23: 3007–3033
Yasunaga M, Liang P. Graph-based, self-supervised program repair from diagnostic feedback. In: Proceedings of International Conference on Machine Learning, 2020. 10799–10808
Wang K, Singh R, Su Z. Dynamic neural program embedding for program repair. 2017. ArXiv:1711.07163
Gupta R, Kanade A, Shevade S. Deep reinforcement learning for syntactic error repair in student programs. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019. 930–937
Traver V J. On compiler error messages: what they say and what they mean. Adv Hum-Comput Interaction, 2010, 2010: 1–26
Allamanis M, Brockschmidt M, Khademi M. Learning to represent programs with graphs. In: Proceedings of International Conference on Learning Representations, 2018
Gember-Jacobson A, Akella A, Mahajan R, et al. Automatically repairing network control planes using an abstract representation. In: Proceedings of the 26th Symposium on Operating Systems Principles, 2017. 359–373
Dinella E, Dai H, Li Z, et al. Hoppity: learning graph transformations to detect and fix bugs in programs. In: Proceedings of International Conference on Learning Representations, 2020
Gupta K, Christensen P E, Chen X, et al. Synthesize, execute and debug: learning to repair for neural program synthesis. 2020. ArXiv:2007.08095
Tian H, Liu K, Kaboré A K, et al. Evaluating representation learning of code changes for predicting patch correctness in program repair. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020. 981–992
Yang G, Min K, Lee B. Applying deep learning algorithm to automatic bug localization and repair. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020. 1634–1641
Lourenço R, Freire J, Shasha D. BugDoc: algorithms to debug computational processes. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, 2020. 463–478
Pham H V, Lutellier T, Qi W, et al. CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In: Proceedings of IEEE/ACM 41st International Conference on Software Engineering, 2019. 1027–1038
Wardat M, Le W, Rajan H. DeepLocalize: fault localization for deep neural networks. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 251–262
Dolby J, Shinnar A, Allain A, et al. Ariadne: analysis for machine learning programs. In: Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, 2018. 1–10
Cheng D, Cao C, Xu C, et al. Manifesting bugs in machine learning code: an explorative study with mutation testing. In: Proceedings of IEEE International Conference on Software Quality, Reliability and Security, 2018. 313–324
Wu X, Zheng W, Xia X, et al. Data quality matters: a case study on data label correctness for security bug report prediction. IEEE Trans Software Eng, 2022, 48: 2541–2556
Tao G, Ma S, Liu Y, et al. TRADER: trace divergence analysis and embedding regulation for debugging recurrent neural networks. In: Proceedings of IEEE/ACM 42nd International Conference on Software Engineering, 2020. 986–998
Kim E, Gopinath D, Pasareanu C, et al. A programmatic and semantic approach to explaining and debugging neural network based object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. 11128–11137
Sotoudeh M, Thakur A V. Provable repair of deep neural networks. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021. 588–603
Song K, Tan X, Lu J. Neural machine translation with error correction. 2020. ArXiv:2007.10681
Zhang Y, Ren L, Chen L, et al. Detecting numerical bugs in neural network architectures. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020. 826–837
Schoop E, Huang F, Hartmann B. UMLAUT: debugging deep learning programs using program structure and model behavior. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2021. 1–16
Zhang X, Zhai J, Ma S, et al. AUTOTRAINER: an automatic DNN training problem detection and repair system. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 359–371
Sun Z, Zhang J M, Harman M, et al. Automatic testing and improvement of machine translation. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020. 974–985
Jebnoun H, Braiek H B, Rahman M M, et al. The scent of deep learning code: an empirical study. In: Proceedings of the 17th International Conference on Mining Software Repositories, 2020. 420–430
Fan Y, Xia X, Lo D, et al. What makes a popular academic AI repository? Empir Software Eng, 2021, 26: 2
Liu J, Huang Q, Xia X, et al. Is using deep learning frameworks free? Characterizing technical debt in deep learning frameworks. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society, 2020. 1–10
Liu J, Huang Q, Xia X, et al. An exploratory study on the introduction and removal of different types of technical debt in deep learning frameworks. Empir Software Eng, 2021, 26: 16
Han J, Deng S, Lo D, et al. An empirical study of the dependency networks of deep learning libraries. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution, 2020. 868–878
Sun X, Zhou T, Li G, et al. An empirical study on real bugs for machine learning programs. In: Proceedings of the 24th Asia-Pacific Software Engineering Conference, 2017. 348–357
Zhang R, Xiao W, Zhang H, et al. An empirical study on program failures of deep learning jobs. In: Proceedings of IEEE/ACM 42nd International Conference on Software Engineering, 2020. 1159–1170
Jia L, Zhong H, Wang X, et al. An empirical study on bugs inside TensorFlow. In: Proceedings of International Conference on Database Systems for Advanced Applications. Berlin: Springer, 2020. 604–620
Garcia J, Feng Y, Shen J, et al. A comprehensive study of autonomous vehicle bugs. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020. 385–396
Chen Z, Yao H, Lou Y, et al. An empirical study on deployment faults of deep learning based mobile applications. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering, 2021. 674–685
Just R, Jalali D, Ernst M D. Defects4j: a database of existing faults to enable controlled testing studies for Java programs. In: Proceedings of the International Symposium on Software Testing and Analysis, 2014. 437–440
Abreu R, Zoeteweij P, van Gemund A J. An evaluation of similarity coefficients for software fault localization. In: Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing, 2006. 39–46
Wong W E, Qi Y, Zhao L, et al. Effective fault localization using code coverage. In: Proceedings of the 31st Annual International Computer Software and Applications Conference, 2007. 449–456
Rao P, Zheng Z, Chen T Y, et al. Impacts of test suite’s class imbalance on spectrum-based fault localization techniques. In: Proceedings of the 13th International Conference on Quality Software, 2013. 260–267
Shu T, Ye T, Ding Z, et al. Fault localization based on statement frequency. Inf Sci, 2016, 360: 43–56
Feyzi F, Parsa S. Inforence: effective fault localization based on information-theoretic analysis and statistical causal inference. Front Comput Sci, 2019, 13: 735–759
Madeiral F, Urli S, Maia M, et al. BEARS: an extensible Java bug benchmark for automatic program repair studies. In: Proceedings of IEEE 26th International Conference on Software Analysis, Evolution and Reengineering, 2019. 468–478
Saha R K, Lyu Y, Lam W, et al. Bugs.jar: a large-scale, diverse dataset of real-world Java bugs. In: Proceedings of the 15th International Conference on Mining Software Repositories, 2018. 10–13
Song Y, Xie X, Liu Q, et al. A comprehensive empirical investigation on failure clustering in parallel debugging. J Syst Software, 2022, 193: 111452
Song Y, Xie X, Zhang X, et al. Evolving ranking-based failure proximities for better clustering in fault isolation. In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022
Chen T, Cheung S, Yiu S. Metamorphic Testing: A New Approach for Generating Next Test Cases. Technical Report hkust-cs98-01. Hong Kong University of Science and Technology, 1998
Xie X, Ho J W K, Murphy C, et al. Testing and validating machine learning classifiers by metamorphic testing. J Syst Software, 2011, 84: 544–558
Xie X, Zhang Z, Chen T Y, et al. METTLE: a METamorphic testing approach to assessing and validating unsupervised machine learning systems. IEEE Trans Rel, 2020, 69: 1293–1322
Xie X, Ho J, Murphy C, et al. Application of metamorphic testing to supervised classifiers. In: Proceedings of the 9th International Conference on Quality Software, 2009. 135–144
Chen S, Jin S, Xie X. Testing your question answering software via asking recursively. In: Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering, 2021. 104–116
Grottke M, Trivedi K S. A classification of software faults. J Reliab Engin Assoc Japan, 2005, 27: 425–438
Acknowledgements
This work was partially supported by National Natural Science Foundation of China (Grant Nos. 62250610224, 61972289, 61832009). We sincerely appreciate the valuable suggestions from the anonymous reviewers for our paper.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Song, Y., Xie, X. & Xu, B. When debugging encounters artificial intelligence: state of the art and open challenges. Sci. China Inf. Sci. 67, 141101 (2024). https://doi.org/10.1007/s11432-022-3803-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-022-3803-9