Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Influential Global and Local Contexts Guided Trace Representation for Fault Localization

Published: 26 April 2023 Publication History

Abstract

Trace data is critical for fault localization (FL) to analyze suspicious statements potentially responsible for a failure. However, existing trace representation meets its bottleneck mainly in two aspects: (1) the trace information of a statement is restricted to a local context (i.e., a test case) without the consideration of a global context (i.e., all test cases of a test suite); (2) it just uses the ‘occurrence’ for representation without strong FL semantics.
Thus, we propose UNITE: an inflUential coNtext-GuIded Trace rEpresentation, representing the trace from both global and local contexts with influential semantics for FL. UNITE embodies and implements two key ideas: (1) UNITE leverages the widely used weighting capability from local and global contexts of information retrieval to reflect how important a statement (a word) is to a test case (a document) in all test cases of a test suite (a collection), where a test case (a document) and all test cases of a test suite (a collection) represent local and global contexts respectively; (2) UNITE further elaborates the trace representation from ‘occurrence’ (weak semantics) to ‘influence’ (strong semantics) by combing program dependencies. The large-scale experiments on 12 FL techniques and 20 programs show that UNITE significantly improves FL effectiveness.

References

[1]
Hervé Abdi. 2007. The Bonferonni and Šidák corrections for multiple comparisons. Encyclopedia of Measurement and Statistics 3 (2007), 103–107.
[2]
Hiralal Agrawal and Joseph R. Horgan. 1990. Dynamic program slicing. ACM SIGPLAN Notices 25, 6 (1990), 246–256.
[3]
Andrea Arcuri and Lionel Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). IEEE, 1–10.
[4]
L. C Briand, Y. Labiche, and Xuetao Liu. 2007. Using machine learning to support debugging with tarantula. In Proceedings of the IEEE International Symposium on Software Reliability (ISSRE’07). 137–146.
[5]
Gregory W. Corder and Dale I. Foreman. 2010. Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach. Vol. 78. International Statistical Review, 451–452.
[6]
Vidroha Debroy and W. Eric Wong. 2009. Insights on fault interference for programs with multiple bugs. In Proceedings of the 20th International Symposium on Software Reliability Engineering (ISSRE’09). IEEE, 165–174.
[7]
Vidroha Debroy, W. Eric Wong, Xiaofeng Xu, and Byoungju Choi. 2010. A grouping-based strategy to improve the effectiveness of fault localization techniques. In Proceedings of the 10th International Conference on Quality Software (QSIC’10). 13–22.
[8]
Chunrong Fang, Yang Feng, Qingkai Shi, Zicong Liu, Shuying Li, and Baowen Xu. 2017. Fault interference and coupling effect. In Proceedings of the 29th International Conference on Software Engineering and Knowledge Engineering (SEKE’17). 501–506.
[9]
R. Ranganath H. Lee, R. Grosse, and A. Y. Ng. 2009. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). 609–616.
[10]
Simon Heiden, Lars Grunske, Timo Kehrer, Fabian Keller, Andre Van Hoorn, Antonio Filieri, and David Lo. 2019. An evaluation of pure spectrum-based fault localization techniques for large-scale software systems. Software: Practice and Experience 49, 8 (2019), 1197–1224.
[11]
Jie Lee Hua, Lee Naish, and Kotagiri Ramamohanarao. 2010. Effective software bug localization using spectral frequency weighting function. In Proceedings of the 34th Annual Computer Software and Applications Conference (COMPSAC’10). 449–456.
[12]
Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. 2018. Shaping program repair space with existing patches and similar code. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’18). 298–309.
[13]
James A. Jones. 2004. Fault localization using visualization of test information. In Proceedings of the International Conference on Software Engineering (ICSE’04). 54–56.
[14]
James A. Jones, James F. Bowring, and Mary Jean Harrold. 2007. Debugging in parallel. In Proceedings of the 2007 International Symposium on Software Testing and Analysis (ISSTA’07). ACM, 16–26.
[15]
James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering (ASE’05). 273–282.
[16]
Fabian Keller, Lars Grunske, Simon Heiden, Antonio Filieri, Andre van Hoorn, and David Lo. 2017. A critical evaluation of spectrum-based fault localization techniques on a large-scale software system. In Proceedings of the 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS’17). IEEE, 114–125.
[17]
Pavneet Singh Kochhar, Xin Xia, David Lo, and Shanping Li. 2016. Practitioners’ expectations on automated fault localization. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). 165–176.
[18]
Tien Duy B. Le, Richard J. Oentaryo, and David Lo. 2015. Information retrieval and spectrum based bug localization: Better together. In Proceedings of the10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’15). 579–590.
[19]
Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2011. Genprog: A generic method for automatic software repair. IEEE Transactions on Software Engineering (TSE) 38, 1 (2011), 54–72.
[20]
Yann Lecun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436.
[21]
Yan Lei, Xiaoguang Mao, Ziying Dai, and Chengsong Wang. 2012. Effective statistical fault localization using program slices. In Proceedings of the 36th Annual Computer Software and Applications Conference (COMPSAC’12). 1–10.
[22]
Yan Lei, Xiaoguang Mao, Min Zhang, Jingan Ren, and Yinhua Jiang. 2017. Toward understanding information models of fault localization: Elaborate is not always better. In Proceedings of the 41st Annual Computer Software and Applications Conference (COMPSAC’17). IEEE, 57–66.
[23]
Yan Lei, Chengnian Sun, Xiaoguang Mao, and Zhendong Su. 2018. How test suites impact fault localisation starting from the size. IET Software 12, 3 (2018), 190–205.
[24]
Xia Li, Wei Li, Yuqun Zhang, and Lingming Zhang. 2019. DeepFL: Integrating multiple fault diagnosis dimensions for deep fault localization. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’19). 169–180.
[25]
Xia Li and Lingming Zhang. 2017. Transforming programs and tests in tandem for fault localization. In Proceedings of the ACM on Programming Languages (OOPSLA) (2017), 1–30.
[26]
Yi Li, Shaohua Wang, and Tien N. Nguyen. 2021. Fault localization with code coverage representation learning. In Proceedings of the International Conference on Software Engineering (ICSE’21).
[27]
Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawendé F Bissyandé. 2019. TBar: Revisiting template-based automated program repair. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’19). 31–42.
[28]
Yiling Lou, Ali Ghanbari, Xia Li, Lingming Zhang, Haotian Zhang, Dan Hao, and Lu Zhang. 2020. Can automated program repair refine fault localization? A unified debugging approach. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’20). 75–87.
[29]
Fernanda Madeiral, Simon Urli, Marcelo Maia, and Martin Monperrus. 2019. Bears: An extensible java bug benchmark for automatic program repair studies. In Proceedings of the 26th International Conference on Software Analysis, Evolution and Reengineering (SANER’19). IEEE, 468–478.
[30]
Xiaoguang Mao, Yan Lei, Ziying Dai, Yuhua Qi, and Chengsong Wang. 2014. Slice-based statistical fault localization. Journal of Systems and Software 89, 1 (2014), 51–62.
[31]
Wes Masri and Rawad Abou Assi. 2010. Cleansing test suites from coincidental correctness to enhance fault localization. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation (ICST’10).
[32]
Wes Masri and Rawad Abou Assi. 2014. Prevalence of coincidental correctness and mitigation of its impact on fault localization. ACM Transactions on Software Engineering and Methodology (TOSEM) 23, 1 (2014), 1–28.
[33]
Seokhyeon Moon, Yunho Kim, Moonzoo Kim, and Shin Yoo. 2014. Ask the mutants: Mutating faulty programs for fault localization. In Proceedings of the 7th International Conference on Software Testing, Verification and Validation (ICST’14). 153–162.
[34]
J. J. DiCarlo N. Pinto, D. Doukhan, and D. D. Cox. 2009. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. In PLoS Computational Biology. vol. 5.
[35]
Lee Naish and Hua. 2011. A model for spectra-based software diagnosis. ACM Transactions on Software Engineering and Methodology (TOSEM) 20, 3 (2011), 1–32.
[36]
Mike Papadakis and Yves Le Traon. 2012. Using mutants to locate “Unknown” faults. In Proceedings of the 5th International Conference on Software Testing, Verification and Validation (ICST’12). 691–700.
[37]
Mike Papadakis and Yves Le Traon. 2015. Metallaxis-FL: Mutation-based Fault Localization. John Wiley and Sons Ltd. 605–628.
[38]
Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers?. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’11). 199–209.
[39]
Spencer Pearson, Jose Campos, Rene Just, Gordon Fraser, Rui Abreu, Michael D. Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and improving fault localization. In Proceedings of the International Conference on Software Engineering (ICSE’17).
[40]
A. J. C. van Gemund, R. Abreu, and P. Zoeteweij. 2006. An evaluation of similarity coeffcients for software fault localization. In Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing. 39–46.
[41]
Anand Rajaraman and Jeffrey David Ullman. 2011. Mining of massive datasets: Data mining. Min. Massive Datasets (2011), 1–17.
[42]
Abreu Rui, Peter Zoeteweij, and Arjan J. C. van Gemund. 2009. Spectrum-based multiple fault localization. In Proceedings of the International Conference on Automated Software Engineering (ASE’09). 88–99.
[43]
Abreu Rui, Peter Zoeteweij, Rob Golsteijn, and Arjan J. C. van Gemund. 2009. A practical evaluation of spectrum-based fault localization. Journal of Systems and Software 82, 11 (2009), 1780–1792.
[44]
Cong Ying Shi, X. U. ChaoJun, and Xiao Jiang Yang. 2009. Study of TFIDF algorithm. Journal of Computer Applications (2009).
[45]
Ting Shu, Tiantian Ye, Zuohua Ding, and Jinsong Xia. 2016. Fault localization based on statement frequency. Information Sciences 360 (2016), 43–56.
[46]
Jeongju Sohn and Shin Yoo. 2017. FLUCCS: Using code and change metrics to improve fault localization. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’17). ACM, New York, 273–283.
[47]
Ezekiel Soremekun, Lukas Kirschner, Marcel Böhme, and Andreas Zeller. 2021. Locating faults with program slicing: An empirical analysis. Empirical Software Engineering 26, 3 (2021), 1–45.
[48]
Chengnian Sun and Siau Cheng Khoo. 2013. Mining succinct predicated bug signatures. In Proceedings of the 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’13). 576–586.
[49]
S. C. Turaga, J. F. Murray, V. Jain, F. Roth, M. Helmstaedter, K. Briggman, W. Denk, and H. S. Seung. 2010. Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Computation 22, 2 (2010), 511–538.
[50]
András Vargha and Harold D Delaney. 2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics 25, 2 (2000), 101–132.
[51]
Xinming Wang, Shing Chi Cheung, Wing Kwong Chan, and Zhenyu Zhang. 2009. Taming coincidental correctness: Coverage refinement with context patterns to improve fault localization. In Proceedings of the IEEE International Conference on Software Engineering (ICSE’09).
[52]
W. Eric Wong, Vidroha Debroy, and Byoungju Choi. 2010. A family of code coverage-based heuristics for effective fault localization. Journal of Systems and Software (2010), 188–208.
[53]
W. Eric Wong, Vidroha Debroy, Richard Golden, Xiaofeng Xu, and Bhavani Thuraisingham. 2012. Effective software fault localization using an RBF neural network. IEEE Transactions on Reliability 61, 1 (2012), 149–169.
[54]
W. Eric Wong, Vidroha Debroy, Yihao Li, and Ruizhi Gao. 2012. Software fault localization using DStar (D*). In IEEE Sixth International Conference on Software Security and Reliability. 21–30.
[55]
W. Eric Wong, Ruizhi Gao, Yihao Li, Abreu Rui, and Franz Wotawa. 2016. A survey on software fault localization. IEEE Transactions on Software Engineering (TSE) 42, 8 (2016), 707–740.
[56]
W. Eric Wong and Yu Qi. 2009. BP neural network-based effective fault localization. International Journal of Software Engineering and Knowledge Engineering 19, 04 (2009), 573–597.
[57]
W. Eric Wong, Yu Qi, Lei Zhao, and Kai Yuan Cai. 2007. Effective fault localization using code coverage. In Proceedings of the 31st Annual International Computer Software and Applications Conference (COMPSAC’07). 449–456.
[58]
Xin Xia, Lingfeng Bao, David Lo, and Shanping Li. 2016. “Automated debugging considered harmful” considered harmful: A user study revisiting the usefulness of spectra-based fault localization techniques with professionals using real bugs from large systems. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’16). 267–278.
[59]
Xiaoyuan Xie, Tsong Yueh Chen, Fei Ching Kuo, and Baowen Xu. 2013. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering and Methodology (TOSEM) 22, 4 (2013), 31.
[60]
Xiaoyuan Xie, Fei Ching Kuo, Tsong Yueh Chen, Shin Yoo, and Mark Harman. 2013. Provably optimal and human-competitive results in SBSE for spectrum based fault localisation. In Proceedings of the 5th International Symposium on Search Based Software Engineering (SSBSE’13). Springer, Berlin, 224–238.
[61]
Baowen Xu, Ju Qian, Xiaofang Zhang, Zhongqiang Wu, and Lin Chen. 2005. A brief survey of program slicing. ACM SIGSOFT Software Engineering Notes 30, 2 (2005), 1–36.
[62]
Jifeng Xuan, Matias Martinez, Favio Demarco, Maxime Clement, Sebastian Lamelas Marcote, Thomas Durieux, Daniel Le Berre, and Martin Monperrus. 2016. Nopol: Automatic repair of conditional statement bugs in Java programs. IEEE Transactions on Software Engineering (TSE) 43, 1 (2016), 34–55.
[63]
Abubakar Zakari, Sai Peck Lee, Rui Abreu, Babiker Hussien Ahmed, and Rasheed Abubakar Rasheed. 2020. Multiple fault localization of software programs: A systematic literature review. Information and Software Technology 124 (2020), 106312.
[64]
Zhuo Zhang, Yan Lei, Xiaoguang Mao, Xi Chang, Jianxin Xue, and Qingyu Xiong. 2020. Fault localization approach using term frequency and inverse document frequency. Journal of Software 31, 11 (2020), 132–144.
[65]
Zhuo Zhang, Yan Lei, Xiaoguang Mao, and Panpan Li. 2019. CNN-FL: An effective approach for localizing faults using convolutional neural networks. In the 26th International Conference on Software Analysis, Evolution and Reengineering (SANER’19). IEEE, 445–455.
[66]
Zhuo Zhang, Yan Lei, Xiaoguang Mao, Meng Yan, Ling Xu, and Junhao Wen. 2020. Improving deep-learning-based fault localization with resampling. Journal of Software: Evolution and Process (2020), 1–22.
[67]
Zhuo Zhang, Yan Lei, Xiaoguang Mao, Meng Yan, Ling Xu, and Xiaohong Zhang. 2021. A study of effectiveness of deep learning in locating real faults. Information and Software Technology 131 (2021), 106486.
[68]
Zhuo Zhang, Yan Lei, Qingping Tan, Xiaoguang Mao, Ping Zeng, and Xi Chang. 2017. Deep learning-based fault localization with contextual information. IEICE Transactions on Information and Systems E100.D, 12 (2017), 3027–3031.
[69]
Zhuo Zhang, Yan Lei, Jianjun Xu, Xiaoguang Mao, and Xi Chang. 2019. TFIDF-FL: Localizing faults using term frequency-inverse document frequency and deep learning. IEICE Transactions on Information and Systems 102, 9 (2019), 1860–1864.
[70]
Wei Zheng, Desheng Hu, and Jing Wang. 2016. Fault localization analysis based on deep neural network. Mathematical Problems in Engineering, 2016, (2016-4-24) 2016 (2016), 1–11.
[71]
Daming Zou, Jingjing Liang, Yingfei Xiong, Michael D. Ernst, and Lu Zhang. 2019. An empirical study of fault localization families and their combinations. IEEE Transactions on Software Engineering (TSE) (2019), 1–1.

Cited By

View all
  • (2025)CG-FL: A data augmentation approach using context-aware genetic algorithm for fault localizationJournal of Systems and Software10.1016/j.jss.2025.112359222(112359)Online publication date: Apr-2025
  • (2024)Knowledge-Augmented Mutation-Based Bug Localization for Hardware Design CodeACM Transactions on Architecture and Code Optimization10.1145/3660526Online publication date: 22-Apr-2024
  • (2024)Advanced White-Box Heuristics for Search-Based Fuzzing of REST APIsACM Transactions on Software Engineering and Methodology10.1145/365215733:6(1-36)Online publication date: 27-Jun-2024
  • Show More Cited By

Index Terms

  1. Influential Global and Local Contexts Guided Trace Representation for Fault Localization

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Software Engineering and Methodology
    ACM Transactions on Software Engineering and Methodology  Volume 32, Issue 3
    May 2023
    937 pages
    ISSN:1049-331X
    EISSN:1557-7392
    DOI:10.1145/3594533
    • Editor:
    • Mauro Pezzè
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 April 2023
    Online AM: 15 December 2022
    Accepted: 26 November 2022
    Revised: 10 October 2022
    Received: 18 February 2022
    Published in TOSEM Volume 32, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Fault localization
    2. trace representation
    3. statement weighting
    4. program dependence
    5. suspiciousness

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • Fundamental Research Funds for the Central Universities
    • National Defense Basic Scientific Research Project
    • Major Key Project of PCL

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)186
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)CG-FL: A data augmentation approach using context-aware genetic algorithm for fault localizationJournal of Systems and Software10.1016/j.jss.2025.112359222(112359)Online publication date: Apr-2025
    • (2024)Knowledge-Augmented Mutation-Based Bug Localization for Hardware Design CodeACM Transactions on Architecture and Code Optimization10.1145/3660526Online publication date: 22-Apr-2024
    • (2024)Advanced White-Box Heuristics for Search-Based Fuzzing of REST APIsACM Transactions on Software Engineering and Methodology10.1145/365215733:6(1-36)Online publication date: 27-Jun-2024
    • (2024)Statement Types and Error Rates: How are they Related for Boosting Fault Localization?2024 IEEE International Conference on Software Analysis, Evolution and Reengineering - Companion (SANER-C)10.1109/SANER-C62648.2024.00031(1-8)Online publication date: 12-Mar-2024
    • (2024)Improving effort-aware defect prediction by directly learning to rank software modulesInformation and Software Technology10.1016/j.infsof.2023.107250165:COnline publication date: 1-Jan-2024
    • (2024)Model-domain failing test augmentation with Generative Adversarial NetworksExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121901238:PEOnline publication date: 27-Feb-2024
    • (2024)A multi-objective effort-aware defect prediction approach based on NSGA-IIApplied Soft Computing10.1016/j.asoc.2023.110941149:PAOnline publication date: 1-Feb-2024
    • (2024)A Systematic Exploration of Mutation‐Based Fault Localization FormulaeSoftware Testing, Verification and Reliability10.1002/stvr.1905Online publication date: 11-Nov-2024
    • (2023)Contrastive Coincidental Correctness Representation Learning2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE59848.2023.00074(252-263)Online publication date: 9-Oct-2023
    • (2023)Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’IET Software10.1049/sfw2.1213317:4(472-495)Online publication date: 27-Jun-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media