Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Exploring Better Black-Box Test Case Prioritization via Log Analysis

Published: 26 April 2023 Publication History

Abstract

Test case prioritization (TCP) has been widely studied in regression testing, which aims to optimize the execution order of test cases so as to detect more faults earlier. TCP has been divided into white-box test case prioritization (WTCP) and black-box test case prioritization (BTCP). WTCP can achieve better prioritization effectiveness by utilizing source code information, but is not applicable in many practical scenarios (where source code is unavailable, e.g., outsourced testing). BTCP has the benefit of not relying on source code information, but tends to be less effective than WTCP. That is, both WTCP and BTCP suffer from limitations in the practical use.
To improve the practicability of TCP, we aim to explore better BTCP, significantly bridging the effectiveness gap between BTCP and WTCP. In this work, instead of statically analyzing test cases themselves in existing BTCP techniques, we conduct the first study to explore whether this goal can be achieved via log analysis. Specifically, we propose to mine test logs produced during test execution to more sufficiently reflect test behaviors, and design a new BTCP framework (called LogTCP), including log pre-processing, log representation, and test case prioritization components. Based on the LogTCP framework, we instantiate seven log-based BTCP techniques by combining different log representation strategies with different prioritization strategies.
We conduct an empirical study to explore the effectiveness of LogTCP. Based on 10 diverse open-source Java projects from GitHub, we compared LogTCP with three representative BTCP techniques and four representative WTCP techniques. Our results show that all of our LogTCP techniques largely perform better than all the BTCP techniques in average fault detection, to the extent that they become competitive to the WTCP techniques. That demonstrates the great potential of logs in practical TCP.

References

[1]
Jeff Anderson, Maral Azizi, Saeed Salem, and Hyunsook Do. 2019. On the use of usage patterns from telemetry data for test case prioritization. Information and Software Technology 113 (2019), 110–130.
[2]
James H. Andrews. 1998. Testing using log file analysis: Tools, methods, and issues. In Proceedings 13th IEEE International Conference on Automated Software Engineering. IEEE, 157–166.
[3]
James H. Andrews, Lionel C. Briand, and Yvan Labiche. 2005. Is mutation an appropriate tool for testing experiments?. In Proceedings of the 27th International Conference on Software Engineering. ACM, 402–411.
[4]
James H. Andrews and Yingjun Zhang. 2003. General test result checking with log file analysis. IEEE Transactions on Software Engineering 29, 7 (2003), 634–648.
[5]
Han Anu, Jie Chen, Wenchang Shi, Jianwei Hou, Bin Liang, and Bo Qin. 2019. An approach to recommendation of verbosity log levels based on logging intention. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME’19). IEEE, 125–134.
[6]
Andrea Arcuri and Lionel C. Briand. 2011. Adaptive random testing: An illusion of effectiveness?. In International Symposium on Software Testing and Analysis. ACM, 265–275.
[7]
Jonathan Bell, Owolabi Legunsen, Michael Hilton, Lamyaa Eloussi, Tifany Yung, and Darko Marinov. 2018. DeFlaker: Automatically detecting flaky tests. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE’18). IEEE, 433–444.
[8]
Yoav Benjamini and Yosef Hochberg. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological) 57, 1 (1995), 289–300.
[9]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. The Journal of Machine Learning Research 3 (2003), 993–1022.
[10]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.
[11]
Renée C. Bryce and Charles J. Colbourn. 2006. Prioritized interaction testing for pair-wise coverage with seeding and constraints. Information and Software Technology 48, 10 (2006), 960–970.
[12]
Taejoon Byun, Vaibhav Sharma, Abhishek Vijayakumar, Sanjai Rayadurgam, and Darren Cofer. 2019. Input prioritization for testing neural networks. In 2019 IEEE International Conference on Artificial Intelligence Testing. IEEE, 63–70.
[13]
William B. Cavnar and John M. Trenkle. 1994. N-gram-based text categorization. In Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, Vol. 161175. Citeseer.
[14]
Kwok Ping Chan, Tsong Yueh Chen, and Dave Towey. 2006. Restricted random testing: Adaptive random testing by exclusion. International Journal of Software Engineering and Knowledge Engineering 16, 04 (2006), 553–584.
[15]
Boyuan Chen and Zhen Ming Jiang. 2017. Characterizing and detecting anti-patterns in the logging code. In 2017 IEEE/ACM 39th International Conference on Software Engineering. IEEE, 71–81.
[16]
Boyuan Chen and Zhen Ming Jack Jiang. 2017. Characterizing logging practices in Java-based open source software projects–a replication study in Apache Software Foundation. Empirical Software Engineering 22, 1 (2017), 330–374.
[17]
Boyuan Chen, Jian Song, Peng Xu, Xing Hu, and Zhen Ming Jiang. 2018. An automated approach to estimating code coverage measures via execution logs. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 305–316.
[18]
Junjie Chen. 2018. Learning to accelerate compiler testing. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceedings. 472–475.
[19]
Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Bing Xie. 2017. Learning to prioritize test programs for compiler testing. In 2017 IEEE/ACM 39th International Conference on Software Engineering. IEEE, 700–711.
[20]
Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. 2016. Test case prioritization for compilers: A text-vector based approach. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST’16). IEEE, 266–277.
[21]
Junjie Chen, Yanwei Bai, Dan Hao, Lingming Zhang, Lu Zhang, and Bing Xie. 2017. How do assertions impact coverage-based test-suite reduction?. In 2017 IEEE International Conference on Software Testing, Verification and Validation. IEEE, 418–423.
[22]
Junjie Chen, Yanwei Bai, Dan Hao, Lingming Zhang, Lu Zhang, Bing Xie, and Hong Mei. 2016. Supporting oracle construction via static analysis. In 2016 31st IEEE/ACM International Conference on Automated Software Engineering. IEEE, 178–189.
[23]
Junjie Chen, Yiling Lou, Lingming Zhang, Jianyi Zhou, Xiaoleng Wang, Dan Hao, and Lu Zhang. 2018. Optimizing test prioritization via test distribution analysis. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 656–667.
[24]
Junjie Chen, Guancheng Wang, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. 2018. Coverage prediction for accelerating compiler testing. IEEE Transactions on Software Engineering 47, 2 (2018), 261–278.
[25]
Tsong Yueh Chen, Hing Leung, and Ieng Kei Mak. 2004. Adaptive random testing. In Annual Asian Computing Science Conference. Springer, 320–329.
[26]
Myra B. Cohen, Matthew B. Dwyer, and Jiangfan Shi. 2008. Constructing interaction test suites for highly-configurable systems in the presence of constraints: A greedy approach. IEEE Transactions on Software Engineering 34, 5 (2008), 633–650.
[27]
Emilio Cruciani, Breno Miranda, Roberto Verdecchia, and Antonia Bertolino. 2019. Scalable approaches for test suite reduction. In 2019 IEEE/ACM 41st International Conference on Software Engineering. IEEE, 419–429.
[28]
Hetong Dai, Heng Li, Che Shao Chen, Weiyi Shang, and Tse-Hsun Chen. 2020. Logram: Efficient log parsing using n-gram dictionaries. IEEE Transactions on Software Engineering abs/2001.03038 (2020).
[29]
Rajashree Dash, Rajib Lochan Paramguru, and Rasmita Dash. 2011. Comparative analysis of supervised and unsupervised discretization techniques. International Journal of Advances in Science and Technology 2, 3 (2011), 29–37.
[30]
Bogdan Dit, Latifa Guerrouj, Denys Poshyvanyk, and Giuliano Antoniol. 2011. Can better identifier splitting techniques help feature location?. In 2011 IEEE 19th International Conference on Program Comprehension. IEEE, 11–20.
[31]
James Dougherty, Ron Kohavi, and Mehran Sahami. 1995. Supervised and unsupervised discretization of continuous features. In Machine Learning Proceedings 1995. Elsevier, 194–202.
[32]
Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. DeepLog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 1285–1298.
[33]
Ricardo Dunia and S. Joe Qin. 1997. Multi-dimensional fault diagnosis using a subspace approach. In American Control Conference, Vol. 5. Citeseer.
[34]
Sebastian Elbaum, Alexey Malishevsky, and Gregg Rothermel. 2001. Incorporating varying test costs and fault severities into test case prioritization. In Proceedings of the 23rd International Conference on Software Engineering. IEEE, 329–338.
[35]
Sebastian Elbaum, Alexey G. Malishevsky, and Gregg Rothermel. 2002. Test case prioritization: A family of empirical studies. IEEE Transactions on Software Engineering 28, 2 (2002), 159–182.
[36]
Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. 416–419.
[37]
Qiang Fu, Jian-Guang Lou, Yi Wang, and Jiang Li. 2009. Execution anomaly detection in distributed systems through unstructured log analysis. In 2009 Ninth IEEE International Conference on Data Mining. IEEE, 149–158.
[38]
John C. Gower and Gavin J. S. Ross. 1969. Minimum spanning trees and single linkage cluster analysis. Journal of the Royal Statistical Society: Series C (Applied Statistics) 18, 1 (1969), 54–64.
[39]
Haixuan Guo, Shuhan Yuan, and Xintao Wu. 2021. LogBERT: Log anomaly detection via BERT. In 2021 International Joint Conference on Neural Networks (IJCNN’21). IEEE, 1–8.
[40]
Jiawei Han, Jian Pei, and Micheline Kamber. 2011. Data Mining: Concepts and Techniques. Elsevier.
[41]
Dan Hao, Lu Zhang, and Hong Mei. 2016. Test-case prioritization: Achievements and challenges. Frontiers of Computer Science 10, 5 (2016), 769–777.
[42]
Pinjia He, Zhuangbin Chen, Shilin He, and Michael R. Lyu. 2018. Characterizing the natural language descriptions in software logging statements. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 178–189.
[43]
Pinjia He, Jieming Zhu, Pengcheng Xu, Zibin Zheng, and Michael R. Lyu. 2018. A directed acyclic graph approach to online log parsing. arXiv preprint arXiv:1806.04356 (2018).
[44]
Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R. Lyu. 2017. Drain: An online log parsing approach with fixed depth tree. In 2017 IEEE International Conference on Web Services. IEEE, 33–40.
[45]
Shilin He, Pinjia He, Zhuangbin Chen, Tianyi Yang, Yuxin Su, and Michael R. Lyu. 2021. A survey on automated log analysis for reliability engineering. Comput. Surveys 54, 6 (2021), 1–37.
[46]
Hadi Hemmati, Andrea Arcuri, and Lionel Briand. 2013. Achieving scalable model-based testing through test case diversity. ACM Transactions on Software Engineering and Methodology 22, 1 (2013), 1–42.
[47]
Christopher Henard, Mike Papadakis, Mark Harman, Yue Jia, and Yves Le Traon. 2016. Comparing white-box and black-box test prioritization. In 2016 IEEE/ACM 38th International Conference on Software Engineering. IEEE, 523–534.
[48]
Christopher Henard, Mike Papadakis, Gilles Perrouin, Jacques Klein, and Yves Le Traon. 2013. Assessing software product line testing via model-based mutation: An application to similarity testing. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops. IEEE, 188–197.
[49]
Charitha Hettiarachchi, Hyunsook Do, and Byoungju Choi. 2014. Effective regression testing using requirements and risks. In 2014 Eighth International Conference on Software Security and Reliability (SERE’14). IEEE, 157–166.
[50]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[51]
Bo Jiang, Zhenyu Zhang, Wing Kwong Chan, and T. H. Tse. 2009. Adaptive random test case prioritization. In 2009 IEEE/ACM International Conference on Automated Software Engineering. IEEE, 233–244.
[52]
Zhen Ming Jiang, Ahmed E. Hassan, Parminder Flora, and Gilbert Hamann. 2008. Abstracting execution logs to execution events for enterprise applications (short paper). In 2008 The Eighth International Conference on Quality Software. IEEE, 181–186.
[53]
René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. 437–440.
[54]
René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes, and Gordon Fraser. 2014. Are mutants a valid substitute for real faults in software testing?. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 654–665.
[55]
Jung-Min Kim and Adam Porter. 2002. A history-based test prioritization technique for regression testing in resource constrained environments. In Proceedings of the 24th International Conference on Software Engineering. 119–129.
[56]
R. Krishnamoorthi and S. A. Sahaaya Arul Mary. 2009. Factor oriented requirement coverage based system test case prioritization of new and regression test cases. Information and Software Technology 51, 4 (2009), 799–808.
[57]
Jung-Hyun Kwon, In-Young Ko, Gregg Rothermel, and Matt Staats. 2014. Test case prioritization based on information retrieval concepts. In 2014 21st Asia-Pacific Software Engineering Conference, Vol. 1. IEEE, 19–26.
[58]
Wing Lam, Stefan Winter, Angello Astorga, Victoria Stodden, and Darko Marinov. 2020. Understanding reproducibility and characteristics of flaky tests through test reruns in Java projects. In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE’20). IEEE, 403–413.
[59]
Yves Ledru, Alexandre Petrenko, and Sergiy Boroday. 2009. Using string distances for test case prioritisation. In 2009 IEEE/ACM International Conference on Automated Software Engineering. IEEE, 510–514.
[60]
Yves Ledru, Alexandre Petrenko, Sergiy Boroday, and Nadine Mandran. 2012. Prioritizing test cases with string distances. Automated Software Engineering 19, 1 (2012), 65–95.
[61]
Heng Li, Weiyi Shang, and Ahmed E. Hassan. 2017. Which log level should developers choose for a new logging statement? Empirical Software Engineering 22, 4 (2017), 1684–1716.
[62]
Shanshan Li, Xu Niu, Zhouyang Jia, Xiangke Liao, Ji Wang, and Tao Li. 2020. Guiding log revisions by learning from software evolution history. Empirical Software Engineering 25, 3 (2020), 2302–2340.
[63]
Zhenhao Li, Tse-Hsun Chen, and Weiyi Shang. 2020. Where shall we log? Studying and suggesting logging locations in code blocks. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 361–372.
[64]
Zheng Li, Mark Harman, and Robert M. Hierons. 2007. Search algorithms for regression test case prioritization. IEEE Transactions on Software Engineering 33, 4 (2007), 225–237.
[65]
Qingwei Lin, Ken Hsieh, Yingnong Dang, Hongyu Zhang, Kaixin Sui, Yong Xu, Jian-Guang Lou, Chenggang Li, Youjiang Wu, Randolph Yao, Murali Chintalapati, and Dongmei Zhang. 2018. Predicting node failure in cloud service systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 480–490.
[66]
Qingwei Lin, Hongyu Zhang, Jian-Guang Lou, Yu Zhang, and Xuewei Chen. 2016. Log clustering based problem identification for online service systems. In 2016 IEEE/ACM 38th International Conference on Software Engineering Companion. IEEE, 102–111.
[67]
Zhongxin Liu, Xin Xia, David Lo, Zhenchang Xing, Ahmed E. Hassan, and Shanping Li. 2021. Which variables should I log? IEEE Transactions on Software Engineering (2021), 2012–2031.
[68]
Yiling Lou, Junjie Chen, Lingming Zhang, and Dan Hao. 2019. A survey on regression test-case prioritization. In Advances in Computers. Vol. 113. Elsevier, 1–46.
[69]
Yiling Lou, Dan Hao, and Lu Zhang. 2015. Mutation-based test-case prioritization in software evolution. In 2015 IEEE 26th International Symposium on Software Reliability Engineering. IEEE, 46–57.
[70]
Yafeng Lu, Yiling Lou, Shiyang Cheng, Lingming Zhang, Dan Hao, Yangfan Zhou, and Lu Zhang. 2016. How does regression test prioritization perform in real-world software evolution?. In Proceedings of the 38th International Conference on Software Engineering. 535–546.
[71]
Qi Luo, Kevin Moran, and Denys Poshyvanyk. 2016. A large-scale empirical comparison of static and dynamic test case prioritization techniques. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 559–570.
[72]
Qi Luo, Kevin Moran, Denys Poshyvanyk, and Massimiliano Di Penta. 2018. Assessing test case prioritization on real faults and mutants. In 2018 IEEE International Conference on Software Maintenance and Evolution. IEEE, 240–251.
[73]
Qi Luo, Kevin Moran, Lingming Zhang, and Denys Poshyvanyk. 2018. How do static and dynamic test case prioritization techniques perform on modern software systems? An extensive study on GitHub projects. IEEE Transactions on Software Engineering 45, 11 (2018), 1054–1080.
[74]
Mostafa Mahdieh, Seyed-Hassan Mirian-Hosseinabadi, Khashayar Etemadi, Ali Nosrati, and Sajad Jalali. 2020. Incorporating fault-proneness estimations into coverage-based test case prioritization methods. Information and Software Technology 121 (2020), 106269.
[75]
Alexey G. Malishevsky, Gregg Rothermel, and Sebastian Elbaum. 2002. Modeling the cost-benefits tradeoffs for regression testing techniques. In International Conference on Software Maintenance, 2002. Proceedings.IEEE, 204–213.
[76]
Dusica Marijan, Arnaud Gotlieb, and Sagar Sen. 2013. Test case prioritization for continuous regression testing: An industrial case study. In 2013 IEEE International Conference on Software Maintenance. IEEE, 540–543.
[77]
Hong Mei, Dan Hao, Lingming Zhang, Lu Zhang, Ji Zhou, and Gregg Rothermel. 2012. A static approach to prioritizing JUnit test cases. IEEE Transactions on Software Engineering 38, 6 (2012), 1258–1275.
[78]
John Micco. 2017. The state of continuous integration testing@ Google. (2017).
[79]
Breno Miranda, Emilio Cruciani, Roberto Verdecchia, and Antonia Bertolino. 2018. FAST approaches to scalable similarity-based test case prioritization. In 2018 IEEE/ACM 40th International Conference on Software Engineering. IEEE, 222–232.
[80]
Tsuyoshi Mizouchi, Kazumasa Shimari, Takashi Ishio, and Katsuro Inoue. 2019. PADLA: A dynamic log level adapter using online phase detection. In 2019 IEEE/ACM 27th International Conference on Program Comprehension. IEEE, 135–138.
[81]
Meiyappan Nagappan and Mladen A. Vouk. 2010. Abstracting log lines to log event types for mining software system logs. In 2010 7th IEEE Working Conference on Mining Software Repositories. IEEE, 114–117.
[82]
Manoj Kumar Pachariya. 2020. Building ant system for multi-faceted test case prioritization: An empirical study. International Journal of Software Innovation 8, 2 (2020), 23–37.
[83]
Mike Papadakis, Christopher Henard, Mark Harman, Yue Jia, and Yves Le Traon. 2016. Threats to the validity of mutation-based test assessment. In Proceedings of the 25th International Symposium on Software Testing and Analysis. 354–365.
[84]
David Paterson, José Campos, Rui Abreu, Gregory M. Kapfhammer, Gordon Fraser, and Phil McMinn. 2019. An empirical study on the use of defect prediction for test case prioritization. In 2019 12th IEEE Conference on Software Testing, Validation and Verification. IEEE, 346–357.
[85]
Qianyang Peng, August Shi, and Lingming Zhang. 2020. Empirically revisiting and enhancing IR-based test-case prioritization. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 324–336.
[86]
Adithya Abraham Philip, Ranjita Bhagwan, Rahul Kumar, Chandra Sekhar Maddila, and Nachiappan Nagppan. 2019. FastLane: Test minimization for rapidly deployed large-scale online services. In 2019 IEEE/ACM 41st International Conference on Software Engineering. IEEE, 408–418.
[87]
Xiao Qu, Myra B. Cohen, and Katherine M. Woolf. 2007. Combinatorial interaction regression testing: A study of test case generation and prioritization. In 2007 IEEE International Conference on Software Maintenance. IEEE, 255–264.
[88]
Anand Rajaraman and Jeffrey David Ullman. 2011. Mining of Massive Datasets. Cambridge University Press.
[89]
Erik Rogstad, Lionel Briand, and Richard Torkar. 2013. Test case selection for black-box regression testing of database applications. Information and Software Technology 55, 10 (2013), 1781–1795.
[90]
Gregg Rothermel and Mary Jean Harrold. 1997. A safe, efficient regression test selection technique. ACM Transactions on Software Engineering and Methodology 6, 2 (1997), 173–210.
[91]
Gregg Rothermel, Roland H. Untch, Chengyun Chu, and Mary Jean Harrold. 1999. Test case prioritization: An empirical study. In Proceedings IEEE International Conference on Software Maintenance-1999. Software Maintenance for Business Change (Cat. No. 99CB36360). IEEE, 179–188.
[92]
Gregg Rothermel, Roland H. Untch, Chengyun Chu, and Mary Jean Harrold. 2001. Prioritizing test cases for regression testing. IEEE Transactions on Software Engineering 27, 10 (2001), 929–948.
[93]
Barbara Russo, Giancarlo Succi, and Witold Pedrycz. 2015. Mining system logs to learn error predictors: A case study of a telemetry system. Empirical Software Engineering 20, 4 (2015), 879–927.
[94]
Ripon K. Saha, Lingming Zhang, Sarfraz Khurshid, and Dewayne E. Perry. 2015. An information retrieval approach for regression test prioritization based on program changes. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 268–279.
[95]
Sreedevi Sampath, Renee C. Bryce, Gokulanand Viswanath, Vani Kandimalla, and A. Gunes Koru. 2008. Prioritizing user-session-based test cases for web applications testing. In 2008 1st International Conference on Software Testing, Verification, and Validation. IEEE, 141–150.
[96]
Ramadass Sathya and Annamma Abraham. 2013. Comparison of supervised and unsupervised learning algorithms for pattern classification. International Journal of Advanced Research in Artificial Intelligence 2, 2 (2013), 34–38.
[97]
Juliet Popper Shaffer. 1995. Multiple hypothesis testing. Annual Review of Psychology 46, 1 (1995), 561–584.
[98]
Hina Shah, Saurabh Sinha, and Mary Jean Harrold. 2011. Outsourced, offshored software-testing practice: Vendor-side experiences. In 2011 IEEE Sixth International Conference on Global Software Engineering. IEEE, 131–140.
[99]
Weiyi Shang, Meiyappan Nagappan, and Ahmed E. Hassan. 2015. Studying the relationship between logging characteristics and the code quality of platform software. Empirical Software Engineering 20, 1 (2015), 1–27.
[100]
Weiyi Shang, Meiyappan Nagappan, Ahmed E. Hassan, and Zhen Ming Jiang. 2014. Understanding log lines using development knowledge. In 2014 IEEE International Conference on Software Maintenance and Evolution. IEEE, 21–30.
[101]
Samuel Sanford Shapiro and Martin B. Wilk. 1965. An analysis of variance test for normality (complete samples). Biometrika 52, 3/4 (1965), 591–611.
[102]
August Shi, Wing Lam, Reed Oei, Tao Xie, and Darko Marinov. 2019. iFixFlakies: A framework for automatically fixing order-dependent flaky tests. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 545–555.
[103]
Donghwan Shin, Shin Yoo, Mike Papadakis, and Doo-Hwan Bae. 2019. Empirical evaluation of mutation-based test case prioritization techniques. Software Testing, Verification and Reliability 29, 1-2 (2019), e1695.
[104]
Dennis Silva, Ricardo Rabelo, Matheus Campanha, Pedro Santos Neto, Pedro Almir Oliveira, and Ricardo Britto. 2016. A hybrid approach for test case prioritization and selection. In 2016 IEEE Congress on Evolutionary Computation. IEEE, 4508–4515.
[105]
Charles Spearman. 1961. The proof and measurement of association between two things. (1961).
[106]
Hema Srikanth, Mikaela Cashman, and Myra B. Cohen. 2016. Test case prioritization of build acceptance tests for an enterprise cloud application: An industrial case study. Journal of Systems and Software 119 (2016), 122–135.
[107]
Hema Srikanth, Laurie Williams, and Jason Osborne. 2005. System test case prioritization of new and regression test cases. In 2005 International Symposium on Empirical Software Engineering, 2005. IEEE, 10–pp.
[108]
Stephen W. Thomas, Hadi Hemmati, Ahmed E. Hassan, and Dorothea Blostein. 2014. Static test case prioritization using topic models. Empirical Software Engineering 19, 1 (2014), 182–212.
[109]
Zhao Tian, Junjie Chen, Qihao Zhu, Junjie Yang, and Lingming Zhang. 2022. Learning to construct better mutation faults. In 37th IEEE/ACM International Conference on Automated Software Engineering. to appear.
[110]
Risto Vaarandi and Mauno Pihelgas. 2015. LogCluster-a data clustering and pattern mining algorithm for event logs. In 2015 11th International Conference on Network and Service Management. IEEE, 1–7.
[111]
Jeffrey M. Voas. 1992. PIE: A dynamic failure-based technique. IEEE Transactions on Software Engineering 18, 8 (1992), 717.
[112]
Lingzhi Wang, Nengwen Zhao, Junjie Chen, Pinnong Li, Wenchi Zhang, and Kaixin Sui. 2020. Root-cause metric location for microservice systems via log anomaly detection. In 2020 IEEE International Conference on Web Services. IEEE, 142–150.
[113]
Song Wang, Jaechang Nam, and Lin Tan. 2017. QTEP: Quality-aware test case prioritization. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 523–534.
[114]
Zan Wang, Hanmo You, Junjie Chen, Yingyi Zhang, Xuyuan Dong, and Wenbin Zhang. 2021. Prioritizing test inputs for deep neural networks via mutation analysis. In 2021 IEEE/ACM 43rd International Conference on Software Engineering. IEEE, 397–409.
[115]
Robert F. Woolson. 2007. Wilcoxon signed-rank test. Wiley Encyclopedia of Clinical Trials (2007), 1–3.
[116]
Lei Xiao, Huaikou Miao, Weiwei Zhuang, and Shaojun Chen. 2017. An empirical study on clustering approach combining fault prediction for test case prioritization. In 2017 IEEE/ACIS 16th International Conference on Computer and Information Science. IEEE, 815–820.
[117]
Lin Yang, Junjie Chen, Zan Wang, Weijing Wang, Jiajun Jiang, Xuyuan Dong, and Wenbin Zhang. 2021. Semi-supervised log-based anomaly detection via probabilistic label estimation. In 2021 IEEE/ACM 43rd International Conference on Software Engineering. IEEE, 1448–1460.
[118]
Kundi Yao, Guilherme B. de Padua, Weiyi Shang, Catalin Sporea, Andrei Toma, and Sarah Sajedi. 2020. Log4Perf: Suggesting and updating logging locations for web-based systems’ performance monitoring. Empirical Software Engineering 25, 1 (2020), 488–531.
[119]
Shin Yoo and Mark Harman. 2010. Using hybrid algorithm for Pareto efficient multi-objective test suite minimisation. Journal of Systems and Software 83 (2010), 689–701.
[120]
Shin Yoo and Mark Harman. 2012. Regression testing minimization, selection and prioritization: A survey. Software Testing, Verification & Reliability 22 (2012), 67–120.
[121]
Ding Yuan, Haohui Mai, Weiwei Xiong, Lin Tan, Yuanyuan Zhou, and Shankar Pasupathy. 2010. SherLog: Error diagnosis by connecting clues from run-time logs. In Proceedings of the Fifteenth International Conference on Architectural Support for Programming Languages and Operating Systems. 143–154.
[122]
Lingming Zhang, Dan Hao, Lu Zhang, Gregg Rothermel, and Hong Mei. 2013. Bridging the gap between the total and additional test-case prioritization strategies. In 2013 35th International Conference on Software Engineering. IEEE, 192–201.
[123]
Xu Zhang, Yong Xu, Qingwei Lin, Bo Qiao, Hongyu Zhang, Yingnong Dang, Chunyu Xie, Xinsheng Yang, Qian Cheng, Ze Li, Junjie Chen, Xiaoting He, Randolph Yao, Jian-Guang Lou, Murali Chintalapati, Furao Shen, and Dongmei Zhang. 2019. Robust log-based anomaly detection on unstable log data. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 807–817.
[124]
Xu Zhao, Kirk Rodrigues, Yu Luo, Michael Stumm, Ding Yuan, and Yuanyuan Zhou. 2017. Log20: Fully automated optimal placement of log printing statements under specified overhead threshold. In Proceedings of the 26th Symposium on Operating Systems Principles. 565–581.
[125]
Bo Zhou, Hiroyuki Okamura, and Tadashi Dohi. 2011. Enhancing performance of random testing through Markov chain Monte Carlo methods. IEEE Trans. Comput. 62, 1 (2011), 186–192.
[126]
Jianyi Zhou, Junjie Chen, and Dan Hao. 2021. Parallel test prioritization. ACM Transactions on Software Engineering and Methodology 31, 1 (2021), 1–50.
[127]
Xiang Zhou, Xin Peng, Tao Xie, Jun Sun, Chao Ji, Dewei Liu, Qilin Xiang, and Chuan He. 2019. Latent error prediction and fault localization for microservice applications by learning from system trace logs. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 683–694.
[128]
Jieming Zhu, Pinjia He, Qiang Fu, Hongyu Zhang, Michael R. Lyu, and Dongmei Zhang. 2015. Learning to log: Helping developers make informed logging decisions. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 415–425.
[129]
Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, and Michael R. Lyu. 2019. Tools and benchmarks for automated log parsing. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice. IEEE, 121–130.

Cited By

View all
  • (2024)Hybrid whale optimized crow search algorithm and multi-SVM classifier for effective system level test case selectionJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23270046:2(4191-4207)Online publication date: 14-Feb-2024
  • (2024)Dependency-Aware Code NaturalnessProceedings of the ACM on Programming Languages10.1145/36897948:OOPSLA2(2355-2377)Online publication date: 8-Oct-2024
  • (2024)Fairness Testing of Machine Translation SystemsACM Transactions on Software Engineering and Methodology10.1145/366460833:6(1-27)Online publication date: 27-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 32, Issue 3
May 2023
937 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3594533
  • Editor:
  • Mauro Pezzè
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 April 2023
Online AM: 28 October 2022
Accepted: 12 October 2022
Revised: 17 September 2022
Received: 04 January 2022
Published in TOSEM Volume 32, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Test case prioritization
  2. log analysis
  3. regression testing

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Fund projects in the technical field of the foundation strengthening plan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)455
  • Downloads (Last 6 weeks)30
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Hybrid whale optimized crow search algorithm and multi-SVM classifier for effective system level test case selectionJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23270046:2(4191-4207)Online publication date: 14-Feb-2024
  • (2024)Dependency-Aware Code NaturalnessProceedings of the ACM on Programming Languages10.1145/36897948:OOPSLA2(2355-2377)Online publication date: 8-Oct-2024
  • (2024)Fairness Testing of Machine Translation SystemsACM Transactions on Software Engineering and Methodology10.1145/366460833:6(1-27)Online publication date: 27-Jun-2024
  • (2024)Fault Diagnosis for Test Alarms in Microservices through Multi-source DataCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663833(115-125)Online publication date: 10-Jul-2024
  • (2024)Try with Simpler - An Evaluation of Improved Principal Component Analysis in Log-based Anomaly DetectionACM Transactions on Software Engineering and Methodology10.1145/364438633:5(1-27)Online publication date: 3-Jun-2024
  • (2024)Empirical Comparison between MOEAs and Local Search on Multi-Objective Combinatorial Optimisation ProblemsProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654077(547-556)Online publication date: 14-Jul-2024
  • (2024)Framework for Bias Detection in Machine Learning Models: A Fairness ApproachProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635731(1152-1154)Online publication date: 4-Mar-2024
  • (2024)A Testing Program and Pragma Combination Selection Based Framework for High-Level Synthesis Tool Pragma-Related Bug DetectionIEEE Transactions on Software Engineering10.1109/TSE.2024.336855350:4(937-955)Online publication date: Apr-2024
  • (2024)A Regression Test Case Prioritization Framework for Software SustainabilityComputing and Informatics10.1007/978-981-99-9589-9_24(315-329)Online publication date: 26-Jan-2024
  • (2023)Black-Box Test Case Prioritization Using Log Analysis and Test Case Diversity2023 IEEE 34th International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW60843.2023.00072(186-191)Online publication date: 9-Oct-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media