Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3650212.3680364acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Towards More Complete Constraints for Deep Learning Library Testing via Complementary Set Guided Refinement

Published: 11 September 2024 Publication History

Abstract

Deep learning library is important in AI systems. Recently, many works have been proposed to ensure its reliability. They often model inputs of tensor operations as constraints to guide the generation of test cases. However, these constraints may narrow the search space, resulting in incomplete testing. This paper introduces a complementary set-guided refinement that can enhance the completeness of constraints. The basic idea is to see if the complementary set of constraints yields valid test cases. If so, the original constraint is incomplete and needs refinement. Based on this idea, we design an automatic constraint refinement tool, DeepConstr, which adopts a genetic algorithm to refine constraints for better completeness. We evaluated it on two DL libraries, PyTorch and TensorFlow. DeepConstr discovered 84 unknown bugs, out of which 72 were confirmed, with 51 fixed. Compared to state-of-the-art fuzzers, DeepConstr increased coverage for 43.44% of operators supported by NNSmith, and 59.16% of operators supported by NeuRI.

References

[1]
2022. GCOV. https://gcc.gnu.org/onlinedocs/gcc/Gcov.html
[2]
2022. Source-based Code Coverage — Clang 15.0.0 documentation. https://releases.llvm.org/15.0.0/tools/clang/docs/SourceBasedCodeCoverage.html
[3]
2024. Artifact for DeepConstr. https://doi.org/10.5281/zenodo.12669927
[4]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, and Michael Isard. 2016. $TensorFlow$: a system for $Large-Scale$ machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265–283.
[5]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, and Shyamal Anadkat. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
[6]
Marcel Böhme, Cristian Cadar, and Abhik Roychoudhury. 2020. Fuzzing: Challenges and reflections. IEEE Software, 38, 3 (2020), 79–86.
[7]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. 337–340.
[8]
Yinlin Deng, Chunqiu Steven Xia, Haoran Peng, Chenyuan Yang, and Lingming Zhang. 2023. Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models. In Proceedings of the 32nd ACM SIGSOFT international symposium on software testing and analysis. 423–435.
[9]
Yinlin Deng, Chunqiu Steven Xia, Chenyuan Yang, Shizhuo Dylan Zhang, Shujing Yang, and Lingming Zhang. 2024. Large language models are edge-case generators: Crafting unusual programs for fuzzing deep learning libraries. In Proceedings of the 46th IEEE/ACM International Conference on Software Engineering. 1–13.
[10]
Yinlin Deng, Chenyuan Yang, Anjiang Wei, and Lingming Zhang. 2022. Fuzzing deep-learning libraries via automated relational api inference. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 44–56.
[11]
Adeel Ehsan, Mohammed Ahmad M. E. Abuhaliqa, Cagatay Catal, and Deepti Mishra. 2022. RESTful API Testing Methodologies: Rationale, Challenges, and Solution Directions. Applied Sciences, 12, 9 (2022), issn:2076-3417 https://doi.org/10.3390/app12094369
[12]
Luciano Floridi and Massimo Chiriatti. 2020. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 30 (2020), 681–694.
[13]
Jingzhou Fu, Jie Liang, Zhiyong Wu, and Yu Jiang. 2024. Sedar: Obtaining High-Quality Seeds for DBMS Fuzzing via Cross-DBMS SQL Transfer. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–12.
[14]
Patrice Godefroid, Bo-Yuan Huang, and Marina Polishchuk. 2020. Intelligent REST API Data Fuzzing. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 725–736. isbn:9781450370431 https://doi.org/10.1145/3368089.3409719
[15]
Jiazhen Gu, Xuchuan Luo, Yangfan Zhou, and Xin Wang. 2022. Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA. 1418–1430. isbn:9781450392211 https://doi.org/10.1145/3510003.3510092
[16]
Jianmin Guo, Yu Jiang, Yue Zhao, Quan Chen, and Jiaguang Sun. 2018. Dlfuzz: Differential fuzzing testing of deep learning systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 739–743.
[17]
Qianyu Guo, Xiaofei Xie, Yi Li, Xiaoyu Zhang, Yang Liu, Xiaohong Li, and Chao Shen. 2020. Audee: Automated Testing for Deep Learning Frameworks. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). 486–498.
[18]
Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. 2020. Array programming with NumPy. Nature, 585, 7825 (2020), Sept., 357–362. https://doi.org/10.1038/s41586-020-2649-2
[19]
John H Holland. 1992. Genetic algorithms. Scientific american, 267, 1 (1992), 66–73.
[20]
Yu Jiang, Jie Liang, Fuchen Ma, Yuanliang Chen, Chijin Zhou, Yuheng Shen, Zhiyong Wu, Jingzhou Fu, Mingzhe Wang, and Shanshan Li. 2024. When Fuzzing Meets LLMs: Challenges and Opportunities. In Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 492–496.
[21]
Thijs Klooster, Fatih Turkmen, Gerben Broenink, Ruben Ten Hove, and Marcel Böhme. 2023. Continuous fuzzing: a study of the effectiveness and scalability of fuzzing in CI/CD pipelines. In 2023 IEEE/ACM International Workshop on Search-Based and Fuzz Testing (SBFT). 25–32.
[22]
Wen Li, Haoran Yang, Xiapu Luo, Long Cheng, and Haipeng Cai. 2023. Pyrtfuzz: Detecting bugs in python runtimes via two-level collaborative fuzzing. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 1645–1659.
[23]
Jie Liang, Zhiyong Wu, Jingzhou Fu, Yiyuan Bai, Qiang Zhang, and Yu Jiang. 2024. $WingFuzz$: Implementing Continuous Fuzzing for $DBMSs$. In 2024 USENIX Annual Technical Conference (USENIX ATC 24). 479–492.
[24]
Jiawei Liu, Jinkun Lin, Fabian Ruffy, Cheng Tan, Jinyang Li, Aurojit Panda, and Lingming Zhang. 2023. NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, ASPLOS 2023, Vancouver, BC, Canada, March 25-29, 2023, Tor M. Aamodt, Natalie D. Enright Jerger, and Michael M. Swift (Eds.). ACM, 530–543.
[25]
Jiawei Liu, Jinjun Peng, Yuyao Wang, and Lingming Zhang. 2023. Neuri: Diversifying dnn generation via inductive rule inference. arXiv preprint arXiv:2302.02261.
[26]
Stephan Lukasczyk, Florian Kroiß, and Gordon Fraser. 2023. An Empirical Study of Automated Unit Test Generation for Python. Empirical Softw. Engg., 28, 2 (2023), jan, 46 pages. issn:1382-3256 https://doi.org/10.1007/s10664-022-10248-w
[27]
Weisi Luo, Dong Chai, Xiaoyue Run, Jiang Wang, Chunrong Fang, and Zhenyu Chen. 2021. Graph-Based Fuzz Testing for Deep Learning Inference Engines. In Proceedings of the 43rd International Conference on Software Engineering (ICSE ’21). IEEE Press, 288–299. isbn:9781450390859 https://doi.org/10.1109/ICSE43902.2021.00037
[28]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.
[29]
Ruijie Meng, Martin Mirchev, Marcel Böhme, and Abhik Roychoudhury. 2024. Large language model guided protocol fuzzing. In Proceedings of the 31st Annual Network and Distributed System Security Symposium (NDSS).
[30]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. 8024–8035 pages.
[31]
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In proceedings of the 26th Symposium on Operating Systems Principles. 1–18.
[32]
H. Pham, T. Lutellier, W. Qi, and L. Tan. 2019. CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE Computer Society, Los Alamitos, CA, USA. 1027–1038. https://doi.org/10.1109/ICSE.2019.00107
[33]
Reese T Prosser. 1959. Applications of boolean matrices to the analysis of flow diagrams. In Papers presented at the December 1-3, 1959, eastern joint IRE-AIEE-ACM computer conference. 133–138.
[34]
John Regehr. 2017. Responsible and Effective Bugfinding. https://blog.regehr.org/archives/2037
[35]
Kostya Serebryany. 2017. OSS-Fuzz - Google’ s continuous fuzzing service for open source software. USENIX Association, Vancouver, BC.
[36]
Uri Shaham, Maor Ivgi, Avia Efrat, Jonathan Berant, and Omer Levy. 2023. Zeroscrolls: A zero-shot benchmark for long text understanding. arXiv preprint arXiv:2305.14196.
[37]
Ali Shatnawi, Ghadeer Al-Bdour, Raffi Al-Qurran, and Mahmoud Al-Ayyoub. 2018. A comparative study of open source deep learning frameworks. In 2018 9th international conference on information and communication systems (icics). 72–77.
[38]
Jingyi Shi, Yang Xiao, Yuekang Li, Yeting Li, Dongsong Yu, Chendong Yu, Hui Su, Yufeng Chen, and Wei Huo. 2023. ACETest: Automated Constraint Extraction for Testing Deep Learning Operators. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 690–702.
[39]
Michael Sutton, Adam Greene, and Pedram Amini. 2007. Fuzzing: brute force vulnerability discovery. Pearson Education.
[40]
Cornelis Joost "Keith" van Rijsbergen. 1979. Information Retrieval (2nd ed.). Butterworth, London, GB; Boston, MA. isbn:0-408-70929-4
[41]
Zan Wang, Ming Yan, Junjie Chen, Shuang Liu, and Dongdi Zhang. 2020. Deep Learning Library Testing via Effective Model Generation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 788–799. isbn:9781450370431 https://doi.org/10.1145/3368089.3409761
[42]
Anjiang Wei, Yinlin Deng, Chenyuan Yang, and Lingming Zhang. 2022. Free lunch for testing: Fuzzing deep-learning libraries from open source. In Proceedings of the 44th International Conference on Software Engineering. 995–1007.
[43]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc V Le, and Denny Zhou. 2022. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022) Main Conference Track.
[44]
Chunqiu Steven Xia, Matteo Paltenghi, Jia Le Tian, Michael Pradel, and Lingming Zhang. 2024. Fuzz4all: Universal fuzzing with large language models. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13.
[45]
Dongwei Xiao, Zhibo Liu, Yuanyuan Yuan, Qi Pang, and Shuai Wang. 2022. Metamorphic testing of deep learning compilers. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 6, 1 (2022), 1–28.
[46]
Danning Xie, Yitong Li, Mijung Kim, Hung Viet Pham, Lin Tan, Xiangyu Zhang, and Michael W Godfrey. 2022. DocTer: documentation-guided fuzzing for testing deep learning API functions. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. 176–188.
[47]
Quan Zhang, Yifeng Ding, Yongqiang Tian, Jianmin Guo, Min Yuan, and Yu Jiang. 2021. Advdoor: adversarial backdoor attack of deep learning system. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 127–138.
[48]
Quan Zhang, Yongqiang Tian, Yifeng Ding, Shanshan Li, Chengnian Sun, Yu Jiang, and Jiaguang Sun. 2023. CoopHance: Cooperative Enhancement for Robustness of Deep Learning Systems. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 753–765.
[49]
Xufan Zhang, Jiawei Liu, Ning Sun, Chunrong Fang, Jia Liu, Jiang Wang, Dong Chai, and Zhenyu Chen. 2021. Duo: Differential fuzzing for deep learning operators. IEEE Transactions on Reliability, 70, 4 (2021), 1671–1685.
[50]
Chijin Zhou, Quan Zhang, Lihua Guo, Mingzhe Wang, Yu Jiang, Qing Liao, Zhiyong Wu, Shanshan Li, and Bin Gu. 2023. Towards better semantics exploration for browser fuzzing. Proceedings of the ACM on Programming Languages, 7, OOPSLA2, 604–631.
[51]
Chijin Zhou, Quan Zhang, Mingzhe Wang, Lihua Guo, Jie Liang, Zhe Liu, Mathias Payer, and Yu Jiang. 2022. Minerva: browser API fuzzing with dynamic mod-ref analysis. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2022, Singapore, Singapore, November 14-18, 2022. ACM, 1135–1147.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA 2024: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis
September 2024
1928 pages
ISBN:9798400706127
DOI:10.1145/3650212
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2024

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. DL library
  2. Fuzzing
  3. Large Language Model

Qualifiers

  • Research-article

Conference

ISSTA '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 79
    Total Downloads
  • Downloads (Last 12 months)79
  • Downloads (Last 6 weeks)39
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media