research-article

Open access

Fuzz4All: Universal Fuzzing with Large Language Models

Authors:

Chunqiu Steven Xia,

Matteo Paltenghi,

Michael Pradel,

Lingming ZhangAuthors Info & Claims

ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

Article No.: 126, Pages 1 - 13

https://doi.org/10.1145/3597503.3639121

Published: 12 April 2024 Publication History

Abstract

Fuzzing has achieved tremendous success in discovering bugs and vulnerabilities in various software systems. Systems under test (SUTs) that take in programming or formal language as inputs, e.g., compilers, runtime engines, constraint solvers, and software libraries with accessible APIs, are especially important as they are fundamental building blocks of software development. However, existing fuzzers for such systems often target a specific language, and thus cannot be easily applied to other languages or even other versions of the same language. Moreover, the inputs generated by existing fuzzers are often limited to specific features of the input language, and thus can hardly reveal bugs related to other or new features. This paper presents Fuzz4All, the first fuzzer that is universal in the sense that it can target many different input languages and many different features of these languages. The key idea behind Fuzz4All is to leverage large language models (LLMs) as an input generation and mutation engine, which enables the approach to produce diverse and realistic inputs for any practically relevant language. To realize this potential, we present a novel autoprompting technique, which creates LLM prompts that are well-suited for fuzzing, and a novel LLM-powered fuzzing loop, which iteratively updates the prompt to create new fuzzing inputs. We evaluate Fuzz4All on nine systems under test that take in six different languages (C, C++, Go, SMT2, Java, and Python) as inputs. The evaluation shows, across all six languages, that universal fuzzing achieves higher coverage than existing, language-specific fuzzers. Furthermore, Fuzz4All has identified 98 bugs in widely used systems, such as GCC, Clang, Z3, CVC5, OpenJDK, and the Qiskit quantum computing platform, with 64 bugs already confirmed by developers as previously unknown.

References

[1]

2021. Qiskit/Qiskit. https://github.com/Qiskit/qiskit.

[2]

2023. std::expected. https://en.cppreference.com/w/cpp/utility/expected.

[3]

Cornelius Aschermann, Tommaso Frassetto, Thorsten Holz, Patrick Jauernig, Ahmad-Reza Sadeghi, and Daniel Teuchert. 2019. NAUTILUS: Fishing for Deep Bugs with Grammars. In NDSS.

[4]

Yejin Bang, Samuel Cahyawijaya, Nayeon Lee, Wenliang Dai, Dan Su, Bryan Wilie, Holy Lovenia, Ziwei Ji, Tiezheng Yu, Willy Chung, et al. 2023. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023 (2023).

[5]

Patrick Bareiß, Beatriz Souza, Marcelo d'Amorim, and Michael Pradel. 2022. Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code. CoRR abs/2206.01335 (2022). arXiv:2206.01335

[6]

Marcel Böhme, Cristian Cadar, and Abhik Roychoudhury. 2020. Fuzzing: Challenges and reflections. IEEE Software 38, 3 (2020), 79--86.

[7]

Marcel Böhme, László Szekeres, and Jonathan Metzman. 2022. On the reliability of coverage-based fuzzer benchmarking. In Proceedings of the 44th International Conference on Software Engineering. 1621--1633.

Digital Library

[8]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv:2005.14165.

[9]

Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, et al. 2023. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712 (2023).

[10]

Alexander Bulekov, Bandan Das, Stefan Hajnoczi, and Manuel Egele. 2023. No Grammar, No Problem: Towards Fuzzing the Linux Kernel without System-Call Descriptions. In Network and Distributed System Security (NDSS) Symposium 2023.

[11]

Stefanos Chaliasos, Thodoris Sotiropoulos, Diomidis Spinellis, Arthur Gervais, Benjamin Livshits, and Dimitris Mitropoulos. 2022. Finding typing compiler bugs. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 183--198.

Digital Library

[12]

Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A survey of compiler testing. ACM Computing Surveys (CSUR) 53, 1 (2020), 1--36.

Digital Library

[13]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).

[14]

Yongheng Chen, Rui Zhong, Hong Hu, Hangfan Zhang, Yupeng Yang, Dinghao Wu, and Wenke Lee. 2021. One engine to fuzz'em all: Generic language processor testing with semantic validation. In 2021 IEEE Symposium on Security and Privacy (SP). IEEE, 642--658.

[15]

Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, and Noah Fiedel. 2022. PaLM: Scaling Language Modeling with Pathways. arXiv:2204.02311 [cs.CL]

[16]

Andrew W. Cross, Lev S. Bishop, John A. Smolin, and Jay M. Gambetta. 2017. Open Quantum Assembly Language. arXiv:1707.03429 [quant-ph] (July 2017). arXiv:1707.03429 [quant-ph]

[17]

Chris Cummins, Pavlos Petoumenos, Alastair Murray, and Hugh Leather. 2018. Compiler fuzzing through deep learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 95--105.

Digital Library

[18]

Yinlin Deng, Chunqiu Steven Xia, Haoran Peng, Chenyuan Yang, and Lingming Zhang. 2023. Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 423--435.

Digital Library

[19]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[20]

Karine Even-Mendoza, Cristian Cadar, and Alastair F Donaldson. 2022. CsmithEdge: more effective compiler testing by handling undefined behaviour less conservatively. Empirical Software Engineering 27, 6 (2022), 129.

Digital Library

[21]

Karine Even-Mendoza, Arindam Sharma, Alastair F. Donaldson, and Cristian Cadar. 2023. GrayC: Greybox Fuzzing of Compilers and Analysers for C (ISSTA 2023). Association for Computing Machinery, New York, NY, USA, 1219--1231.

Digital Library

[22]

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. arXiv:2002.08155.

[23]

Mark Fingerhuth, Tomáš Babej, and Peter Wittek. 2018. Open Source Software in Quantum Computing. PLOS ONE 13, 12 (Dec. 2018), e0208561.

[24]

Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, and Mike Lewis. 2022. Incoder: A generative model for code infilling and synthesis. arXiv preprint arXiv:2204.05999 (2022).

[25]

go-fuzz 2023. go-fuzz: randomized testing for Go. https://github.com/dvyukov/go-fuzz.

[26]

Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 50--59.

[27]

gpt4endpoint 2023. Models - GPT-4. https://platform.openai.com/docs/models/gpt-4.

[28]

Alex Groce, Rijnard van Tonder, Goutamkumar Tulajappa Kalburgi, and Claire Le Goues. 2022. Making no-fuss compiler fuzzing effective. In Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction. 194--204.

Digital Library

[29]

Sumit Gulwani, Oleksandr Polozov, Rishabh Singh, et al. 2017. Program synthesis. Foundations and Trends® in Programming Languages 4, 1--2 (2017), 1--119.

[30]

Zhijiang Guo, Michael Schlichtkrull, and Andreas Vlachos. 2022. A survey on automated fact-checking. Transactions of the Association for Computational Linguistics 10 (2022), 178--206.

[31]

Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with code fragments. In 21st USENIX Security Symposium (USENIX Security 12). 445--458.

[32]

Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. 2019. The Curious Case of Neural Text Degeneration. arXiv:1904.09751.

[33]

Bo Jiang, Xiaoyan Wang, Wing Kwong Chan, TH Tse, Na Li, Yongfeng Yin, and Zhenyu Zhang. 2020. Cudasmith: A fuzzer for CUDA compilers. In 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, 861--871.

[34]

jsfunfuzz 2017. Introducing jsfunfuzz. https://www.squarefree.com/2007/08/02/introducing-jsfunfuzz/.

[35]

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361 (2020).

[36]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS '18). Association for Computing Machinery, New York, NY, USA, 2123--2138.

Digital Library

[37]

Suyoung Lee, HyungSeok Han, Sang Kil Cha, and Sooel Son. 2020. Montage: A Neural Network Language {Model-Guided} {JavaScript} Engine Fuzzer. In 29th USENIX Security Symposium (USENIX Security 20). 2613--2630.

[38]

Daniel Lehmann and Michael Pradel. 2018. Feedback-directed differential testing of interactive debuggers. In ESEC/SIGSOFT FSE. 610--620.

[39]

Caroline Lemieux, Jeevana Priya Inala, Shuvendu K Lahiri, and Siddhartha Sen. 2023. CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pretrained Large Language Models. In 45th International Conference on Software Engineering.

[40]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).

[41]

Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, et al. 2023. StarCoder: may the source be with you! arXiv preprint arXiv:2305.06161 (2023).

[42]

Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190 (2021).

[43]

libFuzzer 2023. libFuzzer - a library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html.

[44]

Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F Donaldson. 2015. Many-core compiler fuzzing. ACM SIGPLAN Notices 50, 6 (2015), 65--76.

Digital Library

[45]

Jiawei Liu, Jinkun Lin, Fabian Ruffy, Cheng Tan, Jinyang Li, Aurojit Panda, and Lingming Zhang. 2023. Nnsmith: Generating diverse and valid test cases for deep learning compilers. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2. 530--543.

Digital Library

[46]

Jiawei Liu, Yuxiang Wei, Sen Yang, Yinlin Deng, and Lingming Zhang. 2022. Coverage-guided tensor compiler fuzzing with joint ir-pass mutation. Proceedings of the ACM on Programming Languages 6, OOPSLA1 (2022), 1--26.

Digital Library

[47]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. CoRR abs/2107.13586 (2021). arXiv:2107.13586 https://arxiv.org/abs/2107.13586

[48]

Xiao Liu, Xiaoting Li, Rupesh Prajapati, and Dinghao Wu. 2019. Deepfuzz: Automatic generation of syntax valid c programs for fuzz testing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 1044--1051.

Digital Library

[49]

Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random testing for C and C++ compilers with YARPGen. Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 1--25.

Digital Library

[50]

M. Zalewski 2016. American Fuzzy Lop - Whitepaper. https://lcamtuf.coredump.cx/afl/technical_details.txt.

[51]

Haoyang Ma. 2023. A Survey of Modern Compiler Fuzzing. arXiv preprint arXiv:2306.06884 (2023).

[52]

Henry B Mann and Donald R Whitney. 1947. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics (1947), 50--60.

[53]

Muhammad Numair Mansur, Maria Christakis, and Valentin Wüstholz. 2021. Metamorphic testing of Datalog engines. In ESEC/FSE '21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 639--650.

Digital Library

[54]

Pengyu Nie, Rahul Banerjee, Junyi Jessy Li, Raymond J. Mooney, and Milos Gligoric. 2023. Learning Deep Semantics for Test Completion. In 45th International Conference on Software Engineering.

[55]

OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]

[56]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730--27744.

[57]

Matteo Paltenghi and Michael Pradel. 2022. Bugs in Quantum computing platforms: an empirical study. Proc. ACM Program. Lang. 6, OOPSLA (2022), 1--27.

Digital Library

[58]

Matteo Paltenghi and Michael Pradel. 2023. MorphQ: Metamorphic Testing of the Qiskit Quantum Computing Platform. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE Computer Society, 2413--2424.

Digital Library

[59]

Jiwon Park, Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2021. Generative type-aware mutation for testing SMT solvers. Proceedings of the ACM on Programming Languages 5, OOPSLA (2021), 1--19.

Digital Library

[60]

Jibesh Patra and Michael Pradel. 2016. Learning to fuzz: Application-independent fuzz testing with probabilistic, generative models of input data. (2016).

[61]

PyTorch 2023. PyTorch. http://pytorch.org.

[62]

Guanghui Qin and Jason Eisner. 2021. Learning How to Ask: Querying LMs with Mixtures of Soft Prompts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT).

[63]

Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. (2018).

[64]

Timo Schick and Hinrich Schütze. 2020. Exploiting cloze questions for few shot text classification and natural language inference. arXiv preprint arXiv:2001.07676 (2020).

[65]

John Schulman, Barret Zoph, Jacob Hilton Christina Kim, Jacob Menick, Jiayi Weng, Juan Felipe Ceron Uribe, Liam Fedus, Luke Metz, Michael Pokorny, Rapha Gontijo Lopes, Shengjia Zhao, Arun Vijayvergiya, Eric Sigler, Adam Perelman, Chelsea Voss, Mike Heaton, Joel Parish, Dave Cummings, Rajeev Nayak, Valerie Balcom, David Schnurr, Tomer Kaftan, Chris Hallacy, Nicholas Turley, Noah Deutsch, Vik Goel, Jonathan Ward, Aris Konstantinidis, Wojciech Zaremba, Long Ouyang, Leonard Bogdonoff, Joshua Gross, David Medina, Sarah Yoo, Teddy Lee, Ryan Lowe, Dan Mossing, Joost Huizinga, Roger Jiang, Carroll Wainwright, Diogo Almeida, Steph Lin, Marvin Zhang, Kai Xiao, Katarina Slama, Steven Bills, Alex Gray, Jan Leike, Jakub Pachocki, Phil Tillet, Shantanu Jain, Greg Brockman, and Nick Ryder. 2022. ChatGPT: Optimizing Language Models for Dialogue. (2022). https://openai.com/blog/chatgpt/.

[66]

Max Schäfer, Sarah Nadi, Aryaz Eghbali, and Frank Tip. 2023. Adaptive Test Generation Using a Large Language Model. arXiv:2302.06527 [cs.SE]

[67]

Kensen Shi, David Bieber, and Rishabh Singh. 2022. Tf-coder: Program synthesis for tensor manipulations. ACM Transactions on Programming Languages and Systems (TOPLAS) 44, 2 (2022), 1--36.

Digital Library

[68]

Taylor Shin, Yasaman Razeghi, Robert L Logan IV, Eric Wallace, and Sameer Singh. 2020. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980 (2020).

[69]

Michael Sutton, Adam Greene, and Pedram Amini. 2007. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley Professional.

Digital Library

[70]

syzkaller 2023. syzkaller - kernel fuzzer. https://github.com/google/syzkaller.

[71]

Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, and Colin Raffel. 2021. Improving and simplifying pattern exploiting training. arXiv preprint arXiv:2103.11955 (2021).

[72]

TensorFlow 2023. TensorFlow. https://www.tensorflow.org.

[73]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).

[74]

Vasudev Vikram, Caroline Lemieux, and Rohan Padhye. 2023. Can Large Language Models Write Good Property-Based Tests? arXiv preprint arXiv:2307.04346 (2023).

[75]

Chaozheng Wang, Yuanhang Yang, Cuiyun Gao, Yun Peng, Hongyu Zhang, and Michael R Lyu. 2022. No more fine-tuning? an experimental evaluation of prompt tuning in code intelligence. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 382--394.

Digital Library

[76]

Anjiang Wei, Yinlin Deng, Chenyuan Yang, and Lingming Zhang. 2022. Free lunch for testing: Fuzzing deep-learning libraries from open source. In Proceedings of the 44th International Conference on Software Engineering. 995--1007.

Digital Library

[77]

Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. On the unusual effectiveness of type-aware operator mutations for testing SMT solvers. Proc. ACM Program. Lang. 4, OOPSLA (2020), 193:1--193:25.

Digital Library

[78]

Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. Validating SMT Solvers via Semantic Fusion. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 718--730.

Digital Library

[79]

Chunqiu Steven Xia and Lingming Zhang. 2023. Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT. arXiv preprint arXiv:2304.00385 (2023).

[80]

Frank F. Xu, Uri Alon, Graham Neubig, and Vincent Josua Hellendoorn. 2022. A Systematic Evaluation of Large Language Models of Code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming (San Diego, CA, USA) (MAPS 2022). Association for Computing Machinery, New York, NY, USA, 1--10.

Digital Library

[81]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation. 283--294.

Digital Library

[82]

Zhiqiang Yuan, Yiling Lou, Mingwei Liu, Shiji Ding, Kaixin Wang, Yixuan Chen, and Xin Peng. 2023. No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation. arXiv:2305.04207 [cs.SE]

[83]

Shafiq Joty Yue Wang, Weishi Wang and Steven C.H. Hoi. 2021. CodeT5: Identifieraware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021.

[84]

Andreas Zeller, Rahul Gopinath, Marcel Böhme, Gordon Fraser, and Christian Holler. 2019. The fuzzing book.

[85]

Hui Zhao, Zhihui Li, Hansheng Wei, Jianqi Shi, and Yanhong Huang. 2019. Seq-Fuzzer: An Industrial Protocol Fuzzing Framework from a Deep Learning Perspective. In 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST). 59--67.

[86]

Yingquan Zhao, Zan Wang, Junjie Chen, Mengdi Liu, Mingyuan Wu, Yuqun Zhang, and Lingming Zhang. 2022. History-Driven Test Program Synthesis for JVM Testing. In Proceedings of the 44th International Conference on Software Engineering (Pittsburgh, Pennsylvania) (ICSE '22). 1133--1144.

Digital Library

[87]

Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2022. Large language models are human-level prompt engineers. arXiv preprint arXiv:2211.01910 (2022).

[88]

Daniel M. Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B. Brown, Alec Radford, Dario Amodei, Paul Christiano, and Geoffrey Irving. 2019. Fine-Tuning Language Models from Human Preferences. arXiv:1909.08593.

Cited By

Yang WWang EGui ZZhou YWang BXie W(2025)An MLLM-Assisted Web Crawler Approach for Web Application FuzzingApplied Sciences10.3390/app1502096215:2(962)Online publication date: 19-Jan-2025
https://doi.org/10.3390/app15020962
Zhang JBu HWen HLiu YFei HXi RLi LYang YZhu HMeng D(2025)When LLMs meet cybersecurity: a systematic literature reviewCybersecurity10.1186/s42400-025-00361-w8:1Online publication date: 5-Feb-2025
https://doi.org/10.1186/s42400-025-00361-w
Clear TCajander ÅClear AMcDermott RDaniels MDivitini MForshaw MHumble NKasinidou MKleanthous SKultur CParvini GPolash MZhu TMonga MLonati VBarendsen ESheard JPaterson J(2025)AI Integration in the IT Professional Workplace: A Scoping Review and Interview Study with Implications for Education and Professional Competencies2024 Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3689187.3709607(34-67)Online publication date: 22-Jan-2025
https://dl.acm.org/doi/10.1145/3689187.3709607
Show More Cited By

Recommendations

WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models

Compiler correctness is crucial, as miscompilation can falsify program behaviors, leading to serious consequences over the software supply chain. In the literature, fuzzing has been extensively studied to uncover compiler defects. However, compiler ...
Universal computer-oriented language

The basic idea of UNCOL Universal Computer-Oriented Language—is to introduce a language between problem-oriented languages, POLs, and machine languages, MLs. This third level consists of a single language, UNCOL, which has the character of a generalized ...
Tree2tree Structural Language Modeling for Compiler Fuzzing
Algorithms and Architectures for Parallel Processing
Abstract
Compiler fuzzing requires well-formed test cases. Only syntactically correct programs can pass the parsing stage of a compiler. Recently, advanced compiler fuzzers produce test cases by learning a generative language model of regular programs. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

May 2024

2942 pages

ISBN:9798400702174

DOI:10.1145/3597503

Co-chairs:
Ana Paiva,
Rui Abreu,
Program Co-chairs:
Abhik Roychoudhury,
Margaret Storey

Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 April 2024

Check for updates

Badges

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ICSE '24

Sponsor:

SIGSOFT

ICSE '24: IEEE/ACM 46th International Conference on Software Engineering

April 14 - 20, 2024

Lisbon, Portugal

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
3,185
Total Downloads

Downloads (Last 12 months)3,185
Downloads (Last 6 weeks)394

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang WWang EGui ZZhou YWang BXie W(2025)An MLLM-Assisted Web Crawler Approach for Web Application FuzzingApplied Sciences10.3390/app1502096215:2(962)Online publication date: 19-Jan-2025
https://doi.org/10.3390/app15020962
Zhang JBu HWen HLiu YFei HXi RLi LYang YZhu HMeng D(2025)When LLMs meet cybersecurity: a systematic literature reviewCybersecurity10.1186/s42400-025-00361-w8:1Online publication date: 5-Feb-2025
https://doi.org/10.1186/s42400-025-00361-w
Clear TCajander ÅClear AMcDermott RDaniels MDivitini MForshaw MHumble NKasinidou MKleanthous SKultur CParvini GPolash MZhu TMonga MLonati VBarendsen ESheard JPaterson J(2025)AI Integration in the IT Professional Workplace: A Scoping Review and Interview Study with Implications for Education and Professional Competencies2024 Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3689187.3709607(34-67)Online publication date: 22-Jan-2025
https://dl.acm.org/doi/10.1145/3689187.3709607
Wang YWang WLi D(2025)Dynamic feedback-based vulnerability mining method for highly closed terminal protocolsThird International Conference on Communications, Information System, and Data Science (CISDS 2024)10.1117/12.3058050(13)Online publication date: 4-Jan-2025
https://doi.org/10.1117/12.3058050
Hui MWang LLi HYang RSong YZhuang HCui DLi Q(2025)Unveiling the microservices testing methods, challenges, solutions, and solutions gaps: A systematic mapping studyJournal of Systems and Software10.1016/j.jss.2024.112232220(112232)Online publication date: Feb-2025
https://doi.org/10.1016/j.jss.2024.112232
Mechri AFerrag MDebbah M(2025)SecureQwen: Leveraging LLMs for vulnerability detection in python codebasesComputers & Security10.1016/j.cose.2024.104151148(104151)Online publication date: Jan-2025
https://doi.org/10.1016/j.cose.2024.104151
Yang ZPeng HJiang YLi XDu HWang SLiu J(2025)ChatHTTPFuzz: large language model-assisted IoT HTTP fuzzingInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02527-3Online publication date: 17-Jan-2025
https://doi.org/10.1007/s13042-024-02527-3
Li YWu YWang ZHuang LWang JLi JHuang M(2025)CodeDoctor: multi-category code review comment generationAutomated Software Engineering10.1007/s10515-025-00491-y32:1Online publication date: 27-Feb-2025
https://doi.org/10.1007/s10515-025-00491-y
Brownlee ACallan JEven-Mendoza KGeiger AHanna CPetke JSarro FSobania D(2025)Large language model based mutations in genetic improvementAutomated Software Engineering10.1007/s10515-024-00473-632:1Online publication date: 1-May-2025
https://dl.acm.org/doi/10.1007/s10515-024-00473-6
Bich Do NNguyen C(2024)Generate High-Coverage Unit Test Data Using the LLM ToolInternational Journal of Innovative Technology and Exploring Engineering10.35940/ijitee.L9996.1312112413:12(13-18)Online publication date: 30-Nov-2024
https://doi.org/10.35940/ijitee.L9996.13121124
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten