Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3611643.3613078acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

Assisting Static Analysis with Large Language Models: A ChatGPT Experiment

Published: 30 November 2023 Publication History

Abstract

Recent advances of Large Language Models (LLMs), e.g., ChatGPT, exhibited strong capabilities of comprehending and responding to questions across a variety of domains. Surprisingly, ChatGPT even possesses a strong understanding of program code. In this paper, we investigate where and how LLMs can assist static analysis by asking appropriate questions. In particular, we target a specific bug-finding tool, which produces many false positives from the static analysis. In our evaluation, we find that these false positives can be effectively pruned by asking carefully constructed questions about function-level behaviors or function summaries. Specifically, with a pilot study of 20 false positives, we can successfully prune 8 out of 20 based on GPT-3.5, whereas GPT-4 had a near-perfect result of 16 out of 20, where the four failed ones are not currently considered/supported by our questions, e.g., involving concurrency. Additionally, it also identified one false negative case (a missed bug). We find LLMs a promising tool that can enable a more effective and efficient program analysis.

References

[1]
OpenAI (2023). 2023. GPT-4 Technical Report. arxiv:2303.08774 arXiv:2303.08774 [cs]
[2]
Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World. Commun. ACM, 66–75.
[3]
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, and Yi Zhang. 2023. Sparks of Artificial General Intelligence: Early experiments with GPT-4. arxiv:2303.12712 arXiv:2303.12712 [cs]
[4]
Jiuhai Chen, Lichang Chen, Heng Huang, and Tianyi Zhou. 2023. When do you need Chain-of-Thought Prompting for ChatGPT? arxiv:2304.03262 arXiv:2304.03262 [cs]
[5]
Clang. 2023. Clang Language Extensions. https://clang.llvm.org/docs/LanguageExtensions.html##builtin-functions
[6]
Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. 2013. Why don’t software developers use static analysis tools to find bugs? In 2013 35th International Conference on Software Engineering (ICSE). 672–681.
[7]
Caroline Lemieux, Jeevana Priya Inala, Shuvendu K Lahiri, and Siddhartha Sen. 2023. CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models.
[8]
Haonan Li, Yu Hao, Yizhuo Zhai, and Zhiyun Qian. 2023. The Hitchhiker’s Guide to Program Analysis: A Journey with Large Language Models. arxiv:2308.00245.
[9]
Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, and Chris Callison-Burch. 2023. Faithful Chain-of-Thought Reasoning. arxiv:2301.13379.
[10]
Bertrand Meyer. 1997. Object-Oriented Software Construction, 2nd Edition. Prentice-Hall. isbn:0-13-629155-4 http://www.eiffel.com/doc/oosc/page.html
[11]
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. arxiv:2203.02155 arXiv:2203.02155 [cs]
[12]
Jihyeok Park, Hongki Lee, and Sukyoung Ryu. 2022. A Survey of Parametric Static Analysis. ACM Comput. Surv., 54, 7 (2022), 149:1–149:37. https://doi.org/10.1145/3464457
[13]
Hammond Pearce, Benjamin Tan, Baleegh Ahmad, Ramesh Karri, and Brendan Dolan-Gavitt. 2023. Examining Zero-Shot Vulnerability Repair with Large Language Models. In 2023 IEEE Symposium on Security and Privacy (S&P). IEEE Computer Society, Los Alamitos, CA, USA. https://doi.org/10.1109/SP46215.2023.00001
[14]
John Schulman, Barret Zoph, Jacob Hilton Christina Kim, Jacob Menick, Jiayi Weng, Juan Felipe Ceron Uribe, Liam Fedus, Luke Metz, Michael Pokorny, Rapha Gontijo Lopes, Shengjia Zhao, Arun Vijayvergiya, Eric Sigler, Adam Perelman, Chelsea Voss, Mike Heaton, Joel Parish, Dave Cummings, Rajeev Nayak, Valerie Balcom, David Schnurr, Tomer Kaftan, Chris Hallacy, Nicholas Turley, Noah Deutsch, Vik Goel, Jonathan Ward, Aris Konstantinidis, Wojciech Zaremba, Long Ouyang, Leonard Bogdonoff, Joshua Gross, David Medina, Sarah Yoo, Teddy Lee, Ryan Lowe, Dan Mossing, Joost Huizinga, Roger Jiang, Carroll Wainwright, Diogo Almeida, Steph Lin, Marvin Zhang, Kai Xiao, Katarina Slama, Steven Bills, Alex Gray, Jan Leike, Jakub Pachocki, Phil Tillet, Shantanu Jain, Greg Brockman, and Nick Ryder. 2022. ChatGPT: Optimizing Language Models for Dialogue. https://openai.com/blog/chatgpt/
[15]
Jessica Shieh. 2023. Best practices for prompt engineering with OpenAI API | OpenAI Help Center. https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api
[16]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems. 30, Curran Associates, Inc.
[17]
Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc V Le, Ed H. Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. 2023. Self-Consistency Improves Chain of Thought Reasoning in Language Models. In The Eleventh International Conference on Learning Representations.
[18]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. 2023. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arxiv:2201.11903 arXiv:2201.11903 [cs]
[19]
Chunqiu Steven Xia and Lingming Zhang. 2023. Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT. arxiv:2304.00385
[20]
Yizhuo Zhai, Yu Hao, Hang Zhang, Daimeng Wang, Chengyu Song, Zhiyun Qian, Mohsen Lesani, Srikanth V. Krishnamurthy, and Paul Yu. 2020. UBITect: A Precise and Scalable Method to Detect Use-before-Initialization Bugs in Linux Kernel. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020).
[21]
Yizhuo Zhai, Yu Hao, Zheng Zhang, Weiteng Chen, Guoren Li, Zhiyun Qian, Chengyu Song, Manu Sridharan, Srikanth V. Krishnamurthy, Trent Jaeger, and Paul L. Yu. 2022. Progressive Scrutiny: Incremental Detection of UBI bugs in the Linux Kernel. In 29th Annual Network and Distributed System Security Symposium, NDSS 2022, San Diego, California, USA, April 24-28, 2022. The Internet Society. https://www.ndss-symposium.org/ndss-paper/auto-draft-249/
[22]
Hang Zhang, Weiteng Chen, Yu Hao, Guoren Li, Yizhuo Zhai, Xiaochen Zou, and Zhiyun Qian. 2021. Statically Discovering High-Order Taint Style Vulnerabilities in OS Kernels. In CCS ’21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, November 15 - 19, 2021, Yongdae Kim, Jong Kim, Giovanni Vigna, and Elaine Shi (Eds.). ACM, 811–824. https://doi.org/10.1145/3460120.3484798
[23]
Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, and Ji-Rong Wen. 2023. A Survey of Large Language Models. arxiv:2303.18223.

Cited By

View all
  • (2024)Software engineering education in the era of conversational AI: current trends and future directionsFrontiers in Artificial Intelligence10.3389/frai.2024.14363507Online publication date: 29-Aug-2024
  • (2024)Role and Challenges of ChatGPT or Similar Generative Artificial Intelligence in Reinforced Concrete TechnologySSRN Electronic Journal10.2139/ssrn.4681731Online publication date: 2024
  • (2024)If At First You Don’t Succeed, Try, Try, Again...? Insights and LLM-informed Tooling for Detecting Retry Bugs in Software SystemsProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695971(63-78)Online publication date: 4-Nov-2024
  • Show More Cited By

Index Terms

  1. Assisting Static Analysis with Large Language Models: A ChatGPT Experiment

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
      November 2023
      2215 pages
      ISBN:9798400703270
      DOI:10.1145/3611643
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 November 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. bug detection
      2. large language model
      3. static analysis

      Qualifiers

      • Research-article

      Conference

      ESEC/FSE '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 112 of 543 submissions, 21%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2,518
      • Downloads (Last 6 weeks)241
      Reflects downloads up to 16 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Software engineering education in the era of conversational AI: current trends and future directionsFrontiers in Artificial Intelligence10.3389/frai.2024.14363507Online publication date: 29-Aug-2024
      • (2024)Role and Challenges of ChatGPT or Similar Generative Artificial Intelligence in Reinforced Concrete TechnologySSRN Electronic Journal10.2139/ssrn.4681731Online publication date: 2024
      • (2024)If At First You Don’t Succeed, Try, Try, Again...? Insights and LLM-informed Tooling for Detecting Retry Bugs in Software SystemsProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695971(63-78)Online publication date: 4-Nov-2024
      • (2024)Effectiveness of ChatGPT for Static Analysis: How Far Are We?Proceedings of the 1st ACM International Conference on AI-Powered Software10.1145/3664646.3664777(151-160)Online publication date: 10-Jul-2024
      • (2024)Automated Unit Test Improvement using Large Language Models at MetaCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663839(185-196)Online publication date: 10-Jul-2024
      • (2024)Using Run-Time Information to Enhance Static Analysis of Machine Learning Code in NotebooksCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663785(497-501)Online publication date: 10-Jul-2024
      • (2024)Exploring AI for Vulnerability Detection and Repair2024 Cyber Awareness and Research Symposium (CARS)10.1109/CARS61786.2024.10778769(1-9)Online publication date: 28-Oct-2024
      • (2024)Managing Linux servers with LLM-based AI agents: An empirical evaluation with GPT4Machine Learning with Applications10.1016/j.mlwa.2024.10057017(100570)Online publication date: Sep-2024
      • (2024)Detecting command injection vulnerabilities in Linux-based embedded firmware with LLM-based taint analysis of library functionsComputers & Security10.1016/j.cose.2024.103971144(103971)Online publication date: Sep-2024
      • (2024)Outside the Comfort Zone: Analysing LLM Capabilities in Software Vulnerability DetectionComputer Security – ESORICS 202410.1007/978-3-031-70879-4_14(271-289)Online publication date: 16-Sep-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media