Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Grammar-based whitebox fuzzing

Published: 07 June 2008 Publication History

Abstract

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages.
In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.

References

[1]
D. Aitel. The Advantages of Block-Based Protocol Analysis for Security Testing. Immunity Inc., February, 2002.
[2]
S. Artzi, A. Kie?un, J. Dolby, F. Tip, D. Dig, A. Paradkar, and M. D. Ernst. Finding bugs in dynamic Web applications. Technical Report MIT-CSAIL-TR-2008-006, MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, Feb. 2008.
[3]
D. Bird and C. Munoz. Automatic Generation of Random Self-Checking Test Cases. IBM Systems Journal, 22(3):229--245, 1983.
[4]
N. Borisov, D. Brumley, H. Wang, J. Dunagan, P. Joshi, and C. Guo. Generic application-level protocol analyzer and its language. In NDSS, 2007.
[5]
C. Boyapati, S. Khurshid, and D. Marinov. Korat: automated testing based on Java predicates. In ISSTA, 2002.
[6]
C. Cadar, V. Ganesh, P. Pawlowski, D. Dill, and D. Engler. EXE: automatically generating inputs of death. In CCS, 2006.
[7]
K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In ICFP, 2000.
[8]
D. Coppit and J. Lian. yagg: an easy-to-use generator for structured test inputs. In ASE, 2005.
[9]
W. Cui, J. Kannan, and H. J. Wang. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 2007.
[10]
B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In FSE, 2007.
[11]
M. Emmi, R. Majumdar, and K. Sen. Dynamic test input generation for database applications. In ISSTA, 2007.
[12]
J. E. Forrester and B. P. Miller. An Empirical Study of the Robustness of Windows NT Applications Using Random Testing. In Proceedings of the 4th USENIX Windows System Symposium, Seattle, August 2000.
[13]
P. Godefroid. Compositional Dynamic Test Generation. In POPL, 2007.
[14]
P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI, 2005.
[15]
P. Godefroid, M. Levin, and D. Molnar. Active property checking. Technical Report MSR-TR-2007-91, Microsoft, 2007.
[16]
P. Godefroid, M. Levin, and D. Molnar. Automated whitebox fuzz testing. In NDSS, 2008.
[17]
K. Hanford. Automatic Generation of Test Cases. IBM Systems Journal, 9(4), 1970.
[18]
J. Hopcroft and J. Ullman. Introduction to automata theory, languages and computation. Addison-Wesley Series in Computer Science, 1979.
[19]
S. Khurshid and D. Marinov. TestEra: Specification-Based Testing of Java Programs Using SAT. In ASE, 2004.
[20]
J. King. Symbolic execution and program testing. Communications of the ACM, 19(7):385--394, 1976.
[21]
R. Lämmel and W. Schulte. Controllable combinatorial coverage in grammar-based testing. In TestCom, 2006.
[22]
R. Majumdar and K. Sen. LATEST: Lazy dynamic test input generation. Technical Report UCB/EECS-2007-36, EECS Department, University of California, Berkeley, 2007.
[23]
R. Majumdar and R.-G. Xu. Directed test generation using symbolic grammars. In ASE, 2007.
[24]
B. Malloy and J. Power. An interpretation of Purdom?s algorithm for automatic generation of test cases. In ICIS, 2001.
[25]
P. Maurer. Generating test data with enhanced context-free grammars. IEEE Software, 7(4), 1990.
[26]
B. McKenzie. Generating strings at random from a context free grammar. Technical Report TR-COSC 10/97, Department of Computer Science, University of Canterbury, 1997.
[27]
D. Melski and T. Reps. Interconvertbility of set constraints and context-free language reachability. In PEPM, 1997.
[28]
B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of UNIX utilities. Communications of the ACM, 33(12), 1990.
[29]
R. C. Moore. Removing left recursion from context-free grammars. In Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, 2000.
[30]
C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedbackdirected random test generation. In ICSE, 2007.
[31]
R. Pang, V. Paxson, R. Sommer, and L. Peterson. binpac: a yacc for writing application protocol parsers. In IMC, 2006.
[32]
P. Purdom. A sentence generator for testing parsers. BIT Numerical Mathematics, 12(3), 1972.
[33]
D. J. Salomon and G. V. Cormack. Scannerless NSLR(1) parsing of programming languages. In PLDI, 1989.
[34]
K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for C. In FSE, 2005.
[35]
E. Sirer and B. Bershad. Using production grammars in software testing. In DSL, 1999.
[36]
K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software assurance by bounded exhaustive testing. In ISSTA, 2004.
[37]
M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley, 2007.
[38]
M. Utting, A. Pretschner, and B. Legeard. A Taxonomy of Model-Based Testing. Department of Computer Science, The University of Waikato, New Zealand, Tech. Rep, 4, 2006.
[39]
G. Wassermann and Z. Su. Sound and precise analysis of Web applications for injection vulnerabilities. In PLDI, 2007.

Cited By

View all
  • (2024)ConjunCT: Learning Inductive Invariants to Prove Unbounded Instruction Safety Against Microarchitectural Timing Attacks2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00180(3735-3753)Online publication date: 19-May-2024
  • (2023)µFUZZProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620312(1325-1342)Online publication date: 9-Aug-2023
  • (2023) FormatFuzzer : Effective Fuzzing of Binary File Formats ACM Transactions on Software Engineering and Methodology10.1145/3628157Online publication date: 17-Oct-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 43, Issue 6
PLDI '08
June 2008
382 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1379022
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2008
    396 pages
    ISBN:9781595938602
    DOI:10.1145/1375581
    • General Chair:
    • Rajiv Gupta,
    • Program Chair:
    • Saman Amarasinghe
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2008
Published in SIGPLAN Volume 43, Issue 6

Check for updates

Author Tags

  1. automatic test generation
  2. grammars
  3. program verification
  4. software testing

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)228
  • Downloads (Last 6 weeks)20
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ConjunCT: Learning Inductive Invariants to Prove Unbounded Instruction Safety Against Microarchitectural Timing Attacks2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00180(3735-3753)Online publication date: 19-May-2024
  • (2023)µFUZZProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620312(1325-1342)Online publication date: 9-Aug-2023
  • (2023) FormatFuzzer : Effective Fuzzing of Binary File Formats ACM Transactions on Software Engineering and Methodology10.1145/3628157Online publication date: 17-Oct-2023
  • (2023)Demystify the Fuzzing Methods: A Comprehensive SurveyACM Computing Surveys10.1145/362337556:3(1-38)Online publication date: 5-Oct-2023
  • (2023)Horus: Accelerating Kernel Fuzzing through Efficient Host-VM Memory Access ProceduresACM Transactions on Software Engineering and Methodology10.1145/361166533:1(1-25)Online publication date: 8-Aug-2023
  • (2023)Large Language Models for Fuzzing Parsers (Registered Report)Proceedings of the 2nd International Fuzzing Workshop10.1145/3605157.3605173(31-38)Online publication date: 17-Jul-2023
  • (2023)Rare Path Guided FuzzingProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598136(1295-1306)Online publication date: 12-Jul-2023
  • (2023)NaturalFuzz: Natural Input Generation for Big Data AnalyticsProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00034(1592-1603)Online publication date: 11-Nov-2023
  • (2023)Improving fuzzing assessment methods through the analysis of metrics and experimental conditionsComputers and Security10.1016/j.cose.2022.102946124:COnline publication date: 1-Jan-2023
  • (2023)RLTrace: Synthesizing High-Quality System Call Traces for OS Fuzz TestingInformation Security10.1007/978-3-031-49187-0_6(99-118)Online publication date: 1-Dec-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media