research-article

Open access

PyRTFuzz: Detecting Bugs in Python Runtimes via Two-Level Collaborative Fuzzing

Authors:

Haipeng CaiAuthors Info & Claims

CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

Pages 1645 - 1659

https://doi.org/10.1145/3576915.3623166

Published: 21 November 2023 Publication History

Abstract

Given the widespread use of Python and its sustaining impact, the security and reliability of the Python runtime system is highly and broadly critical. Yet with real-world bugs in Python runtimes being continuously and increasingly reported, technique/tool support for automated detection of such bugs is still largely lacking. In this paper, we present PyRTFuzz, a novel fuzzing technique/tool for holistically testing Python runtimes including the language interpreter and its runtime libraries. PyRTFuzz combines generationand mutation-based fuzzing at the compiler- and application-testing level, respectively, as enabled by static/dynamic analysis for extracting runtime API descriptions, a declarative, specification language for valid and diverse Python code generation, and a custom type-guided mutation strategy for format/structure-aware application input generation. We implemented PyRTFuzz for the primary Python implementation (CPython) and applied it to three versions of the runtime. Our experiments revealed 61 new, demonstrably exploitable bugs including those in the interpreter and most in the runtime libraries. Our results also demonstrated the promising scalability and cost-effectiveness of PyRTFuzz and its great potential for further bug discovery. The two-level collaborative fuzzing methodology instantiated in PyRTFuzz may also apply to other language runtimes especially those of interpreted languages.

References

[1]

Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In NDSS, Vol. 19. 1--15.

[2]

Franco Bazzichi and Ippolito Spadafora. 1982. An automatic generator for compiler testing. IEEE Transactions on Software Engineering 4 (1982), 343--353.

Digital Library

[3]

Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering, Vol. 45, 5 (2017), 489--506.

[4]

Haipeng Cai and Xiaoqin Fu. 2021. D2ABS: A Framework for Dynamic Dependence Analysis of Distributed Programs. IEEE Transactions on Software Engineering (TSE), Vol. 48, 12 (2021), 4733--4761. https://doi.org/10.1109/TSE.2021.3124795 (impact factor: 6.226).

[5]

Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A survey of compiler testing. ACM Computing Surveys (CSUR), Vol. 53, 1 (2020), --36.

[6]

Yuanliang Chen, Yu Jiang, Fuchen Ma, Jie Liang, Mingzhe Wang, Chijin Zhou, Xun Jiao, and Zhuo Su. 2019a. EnFuzz: Ensemble Fuzzing with Seed Synchronization among Diverse Fuzzers. In USENIX Security Symposium. 1967--1983.

[7]

Yuting Chen, Ting Su, and Zhendong Su. 2019b. Deep differential testing of JVM implementations. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1257--1268.

Digital Library

[8]

Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-directed differential testing of JVM implementations. In proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 85--99.

Digital Library

[9]

Chris Cummins, Pavlos Petoumenos, Alastair Murray, and Hugh Leather. 2018. Compiler fuzzing through deep learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 95--105.

Digital Library

[10]

Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020. AFL: Combining incremental steps of fuzzing research. In 14th {USENIX} Workshop on Offensive Technologies ({WOOT} 20).

[11]

Xiaoqin Fu and Haipeng Cai. 2021. FlowDist:Multi-Staged Refinement-Based Dynamic Information Flow Analysis for Distributed Software Systems. In 30th USENIX Security Symposium (USENIX Security 21). 2093--2110.

[12]

Xiaoqin Fu, Haipeng Cai, Wen Li, and Li LI. 2020. Seads: Scalable and Cost-Effective Dynamic Dependence Analysis of Distributed Systems via Reinforcement Learning. ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 30, 1 (2020), 1--45. https://doi.org/10.1145/3379345 (impact factor 2.5; journal-first paper).

Digital Library

[13]

Xiaoqin Fu, Boxiang Lin, and Haipeng Cai. 2022. DistFax: A Toolkit for Measuring Interprocess Communications and Quality of Distributed Systems. In IEEE/ACM International Conference on Software Engineering (ICSE), Tool Demos. 51--55. https://doi.org/10.1145/3510454.3516859

Digital Library

[14]

google. 2022. A Coverage-Guided, Native Python Fuzzer. https://github.com/google/atheris.

[15]

Samuel Groß. 2018. Fuzzil: Coverage guided fuzzing for javascript engines. Department of Informatics, Karlsruhe Institute of Technology (2018).

[16]

Emre Güler, Philipp Görz, Elia Geretto, Andrea Jemmett, Sebastian Österlund, Herbert Bos, Cristiano Giuffrida, and Thorsten Holz. 2020. Cupid: Automatic fuzzer selection for collaborative fuzzing. In Annual Computer Security Applications Conference. 360--372.

Digital Library

[17]

HyungSeok Han, DongHyeon Oh, and Sang Kil Cha. 2019. CodeAlchemist: Semantics-Aware Code Generation to Find Vulnerabilities in JavaScript Engines. In NDSS.

[18]

Kenneth V. Hanford. 1970. Automatic generation of test cases. IBM Systems Journal, Vol. 9, 4 (1970), 242--257.

Digital Library

[19]

Mostafa Hassan, Caterina Urban, Marco Eilers, and Peter Müller. 2018. MaxSMT-based type inference for Python 3. In Computer Aided Verification: 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part II 30. Springer, 12--19.

[20]

Christian Holler, Kim Herzig, Andreas Zeller, et al. 2012. Fuzzing with Code Fragments. In USENIX Security Symposium. 445--458.

[21]

Vanshika kakkar. 2023. Top 10 Programming Languages to Learn in 2023. https://www.geeksforgeeks.org/top-10-programming-languages-to-learn/

[22]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. ACM Sigplan Notices, Vol. 49, 6 (2014), 216--226.

Digital Library

[23]

Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding deep compiler bugs via guided stochastic program mutation. ACM SIGPLAN Notices, Vol. 50, 10 (2015), 386--399.

Digital Library

[24]

Wen Li, Li LI, and Haipeng Cai. 2022a. PolyFax: A Toolkit for Characterizing Multi-Language Software. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Tool Demos. 1662--1666. https://doi.org/10.1145/3540250.3558925

Digital Library

[25]

Wen Li, Austin Marino, Haoran Yang, Na Meng, Li Li, and Haipeng Cai. 2023 a. How are Multilingual Systems Constructed: Characterizing Language Use and Selection in Open-Source Multilingual Software. ACM Transactions on Software Engineering and Methodology (TOSEM) (2023).

[26]

Wen Li, Na Meng, Li Li, and Haipeng Cai. 2021. Understanding language selection in multi-language software projects on GitHub. In IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings. 256--257.

Digital Library

[27]

Wen Li, Jiang Ming, Xiapu Luo, and Haipeng Cai. 2022b. {PolyCruise}: A {Cross-Language} Dynamic Information Flow Analysis. In 31st USENIX Security Symposium (USENIX Security 22). 2513--2530.

[28]

Wen Li, Jinyang Ruan, Guangbei Yi, Long Cheng, Xiapu Luo, and Haipeng Cai. 2023 b. PolyFuzz: Holistic Greybox Fuzzing of Multi-Language Systems. In 32nd USENIX Security Symposium (USENIX Security 23). 1379--1396. https://www.usenix.org/conference/usenixsecurity23/presentation/li-wen

[29]

Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F Donaldson. 2015. Many-core compiler fuzzing. ACM SIGPLAN Notices, Vol. 50, 6 (2015), 65--76.

Digital Library

[30]

LLVM. 2020. LibFuzzer: A library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html.

[31]

Valentin JM Manes, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J Schwartz, and Maverick Woo. 2018. Fuzzing: Art, science, and engineering. arXiv preprint arXiv:1812.00140 (2018).

[32]

Amir M Mir, Evaldas Latovs kinas, Sebastian Proksch, and Georgios Gousios. 2022. Type4Py: Practical deep similarity learning-based type inference for Python. In Proceedings of the 44th International Conference on Software Engineering. 2241--2252.

Digital Library

[33]

M.Zalewski. 2014. Technical "whitepaper" for afl-fuzz. https://lcamtuf.coredump.cx/afl/technical_details.txt.

[34]

Eriko Nagai, Hironobu Awazu, Nagisa Ishiura, and Naoya Takeda. 2012. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012). 48--53.

[35]

Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2014. Reinforcing random testing of arithmetic optimization of C compilers by scaling up size and number of expressions. IPSJ Transactions on System LSI Design Methodology, Vol. 7 (2014), 91--100.

[36]

Sebastian Österlund, Elia Geretto, Andrea Jemmett, Emre Güler, Philipp Görz, Thorsten Holz, Cristiano Giuffrida, and Herbert Bos. 2021. Collabfuzz: A framework for collaborative fuzzing. In Proceedings of the 14th European Workshop on Systems Security. 1--7.

Digital Library

[37]

Jibesh Patra and Michael Pradel. 2016. Learning to fuzz: Application-independent fuzz testing with probabilistic, generative models of input data. TU Darmstadt, Department of Computer Science, Tech. Rep. TUD-CS-2016--14664 (2016).

[38]

Yun Peng, Cuiyun Gao, Zongjie Li, Bowei Gao, David Lo, Qirun Zhang, and Michael Lyu. 2022. Static inference meets deep learning: a hybrid type inference approach for python. In Proceedings of the 44th International Conference on Software Engineering. 2019--2030.

Digital Library

[39]

Paul Purdom. 1972. A sentence generator for testing parsers. BIT Numerical Mathematics, Vol. 12 (1972), 366--375.

Digital Library

[40]

Victoria Puzhevich. 2020. Top Programming Languages to Use. https://scand.com/company/blog/top-programming-languages-to-use-in-2020/

[41]

Python. 2022a. CPython Repository. https://github.com/python/cpython.

[42]

Python. 2022b. Python 3.8 Abstract Syntax Trees. https://docs.python.org/3.8/library/ast.html.

[43]

Python.org. 2023. The Python Language Reference. https://docs.python.org/3/reference/.

[44]

Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware Evolutionary Fuzzing. In NDSS, Vol. 17. 1--14.

[45]

Jesse Ruderman. 2007. Introducing jsfunfuzz. URL http://www. squarefree. com/2007/08/02/introducing-jsfunfuzz, Vol. 20 (2007), 25--29.

[46]

Dipanjan Sarkar, Raghav Bali, and Tushar Sharma. 2018. Practical machine learning with Python. A Problem-Solvers Guide To Building Real-World Intelligent Systems. Berkely: Apress (2018).

[47]

Emin Gün Sirer and Brian N Bershad. 1999. Using production grammars in software testing. ACM SIGPLAN Notices, Vol. 35, 1 (1999), 1--13.

Digital Library

[48]

Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-driven seed generation for fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 579--594.

[49]

Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2019. Superion: Grammar-aware greybox fuzzing. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 724--735.

Digital Library

[50]

Zhaogui Xu, Peng Liu, Xiangyu Zhang, and Baowen Xu. 2016. Python predictive analysis for bug detection. In Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. 121--132.

Digital Library

[51]

Haoran Yang, Wen Li, and Haipeng Cai. 2022. Language-Agnostic Dynamic Analysis of Multilingual Code: Promises, Pitfalls, and Prospects. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Ideas, Visions and Reflections. 1621--1626. https://doi.org/10.1145/3540250.3560880

Digital Library

[52]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation. 283--294.

Digital Library

Cited By

Nong YFang RYi GZhao KLuo XChen FCai HRoychoudhury APaiva AAbreu RStorey M(2024)VGX: Large-Scale Sample Generation for Boosting Learning-Based Software Vulnerability AnalysesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639116(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639116

Index Terms

PyRTFuzz: Detecting Bugs in Python Runtimes via Two-Level Collaborative Fuzzing
1. Security and privacy
  1. Software and application security
    1. Software security engineering
2. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Program analysis

Recommendations

GrayC: Greybox Fuzzing of Compilers and Analysers for C
ISSTA 2023: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis

Fuzzing of compilers and code analysers has led to a large number of bugs being found and fixed in widely-used frameworks such as LLVM, GCC and Frama-C. Most such fuzzing techniques have taken a blackbox approach, with compilers and code analysers ...
Efficient Greybox Fuzzing to Detect Memory Errors
ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Greybox fuzzing is a proven and effective testing method for the detection of security vulnerabilities and other bugs in modern software systems. Greybox fuzzing can also be used in combination with a sanitizer, such as AddressSanitizer (ASAN), to ...
Sequence coverage directed greybox fuzzing
ICPC '19: Proceedings of the 27th International Conference on Program Comprehension

Existing directed fuzzers are not efficient enough. Directed symbolic-execution-based whitebox fuzzers, e.g. BugRedux, spend lots of time on heavyweight program analysis and constraints solving at runtime. Directed greybox fuzzers, such as AFLGo, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

November 2023

3722 pages

ISBN:9798400700507

DOI:10.1145/3576915

General Chairs:
Weizhi Meng
Technical University of Denmark
,
Christian D. Jensen
Technical University of Denmark
,
Program Chairs:
Cas Cremers
CISPA Helmholtz Center for Information Security
,
Engin Kirda
Khoury College of Computer Sciences

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 November 2023

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

CCS '23

Sponsor:

SIGSAC

CCS '23: ACM SIGSAC Conference on Computer and Communications Security

November 26 - 30, 2023

Copenhagen, Denmark

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '24

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
1,042
Total Downloads

Downloads (Last 12 months)1,042
Downloads (Last 6 weeks)114

Reflects downloads up to 02 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Nong YFang RYi GZhao KLuo XChen FCai HRoychoudhury APaiva AAbreu RStorey M(2024)VGX: Large-Scale Sample Generation for Boosting Learning-Based Software Vulnerability AnalysesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639116(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639116

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents