research-article

Open access

Evolutionary Generative Fuzzing for Differential Testing of the Kotlin Compiler

Authors:

Călin Georgescu,

Mitchell Olsthoorn,

Pouria Derakhshanfar,

Annibale PanichellaAuthors Info & Claims

FSE 2024: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering

Pages 197 - 207

https://doi.org/10.1145/3663529.3663864

Published: 10 July 2024 Publication History

Abstract

Compiler correctness is a cornerstone of reliable software development. However, systematic testing of compilers is infeasible, given the vast space of possible programs and the complexity of modern programming languages. In this context, differential testing offers a practical methodology as it addresses the oracle problem by comparing the output of alternative compilers given the same set of programs as input. In this paper, we investigate the effectiveness of differential testing in finding bugs within the Kotlin compilers developed at JetBrains. We propose a black-box generative approach that creates input programs for the K1 and K2 compilers. First, we build workable models of Kotlin semantic (semantic interface) and syntactic (enriched context-free grammar) language features, which are subsequently exploited to generate random code snippets. Second, we extend random sampling by introducing two genetic algorithms (GAs) that aim to generate more diverse input programs. Our case study shows that the proposed approach effectively detects bugs in K1 and K2; these bugs have been confirmed and (some) fixed by JetBrains developers. While we do not observe a significant difference w.r.t. the number of defects uncovered by the different search algorithms, random search and GAs are complementary as they find different categories of bugs. Finally, we provide insights into the relationships between the size, complexity, and fault detection capability of the generated input programs.

References

[1]

Raja Ben Abdessalem, Annibale Panichella, Shiva Nejati, Lionel C Briand, and Thomas Stifter. 2018. Testing autonomous cars for feature interaction failures using many-objective search. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 143–154.

Digital Library

[2]

Zohreh Aghababaeyan, Manel Abdellatif, Lionel Briand, S Ramesh, and Mojtaba Bagherzadeh. 2023. Black-box testing of deep neural networks through test case diversity. IEEE Transactions on Software Engineering.

Digital Library

[3]

Marat Akhin and Mikhail Belyaev. 2020. Kotlin language specification: Kotlin/Core. JetBrains Research.

[4]

Saswat Anand, Edmund K Burke, Tsong Yueh Chen, John Clark, Myra B Cohen, Wolfgang Grieskamp, Mark Harman, Mary Jean Harrold, Phil McMinn, and Antonia Bertolino. 2013. An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software, 86, 8 (2013), 1978–2001.

Digital Library

[5]

Theodore W Anderson and Donald A Darling. 1954. A test of goodness of fit. Journal of the American statistical association, 49, 268 (1954), 765–769.

[6]

Andrea Arcuri. 2019. RESTful API automated test case generation with EvoMaster. ACM Transactions on Software Engineering and Methodology (TOSEM), 28, 1 (2019), 1–37.

Digital Library

[7]

Andrea Arcuri and Gordon Fraser. 2013. Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering, 18 (2013), 594–623.

[8]

Earl T Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2014. The oracle problem in software testing: A survey. IEEE transactions on software engineering, 41, 5 (2014), 507–525.

Digital Library

[9]

Béatrice Bérard, Michel Bidoit, Alain Finkel, François Laroussinie, Antoine Petit, Laure Petrucci, and Philippe Schnoebelen. 2013. Systems and software verification: model-checking techniques and tools. Springer Science & Business Media.

[10]

Abdulazeez S Boujarwah and Kassem Saleh. 1997. Compiler test case generation methods: a survey and assessment. Information and software technology, 39, 9 (1997), 617–625.

[11]

Georgescu Calin. 2023. Empirical Study Data for Test Program-Based Generative Fuzzing for Differential Testing of the Kotlin Compiler. https://doi.org/10.5281/zenodo.8221889

[12]

Georgescu Calin. 2023. Fuzzer Implementation for Test Program-Based Generative Fuzzing for Differential Testing of the Kotlin Compiler. https://doi.org/10.5281/zenodo.8222995

[13]

José Campos, Yan Ge, Nasser Albunian, Gordon Fraser, Marcelo Eler, and Andrea Arcuri. 2018. An empirical evaluation of evolutionary algorithms for unit test suite generation. Information and Software Technology, 104 (2018), 207–235.

[14]

Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A survey of compiler testing. ACM Computing Surveys (CSUR), 53, 1 (2020), 1–36.

Digital Library

[15]

Yuting Chen, Ting Su, and Zhendong Su. 2019. Deep differential testing of JVM implementations. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 1257–1268.

Digital Library

[16]

Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-directed differential testing of JVM implementations. In proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 85–99.

Digital Library

[17]

William Jay Conover. 1999. Practical nonparametric statistics. 350, john wiley & sons.

[18]

Davide Corradini, Amedeo Zampieri, Michele Pasqua, and Mariano Ceccato. 2021. Empirical comparison of black-box test case generation tools for RESTful APIs. In 2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM). 226–236.

[19]

Pouria Derakhshanfar, Xavier Devroey, Annibale Panichella, Andy Zaidman, and Arie van Deursen. 2022. Generating Class-Level Integration Tests Using Call Site Information. IEEE Transactions on Software Engineering, 49, 4 (2022), 2069–2087.

Digital Library

[20]

Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416–419.

Digital Library

[21]

Gordon Fraser and Andrea Arcuri. 2015. 1600 faults in 100 projects: automatically finding faults while achieving high coverage with evosuite. Empirical software engineering, 20 (2015), 611–639.

[22]

HyungSeok Han, DongHyeon Oh, and Sang Kil Cha. 2019. CodeAlchemist: Semantics-Aware Code Generation to Find Vulnerabilities in JavaScript Engines. In NDSS.

[23]

Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with code fragments. In 21st USENIX Security Symposium (USENIX Security 12). 445–458.

[24]

Donald E Knuth. 1968. Semantics of context-free languages. Mathematical systems theory, 2, 2 (1968), 127–145.

[25]

Patrick Kreutzer, Stefan Kraus, and Michael Philippsen. 2020. Language-agnostic generation of compilable test programs. In 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST). 39–50.

[26]

Yuwei Li, Shouling Ji, Chenyang Lyu, Yuan Chen, Jianhai Chen, Qinchen Gu, Chunming Wu, and Raheem Beyah. 2020. V-fuzz: Vulnerability prediction-assisted evolutionary fuzzing for binary programs. IEEE Transactions on Cybernetics, 52, 5 (2020), 3745–3756.

[27]

Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random testing for C and C++ compilers with YARPGen. Proceedings of the ACM on Programming Languages, 4, OOPSLA (2020), 1–25.

Digital Library

[28]

R Timothy Marler and Jasbir S Arora. 2004. Survey of multi-objective optimization methods for engineering. Structural and multidisciplinary optimization, 26 (2004), 369–395.

[29]

William M McKeeman. 1998. Differential testing for software. Digital Technical Journal, 10, 1 (1998), 100–107.

[30]

Phil McMinn. 2011. Search-based software testing: Past, present and future. In 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops. 153–163.

Digital Library

[31]

Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2015. Reformulating branch coverage as a many-objective optimization problem. In 2015 IEEE 8th international conference on software testing, verification and validation (ICST). 1–10.

[32]

Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2017. Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets. IEEE Transactions on Software Engineering, 44, 2 (2017), 122–158.

[33]

Sebastiano Panichella, Alessio Gambi, Fiorella Zampetti, and Vincenzo Riccio. 2021. Sbst tool competition 2021. In 2021 IEEE/ACM 14th International Workshop on Search-Based Software Testing (SBST). 20–27.

[34]

Terence Parr. 2013. The definitive ANTLR 4 reference. The Definitive ANTLR 4 Reference, 1–326.

[35]

Paul Purdom. 1972. A sentence generator for testing parsers. BIT Numerical Mathematics, 12, 3 (1972), 366–375.

Digital Library

[36]

Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. Vuzzer: Application-aware evolutionary fuzzing. In NDSS. 17, 1–14.

[37]

Dominic Steinhöfel and Andreas Zeller. 2022. Input Invariants. In ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 583––594.

[38]

Daniil Stepanov, Marat Akhin, and Mikhail Belyaev. 2021. Type-Centric Kotlin Compiler Fuzzing: Preserving Test Program Correctness by Preserving Types. In 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST). 318–328.

[39]

Martijn van Meerten, Burcu Kulahcioglu Ozkan, and Annibale Panichella. 2023. Evolutionary Approach for Concurrency Testing of Ripple Blockchain Consensus Algorithm. In 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 36–47.

[40]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation. 283–294.

Digital Library

Index Terms

Evolutionary Generative Fuzzing for Differential Testing of the Kotlin Compiler
1. Software and its engineering
  1. Software creation and management
    1. Search-based software engineering
    2. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. Compilers
2. Theory of computation
  1. Design and analysis of algorithms
    1. Mathematical optimization
      1. Discrete optimization
        Optimization with randomized search heuristics
        Evolutionary algorithms

Recommendations

Automated conformance testing for JavaScript engines via deep compiler fuzzing
PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

JavaScript (JS) is a popular, platform-independent programming language. To ensure the interoperability of JS programs across different platforms, the implementation of a JS engine should conform to the ECMAScript standard. However, doing so is ...
Compiler fuzzing through deep learning
ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

Random program generation — fuzzing — is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space ...
Evaluating evolutionary testability for structure-oriented testing with software measurements

Test case design is the most important test activity with respect to test quality. For this reason, a large number of testing methods have been developed to assist the tester with the definition of appropriate, error-sensitive test data. Evolutionary ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

FSE 2024: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering

July 2024

715 pages

ISBN:9798400706585

DOI:10.1145/3663529

General Chair:
Marcelo d'Amorim
North Carolina State University, USA

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 July 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

FSE '24

Sponsor:

SIGSOFT

FSE '24: 32nd ACM International Conference on the Foundations of Software Engineering

July 15 - 19, 2024

Porto de Galinhas, Brazil

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
171
Total Downloads

Downloads (Last 12 months)171
Downloads (Last 6 weeks)44

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents