research-article

Open access

Fuzzing Loop Optimizations in Compilers for C++ and Data-Parallel Languages

Authors:

Vsevolod Livinskii,

Dmitry Babokin,

John RegehrAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 7, Issue PLDI

Article No.: 181, Pages 1826 - 1847

https://doi.org/10.1145/3591295

Published: 06 June 2023 Publication History

Abstract

Compilers are part of the foundation upon which software systems are built; they need to be as correct as possible. This paper is about stress-testing loop optimizers; it presents a major reimplementation of Yet Another Random Program Generator (YARPGen), an open-source generative compiler fuzzer. This new version has found 122 bugs, both in compilers for data-parallel languages, such as the Intel® Implicit SPMD Program Compiler and the Intel® oneAPI DPC++ compiler, and in C++ compilers such as GCC and Clang/LLVM. The first main contribution of our work is a novel method for statically avoiding undefined behavior when generating loops; the resulting programs conform to the relevant language standard, enabling automated testing. The second main contribution is a collection of mechanisms for increasing the diversity of generated loop code; in our evaluation, we demonstrate that these make it possible to trigger loop optimizations significantly more often, providing opportunities to discover bugs in the optimizers.

References

[1]

Abdulazeez S. Boujarwah and Kassem Saleh. 1997. Compiler Test Case Generation Methods: A Survey and Assessment. Information and software technology, 39, 9 (1997), 617–625. https://doi.org/10.1016/S0950-5849(97)00017-7

[2]

Nader Boushehri, Ryan Berger, Stefan Mada, and John Regehr. 2022. Automated Translation Validation for an LLVM Backend. Presented at 2022 LLVM Developers’ Meeting. https://llvm.org/devmtg/2022-11/slides/TechTalk18-AutomatedTranslationValidation.pdf [Online, accessed 03-16-2023]

[3]

C. J. Burgess and M. Saidi. 1996. The Automatic Generation of Test Cases for Optimizing Fortran Compilers. Information and Software Technology, 38, 2 (1996), Jan., 111–119. issn:0950-5849 https://doi.org/10.1016/0950-5849(95)01055-6

[4]

Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A Survey of Compiler Testing. ACM Computing Surveys (CSUR), 53, 1 (2020), 1–36. https://doi.org/10.1145/3363562

Digital Library

[5]

Chris Cummins, Pavlos Petoumenos, Alastair Murray, and Hugh Leather. 2018. Compiler Fuzzing through Deep Learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 95–105. https://doi.org/10.1145/3213846.3213848

Digital Library

[6]

Karine Even-Mendoza, Cristian Cadar, and Alastair F. Donaldson. 2020. Closer to the Edge: Testing Compilers More Thoroughly by Being Less Conservative about Undefined Behaviour. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 1219–1223. https://doi.org/10.1145/3324884.3418933

Digital Library

[7]

Karine Even-Mendoza, Cristian Cadar, and Alastair F. Donaldson. 2022. CsmithEdge: More Effective Compiler Testing by Handling Undefined Behaviour Less Conservatively. Empirical Software Engineering, 27, 6 (2022), 1–35. https://doi.org/10.1007/s10664-022-10146-1

Digital Library

[8]

Paul Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part I. One-dimensional time. International Journal of Parallel Programming, 21, 5 (1992), 313–347. https://doi.org/10.1007/BF01407835 Publisher: Springer

Digital Library

[9]

Paul Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. International Journal of Parallel Programming, 21, 6 (1992), 389–420. https://doi.org/10.1007/BF01379404 Publisher: Springer

[10]

Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012. Swarm Testing. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. 78–88. https://doi.org/10.1145/2338965.2336763

Digital Library

[11]

William Hatch, Pierce Darragh, and Eric Eide. 2023. Xsmith. https://gitlab.flux.utah.edu/xsmith/xsmith [Online, accessed 03-16-2023]

[12]

He Jiang, Zhide Zhou, Zhilei Ren, Jingxuan Zhang, and Xiaochen Li. 2022. CTOS: Compiler Testing for Optimization Sequences of LLVM. IEEE Transactions on Software Engineering, 48, 7 (2022), 2339–2358. https://doi.org/10.1109/TSE.2021.3058671

[13]

Khronos ® SYCL™ Working Group. 2020. SYCL™ Specification 1.2.1. https://registry.khronos.org/SYCL/specs/sycl-1.2.1.pdf [Online, accessed 12-08-2022]

[14]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. ACM SIGPLAN Notices, 49, 6 (2014), 216–226. https://doi.org/10.1145/2666356.2594334

Digital Library

[15]

Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. ACM SIGPLAN Notices, 50, 10 (2015), 386–399. https://doi.org/10.1145/2858965.2814319

Digital Library

[16]

Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-Core Compiler Fuzzing. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). 65–76. isbn:978-1-4503-3468-6 https://doi.org/10.1145/2737924.2737986

Digital Library

[17]

Martin Liška. 2022. C-Vise. https://github.com/marxin/cvise [Online, accessed 11-08-2022]

[18]

Xiao Liu, Xiaoting Li, Rupesh Prajapati, and Dinghao Wu. 2019. DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing. In Proceedings of the AAAI Conference on Artificial Intelligence. 33, 1044–1051. https://doi.org/10.1609/aaai.v33i01.33011044

Digital Library

[19]

Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random Testing for C and C++ Compilers with YARPGen. In Proceedings of the ACM on Programming Languages. 4, 1–25. https://doi.org/10.1145/3485529

Digital Library

[20]

Nuno P. Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. 2021. Alive2: Bounded Translation Validation for LLVM. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 65–79. https://doi.org/10.1145/3453483.3454030

Digital Library

[21]

Michael McCool, James Reinders, and Arch Robison. 2012. Structured Parallel Programming: Patterns for Efficient Computation (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. isbn:978-0-12-391443-9

Digital Library

[22]

Eriko Nagai, Hironobu Awazu, Nagisa Ishiura, and Naoya Takeda. 2012. Random Testing of C Compilers Targeting Arithmetic Optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012). 48–53.

[23]

Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2013. Scaling up Size and Number of Expressions in Random Testing of Arithmetic Optimization of C Compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013). 88–93.

[24]

Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2014. Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions. IPSJ Transactions on System LSI Design Methodology, 7 (2014), 91–100. https://doi.org/10.2197/ipsjtsldm.7.91

[25]

Kazuhiro Nakamura and Nagisa Ishiura. 2015. Introducing Loop Statements in Random Testing of C Compilers Based on Expected Value Calculation. In Proc. of the Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2015). 226–227.

[26]

Kazuhiro Nakamura and Nagisa Ishiura. 2016. Random Testing of C Compilers Based on Test Program Generation by Equivalence Transformation. In 2016 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS). IEEE, 676–679. https://doi.org/10.1109/APCCAS.2016.7804063

[27]

Matt Pharr and William R. Mark. 2012. ISPC: A SPMD Compiler for High-Performance CPU Programming. In 2012 Innovative Parallel Computing (InPar). IEEE, 1–13. https://doi.org/10.1109/InPar.2012.6339601

[28]

John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-Case Reduction for C Compiler Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. 335–346. https://doi.org/10.1145/2345156.2254104

Digital Library

[29]

Yuyang Rong, Stephen Neuendorffer, and Hao Chen. 2022. IRFuzzer: Improving IR Fuzzing with More Diversified Input. Presented at 2022 LLVM Developers’ Meeting. https://llvm.org/devmtg/2022-11/slides/Lightning2-ImprovingIRFuzzingWithMoreDiversifiedInput.pdf [Online, accessed 03-16-2023]

[30]

Richard L. Sauder. 1962. A General Test Data Generator for COBOL. In Proceedings of the Spring Joint Computer Conference. 317–323. https://doi.org/10.1145/1460833.1460869

Digital Library

[31]

Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding Compiler Bugs via Live Code Mutation. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 849–863. https://doi.org/10.1145/2983990.2984038

Digital Library

[32]

Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided Program Reduction. In Proceedings of the 40th International Conference on Software Engineering. 361–371. https://doi.org/10.1145/3180155.3180236

Digital Library

[33]

Yan Wang, Peng Jia, Luping Liu, Cheng Huang, and Zhonglin Liu. 2020. A Systematic Review of Fuzzing Based on Machine Learning Techniques. PlOS One, 15, 8 (2020), e0237749. https://doi.org/10.1371/journal.pone.0237749

[34]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). Association for Computing Machinery, 283–294. isbn:978-1-4503-0663-8 https://doi.org/10.1145/1993498.1993532

Digital Library

[35]

Zhide Zhou, Zhilei Ren, Guojun Gao, and He Jiang. 2021. An Empirical Study of Optimization Bugs in GCC and LLVM. Journal of Systems and Software, 174 (2021), 110884. https://doi.org/10.1016/j.jss.2020.110884

Cited By

Gao FChen HZhou YWang KFilkov VRay BZhou M(2024)Shoot Yourself in the Foot — Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695548(1846-1857)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695548
Chen HFilkov VRay BZhou M(2024)Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695369(2426-2428)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695369
Zeng JTang YZhang JLi WHuang Z(2024)Detecting Optimizing Compiler Bugs via History-Driven Test Program MutationProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671387(145-154)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3671387
Show More Cited By

Index Terms

Fuzzing Loop Optimizations in Compilers for C++ and Data-Parallel Languages
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Empirical software validation
  2. Software notations and tools
    1. Compilers
      1. Source code generation

Recommendations

Random testing for C and C++ compilers with YARPGen

Compilers should not crash and they should not miscompile applications. Random testing is an effective method for finding compiler bugs that have escaped other kinds of testing. This paper presents Yet Another Random Program Generator (YARPGen), a random ...
Finding and understanding bugs in C compilers
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation

Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to ...
Finding and understanding bugs in C compilers
PLDI '11

Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 7, Issue PLDI

June 2023

2020 pages

EISSN:2475-1421

DOI:10.1145/3554310

Editor:
Michael Hicks
Amazon, USA

Issue’s Table of Contents

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2023

Published in PACMPL Volume 7, Issue PLDI

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Intel Corporation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
1,876
Total Downloads

Downloads (Last 12 months)1,019
Downloads (Last 6 weeks)97

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gao FChen HZhou YWang KFilkov VRay BZhou M(2024)Shoot Yourself in the Foot — Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695548(1846-1857)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695548
Chen HFilkov VRay BZhou M(2024)Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695369(2426-2428)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695369
Zeng JTang YZhang JLi WHuang Z(2024)Detecting Optimizing Compiler Bugs via History-Driven Test Program MutationProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671387(145-154)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3671387
Theodoridis TSu Z(2024)Refined Input, Degraded Output: The Counterintuitive World of Compiler BehaviorProceedings of the ACM on Programming Languages10.1145/36564048:PLDI(671-691)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656404
Li STheodoridis TSu Z(2024)Boosting Compiler Testing by Injecting Real-World CodeProceedings of the ACM on Programming Languages10.1145/36563868:PLDI(223-245)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656386
Sotiropoulos TChaliasos SSu Z(2024)API-Driven Program Synthesis for Testing Static Typing ImplementationsProceedings of the ACM on Programming Languages10.1145/36329048:POPL(1850-1881)Online publication date: 5-Jan-2024
https://dl.acm.org/doi/10.1145/3632904
Li XLiang HFeng X(2024)Verified Validation for Affine Scheduling in Polyhedral CompilationTheoretical Aspects of Software Engineering10.1007/978-3-031-64626-3_17(287-305)Online publication date: 29-Jul-2024
https://dl.acm.org/doi/10.1007/978-3-031-64626-3_17
Li CJiang YXu CSu ZDruschel PKaufmann AMace JFlinn JSeltzer M(2023)Validating JIT Compilers via Compilation Space ExplorationProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613140(66-79)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3600006.3613140

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents