Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Fuzzing Loop Optimizations in Compilers for C++ and Data-Parallel Languages

Published: 06 June 2023 Publication History

Abstract

Compilers are part of the foundation upon which software systems are built; they need to be as correct as possible. This paper is about stress-testing loop optimizers; it presents a major reimplementation of Yet Another Random Program Generator (YARPGen), an open-source generative compiler fuzzer. This new version has found 122 bugs, both in compilers for data-parallel languages, such as the Intel® Implicit SPMD Program Compiler and the Intel® oneAPI DPC++ compiler, and in C++ compilers such as GCC and Clang/LLVM. The first main contribution of our work is a novel method for statically avoiding undefined behavior when generating loops; the resulting programs conform to the relevant language standard, enabling automated testing. The second main contribution is a collection of mechanisms for increasing the diversity of generated loop code; in our evaluation, we demonstrate that these make it possible to trigger loop optimizations significantly more often, providing opportunities to discover bugs in the optimizers.

References

[1]
Abdulazeez S. Boujarwah and Kassem Saleh. 1997. Compiler Test Case Generation Methods: A Survey and Assessment. Information and software technology, 39, 9 (1997), 617–625. https://doi.org/10.1016/S0950-5849(97)00017-7
[2]
Nader Boushehri, Ryan Berger, Stefan Mada, and John Regehr. 2022. Automated Translation Validation for an LLVM Backend. Presented at 2022 LLVM Developers’ Meeting. https://llvm.org/devmtg/2022-11/slides/TechTalk18-AutomatedTranslationValidation.pdf [Online, accessed 03-16-2023]
[3]
C. J. Burgess and M. Saidi. 1996. The Automatic Generation of Test Cases for Optimizing Fortran Compilers. Information and Software Technology, 38, 2 (1996), Jan., 111–119. issn:0950-5849 https://doi.org/10.1016/0950-5849(95)01055-6
[4]
Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A Survey of Compiler Testing. ACM Computing Surveys (CSUR), 53, 1 (2020), 1–36. https://doi.org/10.1145/3363562
[5]
Chris Cummins, Pavlos Petoumenos, Alastair Murray, and Hugh Leather. 2018. Compiler Fuzzing through Deep Learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 95–105. https://doi.org/10.1145/3213846.3213848
[6]
Karine Even-Mendoza, Cristian Cadar, and Alastair F. Donaldson. 2020. Closer to the Edge: Testing Compilers More Thoroughly by Being Less Conservative about Undefined Behaviour. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 1219–1223. https://doi.org/10.1145/3324884.3418933
[7]
Karine Even-Mendoza, Cristian Cadar, and Alastair F. Donaldson. 2022. CsmithEdge: More Effective Compiler Testing by Handling Undefined Behaviour Less Conservatively. Empirical Software Engineering, 27, 6 (2022), 1–35. https://doi.org/10.1007/s10664-022-10146-1
[8]
Paul Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part I. One-dimensional time. International Journal of Parallel Programming, 21, 5 (1992), 313–347. https://doi.org/10.1007/BF01407835 Publisher: Springer
[9]
Paul Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. International Journal of Parallel Programming, 21, 6 (1992), 389–420. https://doi.org/10.1007/BF01379404 Publisher: Springer
[10]
Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012. Swarm Testing. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. 78–88. https://doi.org/10.1145/2338965.2336763
[11]
William Hatch, Pierce Darragh, and Eric Eide. 2023. Xsmith. https://gitlab.flux.utah.edu/xsmith/xsmith [Online, accessed 03-16-2023]
[12]
He Jiang, Zhide Zhou, Zhilei Ren, Jingxuan Zhang, and Xiaochen Li. 2022. CTOS: Compiler Testing for Optimization Sequences of LLVM. IEEE Transactions on Software Engineering, 48, 7 (2022), 2339–2358. https://doi.org/10.1109/TSE.2021.3058671
[13]
Khronos ® SYCL™ Working Group. 2020. SYCL™ Specification 1.2.1. https://registry.khronos.org/SYCL/specs/sycl-1.2.1.pdf [Online, accessed 12-08-2022]
[14]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. ACM SIGPLAN Notices, 49, 6 (2014), 216–226. https://doi.org/10.1145/2666356.2594334
[15]
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. ACM SIGPLAN Notices, 50, 10 (2015), 386–399. https://doi.org/10.1145/2858965.2814319
[16]
Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-Core Compiler Fuzzing. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). 65–76. isbn:978-1-4503-3468-6 https://doi.org/10.1145/2737924.2737986
[17]
Martin Liška. 2022. C-Vise. https://github.com/marxin/cvise [Online, accessed 11-08-2022]
[18]
Xiao Liu, Xiaoting Li, Rupesh Prajapati, and Dinghao Wu. 2019. DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing. In Proceedings of the AAAI Conference on Artificial Intelligence. 33, 1044–1051. https://doi.org/10.1609/aaai.v33i01.33011044
[19]
Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random Testing for C and C++ Compilers with YARPGen. In Proceedings of the ACM on Programming Languages. 4, 1–25. https://doi.org/10.1145/3485529
[20]
Nuno P. Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. 2021. Alive2: Bounded Translation Validation for LLVM. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 65–79. https://doi.org/10.1145/3453483.3454030
[21]
Michael McCool, James Reinders, and Arch Robison. 2012. Structured Parallel Programming: Patterns for Efficient Computation (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. isbn:978-0-12-391443-9
[22]
Eriko Nagai, Hironobu Awazu, Nagisa Ishiura, and Naoya Takeda. 2012. Random Testing of C Compilers Targeting Arithmetic Optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012). 48–53.
[23]
Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2013. Scaling up Size and Number of Expressions in Random Testing of Arithmetic Optimization of C Compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013). 88–93.
[24]
Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2014. Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions. IPSJ Transactions on System LSI Design Methodology, 7 (2014), 91–100. https://doi.org/10.2197/ipsjtsldm.7.91
[25]
Kazuhiro Nakamura and Nagisa Ishiura. 2015. Introducing Loop Statements in Random Testing of C Compilers Based on Expected Value Calculation. In Proc. of the Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2015). 226–227.
[26]
Kazuhiro Nakamura and Nagisa Ishiura. 2016. Random Testing of C Compilers Based on Test Program Generation by Equivalence Transformation. In 2016 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS). IEEE, 676–679. https://doi.org/10.1109/APCCAS.2016.7804063
[27]
Matt Pharr and William R. Mark. 2012. ISPC: A SPMD Compiler for High-Performance CPU Programming. In 2012 Innovative Parallel Computing (InPar). IEEE, 1–13. https://doi.org/10.1109/InPar.2012.6339601
[28]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-Case Reduction for C Compiler Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. 335–346. https://doi.org/10.1145/2345156.2254104
[29]
Yuyang Rong, Stephen Neuendorffer, and Hao Chen. 2022. IRFuzzer: Improving IR Fuzzing with More Diversified Input. Presented at 2022 LLVM Developers’ Meeting. https://llvm.org/devmtg/2022-11/slides/Lightning2-ImprovingIRFuzzingWithMoreDiversifiedInput.pdf [Online, accessed 03-16-2023]
[30]
Richard L. Sauder. 1962. A General Test Data Generator for COBOL. In Proceedings of the Spring Joint Computer Conference. 317–323. https://doi.org/10.1145/1460833.1460869
[31]
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding Compiler Bugs via Live Code Mutation. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 849–863. https://doi.org/10.1145/2983990.2984038
[32]
Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided Program Reduction. In Proceedings of the 40th International Conference on Software Engineering. 361–371. https://doi.org/10.1145/3180155.3180236
[33]
Yan Wang, Peng Jia, Luping Liu, Cheng Huang, and Zhonglin Liu. 2020. A Systematic Review of Fuzzing Based on Machine Learning Techniques. PlOS One, 15, 8 (2020), e0237749. https://doi.org/10.1371/journal.pone.0237749
[34]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). Association for Computing Machinery, 283–294. isbn:978-1-4503-0663-8 https://doi.org/10.1145/1993498.1993532
[35]
Zhide Zhou, Zhilei Ren, Guojun Gao, and He Jiang. 2021. An Empirical Study of Optimization Bugs in GCC and LLVM. Journal of Systems and Software, 174 (2021), 110884. https://doi.org/10.1016/j.jss.2020.110884

Cited By

View all
  • (2024)Shoot Yourself in the Foot — Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695548(1846-1857)Online publication date: 27-Oct-2024
  • (2024)Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695369(2426-2428)Online publication date: 27-Oct-2024
  • (2024)Detecting Optimizing Compiler Bugs via History-Driven Test Program MutationProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671387(145-154)Online publication date: 24-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 7, Issue PLDI
June 2023
2020 pages
EISSN:2475-1421
DOI:10.1145/3554310
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2023
Published in PACMPL Volume 7, Issue PLDI

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. YARPGen
  2. automated testing
  3. compiler defect
  4. compiler testing
  5. random program generation
  6. random testing

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,019
  • Downloads (Last 6 weeks)97
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Shoot Yourself in the Foot — Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695548(1846-1857)Online publication date: 27-Oct-2024
  • (2024)Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695369(2426-2428)Online publication date: 27-Oct-2024
  • (2024)Detecting Optimizing Compiler Bugs via History-Driven Test Program MutationProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671387(145-154)Online publication date: 24-Jul-2024
  • (2024)Refined Input, Degraded Output: The Counterintuitive World of Compiler BehaviorProceedings of the ACM on Programming Languages10.1145/36564048:PLDI(671-691)Online publication date: 20-Jun-2024
  • (2024)Boosting Compiler Testing by Injecting Real-World CodeProceedings of the ACM on Programming Languages10.1145/36563868:PLDI(223-245)Online publication date: 20-Jun-2024
  • (2024)API-Driven Program Synthesis for Testing Static Typing ImplementationsProceedings of the ACM on Programming Languages10.1145/36329048:POPL(1850-1881)Online publication date: 5-Jan-2024
  • (2024)Verified Validation for Affine Scheduling in Polyhedral CompilationTheoretical Aspects of Software Engineering10.1007/978-3-031-64626-3_17(287-305)Online publication date: 29-Jul-2024
  • (2023)Validating JIT Compilers via Compilation Space ExplorationProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613140(66-79)Online publication date: 23-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media